Release Notes - Spark - Version 3.3.4 - HTML format

Sub-task

  • [SPARK-44857] - Fix getBaseURI error in Spark Worker LogPage UI buttons
  • [SPARK-45187] - Fix WorkerPage to use the same pattern for `logPage` urls
  • [SPARK-45749] - Fix Spark History Server to sort `Duration` column properly
  • [SPARK-46012] - EventLogFileReader should not read rolling logs if appStatus is missing
  • [SPARK-46095] - Document REST API for Spark Standalone Cluster

Bug

  • [SPARK-43327] - Trigger `committer.setupJob` before plan execute in `FileFormatWriter`
  • [SPARK-43393] - Sequence expression can overflow
  • [SPARK-44074] - `Logging plan changes for execution` test failed
  • [SPARK-44547] - BlockManagerDecommissioner throws exceptions when migrating RDD cached blocks to fallback storage
  • [SPARK-44581] - ShutdownHookManager get wrong hadoop user group information
  • [SPARK-44805] - Data lost after union using spark.sql.parquet.enableNestedColumnVectorizedReader=true
  • [SPARK-44813] - The JIRA Python misses our assignee when it searches user again
  • [SPARK-44843] - flaky test: RocksDBStateStoreStreamingAggregationSuite
  • [SPARK-44871] - Fix PERCENTILE_DISC behaviour
  • [SPARK-44925] - K8s default service token file should not be materialized into token
  • [SPARK-44935] - Fix `RELEASE` file to have the correct information in Docker images
  • [SPARK-44973] - Fix ArrayIndexOutOfBoundsException in conv()
  • [SPARK-44990] - CSV conversion performance severely degraded for null fields
  • [SPARK-45057] - Deadlock caused by rdd replication level of 2
  • [SPARK-45079] - percentile_approx() fails with an internal error on NULL accuracy
  • [SPARK-45100] - reflect() fails with an internal error on NULL class and method
  • [SPARK-45210] - Switch languages consistently across docs for all code snippets (Spark 3.4 and below)
  • [SPARK-45227] - Fix a subtle thread-safety issue with CoarseGrainedExecutorBackend where an executor process randomly gets stuck
  • [SPARK-45430] - FramelessOffsetWindowFunctionFrame fails when ignore nulls and offset > # of rows
  • [SPARK-45508] - Add "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED" so Platform can access cleaner on Java 9+
  • [SPARK-45580] - Subquery changes the output schema of the outer query
  • [SPARK-45670] - SparkSubmit does not support --total-executor-cores when deploying on K8s
  • [SPARK-45885] - Upgrade ORC to 1.7.10
  • [SPARK-45920] - group by ordinal should be idempotent
  • [SPARK-45935] - Fix RST files link substitutions error
  • [SPARK-46006] - YarnAllocator miss clean targetNumExecutorsPerResourceProfileId after YarnSchedulerBackend call stop
  • [SPARK-46019] - Fix HiveThriftServer2ListenerSuite and ThriftServerPageSuite to create java.io.tmpdir if it doesn't exist
  • [SPARK-46092] - Overflow in Parquet row group filter creation causes incorrect results
  • [SPARK-46239] - Hide Jetty info

Improvement

  • [SPARK-44920] - Use await() instead of awaitUninterruptibly() in TransportClientFactory.createClient()
  • [SPARK-45127] - Exclude README.md from document build
  • [SPARK-45286] - Add back Matomo analytics to release docs
  • [SPARK-45751] - The default value of ‘spark.executor.logs.rolling.maxRetainedFiles' on the official website is incorrect
  • [SPARK-45829] - The default value of ‘spark.executor.logs.rolling.maxSize' on the official website is incorrect
  • [SPARK-46286] - Document spark.io.compression.zstd.bufferPool.enabled

Test

  • [SPARK-45568] - WholeStageCodegenSparkSubmitSuite flakiness

Documentation

  • [SPARK-44725] - Document spark.network.timeoutInterval

Edit/Copy Release Notes

The text area below allows the project release notes to be edited and copied to another document.