Release Notes - Spark - Version 3.2.4 - HTML format

Sub-task

  • [SPARK-41388] - getReusablePVCs should ignore recently created PVCs in the previous batch
  • [SPARK-42071] - Register scala.math.Ordering$Reverse to KyroSerializer

Bug

  • [SPARK-38173] - Quoted column cannot be recognized correctly when quotedRegexColumnNames is true
  • [SPARK-39399] - proxy-user not working for Spark on k8s in cluster deploy mode
  • [SPARK-39596] - Run `Linters, licenses, dependencies and documentation generation ` GitHub Actions failed
  • [SPARK-40817] - Remote spark.jars URIs ignored for Spark on Kubernetes in cluster mode
  • [SPARK-40819] - Parquet INT64 (TIMESTAMP(NANOS,true)) now throwing Illegal Parquet type instead of automatically converting to LongType
  • [SPARK-41162] - Anti-join must not be pushed below aggregation with ambiguous predicates
  • [SPARK-41254] - YarnAllocator.rpIdToYarnResource map is not properly updated
  • [SPARK-41376] - Executor netty direct memory check should respect spark.shuffle.io.preferDirectBufs
  • [SPARK-41554] - Decimal.changePrecision produces ArrayIndexOutOfBoundsException
  • [SPARK-41732] - Session window: analysis rule "SessionWindowing" does not apply tree-pattern based pruning
  • [SPARK-41952] - Upgrade Parquet to fix off-heap memory leaks in Zstd codec
  • [SPARK-41989] - PYARROW_IGNORE_TIMEZONE warning can break application logging setup
  • [SPARK-42090] - Introduce sasl retry count in RetryingBlockTransferor
  • [SPARK-42157] - `spark.scheduler.mode=FAIR` should provide FAIR scheduler
  • [SPARK-42168] - CoGroup with window function returns incorrect result when partition keys differ in order
  • [SPARK-42188] - Force SBT protobuf version to match Maven on branch 3.2 and 3.3
  • [SPARK-42201] - `build/sbt` should allow SBT_OPTS to override JVM memory setting
  • [SPARK-42259] - ResolveGroupingAnalytics should take care of Python UDAF
  • [SPARK-42462] - Prevent `docker-image-tool.sh` from publishing OCI manifests
  • [SPARK-42478] - Make a serializable jobTrackerId instead of a non-serializable JobID in FileWriterFactory
  • [SPARK-42596] - [YARN] OMP_NUM_THREADS not set to number of executor cores by default
  • [SPARK-42649] - Remove the standard Apache License header from the top of third-party source files
  • [SPARK-42673] - Make build/mvn build Spark only with the verified maven version
  • [SPARK-42697] - /api/v1/applications return 0 for duration
  • [SPARK-42747] - Fix incorrect internal status of LoR and AFT
  • [SPARK-42785] - [K8S][Core] When spark submit without --deploy-mode, will face NPE in Kubernetes Case
  • [SPARK-42799] - Update SBT build `xercesImpl` version to match with pom.xml
  • [SPARK-42906] - Replace a starting digit with `x` in resource name prefix
  • [SPARK-42967] - Fix SparkListenerTaskStart.stageAttemptId when a task is started after the stage is cancelled
  • [SPARK-43004] - vendor==vendor typo in ResourceRequest.equals()
  • [SPARK-43005] - `v is v >= 0` typo in pyspark/pandas/config.py
  • [SPARK-43069] - Use `sbt-eclipse` instead of `sbteclipse-plugin`

Improvement

  • [SPARK-41360] - Avoid BlockManager re-registration if the executor has been lost
  • [SPARK-42934] - Testing OrcEncryptionSuite using maven is always skipped
  • [SPARK-43395] - Exclude macOS tar extended metadata in make-distribution.sh

Test

  • [SPARK-36883] - Upgrade R version to 4.1.1 in CI images
  • [SPARK-41863] - Skip `flake8` tests if the command is not available
  • [SPARK-41865] - Use pycodestyle to 2.7.0 to fix pycodestyle errors

Task

Dependency upgrade

Github Integration

Edit/Copy Release Notes

The text area below allows the project release notes to be edited and copied to another document.