Sub-task
- [SPARK-38697] - Extend SparkSessionExtensions to inject rules into AQE Optimizer
- [SPARK-39200] - Stream is corrupted Exception while fetching the blocks from fallback storage system
- [SPARK-39965] - Skip PVC cleanup when driver doesn't own PVCs
- [SPARK-40459] - recoverDiskStore should not stop by existing recomputed files
- [SPARK-40636] - Fix wrong remained shuffles log in BlockManagerDecommissioner
Bug
- [SPARK-8731] - Beeline doesn't work with -e option when started in background
- [SPARK-32380] - sparksql cannot access hive table while data in hbase
- [SPARK-35542] - Bucketizer created for multiple columns with parameters splitsArray, inputCols and outputCols can not be loaded after saving it.
- [SPARK-39184] - ArrayIndexOutOfBoundsException for some date/time sequences in some time-zones
- [SPARK-39647] - Block push fails with java.lang.IllegalArgumentException: Active local dirs list has not been updated by any executor registration even when the NodeManager hasn't been restarted
- [SPARK-39775] - Regression due to AVRO-2035
- [SPARK-39833] - Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true
- [SPARK-39835] - Fix EliminateSorts remove global sort below the local sort
- [SPARK-39839] - Handle special case of null variable-length Decimal with non-zero offsetAndSize in UnsafeRow structural integrity check
- [SPARK-39847] - Race condition related to interruption of task threads while they are in RocksDBLoader.loadLibrary()
- [SPARK-39867] - Global limit should not inherit OrderPreservingUnaryNode
- [SPARK-39887] - Expression transform error
- [SPARK-39900] - Issue with querying dataframe produced by 'binaryFile' format using 'not' operator
- [SPARK-39932] - WindowExec should clear the final partition buffer
- [SPARK-39952] - SaveIntoDataSourceCommand should recache result relation
- [SPARK-39962] - Global aggregation against pandas aggregate UDF does not take the column order into account
- [SPARK-39972] - Revert the test case of SPARK-39962 in branch-3.2 and branch-3.1
- [SPARK-40002] - Limit improperly pushed down through window using ntile function
- [SPARK-40065] - Executor ConfigMap is not mounted if profile is not default
- [SPARK-40079] - Add Imputer inputCols validation for empty input case
- [SPARK-40089] - Sorting of at least Decimal(20, 2) fails for some values near the max.
- [SPARK-40117] - Convert condition to java in DataFrameWriterV2.overwrite
- [SPARK-40121] - Initialize projection used for Python UDF
- [SPARK-40124] - Update TPCDS v1.4 q32 for Plan Stability tests
- [SPARK-40149] - Star expansion after outer join asymmetrically includes joining key
- [SPARK-40169] - Fix the issue with Parquet column index and predicate pushdown in Data source V1
- [SPARK-40212] - SparkSQL castPartValue does not properly handle byte & short
- [SPARK-40218] - GROUPING SETS should preserve the grouping columns
- [SPARK-40270] - Make compute.max_rows as None working in DataFrame.style
- [SPARK-40280] - Failure to create parquet predicate push down for ints and longs on some valid files
- [SPARK-40315] - Non-deterministic hashCode() calculations for ArrayBasedMapData on equal objects
- [SPARK-40407] - Repartition of DataFrame can result in severe data skew in some special case
- [SPARK-40470] - arrays_zip output unexpected alias column names when using GetMapValue and GetArrayStructFields
- [SPARK-40493] - Revert "[SPARK-33861][SQL] Simplify conditional in predicate"
- [SPARK-40562] - Add spark.sql.legacy.groupingIdWithAppendedUserGroupBy
- [SPARK-40583] - Documentation error in "Integration with Cloud Infrastructures"
- [SPARK-40588] - Sorting issue with partitioned-writing and AQE turned on
- [SPARK-40612] - On Kubernetes for long running app Spark using an invalid principal to renew the delegation token
- [SPARK-40660] - Switch to XORShiftRandom to distribute elements
- [SPARK-40829] - STORED AS serde in CREATE TABLE LIKE view does not work
- [SPARK-40851] - TimestampFormatter behavior changed when using the latest Java 8/11/17
- [SPARK-40869] - KubernetesConf.getResourceNamePrefix creates invalid name prefixes
- [SPARK-40874] - Fix broadcasts in Python UDFs when encryption is enabled
- [SPARK-40902] - Quick submission of drivers in tests to mesos scheduler results in dropping drivers
- [SPARK-40963] - ExtractGenerator sets incorrect nullability in new Project
- [SPARK-40987] - Avoid creating a directory when deleting a block, causing DAGScheduler to not work
- [SPARK-41035] - Incorrect results or NPE when a literal is reused across distinct aggregations
- [SPARK-41091] - Fix Docker release tool for branch-3.2
- [SPARK-41188] - Set executorEnv OMP_NUM_THREADS to be spark.task.cpus by default for spark executor JVM processes
- [SPARK-41327] - Fix SparkStatusTracker.getExecutorInfos by switch On/OffHeapStorageMemory info
- [SPARK-41395] - InterpretedMutableProjection can corrupt unsafe buffer when used with decimal data
- [SPARK-41448] - Make consistent MR job IDs in FileBatchWriter and FileFormatWriter
- [SPARK-41522] - GA dependencies test faild
- [SPARK-41535] - InterpretedUnsafeProjection and InterpretedMutableProjection can corrupt unsafe buffer when used with calendar interval data
- [SPARK-41668] - DECODE function returns wrong results when passed NULL
Improvement
- [SPARK-38034] - Optimize time complexity and extend applicable cases for TransposeWindow
- [SPARK-39831] - R dependencies installation start to fail after devtools_2.4.4 was released
- [SPARK-39879] - Reduce local-cluster memory configuration in BroadcastJoinSuite* and HiveSparkSubmitSuite
- [SPARK-40022] - YarnClusterSuite should not ABORTED when there is no Python3 environment
- [SPARK-40241] - Correct the link of GenericUDTF
- [SPARK-40490] - `YarnShuffleIntegrationSuite` no longer verifies `registeredExecFile` reload after SPARK-17321
- [SPARK-40574] - Add PURGE to DROP TABLE doc
- [SPARK-41541] - Fix wrong child call in SQLShuffleWriteMetricsReporter.decRecordsWritten()
Test
- [SPARK-40172] - Temporarily disable flaky test cases in ImageFileFormatSuite
- [SPARK-40461] - Set upperbound for pyzmq 24.0.0 for Python linter
Task
- [SPARK-40213] - Incorrect ASCII value for Latin-1 Supplement characters
- [SPARK-40292] - arrays_zip output unexpected alias column names
Dependency upgrade
- [SPARK-40801] - Upgrade Apache Commons Text to 1.10
Documentation
- [SPARK-40043] - Document DataStreamWriter.toTable and DataStreamReader.table
- [SPARK-40983] - Remove Hadoop requirements for zstd mention in Parquet compression codec
Edit/Copy Release Notes
The text area below allows the project release notes to be edited and copied to another document.