Sub-task
- [SPARK-33976] - Add a dedicated SQL document page for the TRANSFORM-related functionality,
- [SPARK-34507] - Spark artefacts built against Scala 2.13 incorrectly depend on Scala 2.12
- [SPARK-34543] - Respect case sensitivity in V1 ALTER TABLE .. SET LOCATION
- [SPARK-35093] - AQE columnar mismatch on exchange reuse
- [SPARK-35159] - extract doc of hive format
- [SPARK-35168] - mapred.reduce.tasks should be shuffle.partitions not adaptive.coalescePartitions.initialPartitionNum
- [SPARK-35695] - QueryExecutionListener does not see any observed metrics fired before persist/cache
Bug
- [SPARK-32924] - Web UI sort on duration is wrong
- [SPARK-33482] - V2 Datasources that extend FileScan preclude exchange reuse
- [SPARK-33504] - The application log in the Spark history server contains sensitive attributes such as password that should be redated instead of plain text
- [SPARK-34392] - Invalid ID for offset-based ZoneId since Spark 3.0
- [SPARK-34421] - Custom functions can't be used in temporary views with CTEs
- [SPARK-34449] - Upgrade Jetty to fix CVE-2020-27218
- [SPARK-34534] - New protocol FetchShuffleBlocks in OneForOneBlockFetcher lead to data loss or correctness
- [SPARK-34545] - PySpark Python UDF return inconsistent results when applying 2 UDFs with different return type to 2 columns together
- [SPARK-34556] - Checking duplicate static partition columns doesn't respect case sensitive conf
- [SPARK-34596] - NewInstance.doGenCode should not throw malformed class name error
- [SPARK-34607] - NewInstance.resolved should not throw malformed class name error
- [SPARK-34676] - TableCapabilityCheckSuite should not inherit all tests from AnalysisSuite
- [SPARK-34696] - Fix CodegenInterpretedPlanTest to generate correct test cases
- [SPARK-34697] - Allow DESCRIBE FUNCTION and SHOW FUNCTIONS explain about || (string concatenation operator).
- [SPARK-34719] - fail if the view query has duplicated column names
- [SPARK-34723] - Correct parameter type for subexpression elimination under whole-stage
- [SPARK-34724] - Fix Interpreted evaluation by using getClass.getMethod instead of getDeclaredMethod
- [SPARK-34743] - ExpressionEncoderSuite should use deepEquals when we expect `array of array`
- [SPARK-34747] - Add virtual operators to the built-in function document.
- [SPARK-34756] - Fix FileScan equality check
- [SPARK-34760] - run JavaSQLDataSourceExample failed with Exception in runBasicDataSourceExample().
- [SPARK-34763] - col(), $"<name>" and df("name") should handle quoted column names properly.
- [SPARK-34768] - Respect the default input buffer size in Univocity
- [SPARK-34772] - RebaseDateTime loadRebaseRecords should use Spark classloader instead of context
- [SPARK-34774] - The `change-scala- version.sh` script not replaced scala.version property correctly
- [SPARK-34776] - Catalyst error on on certain struct operation (Couldn't find _gen_alias_)
- [SPARK-34794] - Nested higher-order functions broken in DSL
- [SPARK-34798] - Fix incorrect join condition
- [SPARK-34811] - Redact fs.s3a.access.key like secret and token
- [SPARK-34832] - ExternalAppendOnlyUnsafeRowArrayBenchmark can't run with spark-submit
- [SPARK-34834] - There is a potential Netty memory leak in TransportResponseHandler.
- [SPARK-34845] - ProcfsMetricsGetter.computeAllMetrics may return partial metrics when some of child pids metrics are missing
- [SPARK-34874] - Recover test reports for failed GA builds
- [SPARK-34876] - Non-nullable aggregates can return NULL in a correlated subquery
- [SPARK-34897] - Support reconcile schemas based on index after nested column pruning
- [SPARK-34900] - Some `spark-submit` commands used to run benchmarks in the user's guide is wrong
- [SPARK-34909] - conv() does not convert negative inputs to unsigned correctly
- [SPARK-34926] - PartitionUtils.getPathFragment should handle null value
- [SPARK-34933] - Remove the description that || and && can be used as logical operators from the document.
- [SPARK-34939] - Throw fetch failure exception when unable to deserialize broadcasted map statuses
- [SPARK-34963] - Nested column pruning fails to extract case-insensitive struct field from array
- [SPARK-34988] - Upgrade Jetty for CVE-2021-28165
- [SPARK-35014] - A foldable expression could not be replaced by an AttributeReference
- [SPARK-35080] - Correlated subqueries with equality predicates can return wrong results
- [SPARK-35096] - foreachBatch throws ArrayIndexOutOfBoundsException if schema is case Insensitive
- [SPARK-35106] - HadoopMapReduceCommitProtocol performs bad rename when dynamic partition overwrite is used
- [SPARK-35142] - `OneVsRest` classifier uses incorrect data type for `rawPrediction` column
- [SPARK-35178] - maven autodownload failing
- [SPARK-35210] - Upgrade Jetty to 9.4.40 to fix ERR_CONNECTION_RESET issue
- [SPARK-35244] - invoke should throw the original exception
- [SPARK-35278] - Invoke should find the method with correct number of parameters
- [SPARK-35288] - StaticInvoke should find the method without exact argument classes match
- [SPARK-35296] - Dataset.observe fails with an assertion
- [SPARK-35393] - PIP packaging test is skipped in GitHub Actions build
- [SPARK-35425] - Pin jinja2 in spark-rm/Dockerfile and add as a required dependency in the release README.md
- [SPARK-35458] - ARM CI failed: failed to validate maven sha512
- [SPARK-35463] - Skip checking checksum on a system doesn't have `shasum`
- [SPARK-35482] - case sensitive block manager port key should be used in BasicExecutorFeatureStep
- [SPARK-35493] - spark.blockManager.port does not work for driver pod
- [SPARK-35566] - Fix number of output rows for StateStoreRestoreExec
- [SPARK-35573] - Make SparkR tests pass with R 4.1+
- [SPARK-35610] - Memory leak in Spark interpreter
- [SPARK-35653] - [SQL] CatalystToExternalMap interpreted path fails for Map with case classes as keys or values
- [SPARK-35659] - Avoid write null to StateStore
- [SPARK-35673] - Spark fails on unrecognized hint in subquery
- [SPARK-35679] - Overflow on converting valid Timestamp to Microseconds
- [SPARK-38141] - NoSuchMethodError: org.json4s.JsonDSL$JsonAssoc org.json4s.JsonDSL$.pair2Assoc
Improvement
- [SPARK-34683] - Update the documents to explain the usage of LIST FILE and LIST JAR in case they take multiple file names
- [SPARK-34915] - Cache Maven, SBT and Scala in all jobs that use them
- [SPARK-34922] - Use better CBO cost function
- [SPARK-34940] - Fix minor unit test in BasicWriteTaskStatsTrackerSuite
- [SPARK-35002] - Fix the java.net.BindException when testing with Github Action
- [SPARK-35045] - Add an internal option to control input buffer in univocity
- [SPARK-35127] - When we switch between different stage-detail pages, the entry item in the newly-opened page may be blank.
- [SPARK-35227] - Replace Bintray with the new repository service for the spark-packages resolver in SparkSubmit
- [SPARK-35358] - Set maximum Java heap used for release build
- [SPARK-35373] - Verify checksums of downloaded artifacts in build/mvn
- [SPARK-35687] - PythonUDFSuite move assume into its methods
- [SPARK-35714] - Bug fix for deadlock during the executor shutdown
Test
- [SPARK-24931] - Recover lint-r job in GitHub Actions workflow
- [SPARK-34424] - HiveOrcHadoopFsRelationSuite fails with seed 610710213676
- [SPARK-34604] - Flaky test: TaskContextTestsWithWorkerReuse.test_task_context_correct_with_python_worker_reuse
- [SPARK-34610] - Fix Python UDF used in GroupedAggPandasUDFTests.
- [SPARK-34795] - Adds a new job in GitHub Actions to check the output of TPC-DS queries
- [SPARK-34951] - Recover Python linter (Sphinx build) in GitHub Actions
- [SPARK-35192] - Port minimal TPC-DS datagen code from databricks/spark-sql-perf
- [SPARK-35293] - Use the newer dsdgen for TPCDSQueryTestSuite
- [SPARK-35327] - Filters out the TPC-DS queries that can cause flaky test results
- [SPARK-35413] - Use the SHA of the latest commit when checking out databricks/tpcds-kit
Task
- [SPARK-34970] - Redact map-type options in the output of explain()
- [SPARK-35233] - Switch from bintray to scala.jfrog.io for SBT download in branch 2.4 and 3.0
- [SPARK-35495] - Change SparkR maintainer for CRAN
Documentation
- [SPARK-35405] - Submitting Applications documentation has outdated information about K8s client mode support
Edit/Copy Release Notes
The text area below allows the project release notes to be edited and copied to another document.