Release Notes - ASF JIRA

Release Notes - Spark - Version 3.1.2 - HTML format

Configure Release Notes

Sub-task

[SPARK-33976] - Add a dedicated SQL document page for the TRANSFORM-related functionality,
[SPARK-34507] - Spark artefacts built against Scala 2.13 incorrectly depend on Scala 2.12
[SPARK-34543] - Respect case sensitivity in V1 ALTER TABLE .. SET LOCATION
[SPARK-34561] - Cannot drop/add columns from/to a dataset of v2 `DESCRIBE TABLE`
[SPARK-34577] - Cannot drop/add columns from/to a dataset of v2 `DESCRIBE NAMESPACE`
[SPARK-34630] - Add type hints of pyspark.__version__ and pyspark.sql.Column.contains
[SPARK-34682] - Regression in "operating on canonicalized plan" check in CustomShuffleReaderExec
[SPARK-34711] - Exercise code-gen enable/disable code paths for SHJ in join test suites
[SPARK-34790] - Fail in fetch shuffle blocks in batch when i/o encryption is enabled.
[SPARK-34840] - Fix cases of corruption in merged shuffle blocks that are pushed
[SPARK-35019] - Improve type hints on pyspark.sql.*
[SPARK-35093] - AQE columnar mismatch on exchange reuse
[SPARK-35159] - extract doc of hive format
[SPARK-35168] - mapred.reduce.tasks should be shuffle.partitions not adaptive.coalescePartitions.initialPartitionNum
[SPARK-35431] - Sort elements generated by collect_set in SQLQueryTestSuite

Bug

[SPARK-32924] - Web UI sort on duration is wrong
[SPARK-33482] - V2 Datasources that extend FileScan preclude exchange reuse
[SPARK-34128] - Suppress excessive logging of TTransportExceptions in Spark ThriftServer
[SPARK-34225] - Jars or file paths which contain spaces are generating FileNotFoundException exception
[SPARK-34361] - Dynamic allocation on K8s kills executors with running tasks
[SPARK-34392] - Invalid ID for offset-based ZoneId since Spark 3.0
[SPARK-34417] - org.apache.spark.sql.DataFrameNaFunctions.fillMap(values: Seq[(String, Any)]) fails for column name having a dot
[SPARK-34436] - DPP support LIKE ANY/ALL
[SPARK-34473] - avoid NPE in DataFrameReader.schema(StructType)
[SPARK-34490] - table maybe resolved as a view if the table is dropped
[SPARK-34497] - JDBC connection provider is not removing kerberos credentials from JVM security context
[SPARK-34504] - Avoid unnecessary view resolving and remove the `performCheck` flag
[SPARK-34515] - Fix NPE if InSet contains null value during getPartitionsByFilter
[SPARK-34531] - Remove Experimental API tag in PrometheusServlet
[SPARK-34534] - New protocol FetchShuffleBlocks in OneForOneBlockFetcher lead to data loss or correctness
[SPARK-34545] - PySpark Python UDF return inconsistent results when applying 2 UDFs with different return type to 2 columns together
[SPARK-34547] - Resolve using child metadata attributes as fallback
[SPARK-34551] - generate-contributors.py, releaseutils.py and translate-contributors.py are broken
[SPARK-34555] - Resolve metadata output from DataFrame
[SPARK-34556] - Checking duplicate static partition columns doesn't respect case sensitive conf
[SPARK-34567] - CreateTableAsSelect should have metrics update too
[SPARK-34584] - When insert into a partition table with a illegal partition value, DSV2 behavior different as others
[SPARK-34596] - NewInstance.doGenCode should not throw malformed class name error
[SPARK-34599] - INSERT INTO OVERWRITE doesn't support partition columns containing dot for DSv2
[SPARK-34607] - NewInstance.resolved should not throw malformed class name error
[SPARK-34613] - Fix view does not capture disable hint config
[SPARK-34642] - TypeError in Pyspark Linear Regression docs
[SPARK-34643] - Use CRAN URL in canonical form
[SPARK-34660] - Don't use ParVector with `withExistingConf` which is not thread-safe
[SPARK-34674] - Spark app on k8s doesn't terminate without call to sparkContext.stop() method
[SPARK-34676] - TableCapabilityCheckSuite should not inherit all tests from AnalysisSuite
[SPARK-34681] - Full outer shuffled hash join when building left side produces wrong result
[SPARK-34696] - Fix CodegenInterpretedPlanTest to generate correct test cases
[SPARK-34697] - Allow DESCRIBE FUNCTION and SHOW FUNCTIONS explain about || (string concatenation operator).
[SPARK-34713] - group by CreateStruct with ExtractValue fails analysis
[SPARK-34714] - collect_list(struct()) fails when used with GROUP BY
[SPARK-34719] - fail if the view query has duplicated column names
[SPARK-34723] - Correct parameter type for subexpression elimination under whole-stage
[SPARK-34724] - Fix Interpreted evaluation by using getClass.getMethod instead of getDeclaredMethod
[SPARK-34727] - Difference in results of casting float to timestamp
[SPARK-34731] - ConcurrentModificationException in EventLoggingListener when redacting properties
[SPARK-34737] - Discrepancy between TIMESTAMP_SECONDS and cast from float
[SPARK-34743] - ExpressionEncoderSuite should use deepEquals when we expect `array of array`
[SPARK-34747] - Add virtual operators to the built-in function document.
[SPARK-34756] - Fix FileScan equality check
[SPARK-34760] - run JavaSQLDataSourceExample failed with Exception in runBasicDataSourceExample().
[SPARK-34763] - col(), $"<name>" and df("name") should handle quoted column names properly.
[SPARK-34768] - Respect the default input buffer size in Univocity
[SPARK-34770] - InMemoryCatalog.tableExists should not fail if database doesn't exist
[SPARK-34772] - RebaseDateTime loadRebaseRecords should use Spark classloader instead of context
[SPARK-34774] - The `change-scala- version.sh` script not replaced scala.version property correctly
[SPARK-34776] - Catalyst error on on certain struct operation (Couldn't find _gen_alias_)
[SPARK-34794] - Nested higher-order functions broken in DSL
[SPARK-34796] - Codegen compilation error for query with LIMIT operator and without AQE
[SPARK-34798] - Fix incorrect join condition
[SPARK-34803] - Util methods requiring certain versions of Pandas & PyArrow don't pass through the raised ImportError
[SPARK-34811] - Redact fs.s3a.access.key like secret and token
[SPARK-34814] - LikeSimplification should handle NULL
[SPARK-34820] - K8s Integration test failed (due to libldap installation failed)
[SPARK-34829] - transform_values return identical values when it's used with udf that returns reference type
[SPARK-34832] - ExternalAppendOnlyUnsafeRowArrayBenchmark can't run with spark-submit
[SPARK-34833] - Apply right-padding correctly for correlated subqueries
[SPARK-34834] - There is a potential Netty memory leak in TransportResponseHandler.
[SPARK-34842] - Corrects the type of date_dim.d_quarter_name in the TPCDS schema
[SPARK-34845] - ProcfsMetricsGetter.computeAllMetrics may return partial metrics when some of child pids metrics are missing
[SPARK-34874] - Recover test reports for failed GA builds
[SPARK-34876] - Non-nullable aggregates can return NULL in a correlated subquery
[SPARK-34897] - Support reconcile schemas based on index after nested column pruning
[SPARK-34900] - Some `spark-submit` commands used to run benchmarks in the user's guide is wrong
[SPARK-34909] - conv() does not convert negative inputs to unsigned correctly
[SPARK-34926] - PartitionUtils.getPathFragment should handle null value
[SPARK-34933] - Remove the description that || and && can be used as logical operators from the document.
[SPARK-34939] - Throw fetch failure exception when unable to deserialize broadcasted map statuses
[SPARK-34948] - Add ownerReference to executor configmap to fix leakages
[SPARK-34949] - Executor.reportHeartBeat reregisters blockManager even when Executor is shutting down
[SPARK-34963] - Nested column pruning fails to extract case-insensitive struct field from array
[SPARK-34965] - Remove .sbtopts that duplicately sets the default memory
[SPARK-34988] - Upgrade Jetty for CVE-2021-28165
[SPARK-35004] - Fix Incorrect assertion of "master/worker web ui available behind front-end reverseProxy" in MasterSuite
[SPARK-35014] - A foldable expression could not be replaced by an AttributeReference
[SPARK-35079] - Transform with udf gives incorrect result
[SPARK-35080] - Correlated subqueries with equality predicates can return wrong results
[SPARK-35096] - foreachBatch throws ArrayIndexOutOfBoundsException if schema is case Insensitive
[SPARK-35106] - HadoopMapReduceCommitProtocol performs bad rename when dynamic partition overwrite is used
[SPARK-35117] - UI progress bar no longer highlights in progress tasks
[SPARK-35136] - Initial null value of LiveStage.info can lead to NPE
[SPARK-35142] - `OneVsRest` classifier uses incorrect data type for `rawPrediction` column
[SPARK-35178] - maven autodownload failing
[SPARK-35210] - Upgrade Jetty to 9.4.40 to fix ERR_CONNECTION_RESET issue
[SPARK-35213] - Corrupt DataFrame for certain withField patterns
[SPARK-35226] - JDBC datasources should accept refreshKrb5Config parameter
[SPARK-35244] - invoke should throw the original exception
[SPARK-35278] - Invoke should find the method with correct number of parameters
[SPARK-35288] - StaticInvoke should find the method without exact argument classes match
[SPARK-35359] - Insert data with char/varchar datatype will fail when data length exceed length limitation
[SPARK-35375] - Use Jinja2 < 3.0.0 for Python linter dependency in GA
[SPARK-35381] - Fix lambda variable name issues in nested DataFrame functions in R APIs
[SPARK-35382] - Fix lambda variable name issues in nested DataFrame functions in Python APIs
[SPARK-35393] - PIP packaging test is skipped in GitHub Actions build
[SPARK-35425] - Pin jinja2 in spark-rm/Dockerfile and add as a required dependency in the release README.md
[SPARK-35458] - ARM CI failed: failed to validate maven sha512
[SPARK-35463] - Skip checking checksum on a system doesn't have `shasum`
[SPARK-35482] - case sensitive block manager port key should be used in BasicExecutorFeatureStep
[SPARK-35493] - spark.blockManager.port does not work for driver pod
[SPARK-36765] - Spark Support for MS Sql JDBC connector with Kerberos/Keytab
[SPARK-38208] - 'Column' object is not callable

Improvement

[SPARK-34482] - Correct the active SparkSession for streaming query
[SPARK-34550] - Skip InSet null value during push filter to Hive metastore
[SPARK-34639] - always remove unnecessary Alias in Analyzer.resolveExpression
[SPARK-34683] - Update the documents to explain the usage of LIST FILE and LIST JAR in case they take multiple file names
[SPARK-34749] - Simplify CreateNamedStruct
[SPARK-34752] - Upgrade Jetty to 9.4.37 to fix CVE-2020-27223
[SPARK-34762] - Many PR's Scala 2.13 build action failed
[SPARK-34766] - Do not capture maven config for views
[SPARK-34915] - Cache Maven, SBT and Scala in all jobs that use them
[SPARK-34922] - Use better CBO cost function
[SPARK-34923] - Metadata output should not always be propagated
[SPARK-34940] - Fix minor unit test in BasicWriteTaskStatsTrackerSuite
[SPARK-35002] - Fix the java.net.BindException when testing with Github Action
[SPARK-35045] - Add an internal option to control input buffer in univocity
[SPARK-35087] - Some columns in table ` Aggregated Metrics by Executor` of stage-detail page shows incorrectly.
[SPARK-35127] - When we switch between different stage-detail pages, the entry item in the newly-opened page may be blank.
[SPARK-35171] - Declare the markdown package as a dependency of the SparkR package
[SPARK-35227] - Replace Bintray with the new repository service for the spark-packages resolver in SparkSubmit
[SPARK-35358] - Set maximum Java heap used for release build
[SPARK-35373] - Verify checksums of downloaded artifacts in build/mvn
[SPARK-35411] - Essential information missing in TreeNode json string

Test

[SPARK-24931] - Recover lint-r job in GitHub Actions workflow
[SPARK-34604] - Flaky test: TaskContextTestsWithWorkerReuse.test_task_context_correct_with_python_worker_reuse
[SPARK-34610] - Fix Python UDF used in GroupedAggPandasUDFTests.
[SPARK-34795] - Adds a new job in GitHub Actions to check the output of TPC-DS queries
[SPARK-34813] - Remove Scala 2.13 build GitHub Action job from branch-3.1
[SPARK-34951] - Recover Python linter (Sphinx build) in GitHub Actions
[SPARK-35192] - Port minimal TPC-DS datagen code from databricks/spark-sql-perf
[SPARK-35293] - Use the newer dsdgen for TPCDSQueryTestSuite
[SPARK-35327] - Filters out the TPC-DS queries that can cause flaky test results
[SPARK-35413] - Use the SHA of the latest commit when checking out databricks/tpcds-kit

Task

[SPARK-34970] - Redact map-type options in the output of explain()
[SPARK-35495] - Change SparkR maintainer for CRAN

Documentation

[SPARK-35250] - SQL DataFrameReader unescapedQuoteHandling parameter is misdocumented
[SPARK-35405] - Submitting Applications documentation has outdated information about K8s client mode support

Edit/Copy Release Notes

The text area below allows the project release notes to be edited and copied to another document.

Release Notes - Spark - Version 3.1.2
    
<h2>        Sub-task
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-33976'>SPARK-33976</a>] -         Add a dedicated SQL document page for the TRANSFORM-related functionality,
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34507'>SPARK-34507</a>] -         Spark artefacts built against Scala 2.13 incorrectly depend on Scala 2.12
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34543'>SPARK-34543</a>] -         Respect case sensitivity in V1 ALTER TABLE .. SET LOCATION
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34561'>SPARK-34561</a>] -         Cannot drop/add columns from/to a dataset of v2 `DESCRIBE TABLE`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34577'>SPARK-34577</a>] -         Cannot drop/add columns from/to a dataset of v2 `DESCRIBE NAMESPACE`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34630'>SPARK-34630</a>] -         Add type hints of pyspark.__version__ and pyspark.sql.Column.contains
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34682'>SPARK-34682</a>] -         Regression in &quot;operating on canonicalized plan&quot; check in CustomShuffleReaderExec
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34711'>SPARK-34711</a>] -         Exercise code-gen enable/disable code paths for SHJ in join test suites
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34790'>SPARK-34790</a>] -         Fail in fetch shuffle blocks in batch when i/o encryption is enabled.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34840'>SPARK-34840</a>] -         Fix cases of corruption in merged shuffle blocks that are pushed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35019'>SPARK-35019</a>] -         Improve type hints on pyspark.sql.*
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35093'>SPARK-35093</a>] -         AQE columnar mismatch on exchange reuse
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35159'>SPARK-35159</a>] -         extract doc of hive format
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35168'>SPARK-35168</a>] -         mapred.reduce.tasks should be shuffle.partitions not adaptive.coalescePartitions.initialPartitionNum
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35431'>SPARK-35431</a>] -         Sort elements generated by collect_set in SQLQueryTestSuite 
</li>
</ul>
            
<h2>        Bug
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-32924'>SPARK-32924</a>] -         Web UI sort on duration is wrong
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-33482'>SPARK-33482</a>] -         V2 Datasources that extend FileScan preclude exchange reuse
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34128'>SPARK-34128</a>] -          Suppress excessive logging of TTransportExceptions in Spark ThriftServer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34225'>SPARK-34225</a>] -         Jars or file paths which contain spaces are generating FileNotFoundException exception
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34361'>SPARK-34361</a>] -         Dynamic allocation on K8s kills executors with running tasks
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34392'>SPARK-34392</a>] -         Invalid ID for offset-based ZoneId since Spark 3.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34417'>SPARK-34417</a>] -         org.apache.spark.sql.DataFrameNaFunctions.fillMap(values: Seq[(String, Any)]) fails for column name having a dot
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34436'>SPARK-34436</a>] -         DPP support LIKE ANY/ALL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34473'>SPARK-34473</a>] -         avoid NPE in DataFrameReader.schema(StructType)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34490'>SPARK-34490</a>] -         table maybe resolved as a view if the table is dropped
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34497'>SPARK-34497</a>] -         JDBC connection provider is not removing kerberos credentials from JVM security context
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34504'>SPARK-34504</a>] -         Avoid unnecessary view resolving and remove the `performCheck` flag
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34515'>SPARK-34515</a>] -         Fix NPE if InSet contains null value during getPartitionsByFilter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34531'>SPARK-34531</a>] -         Remove Experimental API tag in PrometheusServlet
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34534'>SPARK-34534</a>] -         New protocol FetchShuffleBlocks in OneForOneBlockFetcher lead to data loss or correctness
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34545'>SPARK-34545</a>] -         PySpark Python UDF return inconsistent results when applying 2 UDFs with different return type to 2 columns together
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34547'>SPARK-34547</a>] -         Resolve using child metadata attributes as fallback
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34551'>SPARK-34551</a>] -         generate-contributors.py, releaseutils.py and translate-contributors.py are broken
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34555'>SPARK-34555</a>] -         Resolve metadata output from DataFrame
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34556'>SPARK-34556</a>] -         Checking duplicate static partition columns doesn&#39;t respect case sensitive conf
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34567'>SPARK-34567</a>] -         CreateTableAsSelect should have metrics update too
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34584'>SPARK-34584</a>] -         When insert into a partition table with a illegal partition value, DSV2 behavior different as others
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34596'>SPARK-34596</a>] -         NewInstance.doGenCode should not throw malformed class name error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34599'>SPARK-34599</a>] -         INSERT INTO OVERWRITE doesn&#39;t support partition columns containing dot for DSv2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34607'>SPARK-34607</a>] -         NewInstance.resolved should not throw malformed class name error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34613'>SPARK-34613</a>] -         Fix view does not capture disable hint config
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34642'>SPARK-34642</a>] -         TypeError in Pyspark Linear Regression docs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34643'>SPARK-34643</a>] -         Use CRAN URL in canonical form
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34660'>SPARK-34660</a>] -         Don&#39;t use ParVector with `withExistingConf` which is not thread-safe
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34674'>SPARK-34674</a>] -         Spark app on k8s doesn&#39;t terminate without call to sparkContext.stop() method
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34676'>SPARK-34676</a>] -         TableCapabilityCheckSuite should not inherit all tests from AnalysisSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34681'>SPARK-34681</a>] -         Full outer shuffled hash join when building left side produces wrong result
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34696'>SPARK-34696</a>] -         Fix CodegenInterpretedPlanTest to generate correct test cases
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34697'>SPARK-34697</a>] -         Allow DESCRIBE FUNCTION and SHOW FUNCTIONS explain about ||  (string concatenation operator).
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34713'>SPARK-34713</a>] -         group by CreateStruct with ExtractValue fails analysis
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34714'>SPARK-34714</a>] -         collect_list(struct()) fails when used with GROUP BY
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34719'>SPARK-34719</a>] -         fail if the view query has duplicated column names
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34723'>SPARK-34723</a>] -         Correct parameter type for subexpression elimination under whole-stage
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34724'>SPARK-34724</a>] -         Fix Interpreted evaluation by using getClass.getMethod instead of getDeclaredMethod
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34727'>SPARK-34727</a>] -         Difference in results of casting float to timestamp
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34731'>SPARK-34731</a>] -         ConcurrentModificationException in EventLoggingListener when redacting properties
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34737'>SPARK-34737</a>] -         Discrepancy between TIMESTAMP_SECONDS and cast from float
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34743'>SPARK-34743</a>] -         ExpressionEncoderSuite should use deepEquals when we expect `array of array`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34747'>SPARK-34747</a>] -         Add virtual operators to the built-in function document.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34756'>SPARK-34756</a>] -         Fix FileScan equality check
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34760'>SPARK-34760</a>] -         run JavaSQLDataSourceExample failed with Exception in runBasicDataSourceExample().
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34763'>SPARK-34763</a>] -         col(), $&quot;&lt;name&gt;&quot; and df(&quot;name&quot;) should handle quoted column names properly.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34768'>SPARK-34768</a>] -         Respect the default input buffer size in Univocity
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34770'>SPARK-34770</a>] -         InMemoryCatalog.tableExists should not fail if database doesn&#39;t exist
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34772'>SPARK-34772</a>] -         RebaseDateTime loadRebaseRecords should use Spark classloader instead of context
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34774'>SPARK-34774</a>] -         The `change-scala- version.sh` script not replaced scala.version property correctly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34776'>SPARK-34776</a>] -         Catalyst error on on certain struct operation (Couldn&#39;t find _gen_alias_)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34794'>SPARK-34794</a>] -         Nested higher-order functions broken in DSL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34796'>SPARK-34796</a>] -         Codegen compilation error for query with LIMIT operator and without AQE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34798'>SPARK-34798</a>] -         Fix incorrect join condition
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34803'>SPARK-34803</a>] -         Util methods requiring certain versions of Pandas &amp; PyArrow don&#39;t pass through the raised ImportError
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34811'>SPARK-34811</a>] -         Redact fs.s3a.access.key like secret and token
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34814'>SPARK-34814</a>] -         LikeSimplification should handle NULL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34820'>SPARK-34820</a>] -         K8s Integration test failed (due to libldap installation failed)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34829'>SPARK-34829</a>] -         transform_values return identical values when it&#39;s used with udf that returns reference type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34832'>SPARK-34832</a>] -         ExternalAppendOnlyUnsafeRowArrayBenchmark can&#39;t run with spark-submit
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34833'>SPARK-34833</a>] -         Apply right-padding correctly for correlated subqueries
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34834'>SPARK-34834</a>] -         There is a potential Netty memory leak in TransportResponseHandler.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34842'>SPARK-34842</a>] -         Corrects the type of date_dim.d_quarter_name in the TPCDS schema
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34845'>SPARK-34845</a>] -         ProcfsMetricsGetter.computeAllMetrics may return partial metrics when some of child pids metrics are missing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34874'>SPARK-34874</a>] -         Recover test reports for failed GA builds
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34876'>SPARK-34876</a>] -         Non-nullable aggregates can return NULL in a correlated subquery
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34897'>SPARK-34897</a>] -         Support reconcile schemas based on index after nested column pruning
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34900'>SPARK-34900</a>] -         Some `spark-submit`  commands used to run benchmarks in the user&#39;s guide is wrong
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34909'>SPARK-34909</a>] -         conv() does not convert negative inputs to unsigned correctly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34926'>SPARK-34926</a>] -         PartitionUtils.getPathFragment should handle null value
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34933'>SPARK-34933</a>] -         Remove the description that || and &amp;&amp; can be used as logical operators from the document.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34939'>SPARK-34939</a>] -         Throw fetch failure exception when unable to deserialize broadcasted map statuses
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34948'>SPARK-34948</a>] -         Add ownerReference to executor configmap to fix leakages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34949'>SPARK-34949</a>] -         Executor.reportHeartBeat reregisters blockManager even when Executor is shutting down
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34963'>SPARK-34963</a>] -         Nested column pruning fails to extract case-insensitive struct field from array
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34965'>SPARK-34965</a>] -         Remove .sbtopts that duplicately sets the default memory
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34988'>SPARK-34988</a>] -         Upgrade Jetty for CVE-2021-28165
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35004'>SPARK-35004</a>] -         Fix Incorrect assertion of &quot;master/worker web ui available behind front-end reverseProxy&quot; in MasterSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35014'>SPARK-35014</a>] -         A foldable expression could not be replaced by an AttributeReference
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35079'>SPARK-35079</a>] -         Transform with udf gives incorrect result
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35080'>SPARK-35080</a>] -         Correlated subqueries with equality predicates can return wrong results
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35096'>SPARK-35096</a>] -         foreachBatch throws ArrayIndexOutOfBoundsException if schema is case Insensitive
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35106'>SPARK-35106</a>] -         HadoopMapReduceCommitProtocol performs bad rename when dynamic partition overwrite is used
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35117'>SPARK-35117</a>] -         UI progress bar no longer highlights in progress tasks
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35136'>SPARK-35136</a>] -         Initial null value of LiveStage.info can lead to NPE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35142'>SPARK-35142</a>] -         `OneVsRest` classifier uses incorrect data type for `rawPrediction` column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35178'>SPARK-35178</a>] -         maven autodownload failing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35210'>SPARK-35210</a>] -         Upgrade Jetty to 9.4.40 to fix ERR_CONNECTION_RESET issue
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35213'>SPARK-35213</a>] -         Corrupt DataFrame for certain withField patterns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35226'>SPARK-35226</a>] -         JDBC datasources should accept refreshKrb5Config parameter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35244'>SPARK-35244</a>] -         invoke should throw the original exception
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35278'>SPARK-35278</a>] -         Invoke should find the method with correct number of parameters
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35288'>SPARK-35288</a>] -         StaticInvoke should find the method without exact argument classes match
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35359'>SPARK-35359</a>] -         Insert data with char/varchar datatype will fail when data length exceed length limitation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35375'>SPARK-35375</a>] -         Use Jinja2 &lt; 3.0.0 for Python linter dependency in GA
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35381'>SPARK-35381</a>] -         Fix lambda variable name issues in nested DataFrame functions in R APIs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35382'>SPARK-35382</a>] -         Fix lambda variable name issues in nested DataFrame functions in Python APIs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35393'>SPARK-35393</a>] -         PIP packaging test is skipped in GitHub Actions build
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35425'>SPARK-35425</a>] -         Pin jinja2 in spark-rm/Dockerfile and add as a required dependency in the release README.md
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35458'>SPARK-35458</a>] -         ARM CI failed: failed to validate maven sha512
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35463'>SPARK-35463</a>] -         Skip checking checksum on a system doesn&#39;t have `shasum`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35482'>SPARK-35482</a>] -         case sensitive block manager port key should be used in BasicExecutorFeatureStep
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35493'>SPARK-35493</a>] -         spark.blockManager.port does not work for driver pod
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36765'>SPARK-36765</a>] -         Spark Support for MS Sql JDBC connector with Kerberos/Keytab
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38208'>SPARK-38208</a>] -         &#39;Column&#39; object is not callable
</li>
</ul>
                
<h2>        Improvement
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34482'>SPARK-34482</a>] -         Correct the active SparkSession for streaming query
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34550'>SPARK-34550</a>] -         Skip InSet null value during push filter to Hive metastore
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34639'>SPARK-34639</a>] -         always remove unnecessary Alias in Analyzer.resolveExpression
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34683'>SPARK-34683</a>] -         Update the documents to explain the usage of LIST FILE and LIST JAR in case they take multiple file names
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34749'>SPARK-34749</a>] -         Simplify CreateNamedStruct
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34752'>SPARK-34752</a>] -         Upgrade Jetty to 9.4.37 to fix CVE-2020-27223
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34762'>SPARK-34762</a>] -         Many PR&#39;s Scala 2.13 build action failed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34766'>SPARK-34766</a>] -         Do not capture maven config for views
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34915'>SPARK-34915</a>] -         Cache Maven, SBT and Scala in all jobs that use them
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34922'>SPARK-34922</a>] -         Use better CBO cost function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34923'>SPARK-34923</a>] -         Metadata output should not always be propagated
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34940'>SPARK-34940</a>] -         Fix minor unit test in BasicWriteTaskStatsTrackerSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35002'>SPARK-35002</a>] -         Fix the java.net.BindException when testing with Github Action
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35045'>SPARK-35045</a>] -         Add an internal option to control input buffer in univocity
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35087'>SPARK-35087</a>] -         Some columns  in table ` Aggregated Metrics by Executor` of stage-detail page shows incorrectly.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35127'>SPARK-35127</a>] -         When we switch between different stage-detail pages, the entry item in the newly-opened  page may be blank.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35171'>SPARK-35171</a>] -         Declare the markdown package as a dependency of the SparkR package
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35227'>SPARK-35227</a>] -         Replace Bintray with the new repository service for the spark-packages resolver in SparkSubmit
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35358'>SPARK-35358</a>] -         Set maximum Java heap used for release build
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35373'>SPARK-35373</a>] -         Verify checksums of downloaded artifacts in build/mvn
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35411'>SPARK-35411</a>] -         Essential information missing in TreeNode json string
</li>
</ul>
    
<h2>        Test
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-24931'>SPARK-24931</a>] -         Recover lint-r job in GitHub Actions workflow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34604'>SPARK-34604</a>] -         Flaky test: TaskContextTestsWithWorkerReuse.test_task_context_correct_with_python_worker_reuse
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34610'>SPARK-34610</a>] -         Fix Python UDF used in GroupedAggPandasUDFTests.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34795'>SPARK-34795</a>] -         Adds a new job in GitHub Actions to check the output of TPC-DS queries
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34813'>SPARK-34813</a>] -         Remove Scala 2.13 build GitHub Action job from branch-3.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34951'>SPARK-34951</a>] -         Recover Python linter (Sphinx build) in GitHub Actions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35192'>SPARK-35192</a>] -         Port minimal TPC-DS datagen code from databricks/spark-sql-perf
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35293'>SPARK-35293</a>] -         Use the newer dsdgen for TPCDSQueryTestSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35327'>SPARK-35327</a>] -         Filters out the TPC-DS queries that can cause flaky test results
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35413'>SPARK-35413</a>] -         Use the SHA of the latest commit when checking out databricks/tpcds-kit
</li>
</ul>
        
<h2>        Task
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34970'>SPARK-34970</a>] -         Redact map-type options in the output of explain()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35495'>SPARK-35495</a>] -         Change SparkR maintainer for CRAN
</li>
</ul>
                                                                                                                                        
<h2>        Documentation
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35250'>SPARK-35250</a>] -         SQL DataFrameReader unescapedQuoteHandling parameter is misdocumented
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35405'>SPARK-35405</a>] -         Submitting Applications documentation has outdated information about K8s client mode support
</li>
</ul>