Release Notes - ASF JIRA

Release Notes - Spark - Version 3.3.0 - HTML format

Configure Release Notes

Sub-task

[SPARK-6305] - Add support for log4j 2.x to Spark
[SPARK-27442] - ParquetFileFormat fails to read column named with invalid characters
[SPARK-27974] - Support ANSI Aggregate Function: array_agg
[SPARK-28137] - Data Type Formatting Functions: `to_number`
[SPARK-32567] - Code-gen for full outer shuffled hash join
[SPARK-32709] - Write Hive ORC/Parquet bucketed table with hivehash (for Hive 1,2)
[SPARK-32712] - Support writing Hive non-ORC/Parquet bucketed table
[SPARK-33701] - Adaptive shuffle merge finalization for push-based shuffle
[SPARK-33832] - Add an option in AQE to mitigate skew even if it causes an new shuffle
[SPARK-34112] - Upgrade ORC to 1.7.0
[SPARK-34183] - DataSource V2: Support required distribution and ordering in SS
[SPARK-34332] - Unify v1 and v2 ALTER NAMESPACE .. SET LOCATION tests
[SPARK-34544] - pyspark toPandas() should return pd.DataFrame
[SPARK-34826] - Adaptive fetch of shuffle mergers for Push based shuffle
[SPARK-34863] - Support nested column in Spark Parquet vectorized readers
[SPARK-34960] - Aggregate (Min/Max/Count) push down for ORC
[SPARK-34980] - Support coalesce partition through union
[SPARK-35352] - Add code-gen for full outer sort merge join
[SPARK-35437] - Use expressions to filter Hive partitions at client side
[SPARK-35496] - Upgrade Scala 2.13 to 2.13.7
[SPARK-35663] - Add Timestamp without time zone type
[SPARK-35664] - Support java.time. LocalDateTime as an external type of TimestampWithoutTZ type
[SPARK-35674] - Test timestamp without time zone in UDF
[SPARK-35697] - Test TimestampWithoutTZType as ordered and atomic type
[SPARK-35698] - Support casting of timestamp without time zone to strings
[SPARK-35711] - Support casting of timestamp without time zone to timestamp type
[SPARK-35716] - Support casting of timestamp without time zone to date type
[SPARK-35718] - Support casting of Date to timestamp without time zone type
[SPARK-35719] - Support type conversion between timestamp and timestamp without time zone type
[SPARK-35720] - Support casting of String to timestamp without time zone type
[SPARK-35764] - Assign pretty names to TimestampWithoutTZType
[SPARK-35785] - Cleanup support for RocksDB instance
[SPARK-35839] - New SQL function: to_timestamp_ntz
[SPARK-35854] - Improve the error message of to_timestamp_ntz with invalid format pattern
[SPARK-35867] - Enable vectorized read for VectorizedPlainValuesReader.readBooleans
[SPARK-35889] - Support adding TimestampWithoutTZ with Interval types
[SPARK-35895] - Support subtracting Intervals from TimestampWithoutTZ
[SPARK-35916] - Support subtraction among Date/Timestamp/TimestampWithoutTZ
[SPARK-35925] - Support DayTimeIntervalType in width-bucket function
[SPARK-35926] - Support YearMonthIntervalType in width-bucket function
[SPARK-35927] - Remove type collection AllTimestampTypes
[SPARK-35932] - Support extracting hour/minute/second from timestamp without time zone
[SPARK-35953] - Support extracting date fields from timestamp without time zone
[SPARK-35963] - Rename TimestampWithoutTZType to TimestampNTZType
[SPARK-35968] - Make sure partitions are not too small in AQE partition coalescing
[SPARK-35971] - Rename the type name of TimestampNTZType as "timestamp_ntz"
[SPARK-35975] - New configuration spark.sql.timestampType for the default timestamp type
[SPARK-35977] - Support non-reserved keyword TIMESTAMP_NTZ
[SPARK-35978] - Support non-reserved keyword TIMESTAMP_LTZ
[SPARK-35979] - Return different timestamp literals based on the default timestamp type
[SPARK-35987] - The ANSI flags of Sum and Avg should be kept after being copied
[SPARK-36015] - Support TimestampNTZType in the Window spec definition
[SPARK-36016] - Support TimestampNTZType in expression ApproxCountDistinctForIntervals
[SPARK-36017] - Support TimestampNTZType in expression ApproximatePercentile
[SPARK-36037] - Support ANSI SQL LOCALTIMESTAMP datetime value function
[SPARK-36043] - Add end-to-end tests with default timestamp type as TIMESTAMP_NTZ
[SPARK-36044] - Suport TimestampNTZ in functions unix_timestamp/to_unix_timestamp
[SPARK-36046] - Support new functions make_timestamp_ntz and make_timestamp_ltz
[SPARK-36050] - Spark doesn’t support reading/writing TIMESTAMP_NTZ with ORC
[SPARK-36054] - Support group by TimestampNTZ column
[SPARK-36055] - Assign pretty SQL string to TimestampNTZ literals
[SPARK-36058] - Support replicasets/job API
[SPARK-36059] - Add the ability to specify a scheduler
[SPARK-36061] - Add `volcano` module and feature step
[SPARK-36072] - TO_TIMESTAMP: return different results based on the default timestamp type
[SPARK-36075] - Support for specifiying executor/driver node selector
[SPARK-36083] - make_timestamp: return different result based on the default timestamp type
[SPARK-36090] - Support TimestampNTZType in expression Sequence
[SPARK-36091] - Support TimestampNTZ type in expression TimeWindow
[SPARK-36095] - Group exception messages in core/rdd
[SPARK-36097] - Group exception messages in core/scheduler
[SPARK-36098] - Group exception messages in core/storage
[SPARK-36101] - Group exception messages in core/api
[SPARK-36107] - Refactor first set of 20 query execution errors to use error classes
[SPARK-36110] - Upgrade SBT to 1.5.5
[SPARK-36119] - Add new SQL function to_timestamp_ltz
[SPARK-36120] - Support TimestampNTZ type in cache table
[SPARK-36135] - Support TimestampNTZ type in file partitioning
[SPARK-36139] - Remove Python 3.6 from `pyspark` GitHub Action job
[SPARK-36144] - Use Python 3.9 in `run-pip-tests` conda environment
[SPARK-36146] - Upgrade Python version from 3.6 to higher version in GitHub linter
[SPARK-36152] - Add Scala 2.13 daily build and test GitHub Action job
[SPARK-36175] - Support TimestampNTZ in Avro data source
[SPARK-36179] - Support TimestampNTZType in SparkGetColumnsOperation
[SPARK-36182] - Support TimestampNTZ type in Parquet file source
[SPARK-36208] - SparkScriptTransformation should support ANSI interval types
[SPARK-36227] - Remove TimestampNTZ type support in Spark 3.2
[SPARK-36230] - hasnans for Series of Decimal(`NaN`)
[SPARK-36231] - Support arithmetic operations of Series containing Decimal(np.nan)
[SPARK-36232] - Support creating a ps.Series/Index with `Decimal('NaN')` with Arrow disabled
[SPARK-36255] - FileNotFoundException from the shuffle push can cause the executor to terminate
[SPARK-36256] - Upgrade lz4-java to 1.8.0
[SPARK-36257] - Updated the version of TimestampNTZ related changes as 3.3.0
[SPARK-36332] - Cleanup RemoteBlockPushResolver log messages
[SPARK-36336] - Define the new exception that mix SparkThrowable for all base exe in QueryExecutionErrors
[SPARK-36337] - decimal('Nan') is unsupported in net.razorvine.pickle
[SPARK-36346] - Support TimestampNTZ type in Orc file source
[SPARK-36357] - Support pushdown Timestamp with local time zone for orc
[SPARK-36368] - Fix CategoricalOps.astype to follow pandas 1.3
[SPARK-36378] - Minor changes to address a few identified server side inefficiencies
[SPARK-36396] - Implement DataFrame.cov
[SPARK-36399] - Implement DataFrame.combine_first
[SPARK-36401] - Implement Series.cov
[SPARK-36409] - Splitting test cases from datetime.sql
[SPARK-36424] - Support eliminate limits in AQE Optimizer
[SPARK-36435] - Implement MultIndex.equal_levels
[SPARK-36438] - Support list-like Python objects for Series comparison
[SPARK-36490] - Make from_csv/to_csv to handle timestamp_ntz type properly
[SPARK-36491] - Make from_json/to_json to handle timestamp_ntz type properly
[SPARK-36506] - Improve test coverage for series.py and indexes/*.py.
[SPARK-36526] - Add supportsIndex interface
[SPARK-36540] - AM should not just finish with Success when dissconnected
[SPARK-36556] - Add DSV2 filters
[SPARK-36587] - Migrate CreateNamespaceStatement to v2 command framework
[SPARK-36608] - Support TimestampNTZ in Arrow
[SPARK-36609] - Add `errors` argument for `ps.to_numeric`.
[SPARK-36615] - SparkContext should register shutdown hook earlier
[SPARK-36618] - Support dropping rows of a single-indexed DataFrame
[SPARK-36624] - When application killed, sc should not exit with code 0
[SPARK-36625] - Support TimestampNTZ in pandas API on Spark
[SPARK-36626] - Support TimestampNTZ in createDataFrame/toPandas and Python UDFs
[SPARK-36645] - Aggregate (Min/Max/Count) push down for Parquet
[SPARK-36646] - Push down group by partition column for Aggregate (Min/Max/Count) for Parquet
[SPARK-36647] - Push down filter by partition column for Aggregate (Min/Max/Count) for Parquet
[SPARK-36650] - ApplicationMaster shutdown hook should catch timeout exception
[SPARK-36652] - AQE dynamic join selection should not apply to non-equi join
[SPARK-36653] - Implement Series.__xor__
[SPARK-36655] - Add `versionadded` for API added in Spark 3.3.0
[SPARK-36656] - CollapseProject should not collapse correlated scalar subqueries
[SPARK-36661] - Support TimestampNTZ in Py4J
[SPARK-36675] - Support ScriptTransformation for timestamp_ntz
[SPARK-36678] - Migrate SHOW TABLES to use V2 command by default
[SPARK-36687] - Rename error classes with _ERROR suffix
[SPARK-36708] - Support numpy.typing for annotating ArrayType
[SPARK-36709] - Support new syntax for specifying index type and name
[SPARK-36710] - Support new syntax in function apply APIs
[SPARK-36711] - Support multi-index in new syntax
[SPARK-36713] - Document new syntax for specifying index type
[SPARK-36724] - Support timestamp_ntz as a type of time column for SessionWindow
[SPARK-36742] - Fix ps.to_datetime with plurals of keys like years, months, days
[SPARK-36746] - Refactor _select_rows_by_iterable in iLocIndexer to use Column.isin
[SPARK-36748] - Introduce the 'compute.isin_limit' option
[SPARK-36754] - array_intersect should handle Double.NaN and Float.NaN
[SPARK-36760] - Add interface SupportsPushDownV2Filters
[SPARK-36769] - Improve `filter` of single-indexed DataFrame
[SPARK-36771] - Fix `pop` of Categorical Series
[SPARK-36778] - Support ILIKE API on Scala(dataframe)
[SPARK-36779] - Error when list of data type tuples has len = 1
[SPARK-36785] - Fix ps.DataFrame.isin
[SPARK-36794] - Ignore duplicated join keys when building relation for SEMI/ANTI shuffle hash join
[SPARK-36796] - Make sql/core and dependent modules all UTs pass on Java 17
[SPARK-36813] - Implement ps.merge_asof
[SPARK-36818] - Fix filtering a Series by a boolean Series
[SPARK-36825] - Read/write dataframes with ANSI intervals from/to parquet files
[SPARK-36830] - Read/write dataframes with ANSI intervals from/to JSON files
[SPARK-36831] - Read/write dataframes with ANSI intervals from/to CSV files
[SPARK-36846] - Inline most of type hint files under pyspark/sql/pandas folder
[SPARK-36848] - Migrate ShowCurrentNamespaceStatement to v2 command framework
[SPARK-36849] - Migrate UseStatement to v2 command framework
[SPARK-36850] - Migrate CreateTableStatement to v2 command framework
[SPARK-36852] - Test ANSI interval support by the Parquet datasource
[SPARK-36854] - Parquet reader fails on load of ANSI interval when off-heap is enabled
[SPARK-36866] - Pushdown filters with ANSI interval values to parquet
[SPARK-36868] - Migrate CreateFunctionStatement to v2 command framework
[SPARK-36871] - Migrate CreateViewStatement to v2 command
[SPARK-36879] - Support Parquet v2 data page encodings for the vectorized path
[SPARK-36880] - Inline type hints for python/pyspark/sql/functions.py
[SPARK-36881] - Inline type hints for python/pyspark/sql/catalog.py
[SPARK-36882] - Support ILIKE API on Python
[SPARK-36884] - Inline type hints for python/pyspark/sql/session.py
[SPARK-36885] - Inline type hints for python/pyspark/sql/dataframe.py
[SPARK-36886] - Inline type hints for python/pyspark/sql/context.py
[SPARK-36891] - Refactor SpecificParquetRecordReaderBase and add more coverage on vectorized Parquet decoding
[SPARK-36895] - Add Create Index syntax support
[SPARK-36897] - Replace collections.namedtuple() by typing.NamedTuple
[SPARK-36899] - Support ILIKE API on R
[SPARK-36900] - "SPARK-36464: size returns correct positive number even with over 2GB data" will oom with JDK17
[SPARK-36902] - Migrate CreateTableAsSelectStatement to v2 command
[SPARK-36906] - Inline type hints for conf.py and observation.py in python/pyspark/sql
[SPARK-36910] - Inline type hints for python/pyspark/sql/types.py
[SPARK-36913] - Implement createIndex and IndexExists in JDBC (MySQL dialect)
[SPARK-36914] - Implement dropIndex and listIndexes in JDBC (MySQL dialect)
[SPARK-36920] - Support ANSI intervals by ABS
[SPARK-36921] - The DIV function should support ANSI intervals
[SPARK-36922] - The SIGN/SIGNUM functions should support ANSI intervals
[SPARK-36924] - CAST between ANSI intervals and numerics
[SPARK-36927] - Inline type hints for python/pyspark/sql/window.py
[SPARK-36928] - Handle ANSI intervals in ColumnarRow, ColumnarBatchRow and ColumnarArray
[SPARK-36930] - Support ps.MultiIndex.dtypes
[SPARK-36931] - Read/write dataframes with ANSI intervals from/to ORC files
[SPARK-36935] - Enhance ParquetSchemaConverter to capture Parquet repetition & definition level
[SPARK-36938] - Inline type hints for group.py in python/pyspark/sql
[SPARK-36940] - Inline type hints for python/pyspark/sql/avro/functions.py
[SPARK-36941] - Check saving of a dataframe with ANSI intervals to a Hive parquet table
[SPARK-36942] - Inline type hints for python/pyspark/sql/readwriter.py
[SPARK-36944] - Remove unused python/pyspark/sql/__init__.pyi
[SPARK-36945] - Inline type hints for python/pyspark/sql/udf.py
[SPARK-36946] - Support time for ps.to_datetime
[SPARK-36948] - Check CREATE TABLE with ANSI intervals using Hive external catalog and Parquet
[SPARK-36949] - Fix CREATE TABLE AS SELECT of ANSI intervals
[SPARK-36951] - Inline type hints for python/pyspark/sql/column.py
[SPARK-36952] - Inline type hints for python/pyspark/resource/information.py and python/pyspark/resource/profile.py
[SPARK-36960] - Pushdown filters with ANSI interval values to ORC
[SPARK-36968] - ps.Series.dot raise "matrices are not aligned" if index is not same
[SPARK-36969] - Inline type hints for SparkContext
[SPARK-36970] - Manual disabled format `B` for `date_format` function to compatibility with Java 8 behavior.
[SPARK-36977] - Update docs to reflect that Python 3.6 is no longer supported
[SPARK-36982] - Migrate SHOW NAMESPACES to use V2 command by default
[SPARK-36991] - Inline type hints for spark/python/pyspark/sql/streaming.py
[SPARK-37000] - Add type hints to python/pyspark/sql/util.py
[SPARK-37008] - WholeStageCodegenSparkSubmitSuite Failed with Java 17
[SPARK-37013] - `select format_string('%0$s', 'Hello')` has different behavior when using java 8 and Java 17
[SPARK-37014] - Inline type hints for python/pyspark/streaming/context.py
[SPARK-37015] - Inline type hints for python/pyspark/streaming/dstream.py
[SPARK-37023] - Avoid fetching merge status when shuffleMergeEnabled is false for a shuffleDependency during retry
[SPARK-37031] - Unify v1 and v2 DESCRIBE NAMESPACE tests
[SPARK-37033] - Inline type hints for python/pyspark/resource/requests.py
[SPARK-37038] - Sample push down in DS v2
[SPARK-37042] - Inline type hints for kinesis.py and listener.py in python/pyspark/streaming
[SPARK-37048] - Clean up inlining type hints under SQL module
[SPARK-37056] - Fix unused code in test about history server & MetricsSystem
[SPARK-37066] - Improve ORC RecordReader's error message
[SPARK-37070] - Pass all UTs in `mllib-local` and `mllib` with Java 17
[SPARK-37072] - Pass all UTs in `repl` with Java 17
[SPARK-37073] - Pass all UTs in `external/avro` with Java 17
[SPARK-37083] - Inline type hints for python/pyspark/accumulators.py
[SPARK-37091] - Support Java 17 in SparkR SystemRequirements
[SPARK-37095] - Inline type hints for files in python/pyspark/broadcast.py
[SPARK-37105] - Pass all UTs in `sql/hive` with Java 17
[SPARK-37106] - Pass all UTs in `yarn` with Java 17
[SPARK-37107] - Inline type hints for files in python/pyspark/status.py
[SPARK-37120] - Add Daily GitHub Action jobs for Java11/17
[SPARK-37125] - Support AnsiInterval radix sort
[SPARK-37129] - Supplement all micro benchmark results use to Java 17
[SPARK-37137] - Inline type hints for python/pyspark/conf.py
[SPARK-37138] - Support ANSI Interval in functions that support numeric type
[SPARK-37139] - Inline type hints for python/pyspark/taskcontext.py and python/pyspark/version.py
[SPARK-37140] - Inline type hints for python/pyspark/resultiterable.py
[SPARK-37144] - Inline type hints for python/pyspark/file.py
[SPARK-37145] - Add KubernetesCustom[Driver/Executor]FeatureConfigStep developer API
[SPARK-37146] - Inline type hints for python/pyspark/__init__.py
[SPARK-37149] - Improve error messages for arithmetic overflow under ANSI mode
[SPARK-37150] - Migrate DESCRIBE NAMESPACE to use V2 command by default
[SPARK-37152] - Inline type hints for python/pyspark/context.py
[SPARK-37153] - Inline type hints for python/pyspark/profiler.py
[SPARK-37154] - Inline type hints for python/pyspark/rdd.py
[SPARK-37155] - Inline type hints for python/pyspark/statcounter.py
[SPARK-37156] - Inline type hints for python/pyspark/storagelevel.py
[SPARK-37157] - Inline type hints for python/pyspark/util.py
[SPARK-37159] - Change HiveExternalCatalogVersionsSuite to be able to test with Java 17
[SPARK-37161] - RowToColumnConverter support AnsiIntervalType
[SPARK-37166] - SPIP: Storage Partitioned Join
[SPARK-37168] - Improve error messages for SQL functions and operators under ANSI mode
[SPARK-37179] - ANSI mode: Add a config to allow casting between Datetime and Numeric
[SPARK-37181] - pyspark.pandas.read_csv() should support latin-1 encoding
[SPARK-37188] - pyspark.pandas histogram accepts the title option but does not add a title to the plot
[SPARK-37190] - Improve error messages for casting under ANSI mode
[SPARK-37192] - Migrate SHOW TBLPROPERTIES to use V2 command by default
[SPARK-37195] - Unify v1 and v2 SHOW TBLPROPERTIES tests
[SPARK-37200] - Drop index support
[SPARK-37212] - Improve the implement of aggregate pushdown.
[SPARK-37220] - Do not split input file for Parquet reader with aggregate push down
[SPARK-37225] - Read/write dataframes with ANSI intervals from/to Avro files
[SPARK-37228] - Implement DataFrame.mapInArrow in Python
[SPARK-37230] - Document DataFrame.mapInArrow
[SPARK-37231] - Dynamic writes/reads of ANSI interval partitions
[SPARK-37232] - Upgrade ORC to 1.7.1
[SPARK-37234] - Inline type hints for python/pyspark/mllib/stat/_statistics.py
[SPARK-37235] - Inline type hints for distribution.py and __init__.py in python/pyspark/mllib/stat
[SPARK-37236] - Inline type hints for KernelDensity.pyi, test.py in python/pyspark/mllib/stat/
[SPARK-37240] - Cannot read partitioned parquet files with ANSI interval partition values
[SPARK-37258] - Upgrade kubernetes-client to 5.12.0
[SPARK-37261] - Check adding partitions with ANSI intervals
[SPARK-37262] - Not log empty aggregate and group by in JDBCScan
[SPARK-37264] - Exclude hadoop-client-api transitive dependency from orc-core
[SPARK-37267] - OptimizeSkewInRebalancePartitions support optimize non-root node
[SPARK-37272] - Add `ExtendedRocksDBTest` and disable RocksDB tests on Apple Silicon
[SPARK-37277] - Support DayTimeIntervalType in Arrow
[SPARK-37279] - Support DayTimeIntervalType in createDataFrame/toPandas and Python UDFs
[SPARK-37281] - Support DayTimeIntervalType in Py4J
[SPARK-37282] - Add ExtendedLevelDBTest and disable LevelDB tests on Apple Silicon
[SPARK-37286] - Move compileAggregates from JDBCRDD to JdbcDialect
[SPARK-37291] - PySpark init SparkSession should copy conf to sharedState
[SPARK-37293] - Remove explicit GC options from Scala tests
[SPARK-37294] - Check inserting of ANSI intervals into a table partitioned by the interval columns
[SPARK-37296] - Add missing type hints in python/pyspark/util.py
[SPARK-37304] - Check replacing columns with ANSI intervals
[SPARK-37310] - Migrate ALTER NAMESPACE ... SET PROPERTIES to use v2 command by default
[SPARK-37311] - Migrate ALTER NAMESPACE ... SET LOCATION to use v2 command by default
[SPARK-37312] - Add `.java-version` to `.gitignore` and `.rat-excludes`
[SPARK-37316] - Add code-gen for existence sort merge join
[SPARK-37317] - Reduce weights in GaussianMixtureSuite
[SPARK-37319] - Support K8s image building with Java 17
[SPARK-37326] - Support TimestampNTZ in CSV data source
[SPARK-37328] - SPARK-33832 brings the bug that OptimizeSkewedJoin may not work since it was applied on whole plan innstead of new stage plan
[SPARK-37330] - Migrate ReplaceTableStatement to v2 command
[SPARK-37331] - Add the ability to create resources before driver pod
[SPARK-37332] - Check adding of ANSI interval columns to v1/v2 tables
[SPARK-37343] - Implement createIndex and IndexExists in JDBC (Postgres dialect)
[SPARK-37345] - Add java.security.jgss/sun.security.krb5 to DEFAULT_MODULE_OPTIONS
[SPARK-37354] - Make the Java version installed on the container image used by the K8s integration tests with SBT configurable
[SPARK-37357] - Add small partition factor for rebalance partitions
[SPARK-37360] - Support TimestampNTZ in JSON data source
[SPARK-37376] - SPJ: Introduce a new DataSource V2 interface HasPartitionKey
[SPARK-37377] - SPJ: Initial implementation of Storage-Partitioned Join
[SPARK-37379] - Add tree pattern pruning to CTESubstitution rule
[SPARK-37381] - Unify v1 and v2 SHOW CREATE TABLE tests
[SPARK-37385] - Add tests for TimestampNTZ and TimestampLTZ for Parquet data source
[SPARK-37389] - Check unclosed bracketed comments
[SPARK-37397] - Inline type hints for python/pyspark/ml/base.py
[SPARK-37398] - Inline type hints for python/pyspark/ml/classification.py
[SPARK-37399] - Inline type hints for python/pyspark/ml/common.py
[SPARK-37400] - Inline type hints for python/pyspark/mllib/classification.py
[SPARK-37401] - Inline type hints for python/pyspark/ml/clustering.py
[SPARK-37402] - Inline type hints for python/pyspark/mllib/clustering.py
[SPARK-37403] - Inline type hints for python/pyspark/mllib/common.py
[SPARK-37404] - Inline type hints for python/pyspark/ml/evaluation.py
[SPARK-37405] - Inline type hints for python/pyspark/ml/feature.py
[SPARK-37406] - Inline type hints for python/pyspark/ml/fpm.py
[SPARK-37407] - Inline type hints for python/pyspark/ml/functions.py
[SPARK-37408] - Inline type hints for python/pyspark/ml/image.py
[SPARK-37409] - Inline type hints for python/pyspark/ml/pipeline.py
[SPARK-37410] - Inline type hints for python/pyspark/ml/recommendation.py
[SPARK-37411] - Inline type hints for python/pyspark/ml/regression.py
[SPARK-37412] - Inline type hints for python/pyspark/ml/stat.py
[SPARK-37413] - Inline type hints for python/pyspark/ml/tree.py
[SPARK-37414] - Inline type hints for python/pyspark/ml/tuning.py
[SPARK-37415] - Inline type hints for python/pyspark/ml/util.py
[SPARK-37416] - Inline type hints for python/pyspark/ml/wrapper.py
[SPARK-37417] - Inline type hints for python/pyspark/ml/linalg/__init__.py
[SPARK-37418] - Inline type hints for python/pyspark/ml/param/__init__.py
[SPARK-37419] - Inline type hints for python/pyspark/ml/param/shared.py
[SPARK-37421] - Inline type hints for python/pyspark/mllib/evaluation.py
[SPARK-37422] - Inline type hints for python/pyspark/mllib/feature.py
[SPARK-37423] - Inline type hints for python/pyspark/mllib/fpm.py
[SPARK-37424] - Inline type hints for python/pyspark/mllib/random.py
[SPARK-37426] - Inline type hints for python/pyspark/mllib/regression.py
[SPARK-37427] - Inline type hints for python/pyspark/mllib/tree.py
[SPARK-37428] - Inline type hints for python/pyspark/mllib/util.py
[SPARK-37429] - Inline type hints for python/pyspark/mllib/linalg/__init__.py
[SPARK-37430] - Inline type hints for python/pyspark/mllib/linalg/distributed.py
[SPARK-37438] - ANSI mode: Use store assignment rules for resolving function invocation
[SPARK-37442] - In AQE, wrong InMemoryRelation size estimation causes "Cannot broadcast the table that is larger than 8GB: 8 GB" failure
[SPARK-37444] - ALTER NAMESPACE ... SET LOCATION should handle empty location consistently across v1 and v2 command
[SPARK-37455] - Replace hash with sort aggregate if child is already sorted
[SPARK-37456] - CREATE NAMESPACE should qualify location for v2 command
[SPARK-37463] - Read/Write Timestamp ntz from/to Orc uses int64
[SPARK-37478] - Unify v1 and v2 DROP NAMESPACE tests
[SPARK-37479] - Migrate DROP NAMESPACE to use V2 command by default
[SPARK-37482] - Skip check monotonic increasing for Series.asof with 'compute.eager_check'
[SPARK-37483] - Support push down top N to JDBC data source V2
[SPARK-37489] - Skip hasnans check in numops if eager_check disable
[SPARK-37490] - Show hint if analyzer fails due to ANSI type coercion
[SPARK-37494] - Unify v1 and v2 options output of `SHOW CREATE TABLE` command
[SPARK-37495] - Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled
[SPARK-37501] - CREATE/REPLACE TABLE should qualify location for v2 command
[SPARK-37504] - pyspark should not pass all options to session states.
[SPARK-37509] - Improve Fallback Storage upload speed by avoiding S3 rate limiter
[SPARK-37510] - Support TimedeltaIndex in pandas API on Spark
[SPARK-37511] - Introduce TimedeltaIndex to pandas API on Spark
[SPARK-37512] - Support TimedeltaIndex creation (from Series/Index) and TimedeltaIndex.astype
[SPARK-37522] - Fix MultilayerPerceptronClassifierTest.test_raw_and_probability_prediction
[SPARK-37526] - Add Java17 PySpark daily test coverage
[SPARK-37527] - Translate more standard aggregate functions for pushdown
[SPARK-37529] - Support K8s integration tests for Java 17
[SPARK-37533] - New SQL function: try_element_at
[SPARK-37543] - Document Java 17 support
[SPARK-37548] - Add Java17 SparkR daily test coverage
[SPARK-37557] - Replace object hash with sort aggregate if child is already sorted
[SPARK-37563] - Implement days, seconds, microseconds properties of TimedeltaIndex
[SPARK-37564] - Support sort aggregate code-gen without grouping keys
[SPARK-37576] - Support built-in K8s executor roll plugin
[SPARK-37590] - Unify v1 and v2 ALTER NAMESPACE ... SET PROPERTIES tests
[SPARK-37613] - Support ANSI Aggregate Function: regr_count
[SPARK-37614] - Support ANSI Aggregate Function: regr_avgx & regr_avgy
[SPARK-37619] - Upgrade Maven to 3.8.4
[SPARK-37620] - Use more precise types for SparkContext Optional fields (i.e. _gateway, _jvm)
[SPARK-37622] - Support K8s executor rolling policy
[SPARK-37632] - Drop code targetting Python < 3.7
[SPARK-37636] - Migrate CREATE NAMESPACE to use v2 command by default
[SPARK-37638] - Use existing active Spark session instead of SparkSession.getOrCreate in pandas API on Spark
[SPARK-37641] - Support ANSI Aggregate Function: regr_r2
[SPARK-37644] - Support datasource v2 complete aggregate pushdown
[SPARK-37645] - Word spell error - "labeled" spells as "labled"
[SPARK-37651] - Use existing active Spark session in all places of pandas API on Spark
[SPARK-37652] - Support optimize skewed join through union
[SPARK-37653] - Upgrade RoaringBitmap to 0.9.23
[SPARK-37655] - Add RocksDB Implementation for KVStore
[SPARK-37664] - Add InMemoryColumnarBenchmark and StateStoreBasicOperationsBenchmark Java 11/17 result
[SPARK-37669] - Remove unnecessary usages of OrderedDict
[SPARK-37673] - Implement `ps.timedelta_range` method
[SPARK-37675] - Prevent overwriting of push shuffle merged files once the shuffle is finalized
[SPARK-37679] - Add a new executor roll policy, FAILED_TASKS
[SPARK-37680] - Support RocksDB backend in Spark History Server
[SPARK-37684] - Upgrade log4j to 2.17
[SPARK-37685] - Make log event immutable for LogAppender
[SPARK-37695] - Skip diagnosis ob merged blocks from push-based shuffle
[SPARK-37699] - Fix failed K8S integration test in SparkConfPropagateSuite
[SPARK-37707] - Allow store assignment between TimestampNTZ and Date/Timestamp
[SPARK-37709] - Add AVERAGE_DURATION executor roll policy
[SPARK-37714] - ANSI mode: allow casting between numeric type and timestamp type
[SPARK-37719] - Try to remove `--add-exports` compile option for Java 17
[SPARK-37727] - Show ignored confs & hide warnings for conf already set in SparkSession.builder.getOrCreate
[SPARK-37729] - SparkSession.setLogLevel not working in Spark Shell
[SPARK-37732] - Improve the implement of JDBCV2Suite
[SPARK-37734] - Upgrade h2 from 1.4.195 to 2.0.202
[SPARK-37735] - Add appId interface to KubernetesConf
[SPARK-37741] - Remove Jenkins badge in README.md
[SPARK-37746] - log4j2-defaults.properties is not working since log4j 2 is always initialized by default
[SPARK-37755] - Optimize RocksDB KVStore configurations
[SPARK-37760] - Upgrade SBT to 1.6.0
[SPARK-37768] - Schema pruning for the metadata struct
[SPARK-37769] - Filter on the metadata struct
[SPARK-37773] - Disable certain doctests of `ps.to_timedelta` for pandas<=1.0.5
[SPARK-37774] - Upgrade log4j from 2.17 to 2.17.1
[SPARK-37775] - [PYSPARK] Fix mlflow doctest
[SPARK-37790] - Upgrade SLF4J to 1.7.32
[SPARK-37791] - Use log4j2 in examples.
[SPARK-37792] - Spark shell sets log level to INFO by default
[SPARK-37794] - Remove log4j bridge api usage
[SPARK-37795] - Add a scalastyle rule to ban `org.apache.log4j`
[SPARK-37801] - List PyPy3 installed libraries in build_and_test workflow
[SPARK-37804] - Unify v1 and v2 CREATE NAMESPACE tests
[SPARK-37805] - Refactor TestUtils#configTestLog4j method to use log4j2 api
[SPARK-37806] - Support minimum number of tasks per executor before being rolling
[SPARK-37819] - Add OUTLIER executor roll policy and use it by default
[SPARK-37824] - Document K8s executor rolling configurations
[SPARK-37827] - Put the some built-in table properties into V1Table.propertie to adapt to V2 command
[SPARK-37839] - DS V2 supports partial aggregate push-down AVG
[SPARK-37843] - Suppress NoSuchFieldError at setMDCForTask
[SPARK-37844] - Remove slf4j-log4j12 dependency from hadoop-minikdc
[SPARK-37847] - PushBlockStreamCallback should check isTooLate first to avoid NPE
[SPARK-37853] - Clean up deprecation compilation warning related to log4j2
[SPARK-37858] - Throw Spark exceptions from AES functions
[SPARK-37864] - Support Parquet v2 data page RLE encoding (for Boolean Values) for the vectorized path
[SPARK-37866] - Set file.encoding to UTF-8 for SBT tests
[SPARK-37867] - Compile aggregate functions of build-in JDBC dialect
[SPARK-37870] - Enable Apple Silicon Jenkins CI (Java/Scala/Python/R)
[SPARK-37875] - Support ARM64 in Java 17 docker image
[SPARK-37878] - Migrate SHOW CREATE TABLE to use v2 command by default
[SPARK-37880] - Upgrade Scala to 2.13.8
[SPARK-37887] - PySpark shell sets log level to INFO by default
[SPARK-37889] - Log4j2 MarkerFilter can not filter unnecessary thrift errors
[SPARK-37923] - Generate partition transforms for BucketSpec inside parser
[SPARK-37929] - Support cascade mode for `dropNamespace` API
[SPARK-37937] - Use error classes in the parsing errors of lateral join
[SPARK-37941] - Use error classes in the compilation errors of casting
[SPARK-37943] - Use error classes in the compilation errors of grouping
[SPARK-37957] - Deterministic flag is not handled for V2 functions
[SPARK-37960] - A new framework to represent catalyst expressions in DS v2 APIs
[SPARK-37979] - Switch to more generic error classes in AES functions
[SPARK-37983] - Backout agg build time metrics from sort aggregate
[SPARK-37986] - Support TimestampNTZ radix sort
[SPARK-37990] - Support TimestampNTZ in RowToColumnConverter
[SPARK-37995] - TPCDS 1TB q72 fails when spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly is false
[SPARK-37998] - Use `rbac.authorization.k8s.io/v1` instead of `v1beta1`
[SPARK-38001] - Replace the unsupported error classes by `UNSUPPORTED_FEATURE`
[SPARK-38013] - AQE can change bhj to smj if no extra shuffle introduce
[SPARK-38015] - Mark legacy file naming functions as deprecated in FileCommitProtocol
[SPARK-38019] - ExecutorMonitor.timedOutExecutors should be deterministic
[SPARK-38022] - Use relativePath for K8s remote file test in BasicTestsSuite
[SPARK-38023] - ExecutorMonitor.onExecutorRemoved should handle ExecutorDecommission as finished
[SPARK-38029] - Support docker-desktop K8S integration test in SBT
[SPARK-38030] - Query with cast containing non-nullable columns fails with AQE on Spark 3.1.1
[SPARK-38047] - Add OUTLIER_NO_FALLBACK executor roll policy
[SPARK-38048] - Add IntegrationTestBackend.describePods to support all K8s test backends
[SPARK-38049] - Use Java 17 in K8s integration tests
[SPARK-38062] - FallbackStorage shouldn't attempt to resolve arbitrary "remote" hostname
[SPARK-38071] - Support K8s namespace parameter in SBT K8s IT
[SPARK-38072] - Support K8s imageTag parameter in SBT K8s IT
[SPARK-38081] - Support cloud-backend in K8s IT with SBT
[SPARK-38085] - DataSource V2: Handle DELETE commands for group-based sources
[SPARK-38095] - HistoryServerDiskManager.appStorePath should use backend-based extensions
[SPARK-38097] - Improve the error for pivoting of unsupported value types
[SPARK-38103] - Use error classes in the parsing errors of transform
[SPARK-38104] - Use error classes in the parsing errors of windows
[SPARK-38105] - Use error classes in the parsing errors of joins
[SPARK-38107] - Use error classes in the compilation errors of python/pandas UDFs
[SPARK-38112] - Use error classes in the execution errors of date/timestamp handling
[SPARK-38113] - Use error classes in the execution errors of pivoting
[SPARK-38125] - Use static factory methods instead of the deprecated `Byte/Short/Integer/Long` constructors
[SPARK-38126] - Check the whole message of error classes
[SPARK-38131] - Keep only user-facing error classes
[SPARK-38145] - Address the 'pyspark' tagged tests when "spark.sql.ansi.enabled" is True
[SPARK-38155] - Disallow distinct aggregate in lateral subqueries with unsupported correlated predicates
[SPARK-38157] - Fix /sql/hive-thriftserver/org.apache.spark.sql.hive.thriftserver.ThriftServerQueryTestSuite under ANSI mode
[SPARK-38159] - Minor refactor of MetadataAttribute unapply method
[SPARK-38162] - Optimize one row plan in normal and AQE Optimizer
[SPARK-38163] - Preserve the error class of `AnalysisException` while constructing of function builder
[SPARK-38164] - New SQL function: try_subtract and try_multiply
[SPARK-38176] - ANSI mode: allow implicitly casting String to other simple types
[SPARK-38180] - Allow safe up-cast expressions in correlated equality predicates
[SPARK-38187] - Support resource reservation (Introduce minCPU/minMemory) with volcano implementations
[SPARK-38188] - Support queue scheduling (Introduce queue) with volcano implementations
[SPARK-38196] - Reactor framework so as JDBC dialect could compile expression by self way
[SPARK-38203] - Fix SQLInsertTestSuite and SchemaPruningSuite under ANSI mode
[SPARK-38226] - Fix HiveCompatibilitySuite under ANSI mode
[SPARK-38228] - Legacy store assignment should not fail on error under ANSI mode
[SPARK-38232] - Explain formatted does not collect subqueries under query stage in AQE
[SPARK-38241] - Close KubernetesClient in K8S integrations tests
[SPARK-38244] - Upgrade kubernetes-client to 5.12.1
[SPARK-38246] - Refactor KVUtils and add UTs related to RocksDB
[SPARK-38251] - Change Cast.toString as "cast" instead of "ansi_cast" under ANSI mode
[SPARK-38268] - Hide the "failOnError" field in the toString method of Abs/CheckOverflow
[SPARK-38272] - Use docker-desktop instead of docker-for-desktop for Docker K8S IT deployMode and context name
[SPARK-38276] - Add approved TPCDS plan under ANSI mode
[SPARK-38281] - Fix AnalysisSuite under ANSI mode
[SPARK-38283] - Test invalid datetime parsing under ANSI mode
[SPARK-38290] - Fix JsonSuite and ParquetIOSuite under ANSI mode
[SPARK-38295] - Fix ArithmeticExpressionSuite under ANSI mode
[SPARK-38298] - Fix DataExpressionSuite, NullExpressionsSuite, StringExpressionsSuite, complexTypesSuite, CastSuite under ANSI mode
[SPARK-38302] - Use Java 17 in K8S integration tests when setting spark-tgz
[SPARK-38306] - Fix ExplainSuite,StatisticsCollectionSuite and StringFunctionsSuite under ANSI mode
[SPARK-38307] - Fix ExpressionTypeCheckingSuite and CollectionExpressionsSuite under ANSI mode
[SPARK-38311] - Fix DynamicPartitionPruning/BucketedReadSuite/ExpressionInfoSuite under ANSI mode
[SPARK-38312] - Use error classes in org.apache.spark.metrics
[SPARK-38316] - Fix SQLViewSuite/TriggerAvailableNowSuite/UnwrapCastInBinaryComparisonSuite/UnwrapCastInComparisonEndToEndSuite under ANSI mode
[SPARK-38321] - Fix BooleanSimplificationSuite under ANSI mode
[SPARK-38325] - ANSI mode: avoid potential runtime error in HashJoin.extractKeyExprAt()
[SPARK-38343] - Fix SQLQuerySuite under ANSI mode
[SPARK-38352] - Fix DataFrameAggregateSuite/DataFrameSetOperationsSuite/DataFrameWindowFunctionsSuite under ANSI mode
[SPARK-38361] - Add factory method getConnection into JDBCDialect.
[SPARK-38363] - Avoid runtime error in Dataset.summary() when ANSI mode is on
[SPARK-38383] - Support APP_ID and EXECUTOR_ID placeholder in annotations
[SPARK-38385] - Improve error messages of 'mismatched input' cases from ANTLR
[SPARK-38387] - Support `na_action` and Series input correspondence in `Series.map`
[SPARK-38391] - Datasource v2 supports partial topN push-down
[SPARK-38392] - Add `spark-` prefix to namespaces and `-driver` suffix to drivers during IT
[SPARK-38398] - Add `priorityClassName` integration test case
[SPARK-38400] - Enable Series.rename to change index labels
[SPARK-38406] - Improve perfermance of ShufflePartitionsUtil createSkewPartitionSpecs
[SPARK-38407] - ANSI Cast: loosen the limitation of casting non-null complex types
[SPARK-38410] - Support specify initial partition number for rebalance
[SPARK-38417] - Remove `Experimental` from `RDD.cleanShuffleDependencies` API
[SPARK-38418] - Add PySpark cleanShuffleDependencies API
[SPARK-38423] - Support priority scheduling with volcano implementations
[SPARK-38430] - Add SBT commands to K8s IT readme
[SPARK-38432] - Refactor framework so as JDBC dialect could compile filter by self way
[SPARK-38442] - Fix ConstantFoldingSuite/ColumnExpressionSuite/DataFrameSuite/AdaptiveQueryExecSuite under ANSI mode
[SPARK-38450] - Fix HiveQuerySuite//PushFoldableIntoBranchesSuite/TransposeWindowSuite
[SPARK-38451] - Fix R tests under ANSI mode
[SPARK-38452] - Support pyDockerfile and rDockerfile in SBT K8s IT
[SPARK-38453] - Add volcano section to K8s IT README.md
[SPARK-38455] - Support driver/executor PodGroup templates
[SPARK-38456] - Improve error messages of no viable alternative, extraneous input and missing token
[SPARK-38480] - Remove spark.kubernetes.job.queue in favor of spark.kubernetes.driver.podGroupTemplateFile
[SPARK-38481] - Substitute Java overflow exception from TIMESTAMPADD by Spark exception
[SPARK-38486] - Upgrade the minimum Minikube version to 1.18.0
[SPARK-38490] - Add Github action test job for ANSI SQL mode
[SPARK-38491] - Support `ignore_index` of `Series.sort_values`
[SPARK-38501] - Fix thriftserver test failures under ANSI mode
[SPARK-38504] - Can't read TimestampNTZ as TimestampLTZ
[SPARK-38508] - Volcano feature doesn't work on EKS graviton instances
[SPARK-38511] - Remove priorityClassName propagation in favor of explicit settings
[SPARK-38513] - Move custom scheduler-specific configs to under `spark.kubernetes.scheduler.NAME` prefix
[SPARK-38515] - Volcano queue is not deleted
[SPARK-38518] - Implement `skipna` of `Series.all/Index.all` to exclude NA/null values
[SPARK-38519] - AQE throw exception should respect SparkFatalException
[SPARK-38524] - Fix Volcano weight to be positive integer and use cpu capability instead
[SPARK-38527] - Set the minimum Volcano version
[SPARK-38533] - DS V2 aggregate push-down supports project with alias
[SPARK-38534] - Disable to_timestamp('366', 'DD') test case
[SPARK-38537] - Unify Statefulset* to StatefulSet*
[SPARK-38538] - Fix driver environment verification in BasicDriverFeatureStepSuite
[SPARK-38544] - Upgrade log4j2 to 2.17.2
[SPARK-38548] - New SQL function: try_sum
[SPARK-38553] - Bump minimum Volcano version to v1.5.1
[SPARK-38560] - If `Sum`, `Count`, `Any` accompany distinct, cannot do partial agg push down.
[SPARK-38561] - Add doc for "Customized Kubernetes Schedulers"
[SPARK-38562] - Add doc for Volcano scheduler
[SPARK-38589] - New SQL function: try_avg
[SPARK-38590] - New SQL function: try_to_binary
[SPARK-38616] - Keep track of SQL query text in Catalyst TreeNode
[SPARK-38625] - DataSource V2: Add APIs for group-based row-level operations
[SPARK-38626] - Make condition in DeleteFromTable required
[SPARK-38633] - Support push down Cast to JDBC data source V2
[SPARK-38644] - DS V2 topN push-down supports project with alias
[SPARK-38676] - Provide query context in runtime error of Add/Subtract/Multiply
[SPARK-38698] - Provide query context in runtime error of Divide/Div/Reminder/Pmod
[SPARK-38716] - Provide query context in runtime error of map key not exists
[SPARK-38761] - DS V2 supports push down misc non-aggregate functions
[SPARK-38762] - Provide query context in Decimal overflow errors
[SPARK-38763] - Pandas API on spark Can`t apply lamda to columns.
[SPARK-38787] - Possible correctness issue on stream-stream join when handling edge case
[SPARK-38791] - Output parameter values of error classes in SQL style
[SPARK-38809] - Implement option to skip null values in symmetric hash impl of stream-stream joins
[SPARK-38813] - Remove TimestampNTZ type support in Spark 3.3
[SPARK-38817] - Upgrade kubernetes-client to 5.12.2
[SPARK-38828] - Remove TimestampNTZ type Python support in Spark 3.3
[SPARK-38829] - New configuration for controlling timestamp inference of Parquet
[SPARK-38837] - Implement `dropna` parameter of `SeriesGroupBy.value_counts`
[SPARK-38855] - DS V2 supports push down math functions
[SPARK-38865] - Update document of JDBC options for pushDownAggregate and pushDownLimit
[SPARK-38891] - Skipping allocating vector for repetition & definition levels when possible
[SPARK-38908] - Provide query context in runtime error of Casting from String to Number/Date/Timestamp/Boolean
[SPARK-38913] - Output identifiers in error messages in SQL style
[SPARK-38926] - Output types in error messages in SQL style
[SPARK-38949] - Wrap SQL statements by double quotes in error messages
[SPARK-38950] - Return Array of Predicate for SupportsPushDownCatalystFilters.pushedFilters
[SPARK-38967] - Turn spark.sql.ansi.strictIndexOperator into internal config
[SPARK-38996] - Use double quotes for types in error messages
[SPARK-38997] - DS V2 aggregate push-down supports group by expressions
[SPARK-39007] - Use double quotes for SQL configs in error messages
[SPARK-39027] - Output SQL statements in upper case in error messages
[SPARK-39040] - Respect NaNvl in EquivalentExpressions for expression elimination
[SPARK-39046] - Return an empty context string if TreeNode.origin is wrongly set
[SPARK-39087] - Improve error messages: step 1
[SPARK-39105] - Add ConditionalExpression trait
[SPARK-39106] - Correct conditional expression constant folding
[SPARK-39121] - Fix doc format/syntax error
[SPARK-39135] - DS V2 aggregate partial push-down should supports group by without aggregate functions
[SPARK-39157] - H2Dialect should override getJDBCType so as make the data type is correct
[SPARK-39162] - Jdbc dialect should decide which function could be pushed down.
[SPARK-39164] - Wrap asserts/illegal state exceptions by the INTERNAL_ERROR exception in actions
[SPARK-39165] - Replace sys.error by IllegalStateException in Spark SQL
[SPARK-39166] - Provide runtime error query context for Binary Arithmetic when WSCG is off
[SPARK-39175] - Provide runtime error query context for Cast when WSCG is off
[SPARK-39177] - Provide query context on map key not exists error when WSCG is off
[SPARK-39187] - Remove SparkIllegalStateException
[SPARK-39190] - Provide query context for decimal precision overflow error when WSCG is off
[SPARK-39193] - Fasten Timestamp type inference of default format in JSON/CSV data source
[SPARK-39208] - Fix query context bugs in decimal overflow under codegen mode
[SPARK-39210] - Provide query context of Decimal overflow in AVG when WSCG is off
[SPARK-39212] - Use double quotes for values of SQL configs/DS options in error messages
[SPARK-39214] - Improve errors related to CAST
[SPARK-39229] - Separate query contexts from error-classes.json
[SPARK-39234] - Code clean up in SparkThrowableHelper.getMessage
[SPARK-39243] - Describe the rules of quoting elements in error messages
[SPARK-39255] - Improve error messages: step 2
[SPARK-39272] - Increase the start position of query context by 1
[SPARK-39322] - Remove `Experimental` from `spark.dynamicAllocation.shuffleTracking.enabled`
[SPARK-39327] - ExecutorRollPolicy.ID should consider ID as a numerical number.
[SPARK-39346] - Convert asserts/illegal state exception to internal errors on each phase
[SPARK-39886] - Disable DEFAULT column SQLConf until implementation is complete

Bug

[SPARK-8582] - Optimize checkpointing to avoid computing an RDD twice
[SPARK-18621] - PySQL SQL Types (aka Dataframa Schema) have __repr__() with Scala and not Python representation
[SPARK-23626] - DAGScheduler blocked due to JobSubmitted event
[SPARK-30062] - Add IMMEDIATE statement to the DB2 dialect truncate implementation
[SPARK-30537] - toPandas gets wrong dtypes when applied on empty DF when Arrow enabled
[SPARK-32079] - PySpark <> Beam pickling issues for collections.namedtuple
[SPARK-33206] - Spark Shuffle Index Cache calculates memory usage wrong
[SPARK-34521] - spark.createDataFrame does not support Pandas StringDtype extension type
[SPARK-34805] - PySpark loses metadata in DataFrame fields when selecting nested columns
[SPARK-35011] - False active executor in UI that caused by BlockManager reregistration
[SPARK-35430] - Investigate the failure of "PVs with local storage" integration test on Docker driver
[SPARK-35531] - Can not insert into hive bucket table if create table with upper case schema
[SPARK-35561] - partition result is incorrect when insert into partition table with int datatype partition column
[SPARK-35672] - Spark fails to launch executors with very large user classpath lists on YARN
[SPARK-35803] - Spark SQL does not support creating views using DataSource v2 based data sources
[SPARK-35881] - [SQL] AQE does not support columnar execution for the final query stage
[SPARK-35912] - [SQL] JSON read behavior is different depending on the cache setting when nullable is false.
[SPARK-35929] - Schema inference of nested structs defaults to map
[SPARK-36004] - Update MiMa and audit Scala/Java API changes
[SPARK-36007] - Failed to run benchmark in GA
[SPARK-36009] - Missing GraphX classes in registerKryoClasses util method
[SPARK-36013] - Upgrade Dropwizard Metrics to 4.2.2
[SPARK-36014] - Use uuid as app id in kubernetes client mode
[SPARK-36026] - Upgrade Kubernetes Client Version to 5.5.0
[SPARK-36036] - Regression: Remote blocks stored on disk by BlockManager are not deleted
[SPARK-36052] - Introduce pending pod limit for Spark on K8s
[SPARK-36122] - Spark does not passon needClientAuth to Jetty SSLContextFactory. Does not allow to configure mTLS authentication.
[SPARK-36169] - Make 'spark.sql.sources.disabledJdbcConnProviderList' as a static conf (as documneted)
[SPARK-36211] - type check fails for `F.udf(...).asNonDeterministic()
[SPARK-36237] - SparkUI should bind handler after application started
[SPARK-36242] - Ensure spill file closed before set success to true in ExternalSorter.spillMemoryIteratorToDisk method
[SPARK-36327] - Spark sql creates staging dir inside database directory rather than creating inside table directory
[SPARK-36341] - In stage page, 'Aggregated Metrics by Executor' the underline displayed when the mouse is moved to the link is blocked
[SPARK-36348] - unexpected Index loaded: pd.Index([10, 20, None], name="x")
[SPARK-36358] - Upgrade Kubernetes Client Version to 5.6.0
[SPARK-36379] - Null at root level of a JSON array causes the parsing failure (w/ permissive mode)
[SPARK-36382] - Remove noisy footer from the summary table for metrics
[SPARK-36383] - NullPointerException throws during executor shutdown
[SPARK-36389] - Revert the change that accepts negative mapId in ShuffleBlockId
[SPARK-36391] - When fetch chunk throw NPE, improve the error message
[SPARK-36421] - Validate all SQL configs to prevent from wrong use for ConfigEntry
[SPARK-36433] - Logs should show correct URL of where HistoryServer is started
[SPARK-36448] - Exceptions in NoSuchItemException.scala have to be case classes to preserve specific exceptions
[SPARK-36488] - "Invalid usage of '*' in expression" error due to the feature of 'quotedRegexColumnNames' in some scenarios.
[SPARK-36507] - Remove/Replace missing links to AMP Camp materials from index.md
[SPARK-36512] - Fix UISeleniumSuite in sql/hive-thriftserver
[SPARK-36553] - KMeans fails with NegativeArraySizeException for K = 50000 after issue #27758 was introduced
[SPARK-36554] - Error message while trying to use spark sql functions directly on dataframe columns without using select expression
[SPARK-36568] - Missed broadcast join in V2 plan
[SPARK-36605] - Upgrade Jackson to 2.12.5
[SPARK-36627] - Tasks with Java proxy objects fail to deserialize
[SPARK-36681] - Fail to load Snappy codec
[SPARK-36700] - BlockManager re-registration is broken due to deferred removal of BlockManager
[SPARK-36717] - Wrong order of variable initialization may lead to incorrect behavior
[SPARK-36733] - Perf issue in SchemaPruning when a struct has many fields
[SPARK-36773] - Fix the uts to check the parquet compression
[SPARK-36798] - When SparkContext is stopped, metrics system should be flushed after listeners have finished processing
[SPARK-36804] - Using the verbose parameter in yarn mode would cause application submission failure
[SPARK-36806] - Use R 4.0.4 in K8s R image
[SPARK-36861] - Partition columns are overly eagerly parsed as dates
[SPARK-36867] - Misleading Error Message with Invalid Column and Group By
[SPARK-36889] - Respect `spark.sql.parquet.filterPushdown` by explain() for DSv2
[SPARK-36896] - Return boolean for `dropTempView` and `dropGlobalTempView`
[SPARK-36905] - Reading Hive view without explicit column names fails in Spark
[SPARK-36985] - Future typing errors in pyspark.pandas
[SPARK-36993] - Fix json_tuple throw NPE if fields exist no foldable null value
[SPARK-37004] - Job cancellation causes py4j errors on Jupyter due to pinned thread mode
[SPARK-37017] - Reduce the scope of synchronized to prevent deadlock.
[SPARK-37026] - Ensure the element type of ResolvedRFormula.terms is scala.Seq for Scala 2.13
[SPARK-37046] - Alter view does not preserve column case
[SPARK-37049] - executorIdleTimeout is not working for pending pods on K8s
[SPARK-37052] - Fix spark-3.2 can use --verbose with spark-shell
[SPARK-37057] - Fix wrong DocSearch facet filter in release-tag.sh
[SPARK-37059] - Ensure the sort order of the output in the PySpark doctests
[SPARK-37060] - Report driver status does not handle response from backup masters
[SPARK-37061] - Custom V2 Metrics uses wrong classname for lookup
[SPARK-37064] - Fix outer join return the wrong max rows if other side is empty
[SPARK-37069] - HiveClientImpl throws NoSuchMethodError: org.apache.hadoop.hive.ql.metadata.Hive.getWithoutRegisterFns
[SPARK-37076] - Implement StructType.toString explicitly for Scala 2.13
[SPARK-37078] - Support old 3-parameter Sink constructors
[SPARK-37079] - Fix DataFrameWriterV2.partitionedBy to send the arguments to JVM properly
[SPARK-37086] - Fix the R test of FPGrowthModel for Scala 2.13
[SPARK-37088] - Python UDF after off-heap vectorized reader can cause crash due to use-after-free in writer thread
[SPARK-37089] - ParquetFileFormat registers task completion listeners lazily, causing Python writer thread to segfault when off-heap vectorized reader is enabled
[SPARK-37098] - Alter table properties should invalidate cache
[SPARK-37102] - Missing dependencies for hadoop-azure
[SPARK-37103] - Switch from Maven to SBT to build Spark on AppVeyor
[SPARK-37112] - Fix MiMa failure with Scala 2.13
[SPARK-37117] - Can't read files in one of Parquet encryption modes (external keymaterial)
[SPARK-37121] - TestUtils.isPythonVersionAtLeast38 returns incorrect results
[SPARK-37135] - Fix some mirco-benchmarks run failed
[SPARK-37141] - WorkerSuite cannot run on Mac OS
[SPARK-37143] - Supplement the missing Java 11 benchmark result files
[SPARK-37147] - MetricsReporter producing NullPointerException when element 'triggerExecution' not present in Map[]
[SPARK-37170] - Pin PySpark version installed in the Binder environment for tagged commit
[SPARK-37191] - Allow merging DecimalTypes with different precision values
[SPARK-37196] - NPE in org.apache.spark.sql.hive.HiveShim$.toCatalystDecimal(HiveShim.scala:106)
[SPARK-37202] - Temp view didn't collect temp function that registered with catalog API
[SPARK-37203] - Fix NotSerializableException when observe with TypedImperativeAggregate
[SPARK-37209] - YarnShuffleIntegrationSuite and other two similar cases in `resource-managers` test failed
[SPARK-37217] - The number of dynamic partitions should early check when writing to external tables
[SPARK-37252] - Ignore test_memory_limit on non-Linux environment
[SPARK-37253] - try_simplify_traceback should not fail when tb_frame.f_lineno is None
[SPARK-37290] - Exponential planning time in case of non-deterministic function
[SPARK-37302] - Explicitly download the dependencies of guava and jetty-io in test-dependencies.sh
[SPARK-37308] - Flaky Test: DDLParserSuite.create view -- basic
[SPARK-37314] - Upgrade kubernetes-client to 5.10.1
[SPARK-37315] - Mitigate ConcurrentModificationException thrown from a test in MLEventSuite
[SPARK-37318] - Make FallbackStorageSuite robust in terms of DNS
[SPARK-37320] - Delete py_container_checks.zip after the test in DepsTestsSuite finishes
[SPARK-37356] - Add fine grained locking to BlockInfoManager
[SPARK-37374] - StatCounter should use mergeStats when merging with self.
[SPARK-37388] - WidthBucket throws NullPointerException in WholeStageCodegenExec
[SPARK-37390] - Buggy method retrival in pyspark.docs.conf.setup
[SPARK-37391] - SIGNIFICANT bottleneck introduced by fix for SPARK-32001
[SPARK-37392] - Catalyst optimizer very time-consuming and memory-intensive with some "explode(array)"
[SPARK-37452] - Char and Varchar breaks backward compatibility between v3 and v2
[SPARK-37459] - Upgrade commons-cli to 1.5.0
[SPARK-37465] - PySpark tests failing on Pandas 0.23
[SPARK-37480] - Configurations in docs/running-on-kubernetes.md are not uptodate
[SPARK-37481] - Disappearance of skipped stages mislead the bug hunting
[SPARK-37498] - test_reuse_worker_of_parallelize_range is flaky
[SPARK-37524] - We should drop all tables after testing dynamic partition pruning
[SPARK-37534] - Bump dev.ludovic.netlib to 2.2.1
[SPARK-37544] - sequence over dates with month interval is producing incorrect results
[SPARK-37545] - V2 CreateTableAsSelect command should qualify location
[SPARK-37546] - V2 ReplaceTableAsSelect command should qualify location
[SPARK-37554] - Add PyArrow, pandas and plotly to release Docker image dependencies
[SPARK-37556] - Deser void class fail with Java serialization
[SPARK-37569] - View Analysis incorrectly marks nested fields as nullable
[SPARK-37573] - IsolatedClient fallbackVersion should be build in version, not always 2.7.4
[SPARK-37575] - null values should be saved as nothing rather than quoted empty Strings "" with default settings
[SPARK-37577] - ClassCastException: ArrayType cannot be cast to StructType
[SPARK-37585] - DSV2 InputMetrics are not getting update in corner case
[SPARK-37598] - Pyspark's newAPIHadoopRDD() method fails with ShortWritables
[SPARK-37615] - Upgrade SBT to 1.5.6
[SPARK-37633] - Unwrap cast should skip if downcast failed with ansi enabled
[SPARK-37635] - SHOW TBLPROPERTIES should print the fully qualified table name
[SPARK-37643] - when charVarcharAsString is true, char datatype partition table query incorrect
[SPARK-37654] - Regression - NullPointerException in Row.getSeq when field null
[SPARK-37656] - Upgrade SBT to 1.5.7
[SPARK-37658] - Skip PIP packaging test if Python version is lower than 3.7
[SPARK-37659] - Fix FsHistoryProvider race condition between list and delet log info
[SPARK-37663] - Mitigate ConcurrentModificationException thrown from tests in SparkContextSuite
[SPARK-37668] - 'Index' object has no attribute 'levels' in pyspark.pandas.frame.DataFrame.insert
[SPARK-37678] - Incorrect annotations in SeriesGroupBy._cleanup_and_return
[SPARK-37690] - Recursive view `df` detected (cycle: `df` -> `df`)
[SPARK-37693] - Fix ChildProcAppHandleSuite failed in spark-master-test-maven-hadoop-3.2
[SPARK-37694] - Disallow delete resources in spark sql cli
[SPARK-37703] - Upgrade SBT to 1.5.8
[SPARK-37713] - No namespace assigned in Executor Pod ConfigMap
[SPARK-37721] - Failed to execute pyspark test in Win WSL
[SPARK-37728] - reading nested columns with ORC vectorized reader can cause ArrayIndexOutOfBoundsException
[SPARK-37730] - plot.hist throws AttributeError on pandas=1.3.5
[SPARK-37754] - Fix black version in dev/reformat-python
[SPARK-37763] - Upgrade jackson to 2.13.1
[SPARK-37778] - Upgrade SBT to 1.6.1
[SPARK-37779] - Make ColumnarToRowExec plan canonicalizable after (de)serialization
[SPARK-37793] - Invalid LocalMergedBlockData cause task hang
[SPARK-37800] - TreeNode.argString incorrectly formats arguments of type Set[_]
[SPARK-37802] - composite field name like `field name` doesn't work with Aggregate push down
[SPARK-37807] - Fix a typo in HttpAuthenticationException message
[SPARK-37820] - Replace ApacheCommonBase64 with JavaBase64 for string fucntions
[SPARK-37834] - Reenable length check in Python linter
[SPARK-37841] - BasicWriteTaskStatsTracker should not try get status for a skipped file
[SPARK-37846] - TaskContext is used at wrong place in BlockManagerDecommissionIntegrationSuite
[SPARK-37855] - IllegalStateException when transforming an array inside a nested struct
[SPARK-37859] - SQL tables created with JDBC with Spark 3.1 are not readable with 3.2
[SPARK-37860] - [BUG] Revert: Fix taskid in the stage page task event timeline
[SPARK-37865] - Spark should not dedup the groupingExpressions when the first child of Union has duplicate columns
[SPARK-37874] - Link to Pandas UDF documentation is broken
[SPARK-37884] - Upgrade kubernetes-client to 5.10.2
[SPARK-37893] - Fix flaky test: AdaptiveQueryExecSuite with Scala 2.13
[SPARK-37895] - Error while joining two tables with non-english field names
[SPARK-37905] - Make `merge_spark_pr.py` set primary author from the first commit in case of ties
[SPARK-37918] - Specified construct when instance SessionStateBuilder
[SPARK-37920] - Remove tab character and trailing space in pom.xml
[SPARK-37932] - Analyzer can fail when join left side and right side are the same view
[SPARK-37947] - Cannot use <func>_outer generators in a lateral view
[SPARK-37958] - Pyspark SparkContext.AddFile() does not respect spark.files.overwrite
[SPARK-37963] - Need to update Partition URI after renaming table in InMemoryCatalog
[SPARK-37972] - Typing incompatibilities with numpy==1.22.x
[SPARK-38016] - Fix the API doc for session_window to say it supports TimestampNTZType too as timeColumn
[SPARK-38018] - Fix ColumnVectorUtils.populate to handle CalendarIntervalType correctly
[SPARK-38042] - Encoder cannot be found when a tuple component is a type alias for an Array
[SPARK-38056] - Structured streaming not working in history server when using LevelDB
[SPARK-38060] - Inconsistent behavior from JSON option allowNonNumericNumbers
[SPARK-38067] - Inconsistent missing values handling in Pandas on Spark to_json
[SPARK-38073] - Update atexit function to avoid issues with late binding
[SPARK-38075] - Hive script transform with order by and limit will return fake rows
[SPARK-38118] - Func(wrong data type) in HAVING clause should throw data mismatch error
[SPARK-38120] - HiveExternalCatalog.listPartitions is failing when partition column name is upper case and dot in partition value
[SPARK-38124] - Revive HashClusteredDistribution and apply to stream-stream join
[SPARK-38130] - array_sort does not allow non-orderable datatypes
[SPARK-38132] - Remove NotPropagation
[SPARK-38133] - Grouping by timestamp_ntz will sometimes corrupt the results
[SPARK-38139] - ml.recommendation.ALS doctests failures
[SPARK-38140] - Desc column stats (min, max) for timestamp type is not consistent with the value due to time zone difference
[SPARK-38146] - UDAF fails to aggregate TIMESTAMP_NTZ column
[SPARK-38151] - Handle `Pacific/Kanton` in DateTimeUtilsSuite
[SPARK-38171] - Upgrade ORC to 1.7.3
[SPARK-38173] - Quoted column cannot be recognized correctly when quotedRegexColumnNames is true
[SPARK-38178] - Correct the logic to measure the memory usage of RocksDB
[SPARK-38182] - Fix NoSuchElementException if pushed filter does not contain any references
[SPARK-38185] - Fix data incorrect if aggregate function is empty
[SPARK-38192] - Use try-with-resources in Level/RocksDBSuite.java
[SPARK-38198] - Fix `QueryExecution.debug#toFile` use the passed in `maxFields` when `explainMode` is `CodegenMode`
[SPARK-38201] - Fix KubernetesUtils#uploadFileToHadoopCompatibleFS use passed in `delSrc` and `overwrite`
[SPARK-38204] - All state operators are at a risk of inconsistency between state partitioning and operator partitioning
[SPARK-38206] - Relax the requirement of data type comparison for keys in stream-stream join
[SPARK-38221] - Group by a stream of complex expressions fails
[SPARK-38227] - Apply strict nullability of nested column in time window / session window
[SPARK-38236] - Absolute file paths specified in create/alter table are treated as relative
[SPARK-38239] - AttributeError: 'LogisticRegressionModel' object has no attribute '_call_java'
[SPARK-38243] - Unintended exception thrown in pyspark.ml.LogisticRegression.getThreshold
[SPARK-38271] - PoissonSampler may output more rows than MaxRows
[SPARK-38273] - decodeUnsafeRows's iterators should close underlying input streams
[SPARK-38275] - Consider to include WriteBatch's memory in the memory usage of RocksDB state store
[SPARK-38285] - ClassCastException: GenericArrayData cannot be cast to InternalRow
[SPARK-38286] - Union's maxRows and maxRowsPerPartition may overflow
[SPARK-38304] - Elt() should return null if index is null under ANSI mode
[SPARK-38308] - Select of a stream of window expressions fails
[SPARK-38309] - SHS has incorrect percentiles for shuffle read bytes and shuffle total blocks metrics
[SPARK-38314] - Fail to read parquet files after writing the hidden file metadata in
[SPARK-38320] - (flat)MapGroupsWithState can timeout groups which just received inputs in the same microbatch
[SPARK-38333] - DPP cause DataSourceScanExec java.lang.NullPointerException
[SPARK-38344] - Avoid to submit task when there are no requests to push up in push-based shuffle
[SPARK-38347] - Nullability propagation in transformUpWithNewOutput
[SPARK-38355] - Change mktemp() to mkstemp()
[SPARK-38357] - StackOverflowError with OR(data filter, partition filter)
[SPARK-38394] - build of spark sql against hadoop-3.4.0-snapshot failing with bouncycastle classpath error
[SPARK-38411] - Use UTF-8 when doMergeApplicationListingInternal reads event logs
[SPARK-38412] - `from` and `to` is swapped in the StateSchemaCompatibilityChecker
[SPARK-38416] - Change day to month
[SPARK-38436] - Fix `test_ceil` to test `ceil`
[SPARK-38446] - Deadlock between ExecutorClassLoader and FileDownloadCallback caused by Log4j
[SPARK-38458] - Fix always false condition in LogDivertAppender#initLayout
[SPARK-38516] - Add log4j-core, log4j-api and log4j-slf4j-impl to classpath if active hadoop-provided
[SPARK-38517] - Fix PySpark documentation generation (missing ipython_genutils)
[SPARK-38523] - Failure on referring to the corrupt record from CSV
[SPARK-38526] - fix misleading function alias name for RuntimeReplaceable
[SPARK-38528] - NullPointerException when selecting a generator in a Stream of aggregate expressions
[SPARK-38530] - GeneratorNestedColumnAliasing does not work correctly for some expressions
[SPARK-38542] - UnsafeHashedRelation should serialize numKeys out
[SPARK-38563] - Upgrade to Py4J 0.10.9.5
[SPARK-38567] - Enable GitHub Action build_and_test on branch-3.3
[SPARK-38579] - Requesting Restful API can cause NullPointerException
[SPARK-38583] - to_timestamp should allow numeric types
[SPARK-38586] - Trigger notifying workflow in branch-3.3 and other future branches
[SPARK-38587] - Validating new location for rename command should use formatted names
[SPARK-38600] - Include unit into the sql string of TIMESTAMPADD/DIFF
[SPARK-38604] - ceil and floor return different types when called from scala than sql
[SPARK-38612] - Fix Inline type hint for duplicated.keep
[SPARK-38630] - K8s app name label should start and end with alphanumeric char
[SPARK-38631] - Arbitrary shell command injection via Utils.unpack()
[SPARK-38640] - NPE with unpersisting memory-only RDD with RDD fetching from shuffle service enabled
[SPARK-38652] - uploadFileUri should preserve file scheme
[SPARK-38655] - OffsetWindowFunctionFrameBase cannot find the offset row whose input is not null
[SPARK-38665] - upgrade jackson due to CVE-2020-36518
[SPARK-38666] - Missing aggregate filter checks
[SPARK-38675] - Race condition in BlockInfoManager during unlock
[SPARK-38677] - pyspark hangs in local mode running rdd map operation
[SPARK-38680] - Set upperbound for pandas-stubs in CI
[SPARK-38681] - Support nested generic case classes
[SPARK-38684] - Stream-stream outer join has a possible correctness issue due to weakly read consistent on outer iterators
[SPARK-38696] - Add `commons-collections` back
[SPARK-38706] - Use URI in FallbackStorage.copy
[SPARK-38776] - Flaky test: ALSSuite.'ALS validate input dataset'
[SPARK-38807] - Error when starting spark shell on Windows system
[SPARK-38818] - Fix the docs of try_multiply/try_subtract/ANSI cast
[SPARK-38823] - Incorrect result of dataset reduceGroups in java
[SPARK-38830] - Warn on corrupted block messages
[SPARK-38866] - Update ORC to 1.7.4
[SPARK-38868] - `assert_true` fails unconditionnaly after `left_outer` joins
[SPARK-38882] - The usage logger attachment logic should handle static methods properly.
[SPARK-38889] - Invalid column name while querying bit type column in MSSQL
[SPARK-38916] - Tasks not killed caused by race conditions between killTask() and launchTask()
[SPARK-38918] - Nested column pruning should filter out attributes that do not belong to the current relation
[SPARK-38922] - TaskLocation.apply throw NullPointerException
[SPARK-38931] - RocksDB File manager would not create initial dfs directory with unknown number of keys on 1st empty checkpoint
[SPARK-38941] - Skip RocksDB-based test case in StreamingJoinSuite on Apple Silicon
[SPARK-38942] - Skip RocksDB-based test case in FlatMapGroupsWithStateSuite on Apple Silicon
[SPARK-38955] - Disable lineSep option in 'from_csv' and 'schema_of_csv'
[SPARK-38973] - When push-based shuffle is enabled, a stage may not complete when retried
[SPARK-38974] - List functions should only list registered functions in the specified database
[SPARK-38977] - Fix schema pruning with correlated subqueries
[SPARK-38988] - Pandas API - "PerformanceWarning: DataFrame is highly fragmented." get printed many times.
[SPARK-38990] - date_trunc and trunc both fail with format from column in inline table
[SPARK-38992] - Avoid using bash -c in ShellBasedGroupsMappingProvider
[SPARK-39012] - SparkSQL parse partition value does not support all data types
[SPARK-39015] - SparkRuntimeException when trying to get non-existent key in a map
[SPARK-39055] - Fix documentation 404 page
[SPARK-39060] - Typo in error messages of decimal overflow
[SPARK-39083] - Fix FsHistoryProvider race condition between update and clean app data
[SPARK-39084] - df.rdd.isEmpty() results in unexpected executor failure and JVM crash
[SPARK-39093] - Dividing interval by integral can result in codegen compilation error
[SPARK-39104] - Null Pointer Exeption on unpersist call
[SPARK-39107] - Silent change in regexp_replace's handling of empty strings
[SPARK-39112] - UnsupportedOperationException if spark.sql.ui.explainMode is set to cost
[SPARK-39144] - Nested subquery expressions deduplicate relations should be done bottom up
[SPARK-39149] - SHOW DATABASES command should not quote database names under legacy mode
[SPARK-39216] - Do not collapse projects in CombineUnions if it hasCorrelatedSubquery
[SPARK-39218] - Python foreachBatch streaming query cannot be stopped gracefully after pin thread mode is enabled
[SPARK-39226] - Fix the precision of the return type of round-like functions
[SPARK-39233] - Remove the check for TimestampNTZ output in Analyzer
[SPARK-39250] - Upgrade Jackson to 2.13.3
[SPARK-39258] - Fix `Hide credentials in show create table` after SPARK-35378
[SPARK-39259] - Timestamps returned by now() and equivalent functions are not consistent in subqueries
[SPARK-39283] - Spark tasks stuck forever due to deadlock between TaskMemoryManager and UnsafeExternalSorter
[SPARK-39286] - Documentation for the decode function has an incorrect reference
[SPARK-39293] - The accumulator of ArrayAggregate should copy the intermediate result if string, struct, array, or map
[SPARK-39313] - V2ExpressionUtils.toCatalystOrdering should fail if V2Expression can not be translated
[SPARK-39341] - KubernetesExecutorBackend should allow IPv6 pod IP
[SPARK-39354] - The analysis exception is incorrect
[SPARK-39360] - Recover spark.kubernetes.memoryOverheadFactor doc and remove deprecation
[SPARK-39376] - Do not output duplicated columns in star expansion of subquery alias of NATURAL/USING JOIN
[SPARK-39393] - Parquet data source only supports push-down predicate filters for non-repeated primitive types
[SPARK-39411] - Release candidates do not have the correct version for PySpark
[SPARK-39412] - IllegalStateException from connector does not work well with error class framework
[SPARK-39417] - Handle Null partition values in PartitioningUtils
[SPARK-39421] - Sphinx build fails with "node class 'meta' is already registered, its visitors will be overridden"
[SPARK-39427] - Disable ANSI intervals in the percentile functions
[SPARK-40804] - Missing handling a catalog name in destination tables in `RenameTableExec`
[SPARK-45601] - stackoverflow when executing rule ExtractWindowExpressions

Epic

[SPARK-39370] - Inline type hints in PySpark

Story

[SPARK-36642] - Add df.withMetadata: a syntax suger to update the metadata of a dataframe

New Feature

[SPARK-595] - Document "local-cluster" mode
[SPARK-12567] - Add aes_encrypt and aes_decrypt UDFs
[SPARK-28955] - Support for LocalDateTime semantics
[SPARK-32268] - Bloom Filter Join
[SPARK-33772] - Build and Run Spark on Java 17
[SPARK-34735] - Add modified configs for SQL execution in UI
[SPARK-34755] - Support the utils for transform number format
[SPARK-34806] - Helper class for batch Dataset.observe()
[SPARK-35334] - Spark should be more resilient to intermittent K8s flakiness
[SPARK-35781] - Support Spark on Apple Silicon on macOS natively on Java 17
[SPARK-36194] - Add a logical plan visitor to propagate the distinct attributes
[SPARK-36263] - Add Dataset.observe(Observation, Column, Column*) to PySpark
[SPARK-36371] - Support raw string literal
[SPARK-36425] - PySpark: support CrossValidatorModel get standard deviation of metrics for each paramMap
[SPARK-36533] - Allow streaming queries with Trigger.Once run in multiple batches
[SPARK-36674] - Support ILIKE - case insensitive Like
[SPARK-36736] - Support ILIKE (ALL | ANY | SOME) - case insensitive LIKE
[SPARK-37047] - Add overloads for lpad and rpad for BINARY strings
[SPARK-37062] - Introduce a new data source for providing consistent set of rows per microbatch
[SPARK-37205] - Support mapreduce.job.send-token-conf when starting containers in YARN
[SPARK-37207] - Python API does not have isEmpty
[SPARK-37219] - support AS OF syntax
[SPARK-37375] - Umbrella: Storage Partitioned Join (SPJ)
[SPARK-37475] - Add Scale Parameter to Floor and Ceil functions
[SPARK-37492] - Optimize Orc test code with withAllNativeOrcReaders
[SPARK-37507] - Add the TO_BINARY() function
[SPARK-37508] - Add CONTAINS() function
[SPARK-37520] - Add the startswith() and endswith() string functions
[SPARK-37552] - Add the convert_timezone() function
[SPARK-37568] - Support 2-arguments by the convert_timezone() function
[SPARK-37582] - Support the binary type by contains()
[SPARK-37583] - Support the binary type by startswith() and endswith()
[SPARK-37584] - New SQL function: map_contains_key
[SPARK-37671] - Support ANSI Aggregation Function of regression
[SPARK-37676] - Support ANSI Aggregation Function: percentile_cont
[SPARK-37691] - Support ANSI Aggregation Function: percentile_disc
[SPARK-37810] - Executor Rolling in Kubernetes environment
[SPARK-37863] - Add submitTime for Spark Application
[SPARK-37970] - Introduce a new interface on streaming data source to notify the latest seen offset
[SPARK-38035] - Add docker tests for build-in JDBC dialect
[SPARK-38054] - DS V2 supports list namespaces of MySQL
[SPARK-38094] - Parquet: enable matching schema columns by field id
[SPARK-38195] - Add the TIMESTAMPADD() function
[SPARK-38278] - Add SparkContext.addArchive in PySpark
[SPARK-38284] - Add the TIMESTAMPDIFF() function
[SPARK-38332] - Add the `DATEADD()` alias for `TIMESTAMPADD()`
[SPARK-38345] - Introduce SQL function ARRAY_SIZE
[SPARK-38389] - Add the `DATEDIFF` and `DATE_DIFF` aliases for `TIMESTAMPDIFF()`

Improvement

[SPARK-20384] - supporting value classes over primitives in DataSets
[SPARK-27790] - Support ANSI SQL INTERVAL types
[SPARK-32797] - Install mypy on the Jenkins CI workers
[SPARK-32940] - Collect, first and last should be deterministic aggregate functions
[SPARK-32986] - Add bucket scan info in explain output of FileSourceScanExec
[SPARK-34079] - Merge non-correlated scalar subqueries
[SPARK-34378] - Loosen AvroSerializer validation to allow extra nullable user-provided fields
[SPARK-34629] - Python type hints improvement
[SPARK-34943] - Upgrade flake8 to 3.8.0 or above in Jenkins
[SPARK-35173] - Support columns batch adding in PySpark.dataframe
[SPARK-35174] - Avoid opening watch when waitAppCompletion is false
[SPARK-35221] - Add join hint build side check
[SPARK-35320] - from_json cannot parse maps with timestamp as key
[SPARK-35442] - Support propagate empty relation through aggregate
[SPARK-35460] - invalid `spark.kubernetes.executor.podNamePrefix` causes app to hang
[SPARK-35703] - Relax constraint for Spark bucket join and remove HashClusteredDistribution
[SPARK-35848] - Spark Bloom Filter, others using treeAggregate can throw OutOfMemoryError
[SPARK-35907] - Instead of File#mkdirs, Files#createDirectories is expected.
[SPARK-35918] - Consolidate logic between AvroSerializer/AvroDeserializer for schema mismatch handling and error messages
[SPARK-35956] - Support auto-assigning labels to less important pods (e.g. decommissioning pods)
[SPARK-35980] - ThreadAudit test helper should log whether a thread is a Daemon thread
[SPARK-35986] - fix pyspark.rdd.RDD.histogram's buckets argument
[SPARK-35991] - Add PlanStability suite for TPCH
[SPARK-36000] - Support creation and operations of ps.Series/Index with Decimal('NaN')
[SPARK-36010] - Upgrade sbt-antlr4 from 0.8.2 to 0.8.3
[SPARK-36018] - Some Improvement for Spark Core
[SPARK-36038] - Basic speculation metrics at stage level
[SPARK-36047] - Replace the handwriting compare methods with static compare methods in Java code
[SPARK-36069] - spark function from_json should output field name, field type and field value when FAILFAST mode throw exception
[SPARK-36070] - Add time cost info for writing rows out and committing the task.
[SPARK-36073] - EquivalentExpressions fixes and improvements
[SPARK-36137] - HiveShim always fallback to getAllPartitionsOf regardless of whether directSQL is enabled in remote HMS
[SPARK-36147] - [SQL] - log level should be warning if files not found in BasicWriteStatsTracker
[SPARK-36149] - dayofweek documentation for python and R
[SPARK-36154] - pyspark documentation doesn't mention week and quarter as valid format arguments to trunc
[SPARK-36157] - TimeWindow expression: apply filter before project
[SPARK-36158] - pyspark sql/functions documentation for months_between isn't as precise as scala version
[SPARK-36160] - pyspark sql/column documentation doesn't always match scala documentation
[SPARK-36163] - Propagate correct JDBC properties in JDBC connector provider and add "connectionProvider" option
[SPARK-36173] - [CORE] Support getting CPU number in TaskContext
[SPARK-36176] - Expose tableExists in pyspark.sql.catalog
[SPARK-36183] - Push down limit 1 through Aggregate
[SPARK-36207] - Export databaseExists in pyspark.sql.catalog
[SPARK-36243] - pyspark catalog.tableExists doesn't work for temporary views
[SPARK-36258] - Export functionExists in pyspark catalog
[SPARK-36276] - Update maven-checkstyle-plugin to 3.1.2 and checkstyle to 8.43
[SPARK-36280] - Remove redundant aliases after RewritePredicateSubquery
[SPARK-36319] - Have Observation return Map instead of Row
[SPARK-36326] - Use Map.computeIfAbsent to simplify the process of HeapMemoryAllocator.bufferPoolsBySize init new item
[SPARK-36334] - Add a new conf to allow K8s API server-side cache for pod listing
[SPARK-36351] - Separate partition filters and data filters in PushDownUtils
[SPARK-36359] - Coalesce drop all expressions after the first non nullable expression
[SPARK-36361] - Install coverage in Python 3.9 and PyPy 3 in GitHub Actions image
[SPARK-36362] - Omnibus Java code static analyzer warning fixes
[SPARK-36373] - DecimalPrecision only add necessary cast
[SPARK-36404] - Support nested columns in ORC vectorized reader for data source v2
[SPARK-36405] - Check that error class SQLSTATEs are valid
[SPARK-36406] - No longer do file truncate operation before delete a write failed file held by DiskBlockObjectWriter
[SPARK-36407] - Avoid potential integer multiplications overflow risk
[SPARK-36410] - Replace anonymous classes with lambda expressions
[SPARK-36418] - Use CAST in parsing of dates/timestamps with default pattern
[SPARK-36419] - Move final aggregation in RDD.treeAggregate to executor
[SPARK-36420] - Use `isEmpty` to improve performance in Pregel's superstep
[SPARK-36450] - Remove unused UnresolvedV2Relation
[SPARK-36451] - Ivy skips looking for source and doc pom
[SPARK-36475] - Add doc about spark.shuffle.service.fetch.rdd.enabled
[SPARK-36481] - Expose LogisticRegression.setInitialModel
[SPARK-36487] - modify exit executor log logic
[SPARK-36495] - Use Type match to simplify CatalystTypeConverter.toCatalyst
[SPARK-36498] - Reorder inner fields of the input query in byName V2 write
[SPARK-36502] - Remove jaxb-api from `sql/catalyst` module
[SPARK-36503] - Add RowToColumnConverter for BinaryType
[SPARK-36536] - Split the JSON/CSV option of datetime format to in read and in write
[SPARK-36546] - Make unionByName null-filling behavior work with array of struct columns
[SPARK-36550] - Propagation cause when UDF reflection fails
[SPARK-36560] - Deflake PySpark coverage report
[SPARK-36566] - Add Spark appname as a label to the executor pods
[SPARK-36573] - Add a default value to ORACLE_DOCKER_IMAGE
[SPARK-36575] - Executor lost may cause spark stage to hang
[SPARK-36576] - Improve range split calculation for Kafka Source minPartitions option
[SPARK-36580] - Replace filter and contains with intersect
[SPARK-36583] - Upgrade commons-pool2 from 2.6.2 to 2.11.1
[SPARK-36602] - Clean up redundant asInstanceOf casts
[SPARK-36607] - Support BooleanType in UnwrapCastInBinaryComparison
[SPARK-36613] - The return value of the Table.capabilities should use EnumSet instead of HashSet
[SPARK-36643] - Add more information in ERROR log while SparkConf is modified when spark.sql.legacy.setCommandRejectsSparkCoreConfs is set
[SPARK-36644] - Push down boolean column filter
[SPARK-36649] - Support Trigger.AvailableNow on Kafka data source
[SPARK-36654] - Drop type ignores from numpy imports
[SPARK-36660] - Cotangent is not supported by Dataframe
[SPARK-36663] - When the existing field name is a number, an error will be reported when reading the orc file
[SPARK-36665] - Add more Not operator optimizations
[SPARK-36679] - Remove lz4 hadoop wrapper classes after Hadoop 3.3.2
[SPARK-36683] - Support secant and cosecant
[SPARK-36688] - Add cot as an R function
[SPARK-36689] - Cleanup the deprecated APIs and raise proper warning message.
[SPARK-36690] - Clean up deprecated api usage related to commons-pool2
[SPARK-36692] - Improve Error statement when requesting thread dump while executor already stopped
[SPARK-36703] - Remove the Sort if it is the child of RepartitionByExpression
[SPARK-36718] - only collapse projects if we don't duplicate expensive expressions
[SPARK-36719] - Supporting Netty Logging at the network layer
[SPARK-36721] - Simplify boolean equalities if one side is literal
[SPARK-36735] - Adjust overhead of cached relation for DPP
[SPARK-36737] - Upgrade commons-io to 2.11.0 and revert change of SPARK-36456
[SPARK-36745] - ExtractEquiJoinKeys should return the original predicates on join keys
[SPARK-36751] - octet_length/bit_length API is not implemented on Scala/Python/R
[SPARK-36797] - Union should resolve nested columns as top-level columns
[SPARK-36799] - Pass queryExecution name in CLI
[SPARK-36805] - Upgrade kubernetes-client to 5.7.3
[SPARK-36808] - Upgrade Kafka to 2.8.1
[SPARK-36809] - Remove broadcast for InSubqueryExec used in DPP
[SPARK-36814] - Make class ColumnarBatch extendable
[SPARK-36821] - Create a test to extend ColumnarBatch
[SPARK-36822] - BroadcastNestedLoopJoinExec should use all condition instead of non-equi condition
[SPARK-36824] - Add sec and csc as R functions
[SPARK-36829] - Refactor collectionOperation related Null check related code
[SPARK-36834] - Namespace log lines in External Shuffle Service
[SPARK-36838] - Improve InSet NaN check generated code performance
[SPARK-36841] - Provide ansi syntax `set catalog xxx` to change the current catalog
[SPARK-36847] - Explicitly specify error codes when ignoring type hint errors
[SPARK-36859] - Upgrade kubernetes-client to 5.8.0
[SPARK-36863] - Update dependency manifests for all released artifacts
[SPARK-36870] - Introduce INTERNAL_ERROR error class
[SPARK-36876] - Support Dynamic Partition pruning for HiveTableScanExec
[SPARK-36890] - Use default WebsocketPingInterval for Kubernetes watches
[SPARK-36893] - upgrade mesos into 1.4.3
[SPARK-36894] - RDD.toDF should be synchronized with dispatched variants of SparkSession.createDataFrame
[SPARK-36898] - Make the shuffle hash join factor configurable
[SPARK-36915] - Pin actions to a full length commit SHA
[SPARK-36918] - unionByName shouldn't consider types when comparing structs
[SPARK-36933] - Reduce duplication in TaskMemoryManager.acquireExecutionMemory
[SPARK-36937] - Change OrcSourceSuite to test both V1 and V2 sources.
[SPARK-36943] - Improve error message for missing column
[SPARK-36953] - Expose SQL state and error class in PySpark exceptions
[SPARK-36961] - Use PEP526 style variable type hints
[SPARK-36963] - Add max_by/min_by to sql.functions
[SPARK-36965] - Extend python test runner by logging out the temp output files
[SPARK-36967] - Report accurate shuffle block size if its skewed
[SPARK-36972] - Add max_by/min_by API to PySpark
[SPARK-36973] - Deduplicate prepare data method for HistogramPlotBase and KdePlotBase
[SPARK-36976] - Add max_by/min_by API to SparkR
[SPARK-36978] - InferConstraints rule should create IsNotNull constraints on the nested field instead of the root nested type
[SPARK-36981] - Upgrade joda-time to 2.10.12
[SPARK-36989] - Migrate type hint data tests
[SPARK-36992] - Improve byte array sort perf by unify getPrefix function of UTF8String and ByteArray
[SPARK-36997] - Test type hints against examples
[SPARK-37001] - Disable two level of map for final hash aggregation by default
[SPARK-37002] - Introduce the 'compute.eager_check' option
[SPARK-37003] - Merge INSERT related docs
[SPARK-37010] - Remove unnecessary "noqa: F401" comments in pandas-on-Spark
[SPARK-37011] - Upgrade flake8 to 3.9.0 or above in Jenkins
[SPARK-37022] - Use black as a formatter for the whole PySpark codebase.
[SPARK-37025] - Upgrade RoaringBitmap to 0.9.22
[SPARK-37032] - Remove unuseable link in spark-3.2.0's doc
[SPARK-37036] - Add util function to raise advice warning for pandas API on Spark.
[SPARK-37037] - Improve byte array sort by unify compareTo function of UTF8String and ByteArray
[SPARK-37041] - Backport HIVE-15025: Secure-Socket-Layer (SSL) support for HMS
[SPARK-37044] - Add Row to __all__ in pyspark.sql.types
[SPARK-37058] - Add spark-shell command line unit test
[SPARK-37071] - OpenHashMap should be serializable without reference tracking
[SPARK-37075] - move UDAF expression building from sql/catalyst to sql/core
[SPARK-37077] - Annotations for pyspark.sql.context.SQLContext.createDataFrame are broken
[SPARK-37080] - Add benchmark tool guide in pull request template
[SPARK-37081] - Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests
[SPARK-37084] - Set spark.sql.files.openCostInBytes to bytesConf
[SPARK-37085] - Missing overloads in functions accepting both varargs and single arg collection
[SPARK-37087] - merge three relation resolutions into one
[SPARK-37101] - In class ShuffleBlockPusher, use config instead of key
[SPARK-37104] - RDD and DStream should be covariant
[SPARK-37108] - Expose make_date expression in R
[SPARK-37113] - Upgrade Parquet to 1.12.2
[SPARK-37115] - Replace HiveClient call with hive shim
[SPARK-37118] - Add KMeans distanceMeasure param to PythonMLLibAPI
[SPARK-37126] - Support TimestampNTZ in PySpark
[SPARK-37133] - Add a config to optionally enforce ANSI reserved keywords
[SPARK-37134] - documentation - unclear "Using PySpark Native Features"
[SPARK-37151] - Avoid executor state sync attempt fail continuously in a short timeframe
[SPARK-37160] - Add a config to optionally disable paddin for char type
[SPARK-37164] - Add ExpressionBuilder for functions with complex overloads
[SPARK-37165] - Add REPEATABLE in TABLESAMPLE to specify seed
[SPARK-37176] - JsonSource's infer should have the same exception handle logic as JacksonParser's parse logic
[SPARK-37199] - Add a deterministic field to QueryPlan
[SPARK-37206] - Upgrade Avro to 1.11.0
[SPARK-37208] - Support mapping Spark gpu/fpga resource types to custom YARN resource type
[SPARK-37211] - More descriptions and adding an image to the failure message about enabling GitHub Actions
[SPARK-37214] - Fail query analysis earlier with invalid identifiers
[SPARK-37221] - The collect-like API in SparkPlan should support columnar output
[SPARK-37224] - Optimize write path on RocksDB state store provider
[SPARK-37237] - Upgrade kubernetes-client to 5.9.0
[SPARK-37239] - Avoid unnecessary `setReplication` in Yarn mode
[SPARK-37241] - Upgrade Jackson to 2.13.0
[SPARK-37243] - Fix the format of the document
[SPARK-37244] - Build and test on Python 3.10
[SPARK-37256] - Replace `ScalaObjectMapper` with `ClassTagExtensions` to fix compilation warnings
[SPARK-37257] - Update setup.py for Python 3.10
[SPARK-37263] - Add PandasAPIOnSparkAdviceWarning class
[SPARK-37266] - View text can only be SELECT queries
[SPARK-37268] - Remove unused method call in FileScanRDD
[SPARK-37273] - Hidden File Metadata Support for Spark SQL
[SPARK-37283] - Don't try to store a V1 table which contains ANSI intervals in Hive compatible format
[SPARK-37284] - Upgrade Jekyll to 4.2.1
[SPARK-37289] - Refactoring:remove the unnecessary function with partitionSchemaOption
[SPARK-37292] - Removes outer join if it only has DISTINCT on streamed side with alias
[SPARK-37298] - Use unique exprId in RewriteAsOfJoin
[SPARK-37300] - TaskSchedulerImpl should ignore task finished event if its task was already finished state
[SPARK-37307] - Don't obtain JDBC connection for empty partition
[SPARK-37327] - Silence the to_pandas() advice log for internal usage
[SPARK-37335] - Clarify output of FPGrowth
[SPARK-37336] - Migrate _java2py to SparkSession
[SPARK-37337] - Improve the API of Spark DataFrame to pandas-on-Spark DataFrame conversion
[SPARK-37339] - Add `spark-version` label to driver and executor pods
[SPARK-37341] - Avoid unnecessary buffer and copy in full outer sort merge join
[SPARK-37342] - Upgrade Apache Arrow to 6.0.0
[SPARK-37346] - Link migration guide for structured stream.
[SPARK-37352] - Silence the `index_col` advice in `to_spark()` for internal usage
[SPARK-37369] - Avoid redundant ColumnarToRow transistion on InMemoryTableScan
[SPARK-37370] - Add SQL configs to control newly added join code-gen in 3.3
[SPARK-37371] - UnionExec should support columnar if all children support columnar
[SPARK-37372] - Removing redundant label addition
[SPARK-37373] - Collect LocalSparkContext worker logs in case of test failure
[SPARK-37380] - Miscellaneous Python lint infra cleanup
[SPARK-37386] - simplify OptimizeSkewedJoin to not run the cost evaluator
[SPARK-37436] - Uses Python's standard string formatter for SQL API in pandas API on Spark
[SPARK-37443] - Provide a profiler for Python/Pandas UDFs
[SPARK-37447] - Cache LogicalPlan.isStreaming() in a lazy val
[SPARK-37450] - Spark SQL reads unnecessary nested fields (another type of pruning case)
[SPARK-37454] - support expressions in time travel timestamp
[SPARK-37457] - Update cloudpickle to v2.0.0
[SPARK-37458] - Remove unnecessary object serialization on foreachBatch
[SPARK-37460] - ALTER (DATABASE|SCHEMA|NAMESPACE) ... SET LOCATION command not documented
[SPARK-37462] - Avoid unnecessary calculating the number of outstanding fetch requests and RPCS
[SPARK-37464] - SCHEMA and DATABASE should simply be aliases of NAMESPACE
[SPARK-37468] - Support ANSI intervals and TimestampNTZ for UnionEstimation
[SPARK-37469] - Unified "fetchWaitTime" and "shuffleReadTime" metrics On UI
[SPARK-37474] - Migrate SparkR documentation to pkgdown
[SPARK-37484] - Replace Get and getOrElse with getOrElse
[SPARK-37485] - Replace map with expressions which produce no result with foreach
[SPARK-37493] - expose driver gc time and duration time
[SPARK-37503] - Improve SparkSession/PySpark SparkSession startup
[SPARK-37505] - mesos module is missing log4j.properties file for UT
[SPARK-37506] - Change the never changed 'var' to 'val'
[SPARK-37513] - date +/- interval with only day-time fields returns different data type between Spark3.2 and Spark3.1
[SPARK-37514] - Remove workarounds due to older pandas
[SPARK-37516] - Uses Python's standard string formatter for SQL API in PySpark
[SPARK-37530] - Spark reads many paths very slow though newAPIHadoopFile
[SPARK-37531] - Use PyArrow 6.0.0 in Python 3.9 tests at GitHub Action job
[SPARK-37540] - Detect more unsupported time travel
[SPARK-37558] - Improve spark-sql cli command doc
[SPARK-37561] - Avoid loading all functions when obtaining hive's DelegationToken
[SPARK-37565] - Upgrade mysql-connector-java to 8.0.27
[SPARK-37578] - DSV2 is not updating Output Metrics
[SPARK-37580] - Reset numFailures when one of task attempts succeeds
[SPARK-37586] - Add cipher mode option and set default cipher mode for aes_encrypt and aes_decrypt
[SPARK-37591] - Support the GCM mode by aes_encrypt()/aes_decrypt()
[SPARK-37592] - Improve performance of JoinSelection
[SPARK-37593] - Reduce default page size by LONG_ARRAY_OFFSET if G1GC and ON_HEAP are used
[SPARK-37594] - Make UT test("SPARK-34399: Add job commit duration metrics for DataWritingCommand") more stable
[SPARK-37600] - Upgrade to Hadoop 3.3.2
[SPARK-37611] - Remove upper limit of spark.kubernetes.memoryOverheadFactor
[SPARK-37618] - Support cleaning up shuffle blocks from external shuffle service
[SPARK-37627] - Add sorted column in BucketTransform
[SPARK-37628] - Upgrade Netty from 4.1.68 to 4.1.72
[SPARK-37629] - speed up Expression.canonicalized
[SPARK-37646] - Avoid touching Scala reflection APIs in the lit function
[SPARK-37649] - Switch default index to distributed-sequence by default in pandas API on Spark
[SPARK-37657] - Support str and timestamp for (Series|DataFrame).describe()
[SPARK-37666] - Set `GCM` as the default mode in `aes_encrypt()`/`aes_decrypt()`
[SPARK-37670] - Support predicate pushdown and column pruning for de-duped CTEs
[SPARK-37686] - Migrate remaining pyspark.sql.functions to _invoke_* style
[SPARK-37688] - ExecutorMonitor should ignore SparkListenerBlockUpdated event if executor was not active
[SPARK-37689] - Expand should be supported PropagateEmptyRelation
[SPARK-37698] - Update ORC to 1.7.2
[SPARK-37704] - Update mypy in tests to 0.920
[SPARK-37710] - Add detailed log message for java.io.IOException occurring on Kryo flow
[SPARK-37712] - Spark request yarn cluster metrics slow and cause unnecessary delay
[SPARK-37715] - Remove ojdbc6 dependency and update docker-integration test docs
[SPARK-37726] - Add spill size metrics for sort merge join
[SPARK-37731] - refactor and cleanup function lookup in Analyzer
[SPARK-37737] - Update Black to 21.12.b0
[SPARK-37738] - PySpark date_add only accepts an integer as it's second parameter
[SPARK-37739] - Upgrade Arrow to 6.0.1
[SPARK-37747] - Upgrade zstd-jni to 1.5.1-1
[SPARK-37753] - Fine tune logic to demote Broadcast hash join in DynamicJoinSelection
[SPARK-37756] - Enable matplotlib test for pandas API on Spark
[SPARK-37761] - Install matplotlib in Python 3.9 and PyPy 3 in GitHub Actions image
[SPARK-37764] - Reserve bucket information when relation conversion from metastore relations to data source relations
[SPARK-37776] - Upgrade silencer to 1.7.7
[SPARK-37777] - update the SQL syntax of SHOW FUNCTIONS
[SPARK-37780] - QueryExecutionListener should also support SQLConf
[SPARK-37782] - Make DataFrame.transform take the parameters for the function.
[SPARK-37783] - Add @tailrec wherever possible
[SPARK-37784] - CodeGenerator.addBufferedState() does not properly handle UDTs
[SPARK-37785] - Add Utils.isAtExecutor
[SPARK-37786] - StreamingQueryListener should also support SQLConf
[SPARK-37789] - Add a class to represent general aggregate functions in DS V2
[SPARK-37796] - ByteArrayMethods arrayEquals should fast skip the check of aligning with unaligned platform
[SPARK-37803] - Create new benchmarks for struct deserializer improvement.
[SPARK-37812] - When deserializing an Orc struct, reuse the result row when possible
[SPARK-37822] - SQL function `split` should return an array of non-nullable elements
[SPARK-37826] - Use zstd codec name in ORC file names for hive orc impl
[SPARK-37828] - Push down filters through RebalancePartitions
[SPARK-37831] - Add task partition id in metrics
[SPARK-37832] - Orc struct serializer should look up field converters in an array rather than a linked list
[SPARK-37833] - Add `precondition` job for skip the main GitHub Action jobs
[SPARK-37835] - Fix the comments on SQLQueryTestSuite.scala/ThriftServerQueryTestSuite.scala to more explicit
[SPARK-37836] - Enable more flake8 rules for PEP 8 compliance
[SPARK-37837] - Enable black formatter in dev Python scripts
[SPARK-37838] - Upgrade scalatestplus artifacts to 3.3.0-SNAP3
[SPARK-37850] - Enable flake's E731 rule in PySpark
[SPARK-37851] - Mark org.apache.spark.sql.hive.execution as slow tests
[SPARK-37852] - Enable flake's E741 rule in PySpark
[SPARK-37854] - Use type match to simplify TestUtils#withHttpConnection
[SPARK-37862] - RecordBinaryComparator should fast skip the check of aligning with unaligned platform
[SPARK-37869] - Update pytest-mypy-plugins to 1.9.3
[SPARK-37876] - Move SpecificParquetRecordReaderBase.listDirectory to TestUtils
[SPARK-37879] - Show test report in GitHub Actions builds from PRs
[SPARK-37885] - Allow pandas_udf to take type annotations with future annotations enabled
[SPARK-37896] - ConstantColumnVector: a column vector with same values
[SPARK-37900] - Use SparkMasterRegex.KUBERNETES_REGEX in SecurityManager
[SPARK-37901] - Upgrade Netty from 4.1.72 to 4.1.73
[SPARK-37902] - Update annotations to resolve issues detected with mypy==0.931
[SPARK-37903] - Replace string_typehints with get_type_hints.
[SPARK-37904] - Improve RebalancePartitions in rules of Optimizer
[SPARK-37909] - Restore F403 checks
[SPARK-37915] - Combine unions if there is a project between them
[SPARK-37917] - Push down limit 1 for right side of left semi/anti join
[SPARK-37922] - Combine to one cast if we can safely up-cast two casts
[SPARK-37924] - Sort table properties by key in SHOW CREATE TABLE on VIEW (v1)
[SPARK-37928] - Add Parquet Data Page V2 bench scenario to DataSourceReadBenchmark
[SPARK-37934] - Upgrade Jetty version to 9.4.44
[SPARK-37949] - Improve Rebalance statistics estimation
[SPARK-37950] - Take EXTERNAL as a reserved table property
[SPARK-37952] - Add missing statements to ALTER TABLE document.
[SPARK-37959] - Fix the UT of checking norm in KMeans & BiKMeans
[SPARK-37968] - Upgrade commons-collections3 to commons-collections4
[SPARK-37974] - Implement vectorized DELTA_BYTE_ARRAY and DELTA_LENGTH_BYTE_ARRAY encodings for Parquet V2 support
[SPARK-37984] - Avoid calculating all outstanding requests to improve performance.
[SPARK-37992] - Restore mypy version check in dev/lint-python
[SPARK-38002] - Upgrade ZSTD-JNI to 1.5.2-1
[SPARK-38006] - Clean up duplicated planner logic for window operator
[SPARK-38007] - Update K8s doc to recommend K8s 1.20+
[SPARK-38008] - Fix the method description of refill
[SPARK-38011] - Remove useless and duplicated configuration in ParquetFileFormat builderReader
[SPARK-38014] - Add Parquet Data Page V2 test scenario for BuiltInDataSourceWriteBenchmark
[SPARK-38021] - Upgrade dropwizard metrics from 4.2.2 to 4.2.7
[SPARK-38028] - Expose Arrow Vector from ArrowColumnVector
[SPARK-38033] - The structured streaming processing cannot be started because the commitId and offsetId are inconsistent
[SPARK-38036] - Refactor `VersionsSuite` as a subclass of `HiveVersionSuite`
[SPARK-38046] - Fix KafkaSource/KafkaMicroBatch flaky test due to non-deterministic timing
[SPARK-38051] - Update Roxygen reference to 7.1.2
[SPARK-38069] - improve structured streaming window of calculated
[SPARK-38076] - Remove redundant null-check is covered by further condition
[SPARK-38082] - Update minimum numpy version to 1.15
[SPARK-38086] - Make ArrowColumnVector Extendable
[SPARK-38089] - Show the root cause exception in TestUtils.assertExceptionMsg
[SPARK-38096] - Update sbt to 1.6.2
[SPARK-38100] - Remove unused method in `Decimal`
[SPARK-38121] - Use SparkSession instead of SQLContext inside PySpark
[SPARK-38123] - Unified use `DataType.catalogString` as `targetType` of `QueryExecutionErrors#castingCauseOverflowError`
[SPARK-38128] - Show full stacktrace in tests by default in PySpark tests
[SPARK-38134] - Upgrade Arrow to 7.0.0
[SPARK-38138] - Materialize QueryPlan subqueries
[SPARK-38147] - Upgrade shapeless to 2.3.7
[SPARK-38148] - Do not add dynamic partition pruning if there exists static partition pruning
[SPARK-38149] - Upgrade joda-time to 2.10.13
[SPARK-38154] - Set up a new GA job to run tests with ANSI mode
[SPARK-38175] - Clean up unused parameters in private methods signature
[SPARK-38177] - Fix wrong transformExpressions in Optimizer
[SPARK-38183] - Show warning when creating pandas-on-Spark session under ANSI mode.
[SPARK-38184] - Fix malformatted ExpressionDescription of `decode`
[SPARK-38186] - Improve the README of Spark docs
[SPARK-38191] - The staging directory of write job only needs to be initialized once in HadoopMapReduceCommitProtocol.
[SPARK-38194] - Make memory overhead factor configurable
[SPARK-38199] - Delete the unused `dataType` specified in the definition of `IntervalColumnAccessor`
[SPARK-38211] - Add SQL migration guide on restoring loose upcast from string
[SPARK-38214] - No need to filter windows when windowDuration is multiple of slideDuration
[SPARK-38216] - When creating a Hive table, fail early if all the columns are partitioned columns
[SPARK-38219] - Support ANSI aggregation function percentile_cont as window function
[SPARK-38220] - Upgrade `commons-math3` to 3.6.1
[SPARK-38225] - Adjust input `format` of function `to_binary`
[SPARK-38229] - Should't check temp/external/ifNotExists with visitReplaceTable when parser
[SPARK-38231] - Upgrade commons-text to 1.9
[SPARK-38235] - Add test util for testing grouped aggregate pandas UDF.
[SPARK-38240] - Improve RuntimeReplaceable and add a guideline for adding new functions
[SPARK-38247] - Unify the output of df.explain and "explain " if plan is command
[SPARK-38249] - Cleanup unused private methods and fields
[SPARK-38256] - Upgarde `org.scalatestplus:mockito` to 3.2.11.0
[SPARK-38259] - Upgrade netty to 4.1.74
[SPARK-38260] - Remove dependence on commons-net
[SPARK-38267] - Replace pattern matches on boolean expressions with conditional statements
[SPARK-38269] - Clean up redundant type cast
[SPARK-38274] - Upgarde junit4 to 4.13.2 and upgrade corresponding junit-interface to 0.13.3
[SPARK-38279] - Pin markupsafe to 2.0.1 fix linter failure
[SPARK-38299] - Clean up deprecated usage of `StringBuilder.newBuilder`
[SPARK-38300] - Use ByteStreams.toByteArray to simplify fileToString and resourceToBytes in catalyst.uti
[SPARK-38301] - Remove unused scala-actors dependency
[SPARK-38305] - Check existence of file before untarring/zipping
[SPARK-38322] - Support query stage show runtime statistics in formatted explain mode
[SPARK-38323] - Support the hidden file metadata in Streaming
[SPARK-38337] - Replace `toIterator` with `iterator` for `IterableLike`/`IterableOnce` to cleanup deprecated api usage
[SPARK-38338] - Remove test dependency of hamcrest
[SPARK-38339] - Upgrade RoaringBitmap to 0.9.25
[SPARK-38342] - Clean up deprecated api usage of Ivy
[SPARK-38348] - Upgrade tink to 1.6.1
[SPARK-38351] - [TESTS] Replace 'abc symbols with Symbol("abc") in tests
[SPARK-38353] - Instrument __enter__ and __exit__ magic methods for pandas API on Spark
[SPARK-38360] - Introduce a `exists` function for `TreeNode` to eliminate duplicate code pattern
[SPARK-38362] - Move eclipse.m2e Maven plugin config in its own profile
[SPARK-38378] - ANTLR grammar definition in separate Parser and Lexer files
[SPARK-38382] - Refactor migration guide's sentences
[SPARK-38384] - Improve error messages of ParseException from ANTLR
[SPARK-38393] - Clean up deprecated usage of GenSeq/GenMap
[SPARK-38414] - Remove redundant SuppressWarnings
[SPARK-38415] - Update histogram_numeric (x, y) result type to make x == input type
[SPARK-38424] - Disallow unused casts and ignores
[SPARK-38428] - Check the FetchShuffleBlocks message only once to improve iteration in external shuffle service
[SPARK-38434] - Correct semantic of CheckAnalysis.getDataTypesAreCompatibleFn method
[SPARK-38437] - Dynamic serialization of Java datetime objects to micros/days
[SPARK-38443] - Document config STREAMING_SESSION_WINDOW_MERGE_SESSIONS_IN_LOCAL_PARTITION
[SPARK-38484] - Move usage logging instrumentation util functions from pandas module to pyspark.util module
[SPARK-38487] - Fix docstrings of nlargest/nsmallest of DataFrame
[SPARK-38489] - Aggregate.groupOnly support foldable expressions
[SPARK-38499] - Upgrade Jackson to 2.13.2
[SPARK-38500] - Add ASF License header to all Service Provider configuration files
[SPARK-38509] - Unregister the TIMESTAMPADD/DIFF functions and remove DATE_ADD/DIFF
[SPARK-38529] - Prevent GeneratorNestedColumnAliasing to be applied to non-Explode generators
[SPARK-38535] - Add the datetimeUnit enum to the grammar and use it in TIMESTAMPADD/DIFF
[SPARK-38540] - Upgrade compress-lzf from 1.0.3 to 1.1.0
[SPARK-38549] - SessionWindowStateStoreRestoreExec should provide numRowsDroppedByWatermark metric
[SPARK-38558] - Remove unnecessary casts between IntegerType and IntDecimal
[SPARK-38565] - Support Left Semi join in row level runtime filters
[SPARK-38570] - Incorrect DynamicPartitionPruning caused by Literal
[SPARK-38574] - Enrich Avro data source documentation
[SPARK-38593] - Incorporate numRowsDroppedByWatermark metric from SessionWindowStateStoreRestoreExec into StateOperatorProgress
[SPARK-38607] - Test result report for ANSI mode
[SPARK-38609] - Add PYSPARK_PANDAS_USAGE_LOGGER environment variable as an alias of KOALAS_USAGE_LOGGER
[SPARK-38623] - Add more comments and tests for HashShuffleSpec
[SPARK-38628] - Complete the copy method in subclasses of InternalRow, ArrayData, and MapData to safely copy their instances.
[SPARK-38650] - Better ParseException message for char without length
[SPARK-38654] - Show default index type in SQL plans for pandas API on Spark
[SPARK-38656] - Show options for Pandas API on Spark in UI
[SPARK-38657] - Rename "SQL" to "SQL/DataFrame" in Spark UI
[SPARK-38709] - remove trailing $ from function class name in sql-expression-schema.md
[SPARK-38710] - use SparkArithmeticException for Arithmetic overflow runtime errors
[SPARK-38778] - Replace http with https for project url in pom
[SPARK-38796] - Implement the to_number and try_to_number SQL functions according to a new specification
[SPARK-38816] - Wrong comment in random matrix generator in spark-als algorithm
[SPARK-38825] - Add a test to cover parquet notIn filter
[SPARK-38833] - PySpark applyInPandas should allow to return empty DataFrame without columns
[SPARK-38892] - Fix the UT of schema equal assert
[SPARK-38924] - Update dataTables to 1.10.25 for security issue
[SPARK-38929] - Improve error messages for cast failures in ANSI
[SPARK-38936] - Script transform feed thread should have name
[SPARK-38939] - Support ALTER TABLE ... DROP [IF EXISTS] COLUMN .. syntax
[SPARK-38953] - Document PySpark common exceptions / errors
[SPARK-38957] - Use multipartIdentifier for parsing table-valued functions
[SPARK-38972] - [SQL] Support <paramName> style for error message parameters
[SPARK-39008] - Change ASF as a single author in Spark distribution
[SPARK-39030] - Rename sum to avoid shading the builtin Python function
[SPARK-39049] - Remove unneeded pass
[SPARK-39154] - Remove outdated statements on distributed-sequence default index
[SPARK-39155] - Access to JVM through passed-in GatewayClient during type conversion
[SPARK-39174] - Catalogs loading swallows missing classname for ClassNotFoundException
[SPARK-39186] - make skew consistent with pandas
[SPARK-39215] - Reduce Py4J calls in pyspark.sql.utils.is_timestamp_ntz_preferred
[SPARK-39240] - Source and binary releases using different tool to generates hashes for integrity
[SPARK-39295] - Improve documentation of pandas API support list.
[SPARK-39361] - Stop using Log4J2's extended throwable logging pattern in default logging configurations
[SPARK-39392] - Refine ANSI error messages and remove 'To return NULL instead'
[SPARK-39633] - Dataframe options for time travel via `timestampAsOf` should respect both formats of specifying timestamp
[SPARK-44700] - Rule OptimizeCsvJsonExprs should not be applied to expression like from_json(regexp_replace)

Test

[SPARK-29871] - Flaky test: ImageFileFormatTest.test_read_images
[SPARK-32391] - Install pydata_sphinx_theme in Jenkins machines
[SPARK-32666] - Install ipython and nbsphinx in Jenkins for Binder integration
[SPARK-33242] - Install numpydoc in Jenkins machines
[SPARK-35345] - Add BloomFilter Benchmark test for Parquet
[SPARK-36048] - Wrong HealthTrackerSuite.allExecutorAndHostIds
[SPARK-36151] - Enable MiMa for Scala 2.13 artifacts after Spark 3.2.0 release
[SPARK-36165] - Fix SQL doc generation in GitHub Action
[SPARK-36204] - Deduplicate Scala 2.13 daily build
[SPARK-36820] - Disable LZ4 test for Hadoop 2.7 profile
[SPARK-36839] - Add daily build with Hadoop 2 profile in GitHub Actions build
[SPARK-36883] - Upgrade R version to 4.1.1 in CI images
[SPARK-36929] - Remove Unused Method EliminateSubqueryAliasesSuite#assertEquivalent
[SPARK-37218] - Parameterize `spark.sql.shuffle.partitions` in TPCDSQueryBenchmark
[SPARK-37223] - Fix unit test check in JoinHintSuite
[SPARK-37322] - `run_scala_tests` should respect test module order
[SPARK-37367] - Reenable exception test in DDLParserSuite.create view -- basic
[SPARK-37368] - Deflake TPC-DS build
[SPARK-37384] - Flay test: HealthTrackerIntegrationSuite.If preferred node is bad, without excludeOnFailure job will fail
[SPARK-37453] - Split TPC-DS build in GitHub Actions
[SPARK-37813] - ORC read benchmark should enable vectorization for nested column
[SPARK-37823] - Add `is-changed.py` dev script
[SPARK-37871] - Use python3 instead of python in BaseScriptTransformation tests
[SPARK-37908] - Refactoring on pod label test in BasicFeatureStepSuite
[SPARK-37921] - Update OrcReadBenchmark to use Hive ORC reader as the basis
[SPARK-37987] - Flaky Test: StreamingAggregationSuite.changing schema of state when restarting query - state format version 1
[SPARK-38031] - Update document type conversion for Pandas UDFs (pyarrow 6.0.1, pandas 1.4.0, Python 3.9)
[SPARK-38032] - Upgrade Arrow version < 7.0.0 for Python UDF tests in SQL
[SPARK-38040] - Enable binary compatibility check for APIs in Catalyst, KVStore and Avro modules
[SPARK-38045] - More strict validation on plan check for stream-stream join unit test
[SPARK-38080] - Flaky test: StreamingQueryManagerSuite: 'awaitAnyTermination with timeout and resetTerminated'
[SPARK-38084] - Support `SKIP_PYTHON` and `SKIP_R` in `run-tests.py`
[SPARK-38136] - Update GitHub Action test image
[SPARK-38142] - Move ArrowColumnVectorSuite to org.apache.spark.sql.vectorized
[SPARK-38297] - Fix mypy failure on DataFrame.to_numpy in pandas API on Spark
[SPARK-38532] - Add test case for invalid gapDuration of sessionwindow
[SPARK-38780] - PySpark docs build should fail when there is warning.
[SPARK-38786] - Test Bug in StatisticsSuite "change stats after add/drop partition command"
[SPARK-38800] - Explicitly document the supported pandas version.
[SPARK-38927] - Skip NumPy/Pandas tests in `test_rdd.py` if not available
[SPARK-38928] - Skip Pandas UDF test in `QueryCompilationErrorsSuite` if not available
[SPARK-39019] - Use `withTempPath` to clean up temporary data directory after `SPARK-37463: read/write Timestamp ntz to Orc with different time zone`
[SPARK-39252] - Flaky Test: pyspark.sql.tests.test_dataframe.DataFrameTests test_df_is_empty
[SPARK-39253] - Improve PySpark API reference to be more readable
[SPARK-39273] - Make PandasOnSparkTestCase inherit ReusedSQLTestCase
[SPARK-39334] - Change to exclude `slf4j-reload4j` for `hadoop-minikdc`
[SPARK-39394] - Improve PySpark structured streaming page more readable

Wish

[SPARK-36611] - Remove unused listener in HiveThriftServer2AppStatusStore
[SPARK-37931] - Quote the column name if needed
[SPARK-38242] - Sort the SparkSubmit debug output

Task

[SPARK-35973] - DataSourceV2: Support SHOW CATALOGS
[SPARK-35996] - Setting version to 3.3.0-SNAPSHOT
[SPARK-36034] - Incorrect datetime filter when reading Parquet files written in legacy mode
[SPARK-36148] - Missing validation of regexp_replace inputs
[SPARK-36223] - TPCDSQueryTestSuite should run with different config set
[SPARK-36888] - Sha2 with bit_length 512 not being tested
[SPARK-36975] - Refactor HiveClientImpl collect hive client call logic
[SPARK-36980] - Insert support query with CTE
[SPARK-37050] - Update conda installation instructions
[SPARK-37067] - DateTimeUtils.stringToTimestamp() incorrectly rejects timezone without colon
[SPARK-37136] - Remove code about hive build in functions
[SPARK-37437] - Remove unused hive-2.3 profile
[SPARK-37445] - Update hadoop-profile
[SPARK-37446] - hive-2.3.9 related API use invoke method
[SPARK-37461] - yarn-client mode client's appid value is null
[SPARK-37471] - spark-sql support nested bracketed comment
[SPARK-37497] - Promote ExecutorPods[PollingSnapshot|WatchSnapshot]Source to DeveloperApi
[SPARK-37555] - spark-sql should pass last unclosed comment to backend and execute throw a exception
[SPARK-37631] - Code clean up on promoting strings in math functions
[SPARK-37716] - Allow LateralJoin node to host non-deterministic expressions when the outer query is a single row relation
[SPARK-37724] - ANSI mode: disable ANSI reserved keywords by default
[SPARK-37750] - ANSI mode: optionally return null result if element not exists in array/map
[SPARK-37766] - Regenerate benchmark results
[SPARK-37815] - Fix the github action job "test_report"
[SPARK-37817] - Remove unreachable code in complexTypeExtractors.scala
[SPARK-37906] - spark-sql should not pass last simple comment to backend
[SPARK-37907] - StaticInvoke should support ConstantFolding
[SPARK-37951] - Refactor ImageFileFormatSuite
[SPARK-37965] - Remove check field name when reading/writing existing data in ORC
[SPARK-37967] - ConstantFolding/ Literal.create support ObjectType
[SPARK-37969] - Hive Serde insert should check schema before execution
[SPARK-37985] - Fix flaky test SPARK-37578
[SPARK-38003] - Differentiate scalar and table function lookup in LookupFunctions
[SPARK-38063] - Support SQL split_part function
[SPARK-38122] - Update App Key of DocSearch
[SPARK-38144] - Remove unused `spark.storage.safetyFraction` config
[SPARK-38150] - Update comment of RelationConversions
[SPARK-38153] - Remove option newlines.topLevelStatements in scalafmt.conf
[SPARK-38189] - Add priority scheduling doc for Spark on K8S
[SPARK-38197] - Improve error message of BlockManager.fetchRemoteManagedBuffer
[SPARK-38215] - InsertIntoHiveDir support convert metadata
[SPARK-38237] - Introduce a new config to require all cluster keys on Aggregate
[SPARK-38318] - regression when replacing a dataset view
[SPARK-38358] - Add migration guide for spark.sql.hive.convertMetastoreInsertDir and spark.sql.hive.convertMetastoreCtas
[SPARK-38419] - Remove tab character and trailing space in script
[SPARK-38449] - Not call createTable when ifNotExist=true and table eixsts
[SPARK-38566] - Revert the parser changes for DEFAULT column support
[SPARK-38784] - Upgrade Jetty to 9.4.46
[SPARK-39178] - When throw SparkFatalException, should show root cause too.
[SPARK-39367] - Review and fix issues in Scala/Java API docs of SQL module
[SPARK-39371] - Review and fix issues in Scala/Java API docs of Core module

Dependency upgrade

[SPARK-38287] - Upgrade h2 from 2.0.204 to 2.1.210 in /sql/core
[SPARK-38291] - Upgrade postgresql from 42.3.0 to 42.3.3
[SPARK-38303] - Upgrade ansi-regex from 5.0.0 to 5.0.1 in /dev
[SPARK-39099] - Add dependencies to Dockerfile for building Spark releases
[SPARK-39183] - Upgrade Apache Xerces Java to 2.12.2

Question

[SPARK-37788] - ColumnOrName vs Column in PySpark Functions module

Umbrella

[SPARK-34705] - Add code-gen for all join types of sort merge join
[SPARK-36504] - Improve test coverage for pandas API on Spark
[SPARK-36707] - Support to specify index type and name in pandas API on Spark
[SPARK-37093] - Inline type hints python/pyspark/streaming
[SPARK-37094] - Inline type hints for files in python/pyspark
[SPARK-37275] - Support ANSI intervals in PySpark
[SPARK-37395] - Inline type hint files for files in python/pyspark/ml
[SPARK-37396] - Inline type hint files for files in python/pyspark/mllib
[SPARK-37814] - Migrating from log4j 1 to log4j 2
[SPARK-37886] - Use ComparisonTestBase to reduce redundant test code
[SPARK-38396] - Improve K8s Integration Tests

Documentation

[SPARK-31907] - Spark SQL functions documentation refers to SQL API documentation without linking to it
[SPARK-36377] - Fix documentation in spark-env.sh.template
[SPARK-36474] - Mention pandas API on Spark in Spark overview pages
[SPARK-37550] - from_json documentation lacks examples for complex types
[SPARK-37624] - Suppress warnings for live pandas-on-Spark quickstart notebooks
[SPARK-37692] - sql-migration-guide wrong description
[SPARK-37718] - Demo sql is incorrect
[SPARK-37818] - Add option for show create table command
[SPARK-37925] - Update document to mention the workaround for YARN-11053
[SPARK-38606] - Update document to make a good guide of multiple versions of the Spark Shuffle Service
[SPARK-38629] - Two links beneath Spark SQL Guide/Data Sources do not work properly
[SPARK-38933] - Add examples of window functions into SQL docs
[SPARK-39001] - Document which options are unsupported in CSV and JSON functions
[SPARK-39032] - Incorrectly formatted examples in pyspark.sql.functions.when
[SPARK-39219] - Promote Structured Streaming over Spark Streaming
[SPARK-39237] - Update the ANSI SQL mode documentation

Github Integration

[SPARK-38261] - Sync missing R packages with CI

Edit/Copy Release Notes

The text area below allows the project release notes to be edited and copied to another document.

Release Notes - Spark - Version 3.3.0
    
<h2>        Sub-task
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-6305'>SPARK-6305</a>] -         Add support for log4j 2.x to Spark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-27442'>SPARK-27442</a>] -         ParquetFileFormat fails to read column named with invalid characters
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-27974'>SPARK-27974</a>] -         Support ANSI Aggregate Function: array_agg
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-28137'>SPARK-28137</a>] -         Data Type Formatting Functions: `to_number`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-32567'>SPARK-32567</a>] -         Code-gen for full outer shuffled hash join
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-32709'>SPARK-32709</a>] -         Write Hive ORC/Parquet bucketed table with hivehash (for Hive 1,2)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-32712'>SPARK-32712</a>] -         Support writing Hive non-ORC/Parquet bucketed table 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-33701'>SPARK-33701</a>] -         Adaptive shuffle merge finalization for push-based shuffle
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-33832'>SPARK-33832</a>] -         Add an option in AQE to mitigate skew even if it causes an new shuffle
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34112'>SPARK-34112</a>] -         Upgrade ORC to 1.7.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34183'>SPARK-34183</a>] -         DataSource V2: Support required distribution and ordering in SS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34332'>SPARK-34332</a>] -         Unify v1 and v2 ALTER NAMESPACE .. SET LOCATION tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34544'>SPARK-34544</a>] -         pyspark toPandas() should return pd.DataFrame
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34826'>SPARK-34826</a>] -         Adaptive fetch of shuffle mergers for Push based shuffle
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34863'>SPARK-34863</a>] -         Support nested column in Spark Parquet vectorized readers
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34960'>SPARK-34960</a>] -         Aggregate (Min/Max/Count) push down for ORC
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34980'>SPARK-34980</a>] -         Support coalesce partition through union
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35352'>SPARK-35352</a>] -         Add code-gen for full outer sort merge join
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35437'>SPARK-35437</a>] -         Use expressions to filter Hive partitions at client side
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35496'>SPARK-35496</a>] -         Upgrade Scala 2.13 to 2.13.7
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35663'>SPARK-35663</a>] -         Add Timestamp without time zone type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35664'>SPARK-35664</a>] -         Support java.time. LocalDateTime as an external type of TimestampWithoutTZ type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35674'>SPARK-35674</a>] -         Test timestamp without time zone in UDF
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35697'>SPARK-35697</a>] -         Test TimestampWithoutTZType as ordered and atomic type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35698'>SPARK-35698</a>] -         Support casting of timestamp without time zone to strings
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35711'>SPARK-35711</a>] -         Support casting of timestamp without time zone to timestamp type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35716'>SPARK-35716</a>] -         Support casting of timestamp without time zone to date type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35718'>SPARK-35718</a>] -         Support casting of Date to timestamp without time zone type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35719'>SPARK-35719</a>] -         Support type conversion between timestamp and timestamp without time zone type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35720'>SPARK-35720</a>] -         Support casting of String to timestamp without time zone type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35764'>SPARK-35764</a>] -         Assign pretty names to TimestampWithoutTZType
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35785'>SPARK-35785</a>] -         Cleanup support for RocksDB instance
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35839'>SPARK-35839</a>] -         New SQL function: to_timestamp_ntz
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35854'>SPARK-35854</a>] -         Improve the error message of to_timestamp_ntz with invalid format pattern
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35867'>SPARK-35867</a>] -         Enable vectorized read for VectorizedPlainValuesReader.readBooleans
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35889'>SPARK-35889</a>] -         Support adding TimestampWithoutTZ with Interval types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35895'>SPARK-35895</a>] -         Support subtracting Intervals from TimestampWithoutTZ
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35916'>SPARK-35916</a>] -         Support subtraction among Date/Timestamp/TimestampWithoutTZ
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35925'>SPARK-35925</a>] -         Support DayTimeIntervalType in width-bucket function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35926'>SPARK-35926</a>] -         Support YearMonthIntervalType in width-bucket function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35927'>SPARK-35927</a>] -         Remove type collection AllTimestampTypes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35932'>SPARK-35932</a>] -         Support extracting hour/minute/second from timestamp without time zone
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35953'>SPARK-35953</a>] -         Support extracting date fields from timestamp without time zone
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35963'>SPARK-35963</a>] -         Rename TimestampWithoutTZType to TimestampNTZType
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35968'>SPARK-35968</a>] -         Make sure partitions are not too small in AQE partition coalescing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35971'>SPARK-35971</a>] -         Rename the type name of TimestampNTZType as &quot;timestamp_ntz&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35975'>SPARK-35975</a>] -         New configuration spark.sql.timestampType for the default timestamp type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35977'>SPARK-35977</a>] -         Support non-reserved  keyword TIMESTAMP_NTZ
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35978'>SPARK-35978</a>] -         Support non-reserved  keyword TIMESTAMP_LTZ
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35979'>SPARK-35979</a>] -         Return different timestamp literals based on the default timestamp type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35987'>SPARK-35987</a>] -         The ANSI flags of Sum and Avg should be kept after being copied
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36015'>SPARK-36015</a>] -         Support TimestampNTZType in the Window spec definition
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36016'>SPARK-36016</a>] -         Support TimestampNTZType in expression ApproxCountDistinctForIntervals
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36017'>SPARK-36017</a>] -         Support TimestampNTZType in expression ApproximatePercentile
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36037'>SPARK-36037</a>] -         Support ANSI SQL LOCALTIMESTAMP datetime value function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36043'>SPARK-36043</a>] -         Add end-to-end tests with default timestamp type as TIMESTAMP_NTZ
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36044'>SPARK-36044</a>] -         Suport TimestampNTZ in functions unix_timestamp/to_unix_timestamp
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36046'>SPARK-36046</a>] -         Support new functions make_timestamp_ntz and make_timestamp_ltz
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36050'>SPARK-36050</a>] -         Spark doesn’t support reading/writing TIMESTAMP_NTZ with ORC
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36054'>SPARK-36054</a>] -         Support group by TimestampNTZ column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36055'>SPARK-36055</a>] -         Assign pretty SQL string to TimestampNTZ literals
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36058'>SPARK-36058</a>] -         Support replicasets/job API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36059'>SPARK-36059</a>] -         Add the ability to specify a scheduler
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36061'>SPARK-36061</a>] -         Add `volcano` module and feature step
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36072'>SPARK-36072</a>] -         TO_TIMESTAMP: return different results based on the default timestamp type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36075'>SPARK-36075</a>] -         Support for specifiying executor/driver node selector
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36083'>SPARK-36083</a>] -         make_timestamp: return different result based on the default timestamp type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36090'>SPARK-36090</a>] -         Support TimestampNTZType in expression Sequence
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36091'>SPARK-36091</a>] -         Support TimestampNTZ  type in expression TimeWindow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36095'>SPARK-36095</a>] -         Group exception messages in core/rdd 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36097'>SPARK-36097</a>] -         Group exception messages in core/scheduler
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36098'>SPARK-36098</a>] -         Group exception messages in core/storage
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36101'>SPARK-36101</a>] -         Group exception messages in core/api
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36107'>SPARK-36107</a>] -         Refactor first set of 20 query execution errors to use error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36110'>SPARK-36110</a>] -         Upgrade SBT to 1.5.5
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36119'>SPARK-36119</a>] -         Add new SQL function to_timestamp_ltz
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36120'>SPARK-36120</a>] -         Support TimestampNTZ type in cache table
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36135'>SPARK-36135</a>] -         Support TimestampNTZ type in file partitioning
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36139'>SPARK-36139</a>] -         Remove Python 3.6 from `pyspark` GitHub Action job
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36144'>SPARK-36144</a>] -         Use Python 3.9 in `run-pip-tests` conda environment
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36146'>SPARK-36146</a>] -         Upgrade Python version from 3.6 to higher version in GitHub linter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36152'>SPARK-36152</a>] -         Add Scala 2.13 daily build and test GitHub Action job
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36175'>SPARK-36175</a>] -         Support TimestampNTZ in Avro data source 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36179'>SPARK-36179</a>] -         Support TimestampNTZType in SparkGetColumnsOperation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36182'>SPARK-36182</a>] -         Support TimestampNTZ type in Parquet file source
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36208'>SPARK-36208</a>] -         SparkScriptTransformation should support ANSI interval types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36227'>SPARK-36227</a>] -         Remove TimestampNTZ type support in Spark 3.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36230'>SPARK-36230</a>] -         hasnans for Series of Decimal(`NaN`)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36231'>SPARK-36231</a>] -         Support arithmetic operations of Series containing Decimal(np.nan) 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36232'>SPARK-36232</a>] -         Support creating a ps.Series/Index with `Decimal(&#39;NaN&#39;)` with Arrow disabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36255'>SPARK-36255</a>] -         FileNotFoundException from the shuffle push can cause the executor to terminate
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36256'>SPARK-36256</a>] -         Upgrade lz4-java to 1.8.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36257'>SPARK-36257</a>] -         Updated the version of TimestampNTZ related changes as 3.3.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36332'>SPARK-36332</a>] -         Cleanup RemoteBlockPushResolver log messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36336'>SPARK-36336</a>] -         Define the new exception that mix SparkThrowable for all base exe in QueryExecutionErrors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36337'>SPARK-36337</a>] -         decimal(&#39;Nan&#39;) is unsupported in net.razorvine.pickle 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36346'>SPARK-36346</a>] -         Support TimestampNTZ type in Orc file source
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36357'>SPARK-36357</a>] -         Support pushdown Timestamp with local time zone for orc
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36368'>SPARK-36368</a>] -         Fix CategoricalOps.astype to follow pandas 1.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36378'>SPARK-36378</a>] -         Minor changes to address a few identified server side inefficiencies
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36396'>SPARK-36396</a>] -         Implement DataFrame.cov
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36399'>SPARK-36399</a>] -         Implement DataFrame.combine_first
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36401'>SPARK-36401</a>] -         Implement Series.cov
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36409'>SPARK-36409</a>] -         Splitting test cases from datetime.sql
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36424'>SPARK-36424</a>] -         Support eliminate limits in AQE Optimizer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36435'>SPARK-36435</a>] -         Implement MultIndex.equal_levels
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36438'>SPARK-36438</a>] -         Support list-like Python objects for Series comparison
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36490'>SPARK-36490</a>] -         Make from_csv/to_csv to handle timestamp_ntz type properly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36491'>SPARK-36491</a>] -         Make from_json/to_json to handle timestamp_ntz type properly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36506'>SPARK-36506</a>] -         Improve test coverage for series.py and indexes/*.py.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36526'>SPARK-36526</a>] -         Add supportsIndex interface
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36540'>SPARK-36540</a>] -         AM should not just finish with Success when dissconnected
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36556'>SPARK-36556</a>] -         Add DSV2 filters
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36587'>SPARK-36587</a>] -         Migrate CreateNamespaceStatement to v2 command framework
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36608'>SPARK-36608</a>] -         Support TimestampNTZ in Arrow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36609'>SPARK-36609</a>] -         Add `errors` argument for `ps.to_numeric`.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36615'>SPARK-36615</a>] -         SparkContext should register shutdown hook earlier
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36618'>SPARK-36618</a>] -         Support dropping rows of a single-indexed DataFrame
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36624'>SPARK-36624</a>] -         When application killed, sc should not exit with code 0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36625'>SPARK-36625</a>] -         Support TimestampNTZ in pandas API on Spark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36626'>SPARK-36626</a>] -         Support TimestampNTZ in createDataFrame/toPandas and Python UDFs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36645'>SPARK-36645</a>] -         Aggregate (Min/Max/Count) push down for Parquet
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36646'>SPARK-36646</a>] -         Push down group by partition column for Aggregate (Min/Max/Count) for Parquet
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36647'>SPARK-36647</a>] -         Push down filter by partition column for Aggregate (Min/Max/Count) for Parquet
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36650'>SPARK-36650</a>] -         ApplicationMaster shutdown hook should catch timeout exception
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36652'>SPARK-36652</a>] -         AQE dynamic join selection should not apply to non-equi join
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36653'>SPARK-36653</a>] -         Implement Series.__xor__
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36655'>SPARK-36655</a>] -         Add `versionadded` for API added in Spark 3.3.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36656'>SPARK-36656</a>] -         CollapseProject should not collapse correlated scalar subqueries
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36661'>SPARK-36661</a>] -         Support TimestampNTZ in Py4J
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36675'>SPARK-36675</a>] -         Support ScriptTransformation for timestamp_ntz
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36678'>SPARK-36678</a>] -         Migrate SHOW TABLES to use V2 command by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36687'>SPARK-36687</a>] -         Rename error classes with _ERROR suffix
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36708'>SPARK-36708</a>] -         Support numpy.typing for annotating ArrayType
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36709'>SPARK-36709</a>] -         Support new syntax for specifying index type and name
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36710'>SPARK-36710</a>] -         Support new syntax in function apply APIs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36711'>SPARK-36711</a>] -         Support multi-index in new syntax
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36713'>SPARK-36713</a>] -         Document new syntax for specifying index type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36724'>SPARK-36724</a>] -         Support timestamp_ntz as a type of time column for SessionWindow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36742'>SPARK-36742</a>] -         Fix ps.to_datetime with plurals of keys like years, months, days
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36746'>SPARK-36746</a>] -         Refactor _select_rows_by_iterable in iLocIndexer to use Column.isin
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36748'>SPARK-36748</a>] -         Introduce the &#39;compute.isin_limit&#39; option
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36754'>SPARK-36754</a>] -         array_intersect should handle Double.NaN and Float.NaN
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36760'>SPARK-36760</a>] -         Add interface SupportsPushDownV2Filters
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36769'>SPARK-36769</a>] -         Improve `filter` of single-indexed DataFrame
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36771'>SPARK-36771</a>] -         Fix `pop` of Categorical Series 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36778'>SPARK-36778</a>] -         Support ILIKE API on Scala(dataframe)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36779'>SPARK-36779</a>] -         Error when list of data type tuples has len = 1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36785'>SPARK-36785</a>] -         Fix ps.DataFrame.isin
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36794'>SPARK-36794</a>] -         Ignore duplicated join keys when building relation for SEMI/ANTI shuffle hash join
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36796'>SPARK-36796</a>] -         Make sql/core and dependent modules all UTs pass on Java 17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36813'>SPARK-36813</a>] -         Implement ps.merge_asof
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36818'>SPARK-36818</a>] -         Fix filtering a Series by a boolean Series
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36825'>SPARK-36825</a>] -         Read/write dataframes with ANSI intervals from/to parquet files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36830'>SPARK-36830</a>] -         Read/write dataframes with ANSI intervals from/to JSON files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36831'>SPARK-36831</a>] -         Read/write dataframes with ANSI intervals from/to CSV files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36846'>SPARK-36846</a>] -         Inline most of type hint files under pyspark/sql/pandas folder
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36848'>SPARK-36848</a>] -         Migrate ShowCurrentNamespaceStatement to v2 command framework
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36849'>SPARK-36849</a>] -         Migrate UseStatement to v2 command framework
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36850'>SPARK-36850</a>] -         Migrate CreateTableStatement to v2 command framework
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36852'>SPARK-36852</a>] -         Test ANSI interval support by the Parquet datasource
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36854'>SPARK-36854</a>] -         Parquet reader fails on load of ANSI interval when off-heap is enabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36866'>SPARK-36866</a>] -         Pushdown filters with ANSI interval values to parquet
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36868'>SPARK-36868</a>] -         Migrate CreateFunctionStatement to v2 command framework
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36871'>SPARK-36871</a>] -         Migrate CreateViewStatement to v2 command
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36879'>SPARK-36879</a>] -         Support Parquet v2 data page encodings for the vectorized path
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36880'>SPARK-36880</a>] -         Inline type hints for python/pyspark/sql/functions.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36881'>SPARK-36881</a>] -         Inline type hints for python/pyspark/sql/catalog.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36882'>SPARK-36882</a>] -         Support ILIKE API on Python
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36884'>SPARK-36884</a>] -         Inline type hints for python/pyspark/sql/session.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36885'>SPARK-36885</a>] -         Inline type hints for python/pyspark/sql/dataframe.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36886'>SPARK-36886</a>] -         Inline type hints for python/pyspark/sql/context.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36891'>SPARK-36891</a>] -         Refactor SpecificParquetRecordReaderBase and add more coverage on vectorized Parquet decoding
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36895'>SPARK-36895</a>] -         Add Create Index syntax support
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36897'>SPARK-36897</a>] -         Replace collections.namedtuple() by typing.NamedTuple
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36899'>SPARK-36899</a>] -         Support ILIKE API on R
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36900'>SPARK-36900</a>] -         &quot;SPARK-36464: size returns correct positive number even with over 2GB data&quot; will oom with JDK17 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36902'>SPARK-36902</a>] -         Migrate CreateTableAsSelectStatement to v2 command
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36906'>SPARK-36906</a>] -         Inline type hints for conf.py and observation.py in python/pyspark/sql
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36910'>SPARK-36910</a>] -         Inline type hints for python/pyspark/sql/types.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36913'>SPARK-36913</a>] -         Implement createIndex and IndexExists in JDBC (MySQL dialect)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36914'>SPARK-36914</a>] -         Implement dropIndex and listIndexes in JDBC (MySQL dialect)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36920'>SPARK-36920</a>] -         Support ANSI intervals by ABS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36921'>SPARK-36921</a>] -         The DIV function should support ANSI intervals
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36922'>SPARK-36922</a>] -         The SIGN/SIGNUM functions should support ANSI intervals
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36924'>SPARK-36924</a>] -         CAST between ANSI intervals and numerics
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36927'>SPARK-36927</a>] -         Inline type hints for python/pyspark/sql/window.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36928'>SPARK-36928</a>] -         Handle ANSI intervals in ColumnarRow, ColumnarBatchRow and ColumnarArray
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36930'>SPARK-36930</a>] -         Support ps.MultiIndex.dtypes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36931'>SPARK-36931</a>] -         Read/write dataframes with ANSI intervals from/to ORC files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36935'>SPARK-36935</a>] -         Enhance ParquetSchemaConverter to capture Parquet repetition &amp; definition level
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36938'>SPARK-36938</a>] -         Inline type hints for group.py in python/pyspark/sql	
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36940'>SPARK-36940</a>] -         Inline type hints for python/pyspark/sql/avro/functions.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36941'>SPARK-36941</a>] -         Check saving of a dataframe with ANSI intervals to a Hive parquet table
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36942'>SPARK-36942</a>] -         Inline type hints for python/pyspark/sql/readwriter.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36944'>SPARK-36944</a>] -         Remove unused python/pyspark/sql/__init__.pyi
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36945'>SPARK-36945</a>] -         Inline type hints for python/pyspark/sql/udf.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36946'>SPARK-36946</a>] -         Support time for ps.to_datetime
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36948'>SPARK-36948</a>] -         Check CREATE TABLE with ANSI intervals using Hive external catalog and Parquet
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36949'>SPARK-36949</a>] -         Fix CREATE TABLE AS SELECT of ANSI intervals
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36951'>SPARK-36951</a>] -         Inline type hints for python/pyspark/sql/column.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36952'>SPARK-36952</a>] -         Inline type hints for python/pyspark/resource/information.py and python/pyspark/resource/profile.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36960'>SPARK-36960</a>] -         Pushdown filters with ANSI interval values to ORC
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36968'>SPARK-36968</a>] -         ps.Series.dot raise &quot;matrices are not aligned&quot; if index is not same
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36969'>SPARK-36969</a>] -         Inline type hints for SparkContext
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36970'>SPARK-36970</a>] -         Manual disabled format `B` for `date_format` function to compatibility with Java 8 behavior.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36977'>SPARK-36977</a>] -         Update docs to reflect that Python 3.6 is no longer supported
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36982'>SPARK-36982</a>] -         Migrate SHOW NAMESPACES to use V2 command by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36991'>SPARK-36991</a>] -         Inline type hints for spark/python/pyspark/sql/streaming.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37000'>SPARK-37000</a>] -         Add type hints to python/pyspark/sql/util.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37008'>SPARK-37008</a>] -         WholeStageCodegenSparkSubmitSuite Failed with Java 17 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37013'>SPARK-37013</a>] -         `select format_string(&#39;%0$s&#39;, &#39;Hello&#39;)` has different behavior when using java 8 and Java 17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37014'>SPARK-37014</a>] -         Inline type hints for python/pyspark/streaming/context.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37015'>SPARK-37015</a>] -         Inline type hints for python/pyspark/streaming/dstream.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37023'>SPARK-37023</a>] -         Avoid fetching merge status when shuffleMergeEnabled is false for a shuffleDependency during retry
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37031'>SPARK-37031</a>] -         Unify v1 and v2 DESCRIBE NAMESPACE tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37033'>SPARK-37033</a>] -         Inline type hints for python/pyspark/resource/requests.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37038'>SPARK-37038</a>] -         Sample push down in DS v2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37042'>SPARK-37042</a>] -         Inline type hints for kinesis.py and listener.py in python/pyspark/streaming
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37048'>SPARK-37048</a>] -         Clean up inlining type hints under SQL module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37056'>SPARK-37056</a>] -         Fix unused code in test about history server &amp; MetricsSystem
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37066'>SPARK-37066</a>] -         Improve ORC RecordReader&#39;s error message
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37070'>SPARK-37070</a>] -         Pass all UTs in `mllib-local` and `mllib` with Java 17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37072'>SPARK-37072</a>] -         Pass all UTs in `repl` with Java 17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37073'>SPARK-37073</a>] -         Pass all UTs in `external/avro` with Java 17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37083'>SPARK-37083</a>] -         Inline type hints for python/pyspark/accumulators.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37091'>SPARK-37091</a>] -         Support Java 17 in SparkR SystemRequirements
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37095'>SPARK-37095</a>] -         Inline type hints for files in python/pyspark/broadcast.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37105'>SPARK-37105</a>] -         Pass all UTs in `sql/hive` with Java 17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37106'>SPARK-37106</a>] -         Pass all UTs in `yarn` with Java 17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37107'>SPARK-37107</a>] -         Inline type hints for files in python/pyspark/status.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37120'>SPARK-37120</a>] -         Add Daily GitHub Action jobs for Java11/17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37125'>SPARK-37125</a>] -         Support AnsiInterval radix sort
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37129'>SPARK-37129</a>] -         Supplement all micro benchmark results use to Java 17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37137'>SPARK-37137</a>] -         Inline type hints for python/pyspark/conf.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37138'>SPARK-37138</a>] -         Support ANSI Interval in functions that support numeric type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37139'>SPARK-37139</a>] -         Inline type hints for python/pyspark/taskcontext.py and python/pyspark/version.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37140'>SPARK-37140</a>] -         Inline type hints for python/pyspark/resultiterable.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37144'>SPARK-37144</a>] -         Inline type hints for python/pyspark/file.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37145'>SPARK-37145</a>] -         Add KubernetesCustom[Driver/Executor]FeatureConfigStep developer API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37146'>SPARK-37146</a>] -         Inline type hints for python/pyspark/__init__.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37149'>SPARK-37149</a>] -         Improve error messages for arithmetic overflow under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37150'>SPARK-37150</a>] -         Migrate DESCRIBE NAMESPACE to use V2 command by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37152'>SPARK-37152</a>] -         Inline type hints for python/pyspark/context.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37153'>SPARK-37153</a>] -         Inline type hints for python/pyspark/profiler.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37154'>SPARK-37154</a>] -         Inline type hints for python/pyspark/rdd.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37155'>SPARK-37155</a>] -         Inline type hints for python/pyspark/statcounter.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37156'>SPARK-37156</a>] -         Inline type hints for python/pyspark/storagelevel.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37157'>SPARK-37157</a>] -         Inline type hints for python/pyspark/util.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37159'>SPARK-37159</a>] -         Change HiveExternalCatalogVersionsSuite to be able to test with Java 17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37161'>SPARK-37161</a>] -         RowToColumnConverter  support AnsiIntervalType
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37166'>SPARK-37166</a>] -         SPIP: Storage Partitioned Join
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37168'>SPARK-37168</a>] -         Improve error messages for SQL functions and operators under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37179'>SPARK-37179</a>] -         ANSI mode: Add a config to allow casting between Datetime and Numeric
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37181'>SPARK-37181</a>] -         pyspark.pandas.read_csv() should support latin-1 encoding
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37188'>SPARK-37188</a>] -         pyspark.pandas histogram accepts the title option but does not add a title to the plot
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37190'>SPARK-37190</a>] -         Improve error messages for casting under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37192'>SPARK-37192</a>] -         Migrate SHOW TBLPROPERTIES to use V2 command by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37195'>SPARK-37195</a>] -         Unify v1 and v2 SHOW TBLPROPERTIES  tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37200'>SPARK-37200</a>] -         Drop index support
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37212'>SPARK-37212</a>] -         Improve the implement of aggregate pushdown.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37220'>SPARK-37220</a>] -         Do not split input file for Parquet reader with aggregate push down
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37225'>SPARK-37225</a>] -         Read/write dataframes with ANSI intervals from/to Avro files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37228'>SPARK-37228</a>] -         Implement DataFrame.mapInArrow in Python
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37230'>SPARK-37230</a>] -         Document DataFrame.mapInArrow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37231'>SPARK-37231</a>] -         Dynamic writes/reads of ANSI interval partitions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37232'>SPARK-37232</a>] -         Upgrade ORC to 1.7.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37234'>SPARK-37234</a>] -         Inline type hints for python/pyspark/mllib/stat/_statistics.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37235'>SPARK-37235</a>] -         Inline type hints for distribution.py and __init__.py in python/pyspark/mllib/stat
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37236'>SPARK-37236</a>] -         Inline type hints for KernelDensity.pyi, test.py in python/pyspark/mllib/stat/
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37240'>SPARK-37240</a>] -         Cannot read partitioned parquet files with ANSI interval partition values
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37258'>SPARK-37258</a>] -         Upgrade kubernetes-client to 5.12.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37261'>SPARK-37261</a>] -         Check adding partitions with ANSI intervals
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37262'>SPARK-37262</a>] -         Not log empty aggregate and group by in JDBCScan
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37264'>SPARK-37264</a>] -         Exclude hadoop-client-api transitive dependency from orc-core
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37267'>SPARK-37267</a>] -         OptimizeSkewInRebalancePartitions support optimize non-root node
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37272'>SPARK-37272</a>] -         Add `ExtendedRocksDBTest` and disable RocksDB tests on Apple Silicon
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37277'>SPARK-37277</a>] -         Support DayTimeIntervalType in Arrow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37279'>SPARK-37279</a>] -         Support DayTimeIntervalType in createDataFrame/toPandas and Python UDFs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37281'>SPARK-37281</a>] -         Support DayTimeIntervalType in Py4J
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37282'>SPARK-37282</a>] -         Add ExtendedLevelDBTest and disable LevelDB tests on Apple Silicon
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37286'>SPARK-37286</a>] -         Move compileAggregates from JDBCRDD to JdbcDialect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37291'>SPARK-37291</a>] -          PySpark init SparkSession should copy conf to sharedState
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37293'>SPARK-37293</a>] -         Remove explicit GC options from Scala tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37294'>SPARK-37294</a>] -         Check inserting of ANSI intervals into a table partitioned by the interval columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37296'>SPARK-37296</a>] -         Add missing type hints in python/pyspark/util.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37304'>SPARK-37304</a>] -         Check replacing columns with ANSI intervals
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37310'>SPARK-37310</a>] -         Migrate ALTER NAMESPACE ... SET PROPERTIES to use v2 command by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37311'>SPARK-37311</a>] -         Migrate ALTER NAMESPACE ... SET LOCATION to use v2 command by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37312'>SPARK-37312</a>] -         Add `.java-version` to `.gitignore` and `.rat-excludes`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37316'>SPARK-37316</a>] -         Add code-gen for existence sort merge join
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37317'>SPARK-37317</a>] -         Reduce weights in GaussianMixtureSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37319'>SPARK-37319</a>] -         Support K8s image building with Java 17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37326'>SPARK-37326</a>] -         Support TimestampNTZ in CSV data source
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37328'>SPARK-37328</a>] -         SPARK-33832 brings the bug that OptimizeSkewedJoin may not work since it was applied on whole plan innstead of new stage plan
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37330'>SPARK-37330</a>] -         Migrate ReplaceTableStatement to v2 command
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37331'>SPARK-37331</a>] -         Add the ability to create resources before driver pod
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37332'>SPARK-37332</a>] -         Check adding of ANSI interval columns to v1/v2 tables
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37343'>SPARK-37343</a>] -         Implement createIndex and IndexExists in JDBC (Postgres dialect)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37345'>SPARK-37345</a>] -         Add java.security.jgss/sun.security.krb5 to DEFAULT_MODULE_OPTIONS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37354'>SPARK-37354</a>] -         Make the Java version installed on the container image used by the K8s integration tests with SBT configurable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37357'>SPARK-37357</a>] -         Add small partition factor for rebalance partitions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37360'>SPARK-37360</a>] -         Support TimestampNTZ in JSON data source
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37376'>SPARK-37376</a>] -         SPJ: Introduce a new DataSource V2 interface HasPartitionKey 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37377'>SPARK-37377</a>] -         SPJ: Initial implementation of Storage-Partitioned Join
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37379'>SPARK-37379</a>] -         Add tree pattern pruning to CTESubstitution rule
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37381'>SPARK-37381</a>] -         Unify v1 and v2 SHOW CREATE TABLE  tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37385'>SPARK-37385</a>] -         Add tests for TimestampNTZ and TimestampLTZ for Parquet data source
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37389'>SPARK-37389</a>] -         Check unclosed bracketed comments
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37397'>SPARK-37397</a>] -         Inline type hints for python/pyspark/ml/base.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37398'>SPARK-37398</a>] -         Inline type hints for python/pyspark/ml/classification.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37399'>SPARK-37399</a>] -         Inline type hints for python/pyspark/ml/common.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37400'>SPARK-37400</a>] -         Inline type hints for python/pyspark/mllib/classification.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37401'>SPARK-37401</a>] -         Inline type hints for python/pyspark/ml/clustering.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37402'>SPARK-37402</a>] -         Inline type hints for python/pyspark/mllib/clustering.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37403'>SPARK-37403</a>] -         Inline type hints for python/pyspark/mllib/common.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37404'>SPARK-37404</a>] -         Inline type hints for python/pyspark/ml/evaluation.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37405'>SPARK-37405</a>] -         Inline type hints for python/pyspark/ml/feature.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37406'>SPARK-37406</a>] -         Inline type hints for python/pyspark/ml/fpm.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37407'>SPARK-37407</a>] -         Inline type hints for python/pyspark/ml/functions.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37408'>SPARK-37408</a>] -         Inline type hints for python/pyspark/ml/image.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37409'>SPARK-37409</a>] -         Inline type hints for python/pyspark/ml/pipeline.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37410'>SPARK-37410</a>] -         Inline type hints for python/pyspark/ml/recommendation.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37411'>SPARK-37411</a>] -         Inline type hints for python/pyspark/ml/regression.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37412'>SPARK-37412</a>] -         Inline type hints for python/pyspark/ml/stat.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37413'>SPARK-37413</a>] -         Inline type hints for python/pyspark/ml/tree.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37414'>SPARK-37414</a>] -         Inline type hints for python/pyspark/ml/tuning.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37415'>SPARK-37415</a>] -         Inline type hints for python/pyspark/ml/util.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37416'>SPARK-37416</a>] -         Inline type hints for python/pyspark/ml/wrapper.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37417'>SPARK-37417</a>] -         Inline type hints for python/pyspark/ml/linalg/__init__.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37418'>SPARK-37418</a>] -         Inline type hints for python/pyspark/ml/param/__init__.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37419'>SPARK-37419</a>] -         Inline type hints for python/pyspark/ml/param/shared.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37421'>SPARK-37421</a>] -         Inline type hints for python/pyspark/mllib/evaluation.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37422'>SPARK-37422</a>] -         Inline type hints for python/pyspark/mllib/feature.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37423'>SPARK-37423</a>] -         Inline type hints for python/pyspark/mllib/fpm.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37424'>SPARK-37424</a>] -         Inline type hints for python/pyspark/mllib/random.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37426'>SPARK-37426</a>] -         Inline type hints for python/pyspark/mllib/regression.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37427'>SPARK-37427</a>] -         Inline type hints for python/pyspark/mllib/tree.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37428'>SPARK-37428</a>] -         Inline type hints for python/pyspark/mllib/util.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37429'>SPARK-37429</a>] -         Inline type hints for python/pyspark/mllib/linalg/__init__.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37430'>SPARK-37430</a>] -         Inline type hints for python/pyspark/mllib/linalg/distributed.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37438'>SPARK-37438</a>] -         ANSI mode: Use store assignment rules for resolving function invocation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37442'>SPARK-37442</a>] -         In AQE, wrong InMemoryRelation size estimation causes &quot;Cannot broadcast the table that is larger than 8GB: 8 GB&quot; failure
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37444'>SPARK-37444</a>] -         ALTER NAMESPACE ... SET LOCATION should handle empty location consistently across v1 and v2 command
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37455'>SPARK-37455</a>] -         Replace hash with sort aggregate if child is already sorted
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37456'>SPARK-37456</a>] -         CREATE NAMESPACE should qualify location for v2 command
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37463'>SPARK-37463</a>] -         Read/Write Timestamp ntz from/to Orc uses int64
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37478'>SPARK-37478</a>] -         Unify v1 and v2 DROP NAMESPACE tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37479'>SPARK-37479</a>] -         Migrate DROP NAMESPACE to use V2 command by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37482'>SPARK-37482</a>] -         Skip check monotonic increasing for Series.asof with &#39;compute.eager_check&#39;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37483'>SPARK-37483</a>] -         Support push down top N to JDBC data source V2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37489'>SPARK-37489</a>] -         Skip hasnans check in numops if eager_check disable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37490'>SPARK-37490</a>] -         Show hint if analyzer fails due to ANSI type coercion
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37494'>SPARK-37494</a>] -         Unify v1 and v2 options output of `SHOW CREATE TABLE` command
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37495'>SPARK-37495</a>] -         Skip identical index checking of Series.compare when config &#39;compute.eager_check&#39; is disabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37501'>SPARK-37501</a>] -         CREATE/REPLACE TABLE should qualify location for v2 command
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37504'>SPARK-37504</a>] -         pyspark should not pass all options to session states.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37509'>SPARK-37509</a>] -         Improve Fallback Storage upload speed by avoiding S3 rate limiter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37510'>SPARK-37510</a>] -         Support TimedeltaIndex in pandas API on Spark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37511'>SPARK-37511</a>] -         Introduce TimedeltaIndex to pandas API on Spark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37512'>SPARK-37512</a>] -         Support TimedeltaIndex creation (from Series/Index) and TimedeltaIndex.astype
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37522'>SPARK-37522</a>] -         Fix MultilayerPerceptronClassifierTest.test_raw_and_probability_prediction
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37526'>SPARK-37526</a>] -         Add Java17 PySpark daily test coverage
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37527'>SPARK-37527</a>] -         Translate more standard aggregate functions for pushdown
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37529'>SPARK-37529</a>] -         Support K8s integration tests for Java 17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37533'>SPARK-37533</a>] -         New SQL function: try_element_at
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37543'>SPARK-37543</a>] -         Document Java 17 support
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37548'>SPARK-37548</a>] -         Add Java17 SparkR daily test coverage
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37557'>SPARK-37557</a>] -         Replace object hash with sort aggregate if child is already sorted
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37563'>SPARK-37563</a>] -         Implement days, seconds, microseconds properties of TimedeltaIndex
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37564'>SPARK-37564</a>] -         Support sort aggregate code-gen without grouping keys
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37576'>SPARK-37576</a>] -         Support built-in K8s executor roll plugin
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37590'>SPARK-37590</a>] -         Unify v1 and v2 ALTER NAMESPACE ... SET PROPERTIES tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37613'>SPARK-37613</a>] -         Support ANSI Aggregate Function: regr_count
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37614'>SPARK-37614</a>] -         Support ANSI Aggregate Function: regr_avgx &amp; regr_avgy
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37619'>SPARK-37619</a>] -         Upgrade Maven to 3.8.4
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37620'>SPARK-37620</a>] -         Use more precise types for SparkContext Optional fields  (i.e. _gateway, _jvm)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37622'>SPARK-37622</a>] -         Support K8s executor rolling policy
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37632'>SPARK-37632</a>] -         Drop code targetting Python &lt; 3.7
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37636'>SPARK-37636</a>] -         Migrate CREATE NAMESPACE to use v2 command by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37638'>SPARK-37638</a>] -         Use existing active Spark session instead of SparkSession.getOrCreate in pandas API on Spark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37641'>SPARK-37641</a>] -         Support ANSI Aggregate Function: regr_r2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37644'>SPARK-37644</a>] -         Support datasource v2 complete aggregate pushdown 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37645'>SPARK-37645</a>] -         Word spell error - &quot;labeled&quot; spells as &quot;labled&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37651'>SPARK-37651</a>] -         Use existing active Spark session in all places of pandas API on Spark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37652'>SPARK-37652</a>] -         Support optimize skewed join through union
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37653'>SPARK-37653</a>] -         Upgrade RoaringBitmap to 0.9.23
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37655'>SPARK-37655</a>] -         Add RocksDB Implementation for KVStore
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37664'>SPARK-37664</a>] -         Add InMemoryColumnarBenchmark and StateStoreBasicOperationsBenchmark Java 11/17 result
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37669'>SPARK-37669</a>] -         Remove unnecessary usages of OrderedDict
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37673'>SPARK-37673</a>] -         Implement `ps.timedelta_range` method
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37675'>SPARK-37675</a>] -         Prevent overwriting of push shuffle merged files once the shuffle is finalized
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37679'>SPARK-37679</a>] -         Add a new executor roll policy, FAILED_TASKS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37680'>SPARK-37680</a>] -         Support RocksDB backend in Spark History Server
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37684'>SPARK-37684</a>] -         Upgrade log4j to 2.17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37685'>SPARK-37685</a>] -         Make log event immutable for LogAppender
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37695'>SPARK-37695</a>] -         Skip diagnosis ob merged blocks from push-based shuffle
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37699'>SPARK-37699</a>] -         Fix failed K8S integration test in SparkConfPropagateSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37707'>SPARK-37707</a>] -         Allow store assignment between TimestampNTZ  and Date/Timestamp
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37709'>SPARK-37709</a>] -         Add AVERAGE_DURATION executor roll policy
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37714'>SPARK-37714</a>] -         ANSI mode: allow casting between numeric type and timestamp type 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37719'>SPARK-37719</a>] -         Try to remove `--add-exports` compile option for Java 17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37727'>SPARK-37727</a>] -         Show ignored confs &amp; hide warnings for conf already set in SparkSession.builder.getOrCreate
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37729'>SPARK-37729</a>] -         SparkSession.setLogLevel not working in Spark Shell
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37732'>SPARK-37732</a>] -         Improve the implement of JDBCV2Suite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37734'>SPARK-37734</a>] -         Upgrade h2 from 1.4.195 to 2.0.202
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37735'>SPARK-37735</a>] -         Add appId interface to KubernetesConf
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37741'>SPARK-37741</a>] -         Remove Jenkins badge in README.md
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37746'>SPARK-37746</a>] -         log4j2-defaults.properties is not working since log4j 2 is always initialized by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37755'>SPARK-37755</a>] -         Optimize RocksDB KVStore configurations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37760'>SPARK-37760</a>] -         Upgrade SBT to 1.6.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37768'>SPARK-37768</a>] -         Schema pruning for the metadata struct
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37769'>SPARK-37769</a>] -         Filter on the metadata struct
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37773'>SPARK-37773</a>] -         Disable certain doctests of `ps.to_timedelta` for pandas&lt;=1.0.5
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37774'>SPARK-37774</a>] -         Upgrade log4j from 2.17 to 2.17.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37775'>SPARK-37775</a>] -         [PYSPARK] Fix mlflow doctest
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37790'>SPARK-37790</a>] -         Upgrade SLF4J to 1.7.32
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37791'>SPARK-37791</a>] -         Use log4j2 in examples. 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37792'>SPARK-37792</a>] -         Spark shell sets log level to INFO by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37794'>SPARK-37794</a>] -         Remove log4j bridge api usage
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37795'>SPARK-37795</a>] -         Add a scalastyle rule to ban `org.apache.log4j`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37801'>SPARK-37801</a>] -         List PyPy3 installed libraries in build_and_test workflow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37804'>SPARK-37804</a>] -         Unify v1 and v2 CREATE NAMESPACE tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37805'>SPARK-37805</a>] -         Refactor TestUtils#configTestLog4j method to use log4j2 api
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37806'>SPARK-37806</a>] -         Support minimum number of tasks per executor before being rolling
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37819'>SPARK-37819</a>] -         Add OUTLIER executor roll policy and use it by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37824'>SPARK-37824</a>] -         Document K8s executor rolling configurations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37827'>SPARK-37827</a>] -         Put the some built-in table properties into V1Table.propertie to adapt to V2 command
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37839'>SPARK-37839</a>] -         DS V2 supports partial aggregate push-down AVG
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37843'>SPARK-37843</a>] -         Suppress NoSuchFieldError at setMDCForTask
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37844'>SPARK-37844</a>] -         Remove slf4j-log4j12 dependency from hadoop-minikdc
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37847'>SPARK-37847</a>] -         PushBlockStreamCallback should check isTooLate first to avoid NPE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37853'>SPARK-37853</a>] -         Clean up deprecation compilation warning related to log4j2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37858'>SPARK-37858</a>] -         Throw Spark exceptions from AES functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37864'>SPARK-37864</a>] -         Support Parquet v2 data page RLE encoding (for Boolean Values) for the vectorized path
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37866'>SPARK-37866</a>] -         Set file.encoding to UTF-8 for SBT tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37867'>SPARK-37867</a>] -         Compile aggregate functions of build-in JDBC dialect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37870'>SPARK-37870</a>] -         Enable Apple Silicon Jenkins CI (Java/Scala/Python/R)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37875'>SPARK-37875</a>] -         Support ARM64 in Java 17 docker image
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37878'>SPARK-37878</a>] -         Migrate SHOW CREATE TABLE to use v2 command by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37880'>SPARK-37880</a>] -         Upgrade Scala to 2.13.8
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37887'>SPARK-37887</a>] -         PySpark shell sets log level to INFO by default 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37889'>SPARK-37889</a>] -         Log4j2 MarkerFilter can not filter unnecessary thrift errors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37923'>SPARK-37923</a>] -         Generate partition transforms for BucketSpec inside parser
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37929'>SPARK-37929</a>] -         Support cascade mode for `dropNamespace` API 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37937'>SPARK-37937</a>] -         Use error classes in the parsing errors of lateral join
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37941'>SPARK-37941</a>] -         Use error classes in the compilation errors of casting
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37943'>SPARK-37943</a>] -         Use error classes in the compilation errors of grouping
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37957'>SPARK-37957</a>] -         Deterministic flag is not handled for V2 functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37960'>SPARK-37960</a>] -         A new framework to represent catalyst expressions in DS v2 APIs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37979'>SPARK-37979</a>] -         Switch to more generic error classes in AES functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37983'>SPARK-37983</a>] -         Backout agg build time metrics from sort aggregate
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37986'>SPARK-37986</a>] -         Support TimestampNTZ radix sort
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37990'>SPARK-37990</a>] -         Support TimestampNTZ in RowToColumnConverter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37995'>SPARK-37995</a>] -         TPCDS 1TB q72 fails when spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly is false
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37998'>SPARK-37998</a>] -         Use `rbac.authorization.k8s.io/v1` instead of `v1beta1`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38001'>SPARK-38001</a>] -         Replace the unsupported error classes by `UNSUPPORTED_FEATURE`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38013'>SPARK-38013</a>] -         AQE can change bhj to smj if no extra shuffle introduce
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38015'>SPARK-38015</a>] -         Mark legacy file naming functions as deprecated in FileCommitProtocol
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38019'>SPARK-38019</a>] -         ExecutorMonitor.timedOutExecutors should be deterministic
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38022'>SPARK-38022</a>] -         Use relativePath for K8s remote file test in BasicTestsSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38023'>SPARK-38023</a>] -         ExecutorMonitor.onExecutorRemoved should handle ExecutorDecommission as finished
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38029'>SPARK-38029</a>] -         Support docker-desktop K8S integration test in SBT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38030'>SPARK-38030</a>] -         Query with cast containing non-nullable columns fails with AQE on Spark 3.1.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38047'>SPARK-38047</a>] -         Add OUTLIER_NO_FALLBACK executor roll policy
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38048'>SPARK-38048</a>] -         Add IntegrationTestBackend.describePods to support all K8s test backends
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38049'>SPARK-38049</a>] -         Use Java 17 in K8s integration tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38062'>SPARK-38062</a>] -         FallbackStorage shouldn&#39;t attempt to resolve arbitrary &quot;remote&quot; hostname
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38071'>SPARK-38071</a>] -         Support K8s namespace parameter in SBT K8s IT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38072'>SPARK-38072</a>] -         Support K8s imageTag parameter in SBT K8s IT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38081'>SPARK-38081</a>] -         Support cloud-backend in K8s IT with SBT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38085'>SPARK-38085</a>] -         DataSource V2: Handle DELETE commands for group-based sources
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38095'>SPARK-38095</a>] -         HistoryServerDiskManager.appStorePath should use backend-based extensions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38097'>SPARK-38097</a>] -         Improve the error for pivoting of unsupported value types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38103'>SPARK-38103</a>] -         Use error classes in the parsing errors of transform
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38104'>SPARK-38104</a>] -         Use error classes in the parsing errors of windows
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38105'>SPARK-38105</a>] -         Use error classes in the parsing errors of joins
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38107'>SPARK-38107</a>] -         Use error classes in the compilation errors of python/pandas UDFs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38112'>SPARK-38112</a>] -         Use error classes in the execution errors of date/timestamp handling
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38113'>SPARK-38113</a>] -         Use error classes in the execution errors of pivoting
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38125'>SPARK-38125</a>] -         Use static factory methods instead of the deprecated `Byte/Short/Integer/Long` constructors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38126'>SPARK-38126</a>] -         Check the whole message of error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38131'>SPARK-38131</a>] -         Keep only user-facing error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38145'>SPARK-38145</a>] -         Address the &#39;pyspark&#39; tagged tests when &quot;spark.sql.ansi.enabled&quot; is True
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38155'>SPARK-38155</a>] -         Disallow distinct aggregate in lateral subqueries with unsupported correlated predicates
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38157'>SPARK-38157</a>] -         Fix /sql/hive-thriftserver/org.apache.spark.sql.hive.thriftserver.ThriftServerQueryTestSuite under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38159'>SPARK-38159</a>] -         Minor refactor of MetadataAttribute unapply method
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38162'>SPARK-38162</a>] -         Optimize one row plan in normal and AQE Optimizer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38163'>SPARK-38163</a>] -         Preserve the error class of `AnalysisException` while constructing of function builder
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38164'>SPARK-38164</a>] -         New SQL function: try_subtract and try_multiply
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38176'>SPARK-38176</a>] -         ANSI mode: allow implicitly casting String to other simple types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38180'>SPARK-38180</a>] -         Allow safe up-cast expressions in correlated equality predicates
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38187'>SPARK-38187</a>] -         Support resource reservation (Introduce minCPU/minMemory) with volcano implementations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38188'>SPARK-38188</a>] -         Support queue scheduling (Introduce queue) with volcano implementations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38196'>SPARK-38196</a>] -         Reactor framework so as JDBC dialect could compile expression by self way
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38203'>SPARK-38203</a>] -         Fix SQLInsertTestSuite and SchemaPruningSuite under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38226'>SPARK-38226</a>] -         Fix HiveCompatibilitySuite under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38228'>SPARK-38228</a>] -         Legacy store assignment should not fail on error under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38232'>SPARK-38232</a>] -         Explain formatted does not collect subqueries under query stage in AQE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38241'>SPARK-38241</a>] -         Close KubernetesClient in K8S integrations tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38244'>SPARK-38244</a>] -         Upgrade kubernetes-client to 5.12.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38246'>SPARK-38246</a>] -         Refactor KVUtils and add UTs related to RocksDB
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38251'>SPARK-38251</a>] -         Change Cast.toString as &quot;cast&quot; instead of &quot;ansi_cast&quot; under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38268'>SPARK-38268</a>] -         Hide the &quot;failOnError&quot; field in the toString method of Abs/CheckOverflow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38272'>SPARK-38272</a>] -         Use docker-desktop instead of docker-for-desktop for Docker K8S IT deployMode and context name 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38276'>SPARK-38276</a>] -         Add approved TPCDS plan under ANSI mode 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38281'>SPARK-38281</a>] -         Fix AnalysisSuite under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38283'>SPARK-38283</a>] -         Test invalid datetime parsing under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38290'>SPARK-38290</a>] -         Fix JsonSuite and ParquetIOSuite under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38295'>SPARK-38295</a>] -         Fix ArithmeticExpressionSuite under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38298'>SPARK-38298</a>] -         Fix DataExpressionSuite, NullExpressionsSuite, StringExpressionsSuite, complexTypesSuite, CastSuite under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38302'>SPARK-38302</a>] -         Use Java 17 in K8S integration tests when setting spark-tgz
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38306'>SPARK-38306</a>] -         Fix ExplainSuite,StatisticsCollectionSuite and StringFunctionsSuite under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38307'>SPARK-38307</a>] -         Fix ExpressionTypeCheckingSuite and CollectionExpressionsSuite under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38311'>SPARK-38311</a>] -         Fix DynamicPartitionPruning/BucketedReadSuite/ExpressionInfoSuite under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38312'>SPARK-38312</a>] -         Use error classes in org.apache.spark.metrics
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38316'>SPARK-38316</a>] -         Fix SQLViewSuite/TriggerAvailableNowSuite/UnwrapCastInBinaryComparisonSuite/UnwrapCastInComparisonEndToEndSuite under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38321'>SPARK-38321</a>] -         Fix BooleanSimplificationSuite under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38325'>SPARK-38325</a>] -         ANSI mode: avoid potential runtime error in HashJoin.extractKeyExprAt() 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38343'>SPARK-38343</a>] -         Fix SQLQuerySuite under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38352'>SPARK-38352</a>] -         Fix DataFrameAggregateSuite/DataFrameSetOperationsSuite/DataFrameWindowFunctionsSuite under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38361'>SPARK-38361</a>] -         Add factory method getConnection into JDBCDialect.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38363'>SPARK-38363</a>] -         Avoid runtime error in Dataset.summary() when ANSI mode is on
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38383'>SPARK-38383</a>] -         Support APP_ID and EXECUTOR_ID placeholder in annotations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38385'>SPARK-38385</a>] -         Improve error messages of &#39;mismatched input&#39; cases from ANTLR
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38387'>SPARK-38387</a>] -         Support `na_action` and Series input correspondence in `Series.map`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38391'>SPARK-38391</a>] -         Datasource v2 supports partial topN push-down
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38392'>SPARK-38392</a>] -         Add `spark-` prefix to namespaces and `-driver` suffix to drivers during IT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38398'>SPARK-38398</a>] -         Add `priorityClassName` integration test case
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38400'>SPARK-38400</a>] -         Enable Series.rename to change index labels
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38406'>SPARK-38406</a>] -         Improve perfermance of ShufflePartitionsUtil createSkewPartitionSpecs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38407'>SPARK-38407</a>] -         ANSI Cast: loosen the limitation of casting non-null complex types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38410'>SPARK-38410</a>] -         Support specify initial partition number for rebalance
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38417'>SPARK-38417</a>] -         Remove `Experimental` from `RDD.cleanShuffleDependencies` API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38418'>SPARK-38418</a>] -         Add PySpark cleanShuffleDependencies API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38423'>SPARK-38423</a>] -         Support priority scheduling with volcano implementations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38430'>SPARK-38430</a>] -         Add SBT commands to K8s IT readme
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38432'>SPARK-38432</a>] -         Refactor framework so as JDBC dialect could compile filter by self way
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38442'>SPARK-38442</a>] -         Fix ConstantFoldingSuite/ColumnExpressionSuite/DataFrameSuite/AdaptiveQueryExecSuite under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38450'>SPARK-38450</a>] -         Fix HiveQuerySuite//PushFoldableIntoBranchesSuite/TransposeWindowSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38451'>SPARK-38451</a>] -         Fix R tests under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38452'>SPARK-38452</a>] -         Support pyDockerfile and rDockerfile in  SBT K8s IT 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38453'>SPARK-38453</a>] -         Add volcano section to K8s IT README.md
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38455'>SPARK-38455</a>] -         Support driver/executor PodGroup templates
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38456'>SPARK-38456</a>] -         Improve error messages of no viable alternative, extraneous input and missing token 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38480'>SPARK-38480</a>] -         Remove spark.kubernetes.job.queue in favor of spark.kubernetes.driver.podGroupTemplateFile
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38481'>SPARK-38481</a>] -         Substitute Java overflow exception from TIMESTAMPADD by Spark exception
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38486'>SPARK-38486</a>] -         Upgrade the minimum Minikube version to 1.18.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38490'>SPARK-38490</a>] -         Add Github action test job for ANSI SQL mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38491'>SPARK-38491</a>] -         Support `ignore_index` of `Series.sort_values`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38501'>SPARK-38501</a>] -         Fix thriftserver test failures under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38504'>SPARK-38504</a>] -         Can&#39;t read TimestampNTZ as TimestampLTZ
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38508'>SPARK-38508</a>] -         Volcano feature doesn&#39;t work on EKS graviton instances
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38511'>SPARK-38511</a>] -         Remove priorityClassName propagation in favor of explicit settings
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38513'>SPARK-38513</a>] -         Move custom scheduler-specific configs to under `spark.kubernetes.scheduler.NAME` prefix
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38515'>SPARK-38515</a>] -         Volcano queue is not deleted
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38518'>SPARK-38518</a>] -         Implement `skipna` of `Series.all/Index.all` to exclude NA/null values
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38519'>SPARK-38519</a>] -         AQE throw exception should respect SparkFatalException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38524'>SPARK-38524</a>] -         Fix Volcano weight to be positive integer and use cpu capability instead
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38527'>SPARK-38527</a>] -         Set the minimum Volcano version
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38533'>SPARK-38533</a>] -         DS V2 aggregate push-down supports project with alias
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38534'>SPARK-38534</a>] -         Disable to_timestamp(&#39;366&#39;, &#39;DD&#39;) test case
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38537'>SPARK-38537</a>] -         Unify Statefulset* to StatefulSet*
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38538'>SPARK-38538</a>] -         Fix driver environment verification in BasicDriverFeatureStepSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38544'>SPARK-38544</a>] -         Upgrade log4j2 to 2.17.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38548'>SPARK-38548</a>] -         New SQL function: try_sum
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38553'>SPARK-38553</a>] -         Bump minimum Volcano version to v1.5.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38560'>SPARK-38560</a>] -         If `Sum`, `Count`, `Any` accompany distinct, cannot do partial agg push down.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38561'>SPARK-38561</a>] -         Add doc for &quot;Customized Kubernetes Schedulers&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38562'>SPARK-38562</a>] -         Add doc for Volcano scheduler
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38589'>SPARK-38589</a>] -         New SQL function: try_avg
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38590'>SPARK-38590</a>] -         New SQL function: try_to_binary
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38616'>SPARK-38616</a>] -         Keep track of SQL query text in Catalyst TreeNode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38625'>SPARK-38625</a>] -         DataSource V2: Add APIs for group-based row-level operations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38626'>SPARK-38626</a>] -         Make condition in DeleteFromTable required
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38633'>SPARK-38633</a>] -         Support push down Cast to JDBC data source V2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38644'>SPARK-38644</a>] -         DS V2 topN push-down supports project with alias
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38676'>SPARK-38676</a>] -         Provide query context in runtime error of Add/Subtract/Multiply
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38698'>SPARK-38698</a>] -         Provide query context in runtime error of Divide/Div/Reminder/Pmod
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38716'>SPARK-38716</a>] -         Provide query context in runtime error of map key not exists
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38761'>SPARK-38761</a>] -         DS V2 supports push down misc non-aggregate functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38762'>SPARK-38762</a>] -         Provide query context in Decimal overflow errors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38763'>SPARK-38763</a>] -         Pandas API on spark Can`t apply lamda to columns.  
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38787'>SPARK-38787</a>] -         Possible correctness issue on stream-stream join when handling edge case
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38791'>SPARK-38791</a>] -         Output parameter values of error classes in SQL style
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38809'>SPARK-38809</a>] -         Implement option to skip null values in symmetric hash impl of stream-stream joins
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38813'>SPARK-38813</a>] -         Remove TimestampNTZ type support in Spark 3.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38817'>SPARK-38817</a>] -         Upgrade kubernetes-client to 5.12.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38828'>SPARK-38828</a>] -         Remove TimestampNTZ type Python support in Spark 3.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38829'>SPARK-38829</a>] -         New configuration for controlling timestamp inference of Parquet
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38837'>SPARK-38837</a>] -         Implement `dropna` parameter of `SeriesGroupBy.value_counts`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38855'>SPARK-38855</a>] -         DS V2 supports push down math functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38865'>SPARK-38865</a>] -         Update document of JDBC options for pushDownAggregate and pushDownLimit
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38891'>SPARK-38891</a>] -         Skipping allocating vector for repetition &amp; definition levels when possible
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38908'>SPARK-38908</a>] -         Provide query context in runtime error of Casting from String to Number/Date/Timestamp/Boolean
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38913'>SPARK-38913</a>] -         Output identifiers in error messages in SQL style
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38926'>SPARK-38926</a>] -         Output types in error messages in SQL style
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38949'>SPARK-38949</a>] -         Wrap SQL statements by double quotes in error messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38950'>SPARK-38950</a>] -         Return Array of Predicate for SupportsPushDownCatalystFilters.pushedFilters
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38967'>SPARK-38967</a>] -         Turn spark.sql.ansi.strictIndexOperator into internal config
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38996'>SPARK-38996</a>] -         Use double quotes for types in error messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38997'>SPARK-38997</a>] -         DS V2 aggregate push-down supports group by expressions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39007'>SPARK-39007</a>] -         Use double quotes for SQL configs in error messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39027'>SPARK-39027</a>] -         Output SQL statements in upper case in error messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39040'>SPARK-39040</a>] -         Respect NaNvl in EquivalentExpressions for expression elimination
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39046'>SPARK-39046</a>] -         Return an empty context string if TreeNode.origin is wrongly set
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39087'>SPARK-39087</a>] -         Improve error messages: step 1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39105'>SPARK-39105</a>] -         Add ConditionalExpression trait
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39106'>SPARK-39106</a>] -         Correct conditional expression constant folding
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39121'>SPARK-39121</a>] -         Fix doc format/syntax error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39135'>SPARK-39135</a>] -         DS V2 aggregate partial push-down should supports group by without aggregate functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39157'>SPARK-39157</a>] -         H2Dialect should override getJDBCType so as make the data type is correct
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39162'>SPARK-39162</a>] -         Jdbc dialect should decide which function could be pushed down.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39164'>SPARK-39164</a>] -         Wrap asserts/illegal state exceptions by the INTERNAL_ERROR exception in actions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39165'>SPARK-39165</a>] -         Replace sys.error by IllegalStateException in Spark SQL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39166'>SPARK-39166</a>] -         Provide runtime error query context for Binary Arithmetic when WSCG is off
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39175'>SPARK-39175</a>] -         Provide runtime error query context for Cast when WSCG is off
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39177'>SPARK-39177</a>] -         Provide query context on map key not exists error when WSCG is off
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39187'>SPARK-39187</a>] -         Remove SparkIllegalStateException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39190'>SPARK-39190</a>] -         Provide query context for decimal precision overflow error when WSCG is off
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39193'>SPARK-39193</a>] -         Fasten Timestamp type inference of default format in JSON/CSV data source
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39208'>SPARK-39208</a>] -         Fix query context bugs in decimal overflow under codegen mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39210'>SPARK-39210</a>] -         Provide query context of Decimal overflow in AVG when WSCG is off
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39212'>SPARK-39212</a>] -         Use double quotes for values of SQL configs/DS options in error messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39214'>SPARK-39214</a>] -         Improve errors related to CAST
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39229'>SPARK-39229</a>] -         Separate query contexts from error-classes.json
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39234'>SPARK-39234</a>] -         Code clean up in SparkThrowableHelper.getMessage
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39243'>SPARK-39243</a>] -         Describe the rules of quoting elements in error messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39255'>SPARK-39255</a>] -         Improve error messages: step 2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39272'>SPARK-39272</a>] -         Increase the start position of query context by 1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39322'>SPARK-39322</a>] -         Remove `Experimental` from `spark.dynamicAllocation.shuffleTracking.enabled`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39327'>SPARK-39327</a>] -         ExecutorRollPolicy.ID should consider ID as a numerical number. 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39346'>SPARK-39346</a>] -         Convert asserts/illegal state exception to internal errors on each phase
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39886'>SPARK-39886</a>] -         Disable DEFAULT column SQLConf until implementation is complete
</li>
</ul>
            
<h2>        Bug
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-8582'>SPARK-8582</a>] -         Optimize checkpointing to avoid computing an RDD twice
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-18621'>SPARK-18621</a>] -         PySQL SQL Types (aka Dataframa Schema) have __repr__() with Scala and not Python representation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-23626'>SPARK-23626</a>] -          DAGScheduler blocked due to JobSubmitted event
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-30062'>SPARK-30062</a>] -         Add IMMEDIATE statement to the DB2 dialect truncate implementation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-30537'>SPARK-30537</a>] -         toPandas gets wrong dtypes when applied on empty DF when Arrow enabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-32079'>SPARK-32079</a>] -         PySpark &lt;&gt; Beam pickling issues for collections.namedtuple
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-33206'>SPARK-33206</a>] -         Spark Shuffle Index Cache calculates memory usage wrong
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34521'>SPARK-34521</a>] -         spark.createDataFrame does not support Pandas StringDtype extension type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34805'>SPARK-34805</a>] -         PySpark loses metadata in DataFrame fields when selecting nested columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35011'>SPARK-35011</a>] -         False active executor in UI that caused by BlockManager reregistration
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35430'>SPARK-35430</a>] -         Investigate the failure of &quot;PVs with local storage&quot; integration test on Docker driver
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35531'>SPARK-35531</a>] -         Can not insert into hive bucket table if create table with upper case schema
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35561'>SPARK-35561</a>] -         partition result is incorrect when insert into partition table with int datatype partition column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35672'>SPARK-35672</a>] -         Spark fails to launch executors with very large user classpath lists on YARN
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35803'>SPARK-35803</a>] -         Spark SQL does not support creating views using DataSource v2 based data sources
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35881'>SPARK-35881</a>] -         [SQL] AQE does not support columnar execution for the final query stage
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35912'>SPARK-35912</a>] -         [SQL] JSON read behavior is different depending on the cache setting when nullable is false.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35929'>SPARK-35929</a>] -         Schema inference of nested structs defaults to map
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36004'>SPARK-36004</a>] -         Update MiMa and audit Scala/Java API changes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36007'>SPARK-36007</a>] -         Failed to run benchmark in GA
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36009'>SPARK-36009</a>] -         Missing GraphX classes in registerKryoClasses util method
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36013'>SPARK-36013</a>] -         Upgrade Dropwizard Metrics to 4.2.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36014'>SPARK-36014</a>] -         Use uuid as app id in kubernetes client mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36026'>SPARK-36026</a>] -         Upgrade Kubernetes Client Version to 5.5.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36036'>SPARK-36036</a>] -         Regression: Remote blocks stored on disk by BlockManager are not deleted
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36052'>SPARK-36052</a>] -         Introduce pending pod limit for Spark on K8s
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36122'>SPARK-36122</a>] -         Spark does not passon needClientAuth to Jetty SSLContextFactory. Does not allow to configure mTLS authentication.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36169'>SPARK-36169</a>] -         Make &#39;spark.sql.sources.disabledJdbcConnProviderList&#39; as a static conf (as documneted)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36211'>SPARK-36211</a>] -         type check fails for `F.udf(...).asNonDeterministic()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36237'>SPARK-36237</a>] -         SparkUI should bind handler after application started
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36242'>SPARK-36242</a>] -         Ensure spill file closed before set success to true in ExternalSorter.spillMemoryIteratorToDisk method
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36327'>SPARK-36327</a>] -         Spark sql creates staging dir inside database directory rather than creating inside table directory
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36341'>SPARK-36341</a>] -         In stage page, &#39;Aggregated Metrics by Executor&#39; the underline displayed when the mouse is moved to the link is blocked
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36348'>SPARK-36348</a>] -         unexpected Index loaded: pd.Index([10, 20, None], name=&quot;x&quot;)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36358'>SPARK-36358</a>] -         Upgrade Kubernetes Client Version to 5.6.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36379'>SPARK-36379</a>] -         Null at root level of a JSON array causes the parsing failure (w/ permissive mode)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36382'>SPARK-36382</a>] -         Remove noisy footer from the summary table for metrics
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36383'>SPARK-36383</a>] -         NullPointerException throws during executor shutdown
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36389'>SPARK-36389</a>] -         Revert the change that accepts negative mapId in ShuffleBlockId
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36391'>SPARK-36391</a>] -         When fetch chunk throw NPE, improve the error message
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36421'>SPARK-36421</a>] -         Validate all SQL configs to prevent from wrong use for ConfigEntry
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36433'>SPARK-36433</a>] -         Logs should show correct URL of where HistoryServer is started
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36448'>SPARK-36448</a>] -         Exceptions in NoSuchItemException.scala have to be case classes to preserve specific exceptions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36488'>SPARK-36488</a>] -         &quot;Invalid usage of &#39;*&#39; in expression&quot; error due to the feature of &#39;quotedRegexColumnNames&#39; in some scenarios.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36507'>SPARK-36507</a>] -         Remove/Replace missing links to AMP Camp materials from index.md
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36512'>SPARK-36512</a>] -         Fix UISeleniumSuite in sql/hive-thriftserver
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36553'>SPARK-36553</a>] -         KMeans fails with NegativeArraySizeException for K = 50000 after issue #27758 was introduced
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36554'>SPARK-36554</a>] -         Error message while trying to use spark sql functions directly on dataframe columns without using select expression
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36568'>SPARK-36568</a>] -         Missed broadcast join in V2 plan
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36605'>SPARK-36605</a>] -         Upgrade Jackson to 2.12.5
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36627'>SPARK-36627</a>] -         Tasks with Java proxy objects fail to deserialize
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36681'>SPARK-36681</a>] -         Fail to load Snappy codec
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36700'>SPARK-36700</a>] -         BlockManager re-registration is broken due to deferred removal of BlockManager 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36717'>SPARK-36717</a>] -         Wrong order of variable initialization may lead to incorrect behavior
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36733'>SPARK-36733</a>] -         Perf issue in SchemaPruning when a struct has many fields
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36773'>SPARK-36773</a>] -         Fix the uts to check the parquet compression
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36798'>SPARK-36798</a>] -         When SparkContext is stopped, metrics system should be flushed after listeners have finished processing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36804'>SPARK-36804</a>] -         Using the verbose parameter in yarn mode would cause application submission failure
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36806'>SPARK-36806</a>] -         Use R 4.0.4 in K8s R image
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36861'>SPARK-36861</a>] -         Partition columns are overly eagerly parsed as dates
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36867'>SPARK-36867</a>] -         Misleading Error Message with Invalid Column and Group By
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36889'>SPARK-36889</a>] -         Respect `spark.sql.parquet.filterPushdown` by explain() for DSv2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36896'>SPARK-36896</a>] -         Return boolean for `dropTempView` and `dropGlobalTempView`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36905'>SPARK-36905</a>] -         Reading Hive view without explicit column names fails in Spark 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36985'>SPARK-36985</a>] -         Future typing errors in pyspark.pandas
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36993'>SPARK-36993</a>] -         Fix json_tuple throw NPE if fields exist no foldable null value
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37004'>SPARK-37004</a>] -         Job cancellation causes py4j errors on Jupyter due to pinned thread mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37017'>SPARK-37017</a>] -         Reduce the scope of synchronized to prevent deadlock.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37026'>SPARK-37026</a>] -         Ensure the element type of ResolvedRFormula.terms is scala.Seq for Scala 2.13
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37046'>SPARK-37046</a>] -         Alter view does not preserve column case
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37049'>SPARK-37049</a>] -         executorIdleTimeout is not working for pending pods on K8s
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37052'>SPARK-37052</a>] -         Fix spark-3.2 can use --verbose with spark-shell
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37057'>SPARK-37057</a>] -         Fix wrong DocSearch facet filter in release-tag.sh
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37059'>SPARK-37059</a>] -         Ensure the sort order of the output in the PySpark doctests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37060'>SPARK-37060</a>] -         Report driver status does not handle response from backup masters
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37061'>SPARK-37061</a>] -         Custom V2 Metrics uses wrong classname for lookup
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37064'>SPARK-37064</a>] -         Fix outer join return the wrong max rows if other side is empty
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37069'>SPARK-37069</a>] -         HiveClientImpl throws NoSuchMethodError: org.apache.hadoop.hive.ql.metadata.Hive.getWithoutRegisterFns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37076'>SPARK-37076</a>] -         Implement StructType.toString explicitly for Scala 2.13
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37078'>SPARK-37078</a>] -         Support old 3-parameter Sink constructors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37079'>SPARK-37079</a>] -         Fix DataFrameWriterV2.partitionedBy to send the arguments to JVM properly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37086'>SPARK-37086</a>] -         Fix the R test of FPGrowthModel for Scala 2.13
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37088'>SPARK-37088</a>] -         Python UDF after off-heap vectorized reader can cause crash due to use-after-free in writer thread
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37089'>SPARK-37089</a>] -         ParquetFileFormat registers task completion listeners lazily, causing Python writer thread to segfault when off-heap vectorized reader is enabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37098'>SPARK-37098</a>] -         Alter table properties should invalidate cache
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37102'>SPARK-37102</a>] -         Missing dependencies for hadoop-azure
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37103'>SPARK-37103</a>] -         Switch from Maven to SBT to build Spark on AppVeyor
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37112'>SPARK-37112</a>] -         Fix MiMa failure with Scala 2.13
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37117'>SPARK-37117</a>] -         Can&#39;t read files in one of Parquet encryption modes (external keymaterial) 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37121'>SPARK-37121</a>] -         TestUtils.isPythonVersionAtLeast38 returns incorrect results
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37135'>SPARK-37135</a>] -         Fix some mirco-benchmarks run failed 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37141'>SPARK-37141</a>] -         WorkerSuite cannot run on Mac OS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37143'>SPARK-37143</a>] -         Supplement the missing Java 11 benchmark result files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37147'>SPARK-37147</a>] -         MetricsReporter producing NullPointerException when element &#39;triggerExecution&#39; not present in Map[]
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37170'>SPARK-37170</a>] -         Pin PySpark version installed in the Binder environment for tagged commit
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37191'>SPARK-37191</a>] -         Allow merging DecimalTypes with different precision values 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37196'>SPARK-37196</a>] -         NPE in org.apache.spark.sql.hive.HiveShim$.toCatalystDecimal(HiveShim.scala:106)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37202'>SPARK-37202</a>] -         Temp view didn&#39;t collect temp function that registered with catalog API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37203'>SPARK-37203</a>] -         Fix NotSerializableException when observe with TypedImperativeAggregate
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37209'>SPARK-37209</a>] -         YarnShuffleIntegrationSuite  and other two similar cases in `resource-managers` test failed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37217'>SPARK-37217</a>] -         The number of dynamic partitions should early check when writing to external tables
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37252'>SPARK-37252</a>] -         Ignore test_memory_limit on non-Linux environment
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37253'>SPARK-37253</a>] -         try_simplify_traceback should not fail when tb_frame.f_lineno is None
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37290'>SPARK-37290</a>] -         Exponential planning time in case of non-deterministic function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37302'>SPARK-37302</a>] -         Explicitly download the dependencies of guava and jetty-io in test-dependencies.sh
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37308'>SPARK-37308</a>] -         Flaky Test: DDLParserSuite.create view -- basic
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37314'>SPARK-37314</a>] -         Upgrade kubernetes-client to 5.10.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37315'>SPARK-37315</a>] -         Mitigate ConcurrentModificationException thrown from a test in MLEventSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37318'>SPARK-37318</a>] -         Make FallbackStorageSuite robust in terms of DNS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37320'>SPARK-37320</a>] -         Delete py_container_checks.zip after the test in DepsTestsSuite finishes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37356'>SPARK-37356</a>] -         Add fine grained locking to BlockInfoManager
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37374'>SPARK-37374</a>] -         StatCounter should use mergeStats when merging with self.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37388'>SPARK-37388</a>] -         WidthBucket throws NullPointerException in WholeStageCodegenExec
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37390'>SPARK-37390</a>] -         Buggy method retrival in pyspark.docs.conf.setup
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37391'>SPARK-37391</a>] -         SIGNIFICANT bottleneck introduced by fix for SPARK-32001
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37392'>SPARK-37392</a>] -         Catalyst optimizer very time-consuming and memory-intensive with some &quot;explode(array)&quot; 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37452'>SPARK-37452</a>] -         Char and Varchar breaks backward compatibility between v3 and v2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37459'>SPARK-37459</a>] -         Upgrade commons-cli to 1.5.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37465'>SPARK-37465</a>] -         PySpark tests failing on Pandas 0.23
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37480'>SPARK-37480</a>] -         Configurations in docs/running-on-kubernetes.md are not uptodate
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37481'>SPARK-37481</a>] -         Disappearance of skipped stages mislead the bug hunting 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37498'>SPARK-37498</a>] -          test_reuse_worker_of_parallelize_range is flaky
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37524'>SPARK-37524</a>] -         We should drop all tables after testing dynamic partition pruning
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37534'>SPARK-37534</a>] -         Bump dev.ludovic.netlib to 2.2.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37544'>SPARK-37544</a>] -         sequence over dates with month interval is producing incorrect results
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37545'>SPARK-37545</a>] -         V2 CreateTableAsSelect command should qualify location
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37546'>SPARK-37546</a>] -         V2 ReplaceTableAsSelect command should qualify location
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37554'>SPARK-37554</a>] -         Add PyArrow, pandas and plotly to release Docker image dependencies
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37556'>SPARK-37556</a>] -         Deser void class fail with Java serialization
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37569'>SPARK-37569</a>] -         View Analysis incorrectly marks nested fields as nullable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37573'>SPARK-37573</a>] -         IsolatedClient  fallbackVersion should be build in version, not always 2.7.4
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37575'>SPARK-37575</a>] -         null values should be saved as nothing rather than quoted empty Strings &quot;&quot; with default settings
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37577'>SPARK-37577</a>] -         ClassCastException: ArrayType cannot be cast to StructType
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37585'>SPARK-37585</a>] -         DSV2 InputMetrics are not getting update in corner case
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37598'>SPARK-37598</a>] -         Pyspark&#39;s newAPIHadoopRDD() method fails with ShortWritables
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37615'>SPARK-37615</a>] -         Upgrade SBT to 1.5.6
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37633'>SPARK-37633</a>] -         Unwrap cast should skip if downcast failed with ansi enabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37635'>SPARK-37635</a>] -         SHOW TBLPROPERTIES should print the fully qualified table name
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37643'>SPARK-37643</a>] -         when charVarcharAsString is true, char datatype partition table query incorrect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37654'>SPARK-37654</a>] -         Regression - NullPointerException in Row.getSeq when field null
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37656'>SPARK-37656</a>] -         Upgrade SBT to 1.5.7
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37658'>SPARK-37658</a>] -         Skip PIP packaging test if Python version is lower than 3.7
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37659'>SPARK-37659</a>] -         Fix FsHistoryProvider race condition between list and delet log info
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37663'>SPARK-37663</a>] -         Mitigate ConcurrentModificationException thrown from tests in SparkContextSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37668'>SPARK-37668</a>] -         &#39;Index&#39; object has no attribute &#39;levels&#39; in  pyspark.pandas.frame.DataFrame.insert
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37678'>SPARK-37678</a>] -         Incorrect annotations in SeriesGroupBy._cleanup_and_return 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37690'>SPARK-37690</a>] -         Recursive view `df` detected (cycle: `df` -&gt; `df`)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37693'>SPARK-37693</a>] -         Fix ChildProcAppHandleSuite failed in spark-master-test-maven-hadoop-3.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37694'>SPARK-37694</a>] -         Disallow delete resources in spark sql cli
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37703'>SPARK-37703</a>] -         Upgrade SBT to 1.5.8
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37713'>SPARK-37713</a>] -         No namespace assigned in Executor Pod ConfigMap
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37721'>SPARK-37721</a>] -         Failed to execute pyspark test in Win WSL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37728'>SPARK-37728</a>] -         reading nested columns with ORC vectorized reader can cause ArrayIndexOutOfBoundsException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37730'>SPARK-37730</a>] -         plot.hist throws AttributeError on pandas=1.3.5
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37754'>SPARK-37754</a>] -         Fix black version in dev/reformat-python
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37763'>SPARK-37763</a>] -         Upgrade jackson to 2.13.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37778'>SPARK-37778</a>] -         Upgrade SBT to 1.6.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37779'>SPARK-37779</a>] -         Make ColumnarToRowExec plan canonicalizable after (de)serialization
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37793'>SPARK-37793</a>] -         Invalid LocalMergedBlockData cause task hang
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37800'>SPARK-37800</a>] -         TreeNode.argString incorrectly formats arguments of type Set[_]
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37802'>SPARK-37802</a>] -         composite field name like `field name` doesn&#39;t work with Aggregate push down
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37807'>SPARK-37807</a>] -         Fix a typo in HttpAuthenticationException message
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37820'>SPARK-37820</a>] -         Replace ApacheCommonBase64 with JavaBase64 for string fucntions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37834'>SPARK-37834</a>] -         Reenable length check in Python linter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37841'>SPARK-37841</a>] -         BasicWriteTaskStatsTracker should not try get status for a skipped file
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37846'>SPARK-37846</a>] -         TaskContext is used at wrong place in BlockManagerDecommissionIntegrationSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37855'>SPARK-37855</a>] -         IllegalStateException when transforming an array inside a nested struct
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37859'>SPARK-37859</a>] -         SQL tables created with JDBC with Spark 3.1 are not readable with 3.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37860'>SPARK-37860</a>] -         [BUG] Revert: Fix taskid in the stage page task event timeline
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37865'>SPARK-37865</a>] -         Spark should not dedup the groupingExpressions when the first child of Union has duplicate columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37874'>SPARK-37874</a>] -         Link to Pandas UDF documentation is broken
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37884'>SPARK-37884</a>] -         Upgrade kubernetes-client to 5.10.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37893'>SPARK-37893</a>] -         Fix flaky test: AdaptiveQueryExecSuite with Scala 2.13
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37895'>SPARK-37895</a>] -         Error while joining two tables with non-english field names
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37905'>SPARK-37905</a>] -         Make `merge_spark_pr.py` set primary author from the first commit in case of ties
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37918'>SPARK-37918</a>] -         Specified construct when instance SessionStateBuilder
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37920'>SPARK-37920</a>] -         Remove tab character and trailing space in pom.xml
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37932'>SPARK-37932</a>] -         Analyzer can fail when join left side and right side are the same view
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37947'>SPARK-37947</a>] -         Cannot use &lt;func&gt;_outer generators in a lateral view
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37958'>SPARK-37958</a>] -         Pyspark SparkContext.AddFile() does not respect spark.files.overwrite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37963'>SPARK-37963</a>] -         Need to update Partition URI after renaming table in InMemoryCatalog
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37972'>SPARK-37972</a>] -         Typing incompatibilities with numpy==1.22.x
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38016'>SPARK-38016</a>] -         Fix the API doc for session_window to say it supports TimestampNTZType too as timeColumn
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38018'>SPARK-38018</a>] -         Fix ColumnVectorUtils.populate to handle CalendarIntervalType correctly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38042'>SPARK-38042</a>] -         Encoder cannot be found when a tuple component is a type alias for an Array
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38056'>SPARK-38056</a>] -         Structured streaming not working in history server when using LevelDB
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38060'>SPARK-38060</a>] -         Inconsistent behavior from JSON option allowNonNumericNumbers
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38067'>SPARK-38067</a>] -         Inconsistent missing values handling in Pandas on Spark to_json
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38073'>SPARK-38073</a>] -         Update atexit function to avoid issues with late binding
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38075'>SPARK-38075</a>] -         Hive script transform with order by and limit will return fake rows
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38118'>SPARK-38118</a>] -         Func(wrong data type) in HAVING clause should throw data mismatch error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38120'>SPARK-38120</a>] -         HiveExternalCatalog.listPartitions is failing when partition column name is upper case and dot in partition value
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38124'>SPARK-38124</a>] -         Revive HashClusteredDistribution and apply to stream-stream join
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38130'>SPARK-38130</a>] -         array_sort does not allow non-orderable datatypes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38132'>SPARK-38132</a>] -         Remove NotPropagation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38133'>SPARK-38133</a>] -         Grouping by timestamp_ntz will sometimes corrupt the results
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38139'>SPARK-38139</a>] -         ml.recommendation.ALS doctests failures
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38140'>SPARK-38140</a>] -         Desc column stats (min, max) for timestamp type is not consistent with the value due to time zone difference
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38146'>SPARK-38146</a>] -         UDAF fails to aggregate TIMESTAMP_NTZ column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38151'>SPARK-38151</a>] -         Handle `Pacific/Kanton` in DateTimeUtilsSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38171'>SPARK-38171</a>] -         Upgrade ORC to 1.7.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38173'>SPARK-38173</a>] -         Quoted column cannot be recognized correctly when quotedRegexColumnNames is true
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38178'>SPARK-38178</a>] -         Correct the logic to measure the memory usage of RocksDB
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38182'>SPARK-38182</a>] -         Fix NoSuchElementException if pushed filter does not contain any references
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38185'>SPARK-38185</a>] -         Fix data incorrect if aggregate function is empty
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38192'>SPARK-38192</a>] -         Use try-with-resources in Level/RocksDBSuite.java
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38198'>SPARK-38198</a>] -         Fix `QueryExecution.debug#toFile` use the passed in `maxFields` when `explainMode` is `CodegenMode`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38201'>SPARK-38201</a>] -         Fix KubernetesUtils#uploadFileToHadoopCompatibleFS use passed in `delSrc` and `overwrite`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38204'>SPARK-38204</a>] -         All state operators are at a risk of inconsistency between state partitioning and operator partitioning
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38206'>SPARK-38206</a>] -         Relax the requirement of data type comparison for keys in stream-stream join
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38221'>SPARK-38221</a>] -         Group by a stream of complex expressions fails
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38227'>SPARK-38227</a>] -         Apply strict nullability of nested column in time window / session window
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38236'>SPARK-38236</a>] -         Absolute file paths specified in create/alter table are treated as relative
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38239'>SPARK-38239</a>] -         AttributeError: &#39;LogisticRegressionModel&#39; object has no attribute &#39;_call_java&#39;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38243'>SPARK-38243</a>] -         Unintended exception thrown in pyspark.ml.LogisticRegression.getThreshold
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38271'>SPARK-38271</a>] -         PoissonSampler may output more rows than MaxRows
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38273'>SPARK-38273</a>] -         decodeUnsafeRows&#39;s iterators should close underlying input streams
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38275'>SPARK-38275</a>] -         Consider to include WriteBatch&#39;s memory in the memory usage of RocksDB state store
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38285'>SPARK-38285</a>] -         ClassCastException: GenericArrayData cannot be cast to InternalRow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38286'>SPARK-38286</a>] -         Union&#39;s maxRows and maxRowsPerPartition may overflow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38304'>SPARK-38304</a>] -         Elt() should return null if index is null under ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38308'>SPARK-38308</a>] -         Select of a stream of window expressions fails
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38309'>SPARK-38309</a>] -         SHS has incorrect percentiles for shuffle read bytes and shuffle total blocks metrics
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38314'>SPARK-38314</a>] -         Fail to read parquet files after writing the hidden file metadata in
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38320'>SPARK-38320</a>] -         (flat)MapGroupsWithState can timeout groups which just received inputs in the same microbatch
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38333'>SPARK-38333</a>] -         DPP cause DataSourceScanExec java.lang.NullPointerException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38344'>SPARK-38344</a>] -         Avoid to submit task when there are no requests to push up in push-based shuffle
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38347'>SPARK-38347</a>] -         Nullability propagation in transformUpWithNewOutput
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38355'>SPARK-38355</a>] -         Change mktemp() to mkstemp()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38357'>SPARK-38357</a>] -         StackOverflowError with OR(data filter, partition filter)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38394'>SPARK-38394</a>] -         build of spark sql against hadoop-3.4.0-snapshot failing with bouncycastle classpath error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38411'>SPARK-38411</a>] -         Use UTF-8 when doMergeApplicationListingInternal reads event logs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38412'>SPARK-38412</a>] -         `from` and `to` is swapped in the StateSchemaCompatibilityChecker
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38416'>SPARK-38416</a>] -         Change day to month 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38436'>SPARK-38436</a>] -         Fix `test_ceil` to test `ceil`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38446'>SPARK-38446</a>] -         Deadlock between ExecutorClassLoader and FileDownloadCallback caused by Log4j
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38458'>SPARK-38458</a>] -         Fix always false condition in LogDivertAppender#initLayout 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38516'>SPARK-38516</a>] -         Add log4j-core, log4j-api and log4j-slf4j-impl to classpath if active hadoop-provided
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38517'>SPARK-38517</a>] -         Fix PySpark documentation generation (missing ipython_genutils)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38523'>SPARK-38523</a>] -         Failure on referring to the corrupt record from CSV
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38526'>SPARK-38526</a>] -         fix misleading function alias name for RuntimeReplaceable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38528'>SPARK-38528</a>] -         NullPointerException when selecting a generator in a Stream of aggregate expressions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38530'>SPARK-38530</a>] -         GeneratorNestedColumnAliasing does not work correctly for some expressions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38542'>SPARK-38542</a>] -         UnsafeHashedRelation should serialize numKeys out
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38563'>SPARK-38563</a>] -         Upgrade to Py4J 0.10.9.5
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38567'>SPARK-38567</a>] -         Enable GitHub Action build_and_test on branch-3.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38579'>SPARK-38579</a>] -         Requesting Restful API can cause NullPointerException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38583'>SPARK-38583</a>] -         to_timestamp should allow numeric types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38586'>SPARK-38586</a>] -         Trigger notifying workflow in branch-3.3 and other future branches
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38587'>SPARK-38587</a>] -         Validating new location for rename command should use formatted names
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38600'>SPARK-38600</a>] -         Include unit into the sql string of TIMESTAMPADD/DIFF 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38604'>SPARK-38604</a>] -         ceil and floor return different types when called from scala than sql
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38612'>SPARK-38612</a>] -         Fix Inline type hint for duplicated.keep
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38630'>SPARK-38630</a>] -         K8s app name label should start and end with alphanumeric char
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38631'>SPARK-38631</a>] -         Arbitrary shell command injection via Utils.unpack()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38640'>SPARK-38640</a>] -         NPE with unpersisting memory-only RDD with RDD fetching from shuffle service enabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38652'>SPARK-38652</a>] -         uploadFileUri should preserve file scheme
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38655'>SPARK-38655</a>] -         OffsetWindowFunctionFrameBase cannot find the offset row whose input is not null
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38665'>SPARK-38665</a>] -         upgrade jackson due to CVE-2020-36518
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38666'>SPARK-38666</a>] -         Missing aggregate filter checks
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38675'>SPARK-38675</a>] -         Race condition in BlockInfoManager during unlock
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38677'>SPARK-38677</a>] -         pyspark hangs in local mode running rdd map operation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38680'>SPARK-38680</a>] -         Set upperbound for pandas-stubs in CI
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38681'>SPARK-38681</a>] -         Support nested generic case classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38684'>SPARK-38684</a>] -         Stream-stream outer join has a possible correctness issue due to weakly read consistent on outer iterators
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38696'>SPARK-38696</a>] -         Add `commons-collections` back
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38706'>SPARK-38706</a>] -         Use URI in FallbackStorage.copy
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38776'>SPARK-38776</a>] -         Flaky test: ALSSuite.&#39;ALS validate input dataset&#39;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38807'>SPARK-38807</a>] -         Error when starting spark shell on Windows system
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38818'>SPARK-38818</a>] -         Fix the docs of try_multiply/try_subtract/ANSI cast
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38823'>SPARK-38823</a>] -         Incorrect result of dataset reduceGroups in java
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38830'>SPARK-38830</a>] -         Warn on corrupted block messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38866'>SPARK-38866</a>] -         Update ORC to 1.7.4
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38868'>SPARK-38868</a>] -         `assert_true` fails unconditionnaly after `left_outer` joins
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38882'>SPARK-38882</a>] -         The usage logger attachment logic should handle static methods properly.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38889'>SPARK-38889</a>] -         Invalid column name while querying bit type column in MSSQL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38916'>SPARK-38916</a>] -         Tasks not killed caused by race conditions between killTask() and launchTask()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38918'>SPARK-38918</a>] -         Nested column pruning should filter out attributes that do not belong to the current relation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38922'>SPARK-38922</a>] -         TaskLocation.apply throw NullPointerException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38931'>SPARK-38931</a>] -         RocksDB File manager would not create initial dfs directory with unknown number of keys on 1st empty checkpoint
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38941'>SPARK-38941</a>] -         Skip RocksDB-based test case in StreamingJoinSuite on Apple Silicon 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38942'>SPARK-38942</a>] -         Skip RocksDB-based test case in FlatMapGroupsWithStateSuite on Apple Silicon
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38955'>SPARK-38955</a>] -         Disable lineSep option in &#39;from_csv&#39; and &#39;schema_of_csv&#39;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38973'>SPARK-38973</a>] -         When push-based shuffle is enabled, a stage may not complete when retried
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38974'>SPARK-38974</a>] -         List functions should only list registered functions in the specified database
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38977'>SPARK-38977</a>] -         Fix schema pruning with correlated subqueries
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38988'>SPARK-38988</a>] -         Pandas API - &quot;PerformanceWarning: DataFrame is highly fragmented.&quot; get printed many times. 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38990'>SPARK-38990</a>] -         date_trunc and trunc both fail with format from column in inline table
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38992'>SPARK-38992</a>] -         Avoid using bash -c in ShellBasedGroupsMappingProvider
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39012'>SPARK-39012</a>] -         SparkSQL parse partition value does not support all data types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39015'>SPARK-39015</a>] -         SparkRuntimeException when trying to get non-existent key in a map
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39055'>SPARK-39055</a>] -         Fix documentation 404 page
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39060'>SPARK-39060</a>] -         Typo in error messages of decimal overflow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39083'>SPARK-39083</a>] -         Fix FsHistoryProvider race condition between update and clean app data
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39084'>SPARK-39084</a>] -         df.rdd.isEmpty() results in unexpected executor failure and JVM crash
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39093'>SPARK-39093</a>] -         Dividing interval by integral can result in codegen compilation error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39104'>SPARK-39104</a>] -         Null Pointer Exeption on unpersist call
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39107'>SPARK-39107</a>] -         Silent change in regexp_replace&#39;s handling of empty strings
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39112'>SPARK-39112</a>] -         UnsupportedOperationException if spark.sql.ui.explainMode is set to cost
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39144'>SPARK-39144</a>] -         Nested subquery expressions deduplicate relations should be done bottom up
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39149'>SPARK-39149</a>] -         SHOW DATABASES command should not quote database names under legacy mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39216'>SPARK-39216</a>] -         Do not collapse projects in CombineUnions if it hasCorrelatedSubquery
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39218'>SPARK-39218</a>] -         Python foreachBatch streaming query cannot be stopped gracefully after pin thread mode is enabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39226'>SPARK-39226</a>] -         Fix the precision of the return type of round-like functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39233'>SPARK-39233</a>] -         Remove the check for TimestampNTZ output in Analyzer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39250'>SPARK-39250</a>] -         Upgrade Jackson to 2.13.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39258'>SPARK-39258</a>] -         Fix `Hide credentials in show create table` after SPARK-35378
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39259'>SPARK-39259</a>] -         Timestamps returned by now() and equivalent functions are not consistent in subqueries
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39283'>SPARK-39283</a>] -         Spark tasks stuck forever due to deadlock between TaskMemoryManager and UnsafeExternalSorter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39286'>SPARK-39286</a>] -         Documentation for the decode function has an incorrect reference
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39293'>SPARK-39293</a>] -         The accumulator of ArrayAggregate should copy the intermediate result if string, struct, array, or map
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39313'>SPARK-39313</a>] -         V2ExpressionUtils.toCatalystOrdering should fail if V2Expression can not be translated
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39341'>SPARK-39341</a>] -         KubernetesExecutorBackend should allow IPv6 pod IP
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39354'>SPARK-39354</a>] -         The analysis exception is incorrect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39360'>SPARK-39360</a>] -         Recover spark.kubernetes.memoryOverheadFactor doc and remove deprecation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39376'>SPARK-39376</a>] -         Do not output duplicated columns in star expansion of subquery alias of NATURAL/USING JOIN
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39393'>SPARK-39393</a>] -         Parquet data source only supports push-down predicate filters for non-repeated primitive types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39411'>SPARK-39411</a>] -         Release candidates do not have the correct version for PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39412'>SPARK-39412</a>] -         IllegalStateException from connector does not work well with error class framework
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39417'>SPARK-39417</a>] -         Handle Null partition values in PartitioningUtils
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39421'>SPARK-39421</a>] -         Sphinx build fails with &quot;node class &#39;meta&#39; is already registered, its visitors will be overridden&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39427'>SPARK-39427</a>] -         Disable ANSI intervals in the percentile functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40804'>SPARK-40804</a>] -         Missing handling a catalog name in destination tables in `RenameTableExec`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-45601'>SPARK-45601</a>] -         stackoverflow when executing rule ExtractWindowExpressions
</li>
</ul>
    
<h2>        Epic
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39370'>SPARK-39370</a>] -         Inline type hints in PySpark
</li>
</ul>
    
<h2>        Story
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36642'>SPARK-36642</a>] -         Add df.withMetadata: a syntax suger to update the metadata of a dataframe
</li>
</ul>
    
<h2>        New Feature
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-595'>SPARK-595</a>] -         Document &quot;local-cluster&quot; mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-12567'>SPARK-12567</a>] -         Add aes_encrypt and aes_decrypt UDFs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-28955'>SPARK-28955</a>] -         Support for LocalDateTime semantics
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-32268'>SPARK-32268</a>] -         Bloom Filter Join
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-33772'>SPARK-33772</a>] -         Build and Run Spark on Java 17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34735'>SPARK-34735</a>] -         Add modified configs for SQL execution in UI
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34755'>SPARK-34755</a>] -         Support the utils for transform number format
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34806'>SPARK-34806</a>] -         Helper class for batch Dataset.observe()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35334'>SPARK-35334</a>] -         Spark should be more resilient to intermittent K8s flakiness
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35781'>SPARK-35781</a>] -         Support Spark on Apple Silicon on macOS natively on Java 17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36194'>SPARK-36194</a>] -         Add a logical plan visitor to propagate the distinct attributes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36263'>SPARK-36263</a>] -         Add Dataset.observe(Observation, Column, Column*) to PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36371'>SPARK-36371</a>] -         Support raw string literal
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36425'>SPARK-36425</a>] -         PySpark: support CrossValidatorModel get standard deviation of metrics for each paramMap 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36533'>SPARK-36533</a>] -         Allow streaming queries with Trigger.Once run in multiple batches
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36674'>SPARK-36674</a>] -         Support ILIKE - case insensitive Like
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36736'>SPARK-36736</a>] -         Support ILIKE (ALL | ANY | SOME) - case insensitive LIKE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37047'>SPARK-37047</a>] -         Add overloads for lpad and rpad for BINARY strings
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37062'>SPARK-37062</a>] -         Introduce a new data source for providing consistent set of rows per microbatch
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37205'>SPARK-37205</a>] -         Support mapreduce.job.send-token-conf when starting containers in YARN
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37207'>SPARK-37207</a>] -         Python API does not have isEmpty
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37219'>SPARK-37219</a>] -         support AS OF syntax
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37375'>SPARK-37375</a>] -         Umbrella: Storage Partitioned Join (SPJ)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37475'>SPARK-37475</a>] -         Add Scale Parameter to Floor and Ceil functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37492'>SPARK-37492</a>] -         Optimize Orc test code with withAllNativeOrcReaders
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37507'>SPARK-37507</a>] -         Add the TO_BINARY() function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37508'>SPARK-37508</a>] -         Add CONTAINS() function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37520'>SPARK-37520</a>] -         Add the startswith() and endswith() string functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37552'>SPARK-37552</a>] -         Add the convert_timezone() function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37568'>SPARK-37568</a>] -         Support 2-arguments by the convert_timezone() function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37582'>SPARK-37582</a>] -         Support the binary type by contains()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37583'>SPARK-37583</a>] -         Support the binary type by startswith() and endswith()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37584'>SPARK-37584</a>] -         New SQL function: map_contains_key
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37671'>SPARK-37671</a>] -         Support ANSI Aggregation Function of regression
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37676'>SPARK-37676</a>] -         Support ANSI Aggregation Function: percentile_cont
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37691'>SPARK-37691</a>] -         Support ANSI Aggregation Function: percentile_disc
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37810'>SPARK-37810</a>] -         Executor Rolling in Kubernetes environment
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37863'>SPARK-37863</a>] -         Add submitTime for Spark Application
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37970'>SPARK-37970</a>] -         Introduce a new interface on streaming data source to notify the latest seen offset
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38035'>SPARK-38035</a>] -         Add docker tests for build-in JDBC dialect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38054'>SPARK-38054</a>] -         DS V2 supports list namespaces of MySQL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38094'>SPARK-38094</a>] -         Parquet: enable matching schema columns by field id
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38195'>SPARK-38195</a>] -         Add the TIMESTAMPADD() function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38278'>SPARK-38278</a>] -         Add SparkContext.addArchive in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38284'>SPARK-38284</a>] -         Add the TIMESTAMPDIFF() function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38332'>SPARK-38332</a>] -         Add the `DATEADD()` alias for `TIMESTAMPADD()`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38345'>SPARK-38345</a>] -         Introduce SQL function ARRAY_SIZE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38389'>SPARK-38389</a>] -         Add the `DATEDIFF` and `DATE_DIFF` aliases for `TIMESTAMPDIFF()`
</li>
</ul>
    
<h2>        Improvement
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-20384'>SPARK-20384</a>] -         supporting value classes over primitives in DataSets
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-27790'>SPARK-27790</a>] -         Support ANSI SQL INTERVAL types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-32797'>SPARK-32797</a>] -         Install mypy on the Jenkins CI workers
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-32940'>SPARK-32940</a>] -         Collect, first and last should be deterministic aggregate functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-32986'>SPARK-32986</a>] -         Add bucket scan info in explain output of FileSourceScanExec
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34079'>SPARK-34079</a>] -         Merge non-correlated scalar subqueries
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34378'>SPARK-34378</a>] -         Loosen AvroSerializer validation to allow extra nullable user-provided fields
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34629'>SPARK-34629</a>] -         Python type hints improvement
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34943'>SPARK-34943</a>] -         Upgrade flake8 to 3.8.0 or above in Jenkins
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35173'>SPARK-35173</a>] -         Support columns batch adding in PySpark.dataframe
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35174'>SPARK-35174</a>] -         Avoid opening watch when waitAppCompletion is false
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35221'>SPARK-35221</a>] -         Add join hint build side check
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35320'>SPARK-35320</a>] -         from_json cannot parse maps with timestamp as key
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35442'>SPARK-35442</a>] -         Support propagate empty relation through aggregate
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35460'>SPARK-35460</a>] -          invalid `spark.kubernetes.executor.podNamePrefix` causes app to hang
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35703'>SPARK-35703</a>] -         Relax constraint for Spark bucket join and remove HashClusteredDistribution
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35848'>SPARK-35848</a>] -         Spark Bloom Filter, others using treeAggregate can throw OutOfMemoryError
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35907'>SPARK-35907</a>] -         Instead of File#mkdirs, Files#createDirectories is expected.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35918'>SPARK-35918</a>] -         Consolidate logic between AvroSerializer/AvroDeserializer for schema mismatch handling and error messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35956'>SPARK-35956</a>] -         Support auto-assigning labels to less important pods (e.g. decommissioning pods)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35980'>SPARK-35980</a>] -         ThreadAudit test helper should log whether a thread is a Daemon thread
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35986'>SPARK-35986</a>] -         fix pyspark.rdd.RDD.histogram&#39;s buckets argument
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35991'>SPARK-35991</a>] -         Add PlanStability suite for TPCH
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36000'>SPARK-36000</a>] -         Support creation and operations of ps.Series/Index with Decimal(&#39;NaN&#39;)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36010'>SPARK-36010</a>] -         Upgrade sbt-antlr4 from 0.8.2 to 0.8.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36018'>SPARK-36018</a>] -         Some Improvement for Spark Core
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36038'>SPARK-36038</a>] -         Basic speculation metrics at stage level
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36047'>SPARK-36047</a>] -         Replace the handwriting compare methods with static compare methods in Java code
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36069'>SPARK-36069</a>] -         spark function from_json should output field name, field type and field value when FAILFAST mode throw exception
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36070'>SPARK-36070</a>] -         Add time cost info for writing rows out and committing the task.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36073'>SPARK-36073</a>] -         EquivalentExpressions fixes and improvements
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36137'>SPARK-36137</a>] -         HiveShim always fallback to getAllPartitionsOf regardless of whether directSQL is enabled in remote HMS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36147'>SPARK-36147</a>] -         [SQL] - log level should be warning if files not found in BasicWriteStatsTracker
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36149'>SPARK-36149</a>] -         dayofweek documentation for python and R
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36154'>SPARK-36154</a>] -         pyspark documentation doesn&#39;t mention week and quarter as valid format arguments to trunc
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36157'>SPARK-36157</a>] -         TimeWindow expression: apply filter before project
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36158'>SPARK-36158</a>] -         pyspark sql/functions documentation for months_between isn&#39;t as precise as scala version
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36160'>SPARK-36160</a>] -         pyspark sql/column documentation doesn&#39;t always match scala documentation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36163'>SPARK-36163</a>] -         Propagate correct JDBC properties in JDBC connector provider and add &quot;connectionProvider&quot; option
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36173'>SPARK-36173</a>] -         [CORE] Support getting CPU number in TaskContext
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36176'>SPARK-36176</a>] -         Expose tableExists in pyspark.sql.catalog
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36183'>SPARK-36183</a>] -         Push down limit 1 through Aggregate
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36207'>SPARK-36207</a>] -         Export databaseExists in pyspark.sql.catalog
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36243'>SPARK-36243</a>] -         pyspark catalog.tableExists doesn&#39;t work for temporary views
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36258'>SPARK-36258</a>] -         Export functionExists in pyspark catalog
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36276'>SPARK-36276</a>] -         Update maven-checkstyle-plugin to 3.1.2 and checkstyle to 8.43
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36280'>SPARK-36280</a>] -         Remove redundant aliases after RewritePredicateSubquery
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36319'>SPARK-36319</a>] -         Have Observation return Map instead of Row
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36326'>SPARK-36326</a>] -         Use Map.computeIfAbsent to simplify the process of HeapMemoryAllocator.bufferPoolsBySize init new item
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36334'>SPARK-36334</a>] -         Add a new conf to allow K8s API server-side cache for pod listing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36351'>SPARK-36351</a>] -         Separate partition filters and data filters in PushDownUtils
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36359'>SPARK-36359</a>] -         Coalesce drop all expressions after the first non nullable expression
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36361'>SPARK-36361</a>] -         Install coverage in Python 3.9 and PyPy 3 in GitHub Actions image
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36362'>SPARK-36362</a>] -         Omnibus Java code static analyzer warning fixes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36373'>SPARK-36373</a>] -         DecimalPrecision only add necessary cast
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36404'>SPARK-36404</a>] -         Support nested columns in ORC vectorized reader for data source v2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36405'>SPARK-36405</a>] -         Check that error class SQLSTATEs are valid
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36406'>SPARK-36406</a>] -         No longer do file truncate operation before delete a write failed file held by DiskBlockObjectWriter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36407'>SPARK-36407</a>] -         Avoid potential integer multiplications overflow risk
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36410'>SPARK-36410</a>] -         Replace anonymous classes with lambda expressions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36418'>SPARK-36418</a>] -         Use CAST in parsing of dates/timestamps with default pattern
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36419'>SPARK-36419</a>] -         Move final aggregation in RDD.treeAggregate to executor
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36420'>SPARK-36420</a>] -         Use `isEmpty` to improve performance in Pregel&#39;s superstep
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36450'>SPARK-36450</a>] -         Remove unused UnresolvedV2Relation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36451'>SPARK-36451</a>] -         Ivy skips looking for source and doc pom
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36475'>SPARK-36475</a>] -         Add doc about spark.shuffle.service.fetch.rdd.enabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36481'>SPARK-36481</a>] -         Expose LogisticRegression.setInitialModel
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36487'>SPARK-36487</a>] -         modify exit executor log logic
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36495'>SPARK-36495</a>] -         Use Type match to simplify CatalystTypeConverter.toCatalyst
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36498'>SPARK-36498</a>] -         Reorder inner fields of the input query in byName V2 write
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36502'>SPARK-36502</a>] -         Remove jaxb-api from `sql/catalyst` module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36503'>SPARK-36503</a>] -         Add RowToColumnConverter for BinaryType
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36536'>SPARK-36536</a>] -         Split the JSON/CSV option of datetime format to in read and in write
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36546'>SPARK-36546</a>] -         Make unionByName null-filling behavior work with array of struct columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36550'>SPARK-36550</a>] -         Propagation cause when UDF reflection fails
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36560'>SPARK-36560</a>] -         Deflake PySpark coverage report
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36566'>SPARK-36566</a>] -         Add Spark appname as a label to the executor pods
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36573'>SPARK-36573</a>] -          Add a default value to ORACLE_DOCKER_IMAGE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36575'>SPARK-36575</a>] -         Executor lost may cause spark stage to hang
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36576'>SPARK-36576</a>] -         Improve range split calculation for Kafka Source minPartitions option
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36580'>SPARK-36580</a>] -         Replace filter and contains with intersect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36583'>SPARK-36583</a>] -         Upgrade commons-pool2 from 2.6.2 to 2.11.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36602'>SPARK-36602</a>] -         Clean up redundant asInstanceOf casts
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36607'>SPARK-36607</a>] -         Support BooleanType in UnwrapCastInBinaryComparison
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36613'>SPARK-36613</a>] -         The return value of the Table.capabilities should use EnumSet instead of HashSet
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36643'>SPARK-36643</a>] -         Add more information in ERROR log while SparkConf is modified when spark.sql.legacy.setCommandRejectsSparkCoreConfs is set
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36644'>SPARK-36644</a>] -         Push down boolean column filter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36649'>SPARK-36649</a>] -         Support Trigger.AvailableNow on Kafka data source
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36654'>SPARK-36654</a>] -         Drop type ignores from numpy imports
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36660'>SPARK-36660</a>] -         Cotangent is not supported by Dataframe
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36663'>SPARK-36663</a>] -         When the existing field name is a number, an error will be reported when reading the orc file
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36665'>SPARK-36665</a>] -         Add more Not operator optimizations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36679'>SPARK-36679</a>] -         Remove lz4 hadoop wrapper classes after Hadoop 3.3.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36683'>SPARK-36683</a>] -         Support secant and cosecant
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36688'>SPARK-36688</a>] -         Add cot as an R function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36689'>SPARK-36689</a>] -         Cleanup the deprecated APIs and raise proper warning message.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36690'>SPARK-36690</a>] -         Clean up deprecated api usage related to commons-pool2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36692'>SPARK-36692</a>] -         Improve Error statement when requesting thread dump while executor already stopped
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36703'>SPARK-36703</a>] -         Remove the Sort if it is the child of RepartitionByExpression
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36718'>SPARK-36718</a>] -         only collapse projects if we don&#39;t duplicate expensive expressions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36719'>SPARK-36719</a>] -         Supporting Netty Logging at the network layer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36721'>SPARK-36721</a>] -         Simplify boolean equalities if one side is literal
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36735'>SPARK-36735</a>] -         Adjust overhead of cached relation for DPP
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36737'>SPARK-36737</a>] -         Upgrade commons-io to 2.11.0 and revert change of SPARK-36456
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36745'>SPARK-36745</a>] -         ExtractEquiJoinKeys should return the original predicates on join keys 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36751'>SPARK-36751</a>] -         octet_length/bit_length API is not implemented  on Scala/Python/R
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36797'>SPARK-36797</a>] -         Union should resolve nested columns as top-level columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36799'>SPARK-36799</a>] -         Pass queryExecution name in CLI
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36805'>SPARK-36805</a>] -         Upgrade kubernetes-client to 5.7.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36808'>SPARK-36808</a>] -         Upgrade Kafka to 2.8.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36809'>SPARK-36809</a>] -         Remove broadcast for InSubqueryExec used in DPP
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36814'>SPARK-36814</a>] -         Make class ColumnarBatch extendable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36821'>SPARK-36821</a>] -         Create a test to extend ColumnarBatch
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36822'>SPARK-36822</a>] -         BroadcastNestedLoopJoinExec should use all condition instead of non-equi condition
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36824'>SPARK-36824</a>] -         Add sec and csc as R functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36829'>SPARK-36829</a>] -         Refactor collectionOperation related Null check related code
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36834'>SPARK-36834</a>] -         Namespace log lines in External Shuffle Service
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36838'>SPARK-36838</a>] -         Improve InSet NaN check generated code performance
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36841'>SPARK-36841</a>] -         Provide ansi syntax  `set catalog xxx` to change the current catalog  
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36847'>SPARK-36847</a>] -         Explicitly specify error codes when ignoring type hint errors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36859'>SPARK-36859</a>] -         Upgrade kubernetes-client to 5.8.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36863'>SPARK-36863</a>] -         Update dependency manifests for all released artifacts
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36870'>SPARK-36870</a>] -         Introduce INTERNAL_ERROR error class
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36876'>SPARK-36876</a>] -         Support Dynamic Partition pruning for HiveTableScanExec
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36890'>SPARK-36890</a>] -         Use default WebsocketPingInterval for Kubernetes watches
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36893'>SPARK-36893</a>] -         upgrade mesos into 1.4.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36894'>SPARK-36894</a>] -         RDD.toDF should be synchronized with dispatched variants of SparkSession.createDataFrame
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36898'>SPARK-36898</a>] -         Make the shuffle hash join factor configurable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36915'>SPARK-36915</a>] -         Pin actions to a full length commit SHA
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36918'>SPARK-36918</a>] -         unionByName shouldn&#39;t consider types when comparing structs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36933'>SPARK-36933</a>] -         Reduce duplication in TaskMemoryManager.acquireExecutionMemory
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36937'>SPARK-36937</a>] -         Change OrcSourceSuite to test both V1 and V2 sources.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36943'>SPARK-36943</a>] -         Improve error message for missing column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36953'>SPARK-36953</a>] -         Expose SQL state and error class in PySpark exceptions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36961'>SPARK-36961</a>] -         Use PEP526 style variable type hints
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36963'>SPARK-36963</a>] -         Add max_by/min_by to sql.functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36965'>SPARK-36965</a>] -         Extend python test runner by logging out the temp output files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36967'>SPARK-36967</a>] -         Report accurate shuffle block size if its skewed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36972'>SPARK-36972</a>] -         Add max_by/min_by API to PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36973'>SPARK-36973</a>] -         Deduplicate prepare data method for HistogramPlotBase and KdePlotBase
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36976'>SPARK-36976</a>] -         Add max_by/min_by API to SparkR
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36978'>SPARK-36978</a>] -         InferConstraints rule should create IsNotNull constraints on the nested field instead of the root nested type 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36981'>SPARK-36981</a>] -         Upgrade joda-time to 2.10.12
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36989'>SPARK-36989</a>] -         Migrate type hint data tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36992'>SPARK-36992</a>] -         Improve byte array sort perf by unify getPrefix function of UTF8String and ByteArray
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36997'>SPARK-36997</a>] -         Test type hints against examples
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37001'>SPARK-37001</a>] -         Disable two level of map for final hash aggregation by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37002'>SPARK-37002</a>] -         Introduce the &#39;compute.eager_check&#39; option
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37003'>SPARK-37003</a>] -         Merge INSERT related docs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37010'>SPARK-37010</a>] -         Remove unnecessary &quot;noqa: F401&quot; comments in pandas-on-Spark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37011'>SPARK-37011</a>] -         Upgrade flake8 to 3.9.0 or above in Jenkins
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37022'>SPARK-37022</a>] -         Use black as a formatter for the whole PySpark codebase.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37025'>SPARK-37025</a>] -         Upgrade RoaringBitmap to 0.9.22
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37032'>SPARK-37032</a>] -         Remove unuseable link in spark-3.2.0&#39;s doc
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37036'>SPARK-37036</a>] -         Add util function to raise advice warning for pandas API on Spark.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37037'>SPARK-37037</a>] -         Improve byte array sort by unify compareTo function of UTF8String and ByteArray 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37041'>SPARK-37041</a>] -         Backport HIVE-15025: Secure-Socket-Layer (SSL) support for HMS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37044'>SPARK-37044</a>] -         Add Row to __all__ in pyspark.sql.types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37058'>SPARK-37058</a>] -         Add spark-shell command line unit test
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37071'>SPARK-37071</a>] -         OpenHashMap should be serializable without reference tracking
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37075'>SPARK-37075</a>] -         move UDAF expression building from sql/catalyst to sql/core
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37077'>SPARK-37077</a>] -         Annotations for pyspark.sql.context.SQLContext.createDataFrame are broken
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37080'>SPARK-37080</a>] -         Add benchmark tool guide in pull request template
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37081'>SPARK-37081</a>] -         Upgrade the version of RDBMS and corresponding JDBC drivers used by docker-integration-tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37084'>SPARK-37084</a>] -         Set spark.sql.files.openCostInBytes to bytesConf
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37085'>SPARK-37085</a>] -         Missing overloads in functions accepting both varargs and single arg collection
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37087'>SPARK-37087</a>] -         merge three relation resolutions into one
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37101'>SPARK-37101</a>] -         In class ShuffleBlockPusher, use config instead of key
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37104'>SPARK-37104</a>] -         RDD and DStream should be covariant
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37108'>SPARK-37108</a>] -         Expose make_date expression in R
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37113'>SPARK-37113</a>] -         Upgrade Parquet to 1.12.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37115'>SPARK-37115</a>] -         Replace HiveClient call with hive shim
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37118'>SPARK-37118</a>] -         Add KMeans distanceMeasure param to PythonMLLibAPI
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37126'>SPARK-37126</a>] -         Support TimestampNTZ in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37133'>SPARK-37133</a>] -         Add a config to optionally enforce ANSI reserved keywords
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37134'>SPARK-37134</a>] -         documentation - unclear &quot;Using PySpark Native Features&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37151'>SPARK-37151</a>] -         Avoid executor state sync attempt fail continuously in a short timeframe
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37160'>SPARK-37160</a>] -         Add a config to optionally disable paddin for char type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37164'>SPARK-37164</a>] -         Add ExpressionBuilder for functions with complex overloads
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37165'>SPARK-37165</a>] -         Add REPEATABLE in TABLESAMPLE to specify seed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37176'>SPARK-37176</a>] -         JsonSource&#39;s infer should have the same exception handle logic as JacksonParser&#39;s parse logic
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37199'>SPARK-37199</a>] -         Add a deterministic field to QueryPlan
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37206'>SPARK-37206</a>] -         Upgrade Avro to 1.11.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37208'>SPARK-37208</a>] -         Support mapping Spark gpu/fpga resource types to custom YARN resource type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37211'>SPARK-37211</a>] -         More descriptions and adding an image to the failure message about enabling GitHub Actions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37214'>SPARK-37214</a>] -         Fail query analysis earlier with invalid identifiers
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37221'>SPARK-37221</a>] -         The collect-like API in SparkPlan should support columnar output
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37224'>SPARK-37224</a>] -         Optimize write path on RocksDB state store provider
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37237'>SPARK-37237</a>] -         Upgrade kubernetes-client to 5.9.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37239'>SPARK-37239</a>] -         Avoid unnecessary `setReplication` in Yarn mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37241'>SPARK-37241</a>] -         Upgrade Jackson to 2.13.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37243'>SPARK-37243</a>] -         Fix the format of the document
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37244'>SPARK-37244</a>] -         Build and test on Python 3.10
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37256'>SPARK-37256</a>] -         Replace `ScalaObjectMapper` with `ClassTagExtensions` to fix compilation warnings
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37257'>SPARK-37257</a>] -         Update setup.py for Python 3.10
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37263'>SPARK-37263</a>] -         Add PandasAPIOnSparkAdviceWarning class
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37266'>SPARK-37266</a>] -         View text can only be SELECT queries
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37268'>SPARK-37268</a>] -         Remove unused method call in FileScanRDD
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37273'>SPARK-37273</a>] -         Hidden File Metadata Support for Spark SQL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37283'>SPARK-37283</a>] -         Don&#39;t try to store a V1 table which contains ANSI intervals in Hive compatible format
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37284'>SPARK-37284</a>] -         Upgrade Jekyll to 4.2.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37289'>SPARK-37289</a>] -         Refactoring:remove the unnecessary function with partitionSchemaOption 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37292'>SPARK-37292</a>] -         Removes outer join if it only has DISTINCT on streamed side with alias
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37298'>SPARK-37298</a>] -         Use unique exprId in RewriteAsOfJoin
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37300'>SPARK-37300</a>] -         TaskSchedulerImpl should ignore task finished event if its task was already finished state
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37307'>SPARK-37307</a>] -         Don&#39;t obtain JDBC connection for empty partition
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37327'>SPARK-37327</a>] -         Silence the to_pandas() advice log for internal usage
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37335'>SPARK-37335</a>] -         Clarify output of FPGrowth
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37336'>SPARK-37336</a>] -         Migrate _java2py to SparkSession
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37337'>SPARK-37337</a>] -         Improve the API of Spark DataFrame to pandas-on-Spark DataFrame conversion
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37339'>SPARK-37339</a>] -         Add `spark-version` label to driver and executor pods
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37341'>SPARK-37341</a>] -         Avoid unnecessary buffer and copy in full outer sort merge join
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37342'>SPARK-37342</a>] -         Upgrade Apache Arrow to 6.0.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37346'>SPARK-37346</a>] -         Link migration guide for structured stream.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37352'>SPARK-37352</a>] -         Silence the `index_col` advice in `to_spark()` for internal usage
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37369'>SPARK-37369</a>] -         Avoid redundant ColumnarToRow transistion on InMemoryTableScan
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37370'>SPARK-37370</a>] -         Add SQL configs to control newly added join code-gen in 3.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37371'>SPARK-37371</a>] -         UnionExec should support columnar if all children support columnar
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37372'>SPARK-37372</a>] -         Removing redundant label addition
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37373'>SPARK-37373</a>] -         Collect LocalSparkContext worker logs in case of test failure
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37380'>SPARK-37380</a>] -         Miscellaneous Python lint infra cleanup
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37386'>SPARK-37386</a>] -         simplify OptimizeSkewedJoin to not run the cost evaluator
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37436'>SPARK-37436</a>] -         Uses Python&#39;s standard string formatter for SQL API in pandas API on Spark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37443'>SPARK-37443</a>] -         Provide a profiler for Python/Pandas UDFs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37447'>SPARK-37447</a>] -         Cache LogicalPlan.isStreaming() in a lazy val
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37450'>SPARK-37450</a>] -         Spark SQL reads unnecessary nested fields (another type of pruning case)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37454'>SPARK-37454</a>] -         support expressions in time travel timestamp
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37457'>SPARK-37457</a>] -         Update cloudpickle to v2.0.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37458'>SPARK-37458</a>] -         Remove unnecessary object serialization on foreachBatch
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37460'>SPARK-37460</a>] -         ALTER (DATABASE|SCHEMA|NAMESPACE) ... SET LOCATION command not documented
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37462'>SPARK-37462</a>] -          Avoid unnecessary calculating the number of outstanding fetch requests and RPCS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37464'>SPARK-37464</a>] -         SCHEMA and DATABASE should simply be aliases of NAMESPACE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37468'>SPARK-37468</a>] -         Support ANSI intervals and TimestampNTZ for UnionEstimation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37469'>SPARK-37469</a>] -         Unified &quot;fetchWaitTime&quot; and &quot;shuffleReadTime&quot; metrics On UI
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37474'>SPARK-37474</a>] -         Migrate SparkR documentation to pkgdown
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37484'>SPARK-37484</a>] -         Replace Get and getOrElse with getOrElse
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37485'>SPARK-37485</a>] -         Replace map with expressions which produce no result with foreach 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37493'>SPARK-37493</a>] -         expose driver gc time and duration time
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37503'>SPARK-37503</a>] -         Improve SparkSession/PySpark SparkSession startup
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37505'>SPARK-37505</a>] -         mesos module is missing log4j.properties file for UT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37506'>SPARK-37506</a>] -         Change the never changed &#39;var&#39; to &#39;val&#39;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37513'>SPARK-37513</a>] -         date +/- interval with only day-time fields returns different data type between Spark3.2 and Spark3.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37514'>SPARK-37514</a>] -         Remove workarounds due to older pandas
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37516'>SPARK-37516</a>] -         Uses Python&#39;s standard string formatter for SQL API in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37530'>SPARK-37530</a>] -         Spark reads many paths very slow though newAPIHadoopFile
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37531'>SPARK-37531</a>] -         Use PyArrow 6.0.0 in Python 3.9 tests at GitHub Action job
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37540'>SPARK-37540</a>] -         Detect more unsupported time travel
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37558'>SPARK-37558</a>] -         Improve spark-sql cli command doc
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37561'>SPARK-37561</a>] -         Avoid loading all functions when obtaining hive&#39;s DelegationToken
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37565'>SPARK-37565</a>] -         Upgrade mysql-connector-java to 8.0.27
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37578'>SPARK-37578</a>] -         DSV2 is not updating Output Metrics
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37580'>SPARK-37580</a>] -         Reset numFailures when one of task attempts succeeds
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37586'>SPARK-37586</a>] -         Add cipher mode option and set default cipher mode for aes_encrypt and aes_decrypt
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37591'>SPARK-37591</a>] -         Support the GCM mode by aes_encrypt()/aes_decrypt()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37592'>SPARK-37592</a>] -         Improve performance of JoinSelection
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37593'>SPARK-37593</a>] -         Reduce default page size by LONG_ARRAY_OFFSET if G1GC and ON_HEAP are used
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37594'>SPARK-37594</a>] -         Make UT test(&quot;SPARK-34399: Add job commit duration metrics for DataWritingCommand&quot;) more stable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37600'>SPARK-37600</a>] -         Upgrade to Hadoop 3.3.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37611'>SPARK-37611</a>] -         Remove upper limit of spark.kubernetes.memoryOverheadFactor
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37618'>SPARK-37618</a>] -         Support cleaning up shuffle blocks from external shuffle service
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37627'>SPARK-37627</a>] -         Add sorted column in BucketTransform
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37628'>SPARK-37628</a>] -         Upgrade Netty from 4.1.68 to 4.1.72
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37629'>SPARK-37629</a>] -         speed up Expression.canonicalized
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37646'>SPARK-37646</a>] -         Avoid touching Scala reflection APIs in the lit function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37649'>SPARK-37649</a>] -         Switch default index to distributed-sequence by default in pandas API on Spark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37657'>SPARK-37657</a>] -         Support str and timestamp for (Series|DataFrame).describe()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37666'>SPARK-37666</a>] -         Set `GCM` as the default mode in `aes_encrypt()`/`aes_decrypt()`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37670'>SPARK-37670</a>] -         Support predicate pushdown and column pruning for de-duped CTEs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37686'>SPARK-37686</a>] -         Migrate remaining pyspark.sql.functions to _invoke_* style
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37688'>SPARK-37688</a>] -         ExecutorMonitor should ignore SparkListenerBlockUpdated event if executor was not active
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37689'>SPARK-37689</a>] -         Expand should be supported PropagateEmptyRelation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37698'>SPARK-37698</a>] -         Update ORC to 1.7.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37704'>SPARK-37704</a>] -         Update mypy in tests to 0.920
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37710'>SPARK-37710</a>] -         Add detailed log message for java.io.IOException occurring on Kryo flow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37712'>SPARK-37712</a>] -         Spark request yarn cluster metrics slow and cause unnecessary delay
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37715'>SPARK-37715</a>] -         Remove ojdbc6 dependency and update docker-integration test docs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37726'>SPARK-37726</a>] -         Add spill size metrics for sort merge join
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37731'>SPARK-37731</a>] -         refactor and cleanup function lookup in Analyzer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37737'>SPARK-37737</a>] -         Update Black to 21.12.b0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37738'>SPARK-37738</a>] -         PySpark date_add only accepts an integer as it&#39;s second parameter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37739'>SPARK-37739</a>] -         Upgrade Arrow to 6.0.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37747'>SPARK-37747</a>] -         Upgrade zstd-jni to 1.5.1-1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37753'>SPARK-37753</a>] -         Fine tune logic to demote Broadcast hash join in DynamicJoinSelection 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37756'>SPARK-37756</a>] -         Enable matplotlib test for pandas API on Spark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37761'>SPARK-37761</a>] -         Install matplotlib in Python 3.9 and PyPy 3 in GitHub Actions image
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37764'>SPARK-37764</a>] -         Reserve bucket information when relation conversion from metastore relations to data source relations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37776'>SPARK-37776</a>] -         Upgrade silencer to 1.7.7
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37777'>SPARK-37777</a>] -         update the SQL syntax of SHOW FUNCTIONS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37780'>SPARK-37780</a>] -         QueryExecutionListener should also support SQLConf
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37782'>SPARK-37782</a>] -         Make DataFrame.transform take the parameters for the function.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37783'>SPARK-37783</a>] -         Add @tailrec wherever possible
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37784'>SPARK-37784</a>] -         CodeGenerator.addBufferedState() does not properly handle UDTs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37785'>SPARK-37785</a>] -         Add Utils.isAtExecutor
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37786'>SPARK-37786</a>] -         StreamingQueryListener should also support SQLConf
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37789'>SPARK-37789</a>] -         Add a class to represent general aggregate functions in DS V2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37796'>SPARK-37796</a>] -         ByteArrayMethods arrayEquals should fast skip the check of aligning with unaligned platform
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37803'>SPARK-37803</a>] -         Create new benchmarks for struct deserializer improvement.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37812'>SPARK-37812</a>] -         When deserializing an Orc struct, reuse the result row when possible
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37822'>SPARK-37822</a>] -         SQL function `split` should return an array of non-nullable elements
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37826'>SPARK-37826</a>] -         Use zstd codec name in ORC file names for hive orc impl
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37828'>SPARK-37828</a>] -         Push down filters through RebalancePartitions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37831'>SPARK-37831</a>] -         Add task partition id in metrics
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37832'>SPARK-37832</a>] -         Orc struct serializer should look up field converters in an array rather than a linked list
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37833'>SPARK-37833</a>] -         Add `precondition` job for skip the main GitHub Action jobs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37835'>SPARK-37835</a>] -         Fix the comments on SQLQueryTestSuite.scala/ThriftServerQueryTestSuite.scala to more explicit
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37836'>SPARK-37836</a>] -         Enable more flake8 rules for PEP 8 compliance
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37837'>SPARK-37837</a>] -         Enable black formatter in dev Python scripts
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37838'>SPARK-37838</a>] -         Upgrade scalatestplus artifacts to 3.3.0-SNAP3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37850'>SPARK-37850</a>] -         Enable flake&#39;s E731 rule in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37851'>SPARK-37851</a>] -         Mark org.apache.spark.sql.hive.execution as slow tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37852'>SPARK-37852</a>] -         Enable flake&#39;s E741 rule in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37854'>SPARK-37854</a>] -         Use type match to simplify TestUtils#withHttpConnection
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37862'>SPARK-37862</a>] -         RecordBinaryComparator should fast skip the check of aligning with unaligned platform
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37869'>SPARK-37869</a>] -         Update pytest-mypy-plugins to 1.9.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37876'>SPARK-37876</a>] -         Move SpecificParquetRecordReaderBase.listDirectory to TestUtils
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37879'>SPARK-37879</a>] -         Show test report in GitHub Actions builds from PRs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37885'>SPARK-37885</a>] -         Allow pandas_udf to take type annotations with future annotations enabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37896'>SPARK-37896</a>] -         ConstantColumnVector: a column vector with same values
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37900'>SPARK-37900</a>] -         Use SparkMasterRegex.KUBERNETES_REGEX in SecurityManager
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37901'>SPARK-37901</a>] -         Upgrade Netty from 4.1.72 to 4.1.73
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37902'>SPARK-37902</a>] -         Update annotations to resolve issues detected with mypy==0.931
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37903'>SPARK-37903</a>] -         Replace string_typehints with get_type_hints.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37904'>SPARK-37904</a>] -         Improve RebalancePartitions in rules of Optimizer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37909'>SPARK-37909</a>] -         Restore F403 checks
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37915'>SPARK-37915</a>] -         Combine unions if there is a project between them
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37917'>SPARK-37917</a>] -         Push down limit 1 for right side of left semi/anti join
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37922'>SPARK-37922</a>] -         Combine to one cast if we can safely up-cast two casts
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37924'>SPARK-37924</a>] -         Sort table properties by key in SHOW CREATE TABLE on VIEW (v1)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37928'>SPARK-37928</a>] -         Add Parquet Data Page V2 bench scenario to DataSourceReadBenchmark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37934'>SPARK-37934</a>] -         Upgrade Jetty version to 9.4.44
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37949'>SPARK-37949</a>] -         Improve Rebalance statistics estimation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37950'>SPARK-37950</a>] -         Take EXTERNAL as a reserved table property
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37952'>SPARK-37952</a>] -         Add missing statements to ALTER TABLE document.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37959'>SPARK-37959</a>] -         Fix the UT of checking norm in KMeans &amp; BiKMeans
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37968'>SPARK-37968</a>] -         Upgrade commons-collections3 to commons-collections4
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37974'>SPARK-37974</a>] -         Implement vectorized  DELTA_BYTE_ARRAY and DELTA_LENGTH_BYTE_ARRAY encodings for Parquet V2 support
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37984'>SPARK-37984</a>] -         Avoid calculating all outstanding requests to improve performance.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37992'>SPARK-37992</a>] -         Restore mypy version check in dev/lint-python
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38002'>SPARK-38002</a>] -         Upgrade ZSTD-JNI to 1.5.2-1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38006'>SPARK-38006</a>] -         Clean up duplicated planner logic for window operator
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38007'>SPARK-38007</a>] -         Update K8s doc to recommend K8s 1.20+
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38008'>SPARK-38008</a>] -         Fix the method description of refill
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38011'>SPARK-38011</a>] -         Remove useless and duplicated configuration in ParquetFileFormat builderReader
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38014'>SPARK-38014</a>] -         Add Parquet Data Page V2 test scenario for BuiltInDataSourceWriteBenchmark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38021'>SPARK-38021</a>] -         Upgrade dropwizard metrics from 4.2.2 to 4.2.7
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38028'>SPARK-38028</a>] -         Expose Arrow Vector from ArrowColumnVector
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38033'>SPARK-38033</a>] -         The structured streaming processing cannot be started because the commitId and offsetId are inconsistent
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38036'>SPARK-38036</a>] -         Refactor `VersionsSuite` as a subclass of `HiveVersionSuite`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38046'>SPARK-38046</a>] -         Fix KafkaSource/KafkaMicroBatch flaky test due to non-deterministic timing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38051'>SPARK-38051</a>] -         Update Roxygen reference to 7.1.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38069'>SPARK-38069</a>] -         improve structured streaming window of calculated
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38076'>SPARK-38076</a>] -         Remove redundant null-check is covered by further condition
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38082'>SPARK-38082</a>] -         Update minimum numpy version to 1.15
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38086'>SPARK-38086</a>] -         Make ArrowColumnVector Extendable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38089'>SPARK-38089</a>] -         Show the root cause exception in TestUtils.assertExceptionMsg
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38096'>SPARK-38096</a>] -         Update sbt to 1.6.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38100'>SPARK-38100</a>] -         Remove unused method in `Decimal`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38121'>SPARK-38121</a>] -         Use SparkSession instead of SQLContext inside PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38123'>SPARK-38123</a>] -         Unified use `DataType.catalogString` as `targetType` of `QueryExecutionErrors#castingCauseOverflowError`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38128'>SPARK-38128</a>] -         Show full stacktrace in tests by default in PySpark tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38134'>SPARK-38134</a>] -         Upgrade Arrow to 7.0.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38138'>SPARK-38138</a>] -         Materialize QueryPlan subqueries
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38147'>SPARK-38147</a>] -         Upgrade shapeless to 2.3.7
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38148'>SPARK-38148</a>] -         Do not add dynamic partition pruning if there exists static partition pruning
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38149'>SPARK-38149</a>] -         Upgrade joda-time to 2.10.13
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38154'>SPARK-38154</a>] -         Set up a new GA job to run tests with ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38175'>SPARK-38175</a>] -         Clean up unused parameters in private methods signature
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38177'>SPARK-38177</a>] -         Fix wrong transformExpressions in Optimizer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38183'>SPARK-38183</a>] -         Show warning when creating pandas-on-Spark session under ANSI mode.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38184'>SPARK-38184</a>] -         Fix malformatted ExpressionDescription of `decode`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38186'>SPARK-38186</a>] -         Improve the README of Spark docs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38191'>SPARK-38191</a>] -         The staging directory of write job only needs to be initialized once in HadoopMapReduceCommitProtocol.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38194'>SPARK-38194</a>] -         Make memory overhead factor configurable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38199'>SPARK-38199</a>] -         Delete the unused `dataType` specified in the definition of `IntervalColumnAccessor`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38211'>SPARK-38211</a>] -         Add SQL migration guide on restoring loose upcast from string
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38214'>SPARK-38214</a>] -         No need to filter windows when windowDuration is multiple of slideDuration
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38216'>SPARK-38216</a>] -         When creating a Hive table, fail early if all the columns are partitioned columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38219'>SPARK-38219</a>] -         Support ANSI aggregation function percentile_cont as window function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38220'>SPARK-38220</a>] -         Upgrade `commons-math3` to 3.6.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38225'>SPARK-38225</a>] -         Adjust input `format` of function `to_binary`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38229'>SPARK-38229</a>] -         Should&#39;t check temp/external/ifNotExists with visitReplaceTable when parser
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38231'>SPARK-38231</a>] -         Upgrade commons-text to 1.9
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38235'>SPARK-38235</a>] -         Add test util for testing grouped aggregate pandas UDF.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38240'>SPARK-38240</a>] -         Improve RuntimeReplaceable and add a guideline for adding new functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38247'>SPARK-38247</a>] -         Unify the output of df.explain and &quot;explain &quot; if plan is command
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38249'>SPARK-38249</a>] -         Cleanup unused private methods and fields
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38256'>SPARK-38256</a>] -         Upgarde `org.scalatestplus:mockito` to 3.2.11.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38259'>SPARK-38259</a>] -         Upgrade netty to 4.1.74
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38260'>SPARK-38260</a>] -         Remove dependence on commons-net
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38267'>SPARK-38267</a>] -         Replace pattern matches on boolean expressions with conditional statements
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38269'>SPARK-38269</a>] -         Clean up redundant type cast
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38274'>SPARK-38274</a>] -         Upgarde junit4 to 4.13.2 and upgrade corresponding junit-interface to 0.13.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38279'>SPARK-38279</a>] -         Pin markupsafe to 2.0.1 fix linter failure
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38299'>SPARK-38299</a>] -         Clean up deprecated usage of `StringBuilder.newBuilder`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38300'>SPARK-38300</a>] -         Use ByteStreams.toByteArray to simplify fileToString and resourceToBytes in catalyst.uti
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38301'>SPARK-38301</a>] -         Remove unused scala-actors dependency
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38305'>SPARK-38305</a>] -         Check existence of file before untarring/zipping
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38322'>SPARK-38322</a>] -         Support query stage show runtime statistics in formatted explain mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38323'>SPARK-38323</a>] -         Support the hidden file metadata in Streaming
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38337'>SPARK-38337</a>] -         Replace `toIterator` with `iterator` for `IterableLike`/`IterableOnce` to cleanup deprecated api usage
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38338'>SPARK-38338</a>] -         Remove test dependency of hamcrest
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38339'>SPARK-38339</a>] -         Upgrade RoaringBitmap to 0.9.25
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38342'>SPARK-38342</a>] -         Clean up deprecated api usage of Ivy
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38348'>SPARK-38348</a>] -         Upgrade tink to 1.6.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38351'>SPARK-38351</a>] -         [TESTS] Replace &#39;abc symbols with Symbol(&quot;abc&quot;) in tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38353'>SPARK-38353</a>] -         Instrument __enter__ and __exit__ magic methods for pandas API on Spark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38360'>SPARK-38360</a>] -         Introduce a `exists` function for `TreeNode` to eliminate duplicate code pattern
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38362'>SPARK-38362</a>] -         Move eclipse.m2e Maven plugin config in its own profile
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38378'>SPARK-38378</a>] -         ANTLR grammar definition in separate Parser and Lexer files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38382'>SPARK-38382</a>] -         Refactor migration guide&#39;s sentences
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38384'>SPARK-38384</a>] -         Improve error messages of ParseException from ANTLR
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38393'>SPARK-38393</a>] -         Clean up deprecated usage of GenSeq/GenMap
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38414'>SPARK-38414</a>] -         Remove redundant SuppressWarnings
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38415'>SPARK-38415</a>] -         Update histogram_numeric (x, y) result type to make x == input type 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38424'>SPARK-38424</a>] -         Disallow unused casts and ignores
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38428'>SPARK-38428</a>] -         Check the FetchShuffleBlocks message only once to improve iteration in external shuffle service 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38434'>SPARK-38434</a>] -         Correct semantic of CheckAnalysis.getDataTypesAreCompatibleFn method
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38437'>SPARK-38437</a>] -         Dynamic serialization of Java datetime objects to micros/days
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38443'>SPARK-38443</a>] -         Document config STREAMING_SESSION_WINDOW_MERGE_SESSIONS_IN_LOCAL_PARTITION
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38484'>SPARK-38484</a>] -         Move usage logging instrumentation util functions from pandas module to pyspark.util module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38487'>SPARK-38487</a>] -         Fix docstrings of nlargest/nsmallest of DataFrame
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38489'>SPARK-38489</a>] -         Aggregate.groupOnly support foldable expressions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38499'>SPARK-38499</a>] -         Upgrade Jackson to 2.13.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38500'>SPARK-38500</a>] -         Add ASF License header to all Service Provider configuration files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38509'>SPARK-38509</a>] -         Unregister the TIMESTAMPADD/DIFF functions and remove DATE_ADD/DIFF
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38529'>SPARK-38529</a>] -         Prevent GeneratorNestedColumnAliasing to be applied to non-Explode generators
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38535'>SPARK-38535</a>] -         Add the datetimeUnit enum to the grammar and use it in TIMESTAMPADD/DIFF
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38540'>SPARK-38540</a>] -         Upgrade compress-lzf from 1.0.3 to 1.1.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38549'>SPARK-38549</a>] -         SessionWindowStateStoreRestoreExec should provide numRowsDroppedByWatermark metric
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38558'>SPARK-38558</a>] -         Remove unnecessary casts between IntegerType and IntDecimal
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38565'>SPARK-38565</a>] -         Support Left Semi join in row level runtime filters
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38570'>SPARK-38570</a>] -         Incorrect DynamicPartitionPruning caused by Literal
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38574'>SPARK-38574</a>] -         Enrich Avro data source documentation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38593'>SPARK-38593</a>] -         Incorporate numRowsDroppedByWatermark metric from SessionWindowStateStoreRestoreExec into StateOperatorProgress
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38607'>SPARK-38607</a>] -         Test result report for ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38609'>SPARK-38609</a>] -         Add PYSPARK_PANDAS_USAGE_LOGGER environment variable as an alias of KOALAS_USAGE_LOGGER
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38623'>SPARK-38623</a>] -         Add more comments and tests for HashShuffleSpec
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38628'>SPARK-38628</a>] -         Complete the copy method in subclasses of InternalRow, ArrayData, and MapData to safely copy their instances.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38650'>SPARK-38650</a>] -         Better ParseException message for char without length
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38654'>SPARK-38654</a>] -         Show default index type in SQL plans for pandas API on Spark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38656'>SPARK-38656</a>] -         Show options for Pandas API on Spark in UI
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38657'>SPARK-38657</a>] -         Rename &quot;SQL&quot; to &quot;SQL/DataFrame&quot; in Spark UI
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38709'>SPARK-38709</a>] -         remove trailing $ from function class name in sql-expression-schema.md
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38710'>SPARK-38710</a>] -         use SparkArithmeticException for Arithmetic overflow runtime errors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38778'>SPARK-38778</a>] -         Replace http with https for project url in pom
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38796'>SPARK-38796</a>] -         Implement the to_number and try_to_number SQL functions according to a new specification
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38816'>SPARK-38816</a>] -         Wrong comment in random matrix generator in spark-als algorithm 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38825'>SPARK-38825</a>] -         Add a test to cover parquet notIn filter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38833'>SPARK-38833</a>] -         PySpark applyInPandas should allow to return empty DataFrame without columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38892'>SPARK-38892</a>] -         Fix the UT of schema equal assert
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38924'>SPARK-38924</a>] -         Update dataTables to 1.10.25 for security issue
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38929'>SPARK-38929</a>] -         Improve error messages for cast failures in ANSI
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38936'>SPARK-38936</a>] -         Script transform feed thread should have name
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38939'>SPARK-38939</a>] -         Support ALTER TABLE ... DROP [IF EXISTS] COLUMN .. syntax
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38953'>SPARK-38953</a>] -         Document PySpark common exceptions / errors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38957'>SPARK-38957</a>] -         Use multipartIdentifier for parsing table-valued functions 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38972'>SPARK-38972</a>] -         [SQL] Support &lt;paramName&gt; style for error message parameters
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39008'>SPARK-39008</a>] -         Change ASF as a single author in Spark distribution
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39030'>SPARK-39030</a>] -         Rename sum to avoid shading the builtin Python function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39049'>SPARK-39049</a>] -         Remove unneeded pass
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39154'>SPARK-39154</a>] -         Remove outdated statements on distributed-sequence default index 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39155'>SPARK-39155</a>] -         Access to JVM through passed-in GatewayClient during type conversion
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39174'>SPARK-39174</a>] -         Catalogs loading swallows missing classname for ClassNotFoundException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39186'>SPARK-39186</a>] -         make skew consistent with pandas
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39215'>SPARK-39215</a>] -         Reduce Py4J calls in pyspark.sql.utils.is_timestamp_ntz_preferred
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39240'>SPARK-39240</a>] -         Source and binary releases using different tool to generates hashes for integrity
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39295'>SPARK-39295</a>] -         Improve documentation of pandas API support list.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39361'>SPARK-39361</a>] -         Stop using Log4J2&#39;s extended throwable logging pattern in default logging configurations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39392'>SPARK-39392</a>] -         Refine ANSI error messages and remove &#39;To return NULL instead&#39;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39633'>SPARK-39633</a>] -         Dataframe options for time travel via `timestampAsOf` should respect both formats of specifying timestamp
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-44700'>SPARK-44700</a>] -         Rule OptimizeCsvJsonExprs should not be applied to expression like from_json(regexp_replace)
</li>
</ul>
    
<h2>        Test
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-29871'>SPARK-29871</a>] -         Flaky test: ImageFileFormatTest.test_read_images
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-32391'>SPARK-32391</a>] -         Install pydata_sphinx_theme in Jenkins machines
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-32666'>SPARK-32666</a>] -         Install ipython and nbsphinx in Jenkins for Binder integration
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-33242'>SPARK-33242</a>] -         Install numpydoc in Jenkins machines
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35345'>SPARK-35345</a>] -         Add BloomFilter Benchmark test for Parquet
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36048'>SPARK-36048</a>] -         Wrong HealthTrackerSuite.allExecutorAndHostIds
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36151'>SPARK-36151</a>] -         Enable MiMa for Scala 2.13 artifacts after Spark 3.2.0 release
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36165'>SPARK-36165</a>] -         Fix SQL doc generation in GitHub Action
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36204'>SPARK-36204</a>] -         Deduplicate Scala 2.13 daily build
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36820'>SPARK-36820</a>] -         Disable LZ4 test for Hadoop 2.7 profile
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36839'>SPARK-36839</a>] -         Add daily build with Hadoop 2 profile in GitHub Actions build
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36883'>SPARK-36883</a>] -         Upgrade R version to 4.1.1 in CI images
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36929'>SPARK-36929</a>] -         Remove Unused Method EliminateSubqueryAliasesSuite#assertEquivalent
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37218'>SPARK-37218</a>] -         Parameterize `spark.sql.shuffle.partitions` in TPCDSQueryBenchmark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37223'>SPARK-37223</a>] -         Fix unit test check in JoinHintSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37322'>SPARK-37322</a>] -         `run_scala_tests` should respect test module order
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37367'>SPARK-37367</a>] -         Reenable exception test in DDLParserSuite.create view -- basic
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37368'>SPARK-37368</a>] -         Deflake TPC-DS build 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37384'>SPARK-37384</a>] -         Flay test: HealthTrackerIntegrationSuite.If preferred node is bad, without excludeOnFailure job will fail
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37453'>SPARK-37453</a>] -         Split TPC-DS build in GitHub Actions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37813'>SPARK-37813</a>] -         ORC read benchmark should enable vectorization for nested column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37823'>SPARK-37823</a>] -         Add `is-changed.py` dev script
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37871'>SPARK-37871</a>] -         Use python3 instead of python in BaseScriptTransformation tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37908'>SPARK-37908</a>] -         Refactoring on pod label test in BasicFeatureStepSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37921'>SPARK-37921</a>] -         Update OrcReadBenchmark to use Hive ORC reader as the basis
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37987'>SPARK-37987</a>] -         Flaky Test: StreamingAggregationSuite.changing schema of state when restarting query - state format version 1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38031'>SPARK-38031</a>] -         Update document type conversion for Pandas UDFs (pyarrow 6.0.1, pandas 1.4.0, Python 3.9)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38032'>SPARK-38032</a>] -         Upgrade Arrow version &lt; 7.0.0 for Python UDF tests in SQL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38040'>SPARK-38040</a>] -         Enable binary compatibility check for APIs in Catalyst, KVStore and Avro modules
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38045'>SPARK-38045</a>] -         More strict validation on plan check for stream-stream join unit test
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38080'>SPARK-38080</a>] -         Flaky test: StreamingQueryManagerSuite: &#39;awaitAnyTermination with timeout and resetTerminated&#39;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38084'>SPARK-38084</a>] -         Support `SKIP_PYTHON` and `SKIP_R` in `run-tests.py`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38136'>SPARK-38136</a>] -         Update GitHub Action test image
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38142'>SPARK-38142</a>] -         Move ArrowColumnVectorSuite to org.apache.spark.sql.vectorized
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38297'>SPARK-38297</a>] -         Fix mypy failure on DataFrame.to_numpy in pandas API on Spark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38532'>SPARK-38532</a>] -         Add test case for invalid gapDuration of sessionwindow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38780'>SPARK-38780</a>] -         PySpark docs build should fail when there is warning.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38786'>SPARK-38786</a>] -         Test Bug in StatisticsSuite &quot;change stats after add/drop partition command&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38800'>SPARK-38800</a>] -         Explicitly document the supported pandas version.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38927'>SPARK-38927</a>] -         Skip NumPy/Pandas tests in `test_rdd.py` if not available
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38928'>SPARK-38928</a>] -         Skip Pandas UDF test in `QueryCompilationErrorsSuite` if not available
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39019'>SPARK-39019</a>] -         Use `withTempPath` to clean up temporary data directory after `SPARK-37463: read/write Timestamp ntz to Orc with different time zone`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39252'>SPARK-39252</a>] -         Flaky Test: pyspark.sql.tests.test_dataframe.DataFrameTests test_df_is_empty
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39253'>SPARK-39253</a>] -         Improve PySpark API reference to be more readable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39273'>SPARK-39273</a>] -         Make PandasOnSparkTestCase inherit ReusedSQLTestCase
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39334'>SPARK-39334</a>] -         Change to exclude `slf4j-reload4j` for `hadoop-minikdc`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39394'>SPARK-39394</a>] -         Improve PySpark structured streaming page more readable
</li>
</ul>
    
<h2>        Wish
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36611'>SPARK-36611</a>] -         Remove unused listener in HiveThriftServer2AppStatusStore
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37931'>SPARK-37931</a>] -         Quote the column name if needed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38242'>SPARK-38242</a>] -         Sort the SparkSubmit debug output 
</li>
</ul>
    
<h2>        Task
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35973'>SPARK-35973</a>] -         DataSourceV2: Support SHOW CATALOGS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35996'>SPARK-35996</a>] -         Setting version to 3.3.0-SNAPSHOT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36034'>SPARK-36034</a>] -         Incorrect datetime filter when reading Parquet files written in legacy mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36148'>SPARK-36148</a>] -         Missing validation of regexp_replace inputs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36223'>SPARK-36223</a>] -         TPCDSQueryTestSuite should run with different config set
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36888'>SPARK-36888</a>] -         Sha2 with bit_length 512 not being tested
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36975'>SPARK-36975</a>] -         Refactor HiveClientImpl collect hive client call logic
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36980'>SPARK-36980</a>] -         Insert support query with CTE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37050'>SPARK-37050</a>] -         Update conda installation instructions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37067'>SPARK-37067</a>] -         DateTimeUtils.stringToTimestamp() incorrectly rejects timezone without colon
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37136'>SPARK-37136</a>] -         Remove code about hive build in functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37437'>SPARK-37437</a>] -         Remove unused hive-2.3 profile
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37445'>SPARK-37445</a>] -         Update hadoop-profile
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37446'>SPARK-37446</a>] -         hive-2.3.9 related API use invoke method
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37461'>SPARK-37461</a>] -         yarn-client mode client&#39;s appid value is null
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37471'>SPARK-37471</a>] -         spark-sql support nested bracketed comment
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37497'>SPARK-37497</a>] -         Promote ExecutorPods[PollingSnapshot|WatchSnapshot]Source to DeveloperApi
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37555'>SPARK-37555</a>] -         spark-sql should pass last unclosed comment to backend and execute throw a exception
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37631'>SPARK-37631</a>] -         Code clean up on promoting strings in math functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37716'>SPARK-37716</a>] -         Allow LateralJoin node to host non-deterministic expressions when the outer query is a single row relation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37724'>SPARK-37724</a>] -         ANSI mode: disable ANSI reserved keywords by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37750'>SPARK-37750</a>] -         ANSI mode: optionally return null result if element not exists in array/map
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37766'>SPARK-37766</a>] -         Regenerate benchmark results
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37815'>SPARK-37815</a>] -         Fix the github action job &quot;test_report&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37817'>SPARK-37817</a>] -         Remove unreachable code in complexTypeExtractors.scala 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37906'>SPARK-37906</a>] -         spark-sql should not pass last simple comment to backend 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37907'>SPARK-37907</a>] -         StaticInvoke should support ConstantFolding
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37951'>SPARK-37951</a>] -         Refactor ImageFileFormatSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37965'>SPARK-37965</a>] -         Remove check field name when reading/writing existing data in ORC
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37967'>SPARK-37967</a>] -         ConstantFolding/ Literal.create support ObjectType
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37969'>SPARK-37969</a>] -         Hive Serde insert should check schema before execution
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37985'>SPARK-37985</a>] -         Fix flaky test SPARK-37578
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38003'>SPARK-38003</a>] -         Differentiate scalar and table function lookup in LookupFunctions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38063'>SPARK-38063</a>] -         Support SQL split_part function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38122'>SPARK-38122</a>] -         Update App Key of DocSearch
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38144'>SPARK-38144</a>] -         Remove unused `spark.storage.safetyFraction` config
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38150'>SPARK-38150</a>] -         Update comment of RelationConversions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38153'>SPARK-38153</a>] -         Remove option newlines.topLevelStatements in scalafmt.conf
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38189'>SPARK-38189</a>] -         Add priority scheduling doc for Spark on K8S
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38197'>SPARK-38197</a>] -         Improve error message of BlockManager.fetchRemoteManagedBuffer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38215'>SPARK-38215</a>] -         InsertIntoHiveDir support convert metadata
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38237'>SPARK-38237</a>] -         Introduce a new config to require all cluster keys on Aggregate
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38318'>SPARK-38318</a>] -         regression when replacing a dataset view
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38358'>SPARK-38358</a>] -         Add migration guide for spark.sql.hive.convertMetastoreInsertDir and spark.sql.hive.convertMetastoreCtas
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38419'>SPARK-38419</a>] -         Remove tab character and trailing space in script
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38449'>SPARK-38449</a>] -         Not call createTable when ifNotExist=true and table eixsts
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38566'>SPARK-38566</a>] -         Revert the parser changes for DEFAULT column support
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38784'>SPARK-38784</a>] -         Upgrade Jetty to 9.4.46
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39178'>SPARK-39178</a>] -         When throw SparkFatalException, should show root cause too.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39367'>SPARK-39367</a>] -         Review and fix issues in Scala/Java API docs of SQL module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39371'>SPARK-39371</a>] -         Review and fix issues in Scala/Java API docs of Core module
</li>
</ul>
                                                    
<h2>        Dependency upgrade
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38287'>SPARK-38287</a>] -         Upgrade h2 from 2.0.204 to 2.1.210 in /sql/core
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38291'>SPARK-38291</a>] -         Upgrade postgresql from 42.3.0 to 42.3.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38303'>SPARK-38303</a>] -         Upgrade ansi-regex from 5.0.0 to 5.0.1 in /dev
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39099'>SPARK-39099</a>] -         Add dependencies to Dockerfile for building Spark releases
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39183'>SPARK-39183</a>] -         Upgrade Apache Xerces Java to 2.12.2
</li>
</ul>
        
<h2>        Question
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37788'>SPARK-37788</a>] -         ColumnOrName vs Column in PySpark Functions module
</li>
</ul>
            
<h2>        Umbrella
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34705'>SPARK-34705</a>] -         Add code-gen for all join types of sort merge join
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36504'>SPARK-36504</a>] -         Improve test coverage for pandas API on Spark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36707'>SPARK-36707</a>] -         Support to specify index type and name in pandas API on Spark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37093'>SPARK-37093</a>] -         Inline type hints python/pyspark/streaming
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37094'>SPARK-37094</a>] -         Inline type hints for files in python/pyspark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37275'>SPARK-37275</a>] -         Support ANSI intervals in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37395'>SPARK-37395</a>] -         Inline type hint files for files in python/pyspark/ml
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37396'>SPARK-37396</a>] -         Inline type hint files for files in python/pyspark/mllib
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37814'>SPARK-37814</a>] -         Migrating from log4j 1 to log4j 2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37886'>SPARK-37886</a>] -         Use ComparisonTestBase to reduce redundant test code
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38396'>SPARK-38396</a>] -         Improve K8s Integration Tests
</li>
</ul>
                                                                
<h2>        Documentation
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-31907'>SPARK-31907</a>] -         Spark SQL functions documentation refers to SQL API documentation without linking to it
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36377'>SPARK-36377</a>] -         Fix documentation in spark-env.sh.template
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36474'>SPARK-36474</a>] -         Mention pandas API on Spark in Spark overview pages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37550'>SPARK-37550</a>] -         from_json documentation lacks examples for complex types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37624'>SPARK-37624</a>] -         Suppress warnings for live pandas-on-Spark quickstart notebooks
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37692'>SPARK-37692</a>] -         sql-migration-guide wrong description
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37718'>SPARK-37718</a>] -         Demo sql is incorrect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37818'>SPARK-37818</a>] -         Add option for show create table command
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37925'>SPARK-37925</a>] -         Update document to mention the workaround for YARN-11053
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38606'>SPARK-38606</a>] -         Update document to make a good guide of multiple versions of the Spark Shuffle Service 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38629'>SPARK-38629</a>] -         Two links beneath Spark SQL Guide/Data Sources do not work properly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38933'>SPARK-38933</a>] -         Add examples of window functions into SQL docs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39001'>SPARK-39001</a>] -         Document which options are unsupported in CSV and JSON functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39032'>SPARK-39032</a>] -         Incorrectly formatted examples in pyspark.sql.functions.when
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39219'>SPARK-39219</a>] -         Promote Structured Streaming over Spark Streaming
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39237'>SPARK-39237</a>] -         Update the ANSI SQL mode documentation
</li>
</ul>
            
<h2>        Github Integration
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38261'>SPARK-38261</a>] -         Sync missing R packages with CI
</li>
</ul>