Release Notes - ASF JIRA

Release Notes - Spark - Version 3.4.0 - HTML format

Configure Release Notes

Sub-task

[SPARK-28330] - ANSI SQL: Top-level <result offset clause> in <query expression>
[SPARK-28516] - Data Type Formatting Functions: `to_char`
[SPARK-30220] - Support Filter expression uses IN/EXISTS predicate sub-queries
[SPARK-30661] - KMeans blockify input vectors
[SPARK-30835] - Add support for YARN decommissioning & pre-emption
[SPARK-33236] - Enable Push-based shuffle service to store state in NM level DB for work preserving restart
[SPARK-33573] - Server side metrics related to push-based shuffle
[SPARK-34305] - Unify v1 and v2 ALTER TABLE .. SET SERDE tests
[SPARK-36114] - Support subqueries with correlated non-equality predicates
[SPARK-36124] - Support set operators to be on correlation paths
[SPARK-36511] - Remove ColumnIO once PARQUET-2050 is released in Parquet 1.13
[SPARK-36620] - Client side related push-based shuffle metrics
[SPARK-37194] - Avoid unnecessary sort in FileFormatWriter if it's not dynamic partition
[SPARK-37287] - Pull out dynamic partition and bucket sort from FileFormatWriter
[SPARK-37378] - SPJ: Convert V2 Transform expressions into catalyst expressions and load their associated functions from V2 FunctionCatalog
[SPARK-37425] - Inline type hints for python/pyspark/mllib/recommendation.py
[SPARK-37599] - Unify v1 and v2 ALTER TABLE .. SET LOCATION tests
[SPARK-37623] - Support ANSI Aggregate Function: regr_intercept
[SPARK-37672] - Support ANSI Aggregate Function: regr_sxx
[SPARK-37681] - Support ANSI Aggregate Function: regr_sxy
[SPARK-37702] - Support ANSI Aggregate Function: regr_syy
[SPARK-37888] - Unify v1 and v2 DESCRIBE TABLE tests
[SPARK-37938] - Use error classes in the parsing errors of partitions
[SPARK-37939] - Use error classes in the parsing errors of properties
[SPARK-37945] - Use error classes in the execution errors of arithmetic ops
[SPARK-37982] - Use error classes in the execution errors related to unsupported input type
[SPARK-38005] - Support cleaning up merged shuffle files and state from external shuffle service
[SPARK-38106] - Use error classes in the parsing errors of functions
[SPARK-38108] - Use error classes in the compilation errors of UDF/UDAF
[SPARK-38257] - Upgrade rocksdbjni to 7.0.3
[SPARK-38270] - SQL CLI AM should keep same exitcode with client
[SPARK-38335] - Parser changes for DEFAULT column support
[SPARK-38336] - Catalyst changes for DEFAULT column support
[SPARK-38441] - Support string and bool `regex` in `Series.replace`
[SPARK-38479] - Add `Series.duplicated` to indicate duplicate Series values.
[SPARK-38493] - Improve the test coverage for pyspark/pandas module
[SPARK-38496] - Improve the test coverage for pyspark/sql module
[SPARK-38552] - Implement `keep` parameter of `frame.nlargest/nsmallest` to decide how to resolve ties
[SPARK-38576] - Implement `numeric_only` parameter for `DataFrame/Series.rank` to rank numeric columns only
[SPARK-38588] - Validate input dataset of ml.classification
[SPARK-38608] - Implement `bool_only` parameter of `DataFrame.all` and`DataFrame.any`
[SPARK-38669] - Validate input dataset of ml.clustering
[SPARK-38678] - Enable RocksDB tests on Apple Silicon on MacOS
[SPARK-38686] - Implement `keep` parameter of `(Index/MultiIndex).drop_duplicates`
[SPARK-38687] - Use error classes in the compilation errors of generators
[SPARK-38688] - Use error classes in the compilation errors of deserializer
[SPARK-38689] - Use error classes in the compilation errors of not allowed DESC PARTITION
[SPARK-38697] - Extend SparkSessionExtensions to inject rules into AQE Optimizer
[SPARK-38700] - Use error classes in the execution errors of save mode
[SPARK-38701] - Inline IllegalStateException out from QueryExecutionErrors
[SPARK-38704] - Support string `inclusive` parameter of `Series.between`
[SPARK-38718] - Test the error class: AMBIGUOUS_FIELD_NAME
[SPARK-38720] - Test the error class: CANNOT_CHANGE_DECIMAL_PRECISION
[SPARK-38721] - Test the error class: CANNOT_PARSE_DECIMAL
[SPARK-38722] - Test the error class: CAST_CAUSES_OVERFLOW
[SPARK-38724] - Test the error class: DIVIDE_BY_ZERO
[SPARK-38725] - Test the error class: DUPLICATE_KEY
[SPARK-38726] - Support `how` parameter of `MultiIndex.dropna`
[SPARK-38727] - Test the error class: FAILED_EXECUTE_UDF
[SPARK-38728] - Test the error class: FAILED_RENAME_PATH
[SPARK-38729] - Test the error class: FAILED_SET_ORIGINAL_PERMISSION_BACK
[SPARK-38730] - Move tests for the grouping error classes to QueryCompilationErrorsSuite
[SPARK-38731] - Move the tests `GROUPING_SIZE_LIMIT_EXCEEDED` to QueryCompilationErrorsSuite
[SPARK-38732] - Test the error class: INCOMPARABLE_PIVOT_COLUMN
[SPARK-38733] - Test the error class: INCOMPATIBLE_DATASOURCE_REGISTER
[SPARK-38734] - Test the error class: INDEX_OUT_OF_BOUNDS
[SPARK-38736] - Test the error classes: INVALID_ARRAY_INDEX*
[SPARK-38737] - Test the error classes: INVALID_FIELD_NAME
[SPARK-38738] - Test the error class: INVALID_FRACTION_OF_SECOND
[SPARK-38739] - Test the error class: INVALID_INPUT_SYNTAX_FOR_NUMERIC_TYPE
[SPARK-38740] - Test the error class: INVALID_JSON_SCHEMA_MAPTYPE
[SPARK-38741] - Test the error class: MAP_KEY_DOES_NOT_EXIST*
[SPARK-38742] - Move the tests `MISSING_COLUMN` to QueryCompilationErrorsSuite
[SPARK-38744] - Test the pivot error classes
[SPARK-38745] - Move the tests for `NON_PARTITION_COLUMN` to QueryCompilationErrorsSuite
[SPARK-38746] - Move the tests for `PARSE_EMPTY_STATEMENT` to QueryParsingErrorsSuite
[SPARK-38747] - Move the tests for `PARSE_SYNTAX_ERROR` to QueryParsingErrorsSuite
[SPARK-38748] - Test the error class: PIVOT_VALUE_DATA_TYPE_MISMATCH
[SPARK-38749] - Test the error class: RENAME_SRC_PATH_NOT_FOUND
[SPARK-38750] - Test the error class: SECOND_FUNCTION_ARGUMENT_NOT_INTEGER
[SPARK-38751] - Test the error class: UNRECOGNIZED_SQL_TYPE
[SPARK-38752] - Test the error class: UNSUPPORTED_DATATYPE
[SPARK-38753] - Move the tests for `WRITING_JOB_ABORTED` to QueryExecutionErrorsSuite
[SPARK-38765] - Implement `inplace` parameter of `Series.clip`
[SPARK-38768] - If limit could pushed down and Data source only have one partition, DS V2 should not do limit again
[SPARK-38774] - impl Series.autocorr
[SPARK-38775] - cleanup validation functions
[SPARK-38785] - impl Series.ewm and DataFrame.ewm
[SPARK-38791] - Output parameter values of error classes in SQL style
[SPARK-38793] - Support `return_indexer` parameter of `Index/MultiIndex.sort_values`
[SPARK-38795] - Support INSERT INTO user specified column lists with DEFAULT values
[SPARK-38811] - Support ALTER TABLE ADD COLUMN commands with DEFAULT values
[SPARK-38820] - Refresh dtype when astype("category")
[SPARK-38821] - test_nsmallest test failed tue to pandas 1.4.0-1.4.2 bug
[SPARK-38822] - Raise indexError when insert loc is out of bounds
[SPARK-38827] - Improve the test coverage for pyspark/find_spark_home.py
[SPARK-38834] - Update the version of TimestampNTZ related changes as 3.4.0
[SPARK-38837] - Implement `dropna` parameter of `SeriesGroupBy.value_counts`
[SPARK-38838] - Support ALTER TABLE ALTER COLUMN commands with DEFAULT values
[SPARK-38840] - Enable spark.sql.parquet.enableNestedColumnVectorizedReader on master branch by default
[SPARK-38844] - impl Series.interpolate and DataFrame.interpolate
[SPARK-38854] - Improve the test coverage for pyspark/statcounter.py
[SPARK-38857] - series name should be preserved in series.mode()
[SPARK-38859] - iloc setitem failed due to "Cannot convert * into bool"
[SPARK-38863] - Implement `skipna` parameter of `DataFrame.all`
[SPARK-38865] - Update document of JDBC options for pushDownAggregate and pushDownLimit
[SPARK-38869] - Respect Table capability `ACCEPT_ANY_SCHEMA` in default column resolution
[SPARK-38877] - CLONE - Improve the test coverage for pyspark/find_spark_home.py
[SPARK-38878] - CLONE - Improve the test coverage for pyspark/statcounter.py
[SPARK-38879] - Improve the test coverage for pyspark/rddsampler.py
[SPARK-38880] - Implement `numeric_only` parameter of `GroupBy.max/min`
[SPARK-38890] - Implement `ignore_index` of `DataFrame.sort_index`.
[SPARK-38891] - Skipping allocating vector for repetition & definition levels when possible
[SPARK-38894] - Exclude pyspark.cloudpickle in test coverage report
[SPARK-38897] - DS V2 supports push down string functions
[SPARK-38899] - DS V2 supports push down datetime functions
[SPARK-38901] - DS V2 supports push down misc functions
[SPARK-38903] - Implement `ignore_index` of `Series.sort_values` and `Series.sort_index`
[SPARK-38907] - Impl DataFrame.corrwith
[SPARK-38913] - Output identifiers in error messages in SQL style
[SPARK-38937] - interpolate support param `limit_direction`
[SPARK-38938] - Implement `inplace` and `columns` parameters of `Series.drop`
[SPARK-38943] - EWM support ignore_na
[SPARK-38946] - Generates a new dataframe instead of operating inplace in setitem
[SPARK-38947] - Support Groupby positional indexing
[SPARK-38949] - Wrap SQL statements by double quotes in error messages
[SPARK-38952] - Implement `numeric_only` of `GroupBy.first` and `GroupBy.last`
[SPARK-38959] - DataSource V2: Support runtime group filtering in row-level commands
[SPARK-38978] - DS V2 supports push down OFFSET operator
[SPARK-38980] - Move error class tests requiring ANSI SQL mode to QueryExecutionAnsiErrorsSuite
[SPARK-38982] - test_categories_setter failed due to pandas bug
[SPARK-38984] - Allow comparison between TimestampNTZ and Timestamp/Date
[SPARK-38986] - Prepend error class tag to error messages
[SPARK-38987] - Handle fallback when merged shuffle blocks are corrupted and spark.shuffle.detectCorrupt is set to true
[SPARK-38989] - Implement `ignore_index` of `DataFrame/Series.sample`
[SPARK-38993] - Impl DataFrame.boxplot and DataFrame.plot.box
[SPARK-38996] - Use double quotes for types in error messages
[SPARK-39000] - Convert bools to ints in basic statistical functions of GroupBy objects
[SPARK-39006] - Show a directional error message for PVC Dynamic Allocation Failure
[SPARK-39007] - Use double quotes for SQL configs in error messages
[SPARK-39018] - Add support for YARN decommissioning when ESS is Disabled
[SPARK-39028] - Use SparkDateTimeException when casting to datetime types failed
[SPARK-39029] - Improve the test coverage for pyspark/broadcast.py
[SPARK-39037] - DS V2 Top N push-down supports order by expressions
[SPARK-39047] - Replace the error class ILLEGAL_SUBSTRING by INVALID_PARAMETER_VALUE
[SPARK-39053] - test_multi_index_dtypes failed due to index mismatch
[SPARK-39054] - GroupByTest failed due to axis Length mismatch
[SPARK-39077] - Implement `skipna` of basic statistical functions of DataFrame and Series
[SPARK-39078] - Support UPDATE commands with DEFAULT values
[SPARK-39081] - Impl DataFrame.resample and Series.resample
[SPARK-39085] - Move error message of INCONSISTENT_BEHAVIOR_CROSS_VERSION to the json file
[SPARK-39086] - Support UDT in Spark Parquet vectorized reader
[SPARK-39087] - Improve error messages: step 1
[SPARK-39095] - Adjust `GroupBy.std` to match pandas 1.4
[SPARK-39096] - Support MERGE commands with DEFAULT values
[SPARK-39097] - Improve the test coverage for pyspark/taskcontext.py
[SPARK-39108] - Show hints for try_add/try_substract/try_multiply in error messages of int/long overflow
[SPARK-39109] - Adjust `GroupBy.mean/median` to match pandas 1.4
[SPARK-39114] - ml.optim.aggregator avoid re-allocating buffers
[SPARK-39121] - Fix doc format/syntax error
[SPARK-39139] - DS V2 supports push down DS V2 UDF
[SPARK-39143] - Support CSV file scans with DEFAULT values
[SPARK-39148] - DS V2 aggregate push down can work with OFFSET or LIMIT
[SPARK-39163] - Throw an exception w/ error class for an invalid bucket file
[SPARK-39164] - Wrap asserts/illegal state exceptions by the INTERNAL_ERROR exception in actions
[SPARK-39165] - Replace sys.error by IllegalStateException in Spark SQL
[SPARK-39167] - Throw an exception w/ an error class for multiple rows from a subquery used as an expression
[SPARK-39170] - ImportError when creating pyspark.pandas document "Supported APIs" if pandas version is low.
[SPARK-39179] - Improve the test coverage for pyspark/shuffle.py
[SPARK-39187] - Remove SparkIllegalStateException
[SPARK-39189] - interpolate supports limit_area
[SPARK-39197] - Implement `skipna` parameter of `GroupBy.all`
[SPARK-39200] - Stream is corrupted Exception while fetching the blocks from fallback storage system
[SPARK-39201] - Implement `ignore_index` of `DataFrame.explode` and `DataFrame.drop_duplicates`
[SPARK-39211] - Support JSON file scans with default values
[SPARK-39214] - Improve errors related to CAST
[SPARK-39223] - implement skew and kurt in Rolling/RollingGroupby/Expanding/ExpandingGroupby
[SPARK-39230] - Support ANSI Aggregate Function: regr_slope
[SPARK-39234] - Code clean up in SparkThrowableHelper.getMessage
[SPARK-39236] - Make CreateTable API and ListTables API compatible
[SPARK-39243] - Describe the rules of quoting elements in error messages
[SPARK-39246] - Implement Groupby.skew
[SPARK-39255] - Improve error messages: step 2
[SPARK-39263] - GetTable, TableExists and DatabaseExists
[SPARK-39265] - Support Parquet file scans with DEFAULT values
[SPARK-39270] - JDBC dialect supports registering dialect specific functions
[SPARK-39271] - Upgrade pandas to 1.4.3
[SPARK-39285] - Spark should not check filed name when read data
[SPARK-39294] - Support Orc file scans with DEFAULT values
[SPARK-39309] - '_SubTest' object has no attribute 'elapsed_time'
[SPARK-39310] - rename `required_same_anchor`
[SPARK-39314] - Respect ps.concat sort parameter to follow pandas behavior
[SPARK-39316] - Merge PromotePrecision and CheckOverflow into decimal binary arithmetic
[SPARK-39317] - groupby.apply doc test failed when SPARK_CONF_ARROW_ENABLED disable
[SPARK-39319] - Make query context as part of SparkThrowable
[SPARK-39324] - Log ExecutorDecommission as INFO level in TaskSchedulerImpl
[SPARK-39326] - replace "NaN" with real "None" value in indexes in doctest
[SPARK-39335] - DescribeTableCommand should redact properties
[SPARK-39339] - Support TimestampNTZ in JDBC data source
[SPARK-39342] - ShowTablePropertiesCommand/ShowTablePropertiesExec should redact properties.
[SPARK-39343] - DescribeTableExec should redact properties
[SPARK-39346] - Convert asserts/illegal state exception to internal errors on each phase
[SPARK-39350] - DescribeNamespace should redact properties
[SPARK-39351] - ShowCreateTable should redact properties
[SPARK-39359] - Restrict DEFAULT columns to allowlist of supported data source types
[SPARK-39383] - Support V2 data sources with DEFAULT values
[SPARK-39384] - Compile build-in linear regression aggregate functions for JDBC dialect
[SPARK-39385] - Translate linear regression aggregate functions for pushdown
[SPARK-39406] - Accept NumPy array in createDataFrame
[SPARK-39413] - Capitalize sql keywords in JDBCV2Suite
[SPARK-39425] - Add migration guide for PS behavior changes
[SPARK-39432] - element_at(*, 0) does not return INVALID_ARRAY_INDEX_IN_ELEMENT_AT
[SPARK-39434] - Provide runtime error query context when array index is out of bound
[SPARK-39450] - Reuse PVCs by default
[SPARK-39451] - Support casting intervals to integrals in ANSI mode
[SPARK-39453] - DS V2 supports push down misc non-aggregate functions(non ANSI)
[SPARK-39459] - local*HostName* methods should support IPv6
[SPARK-39460] - Fix CoarseGrainedSchedulerBackendSuite to handle fast allocations
[SPARK-39461] - Print `SPARK_LOCAL_(HOSTNAME|IP)` in `build/{mvn|sbt}`
[SPARK-39464] - Use `Utils.localCanonicalHostName` instead of `localhost` in tests
[SPARK-39468] - Improve RpcAddress to add [] to IPv6 if needed
[SPARK-39470] - Support cast of ANSI intervals to decimals
[SPARK-39479] - DS V2 supports push down math functions(non ANSI)
[SPARK-39482] - Add build and test documentation on IPv6
[SPARK-39490] - Support `ipFamilyPolicy` and `ipFamilies` in Driver Service
[SPARK-39491] - Hadoop 2.7 build fails due to org.apache.hadoop.yarn.api.records.NodeState.DECOMMISSIONING
[SPARK-39501] - Propagate `java.net.preferIPv6Addresses=true` in SBT tests
[SPARK-39502] - Downgrade scala-maven-plugin to 4.6.1
[SPARK-39503] - Add session catalog name for v1 database table and function
[SPARK-39506] - CacheTable, isCached, UncacheTable, setCurrentCatalog, currentCatalog, listCatalogs
[SPARK-39507] - SocketAuthServer should respect Java IPv6 options
[SPARK-39508] - Support IPv6 between JVM and Python Daemon in PySpark
[SPARK-39509] - Support DEFAULT_ARTIFACT_REPOSITORY in check-license
[SPARK-39514] - LauncherBackendSuite should add java.net.preferIPv6Addresses conf
[SPARK-39516] - Set a scheduled build for branch-3.3
[SPARK-39517] - Recover branch-3.2 build broken by is-changed.py script missing
[SPARK-39519] - Test failure in SPARK-39387 with JDK 11
[SPARK-39520] - ExpressionSetSuite test failure with Scala 2.13
[SPARK-39521] - Define each workflow for each scheduled job in GitHub Actions
[SPARK-39522] - Add Apache Spark infra GA image cache
[SPARK-39528] - Use V2 Filter in SupportsRuntimeFiltering
[SPARK-39529] - Refactor and merge all related job selection logic into precondition
[SPARK-39530] - Fix KafkaTestUtils to support IPv6
[SPARK-39542] - Improve YARN client mode to support IPv6
[SPARK-39552] - Unify v1 and v2 DESCRIBE TABLE
[SPARK-39553] - Failed to remove shuffle ${shuffleId} - null when using Scala 2.13
[SPARK-39555] - Make createTable and listTables in the python side support 3-layer-namespace
[SPARK-39557] - Support ARRAY, STRUCT, MAP types as DEFAULT values
[SPARK-39559] - Support IPv6 in WebUI
[SPARK-39561] - Improve SparkContext to propagate `java.net.preferIPv6Addresses`
[SPARK-39562] - Make hive-thrift server module passes in IPv6 environment
[SPARK-39563] - Use localHostNameForURI in UISuite
[SPARK-39566] - Improve YARN cluster mode to support IPv6
[SPARK-39571] - Add net-tools to Spark docker files
[SPARK-39572] - Fix `test_daemon.py` to support IPv6
[SPARK-39574] - Better error message when `ps.Index` is used for DataFrame/Series creation
[SPARK-39579] - Make ListFunctions/getFunction/functionExists API compatible
[SPARK-39583] - Make RefreshTable be compatible with 3 layer namespace
[SPARK-39594] - Improve logs to show addresses in addition to port
[SPARK-39597] - Make GetTable, TableExists and DatabaseExists in the python side support 3-layer-namespace
[SPARK-39598] - Make *cache*, *catalog* in the python side support 3-layer-namespace
[SPARK-39607] - DataSourceV2: Distribution and ordering support V2 function in writing
[SPARK-39610] - Add safe.directory for container based job
[SPARK-39611] - PySpark support numpy 1.23.X
[SPARK-39615] - Make listColumns be compatible with 3 layer namespace
[SPARK-39627] - DS V2 pushdown should unify the compile API
[SPARK-39629] - Support v2 SHOW FUNCTIONS
[SPARK-39641] - Unify v1 and v2 SHOW FUNCTIONS tests
[SPARK-39643] - Prohibit subquery expressions in DEFAULT values for now
[SPARK-39645] - Make getDatabase and listDatabases compatible with 3 layer namespace
[SPARK-39646] - Make setCurrentDatabase compatible with 3 layer namespace
[SPARK-39649] - Make listDatabases / getDatabase / listColumns / refreshTable in PySpark support 3-layer-namespace
[SPARK-39686] - Disable scheduled builds that do not pass even once
[SPARK-39687] - Make sure new catalog methods listed in API reference
[SPARK-39688] - getReusablePVCs should handle accounts with no PVC permission
[SPARK-39697] - Add REFRESH_DATE flag and use previous cache to build cache image
[SPARK-39700] - Update two-parameter listColumns/getTable/getFunction/tableExists/functionExists functions docs to mention limitation
[SPARK-39704] - Implement createIndex & dropIndex & IndexExists in JDBC (H2 dialect)
[SPARK-39716] - Make currentDatabase/setCurrentDatabase/listCatalogs in SparkR support 3L namespace
[SPARK-39718] - Enable base image build in PySpark job
[SPARK-39719] - Implement databaseExists/getDatabase in SparkR support 3L namespace
[SPARK-39720] - Implement tableExists/getTable in SparkR for 3L namespace
[SPARK-39723] - Implement functionExists/getFunc in SparkR for 3L namespace
[SPARK-39735] - Enable base image build in lint job and fix sparkr env
[SPARK-39736] - Enable base image build in SparkR job
[SPARK-39756] - Better error messages for missing pandas scalars
[SPARK-39759] - Implement listIndexes in JDBC (H2 dialect)
[SPARK-39762] - Support numpy 1.23.0 (Remove numpy<1.23.0 version limit)
[SPARK-39772] - namespace should be null when database is null in the old constructors
[SPARK-39773] - Update document of JDBC options for pushDownOffset
[SPARK-39778] - Improve error messages: step 3
[SPARK-39787] - Use error class in the parsing error of function to_timestamp
[SPARK-39788] - Rename catalogName to dialectName for JdbcUtils
[SPARK-39792] - Add DecimalDivideWithOverflowCheck for decimal average
[SPARK-39795] - New SQL function: try_to_timestamp
[SPARK-39799] - DataSourceV2: View catalog interface
[SPARK-39807] - Respect ``Series.concat`` sort parameter to follow 1.4.3 behavior
[SPARK-39810] - Catalog.tableExists should handle nested namespace
[SPARK-39818] - Fix bug in ARRAY, STRUCT, MAP types with DEFAULT values with NULL field(s)
[SPARK-39819] - DS V2 aggregate push down can work with Top N or Paging (Sort with group expressions)
[SPARK-39827] - add_months() returns a java error on overflow
[SPARK-39828] - Catalog.listTables() should respect currentCatalog
[SPARK-39836] - Simplify V2ExpressionBuilder by extract common method.
[SPARK-39844] - Restrict adding DEFAULT columns for existing tables to allowlist of supported data source types
[SPARK-39846] - Enable spark.dynamicAllocation.shuffleTracking.enabled by default
[SPARK-39852] - Unify v1 and v2 DESCRIBE TABLE tests for columns
[SPARK-39859] - Support v2 `DESCRIBE TABLE EXTENDED` for columns
[SPARK-39862] - Fix bug in existence DEFAULT value lookups for V2 data sources
[SPARK-39884] - KubernetesExecutorBackend should handle IPv6 hostname
[SPARK-39889] - Use different error classes for numeric/interval divided by 0
[SPARK-39898] - Upgrade kubernetes-client to 5.12.3
[SPARK-39899] - Incorrect passing of message parameters in InvalidUDFClassException
[SPARK-39905] - Remove checkErrorClass()
[SPARK-39907] - Implement axis and skipna of Series.argmin
[SPARK-39909] - Organize the check of push down information for JDBCV2Suite
[SPARK-39914] - Add DS V2 Filter to V1 Filter conversion
[SPARK-39917] - Use different error classes for numeric/interval arithmetic overflow
[SPARK-39923] - Put QueryContext to array instead of Option
[SPARK-39926] - Fix bug in existence DEFAULT value lookups for non-vectorized Parquet scans
[SPARK-39928] - Optimize Utils.getIteratorSize for Scala 2.13 refer to IterableOnceOps.size
[SPARK-39929] - DS V2 supports push down string functions(non ANSI)
[SPARK-39933] - Check query context by checkError()
[SPARK-39935] - Switch validateParsingError onto checkError
[SPARK-39949] - Principals in KafkaTestUtils should use canonical host name
[SPARK-39961] - DS V2 push-down translate Cast if the cast is safe
[SPARK-39964] - DS V2 pushdown should unify the translate path
[SPARK-39965] - Skip PVC cleanup when driver doesn't own PVCs
[SPARK-39966] - Use V2 Filter in SupportsDelete
[SPARK-39985] - Test DEFAULT column values with DataFrames
[SPARK-39987] - Support PEAK_JVM_(ON|OFF)HEAP_MEMORY executor rolling policy
[SPARK-40000] - Add config to toggle whether to automatically add default values for INSERTs without user-specified fields
[SPARK-40001] - Add config to make DEFAULT values in JSON tables mutually exclusive with SQLConf.JSON_GENERATOR_IGNORE_NULL_FIELDS
[SPARK-40006] - Make pyspark.sql.group examples self-contained
[SPARK-40008] - Support casting integrals to intervals in ANSI mode
[SPARK-40010] - Make pyspark.sql.window examples self-contained
[SPARK-40012] - Make pyspark.sql.dataframe examples self-contained
[SPARK-40013] - DS V2 expressions should have the default toString
[SPARK-40014] - Support cast of decimals to ANSI intervals
[SPARK-40016] - Remove unnecessary TryEval in TrySum
[SPARK-40018] - Output SparkThrowable to SQL golden files in JSON format
[SPARK-40027] - Make pyspark.sql.streaming.readwriter examples self-contained
[SPARK-40029] - Make pyspark.sql.types examples self-contained
[SPARK-40041] - Add Document Parameters for pyspark.sql.window
[SPARK-40042] - Make pyspark.sql.streaming.query examples self-contained
[SPARK-40044] - Incorrect target interval type in cast overflow errors
[SPARK-40051] - Make pyspark.sql.catalog examples self-contained
[SPARK-40054] - Restore the error handling syntax of try_cast()
[SPARK-40055] - listCatalogs should also return spark_catalog even spark_catalog implementation is defaultSessionCatalog
[SPARK-40060] - Add numberDecommissioningExecutors metric
[SPARK-40061] - Document cast of ANSI intervals
[SPARK-40064] - Use V2 Filter in SupportsOverwrite
[SPARK-40066] - ANSI mode: always return null on invalid access to map column
[SPARK-40077] - Make pyspark.context examples self-contained
[SPARK-40078] - Make pyspark.sql.column examples self-contained
[SPARK-40081] - Add Document Parameters for pyspark.sql.streaming.query
[SPARK-40098] - Format error messages in the Thrift Server
[SPARK-40102] - Use SparkException instead of IllegalStateException in SparkPlan
[SPARK-40107] - Pull out empty2null conversion from FileFormatWriter
[SPARK-40109] - New SQL function: get()
[SPARK-40111] - Make pyspark.rdd examples self-contained
[SPARK-40120] - Make pyspark.sql.readwriter examples self-contained
[SPARK-40135] - Support ps.Index in DataFrame creation
[SPARK-40136] - Incorrect fragment of query context
[SPARK-40138] - Implement DataFrame.mode
[SPARK-40142] - Make pyspark.sql.functions examples self-contained
[SPARK-40147] - Make pyspark.sql.session examples self-contained
[SPARK-40157] - Make pyspark.files examples self-contained
[SPARK-40160] - Make pyspark.broadcast examples self-contained
[SPARK-40161] - Make Series.mode apply PandasMode
[SPARK-40173] - Make pyspark.taskcontext examples self-contained
[SPARK-40180] - Format error messages by spark-sql
[SPARK-40183] - Use error class NUMERIC_VALUE_OUT_OF_RANGE for overflow in decimal conversion
[SPARK-40187] - Add doc for using Apache YuniKorn as a customized scheduler
[SPARK-40191] - Make pyspark.resource examples self-contained
[SPARK-40196] - Consolidate `lit` function with NumPy scalar in sql and pandas module
[SPARK-40198] - Enable spark.storage.decommission.(rdd|shuffle)Blocks.enabled by default
[SPARK-40205] - Provide a query context of ELEMENT_AT_BY_INDEX_ZERO
[SPARK-40209] - Incorrect value in the error message of NUMERIC_VALUE_OUT_OF_RANGE
[SPARK-40220] - Don't output the empty map of error message parameters
[SPARK-40222] - Numeric try_add/try_divide/try_subtract/try_multiply should throw error from their children
[SPARK-40257] - Remove since usage in streaming/query.py and window.py
[SPARK-40260] - Use error classes in the compilation errors of GROUP BY a position
[SPARK-40269] - Randomize the orders of peer in BlockManagerDecommissioner
[SPARK-40291] - Improve the message for column not in group by clause error
[SPARK-40300] - Migrate onto the DATATYPE_MISMATCH error classes
[SPARK-40302] - Add YuniKornSuite
[SPARK-40304] - Add decomTestTag to K8s Integration Test
[SPARK-40305] - Implement Groupby.sem
[SPARK-40310] - try_sum() should throw the exceptions from its child
[SPARK-40313] - ps.DataFrame(data, index) should support the same anchor
[SPARK-40318] - try_avg() should throw the exceptions from its child
[SPARK-40324] - Provide a query context of ParseException
[SPARK-40330] - Implement `Series.searchsorted`.
[SPARK-40332] - Implement `GroupBy.quantile`.
[SPARK-40333] - Implement `GroupBy.nth`.
[SPARK-40334] - Implement `GroupBy.prod`.
[SPARK-40339] - Implement `Expanding.quantile`.
[SPARK-40342] - Implement `Rolling.quantile`.
[SPARK-40345] - Implement `ExpandingGroupby.quantile`.
[SPARK-40348] - Implement `RollingGroupby.quantile`.
[SPARK-40356] - Upgrade pandas to 1.4.4
[SPARK-40357] - Migrate window type check failures onto error classes
[SPARK-40358] - Migrate collection type check failures onto error classes
[SPARK-40359] - Migrate JSON type check failures onto error classes
[SPARK-40361] - Migrate arithmetic type check failures onto error classes
[SPARK-40368] - Migrate Bloom Filter type check failures onto error classes
[SPARK-40369] - Migrate the type check failures of calls via reflection onto error classes
[SPARK-40370] - Migrate cast type check failures onto error classes
[SPARK-40371] - Migrate type check failures of NthValue and NTile onto error classes
[SPARK-40372] - Migrate failures of array type checks onto error classes
[SPARK-40374] - Migrate type check failures of type creators onto error classes
[SPARK-40379] - Propagate decommission executor loss reason during onDisconnect in K8s
[SPARK-40386] - Implement `ddof` in `DataFrame.cov`
[SPARK-40391] - Test the error class UNSUPPORTED_FEATURE.JDBC_TRANSACTION
[SPARK-40393] - Refactor expanding and rolling test for function with input
[SPARK-40399] - Make `pearson` correlation in `DataFrame.corr` support missing values and `min_periods`
[SPARK-40400] - Pass error message parameters to exceptions as a map
[SPARK-40416] - Add error classes for subquery expression CheckAnalysis failures
[SPARK-40417] - Use YuniKorn v1.1+
[SPARK-40420] - Sort message parameters in the JSON formats
[SPARK-40421] - Make `spearman` correlation in `DataFrame.corr` support missing values and `min_periods`
[SPARK-40423] - Add explicit YuniKorn queue submission test coverage
[SPARK-40426] - Return a map from SparkThrowable.getMessageParameters
[SPARK-40432] - Introduce GroupStateImpl and GroupStateTimeout in PySpark
[SPARK-40433] - Add toJVMRow in PythonSQLUtils to convert pickled PySpark Row to JVM Row
[SPARK-40434] - Implement applyInPandasWithState in PySpark
[SPARK-40435] - Add test suites for applyInPandasWithState in PySpark
[SPARK-40445] - Refactor Resampler
[SPARK-40446] - Rename `_MissingPandasXXX` as `MissingPandasXXX`
[SPARK-40447] - Implement `kendall` correlation in `DataFrame.corr`
[SPARK-40448] - Initial prototype implementation
[SPARK-40453] - Improve error handling for GRPC server
[SPARK-40454] - Initial DSL framework for protobuf testing
[SPARK-40458] - Bump Kubernetes Client Version to 6.1.1
[SPARK-40459] - recoverDiskStore should not stop by existing recomputed files
[SPARK-40473] - Migrate parsing errors onto error classes
[SPARK-40479] - Migrate unexpected input type error to an error class
[SPARK-40481] - Ignore stage fetch failure caused by decommissioned executor
[SPARK-40483] - Add `CONNECT` label
[SPARK-40486] - Implement `spearman` and `kendall` in `DataFrame.corrwith`
[SPARK-40498] - Implement `kendall` and `min_periods` in `Series.corr`
[SPARK-40503] - Add resampling to API references
[SPARK-40509] - Construct an example of applyInPandasWithState in examples directory
[SPARK-40510] - Implement `ddof` in `Series.cov`
[SPARK-40512] - Upgrade pandas to 1.5.0
[SPARK-40515] - Add apache/spark-docker repo
[SPARK-40516] - Add official image dockerfile for Spark v3.3.0
[SPARK-40519] - Add "Publish workflow" to help release apache/spark image
[SPARK-40520] - Add a script to generate DOI mainifest
[SPARK-40528] - Add dockerfile template
[SPARK-40529] - Remove `pyspark.pandas.ml`
[SPARK-40532] - Python version for UDF should follow the servers version
[SPARK-40533] - Extend type support for Spark Connect literals
[SPARK-40534] - Extend support for Join Relation
[SPARK-40536] - Make Spark Connect port configurable.
[SPARK-40537] - Re-enable mypi supoprt
[SPARK-40538] - Add missing PySpark functions to Spark Connect
[SPARK-40539] - PySpark read API parity for Spark Connect
[SPARK-40540] - Migrate compilation errors onto error classes
[SPARK-40542] - Make `ddof` in `DataFrame.std` and `Series.std` accept arbitary integers
[SPARK-40543] - Make `ddof` in `DataFrame.var` and `Series.var` accept arbitary integers
[SPARK-40550] - DataSource V2: Handle DELETE commands for delta-based sources
[SPARK-40551] - DataSource V2: Add APIs for delta-based row-level operations
[SPARK-40554] - Make `ddof` in `DataFrame.sem` and `Series.sem` accept arbitary integers
[SPARK-40557] - Re-generate Spark Connect Python protos
[SPARK-40560] - Rename message to messageFormat in the STANDARD format of errors
[SPARK-40561] - Implement `min_count` in GroupBy.min
[SPARK-40569] - Add smoke test in standalone cluster for spark-docker
[SPARK-40571] - Construct a test case to verify fault-tolerance semantic with random python worker failures
[SPARK-40573] - Make `ddof` in `GroupBy.std`, `GroupBy.var` and `GroupBy.sem` accept arbitary integers
[SPARK-40577] - Fix CategoricalIndex.append
[SPARK-40578] - Fix `IndexesTest.test_to_frame` when pandas 1.5.0
[SPARK-40579] - `GroupBy.first` should skip nulls
[SPARK-40580] - Update the document for DataFrame.to_orc
[SPARK-40587] - SELECT * shouldn't be empty project list in proto.
[SPARK-40589] - Fix test for `DataFrame.corr_with` to skip the pandas regression
[SPARK-40590] - Fix `ps.read_parquet` when pandas_metadata is True
[SPARK-40592] - Implement `min_count` in `GroupBy.max`
[SPARK-40593] - protoc-3.21.1-linux-x86_64.exe requires GLIBC_2.14
[SPARK-40605] - Connect module should use log4j2.properties to configure test log output as other modules
[SPARK-40613] - Update sbt-protoc to 1.0.6
[SPARK-40615] - Check unsupported data type when decorrelating subqueries
[SPARK-40621] - Implement `numeric_only` and `min_count` in `GroupBy.sum`
[SPARK-40631] - Implement `min_count` in `GroupBy.first`
[SPARK-40636] - Fix wrong remained shuffles log in BlockManagerDecommissioner
[SPARK-40643] - Implement `min_count` in `GroupBy.last`
[SPARK-40645] - Throw exception for Collect() and recommend to use toPandas()
[SPARK-40663] - Migrate execution errors onto error classes
[SPARK-40665] - Avoid embedding Spark Connect in the Apache Spark binary release
[SPARK-40671] - Support driver service labels
[SPARK-40672] - Run Scala side tests in GitHub Actions
[SPARK-40674] - Use uniitest's asserts instead of built-in assert
[SPARK-40677] - Shade more dependency to be able to run separately
[SPARK-40680] - Avoid hardcoded versions in SBT build
[SPARK-40687] - Support data masking built-in Function 'mask'
[SPARK-40693] - mypy complains accessing the variable defined in the class method
[SPARK-40698] - Improve the precision of `product` for intergral inputs
[SPARK-40699] - Supplement undocumented yarn configuration in documentation
[SPARK-40702] - Confusing partition specs in PartitionsAlreadyExistException
[SPARK-40707] - Add groupby to connect DSL and test more than one grouping expressions
[SPARK-40709] - Supplement undocumented avro configurations in documentation
[SPARK-40710] - Supplement undocumented parquet configurations in documentation
[SPARK-40713] - Improve SET operation support in the proto and the server
[SPARK-40714] - Remove PartitionAlreadyExistsException
[SPARK-40717] - Support Column Alias in connect DSL
[SPARK-40718] - Replace shaded netty with grpc netty to avoid double shaded dependency.
[SPARK-40726] - Supplement undocumented orc configurations in documentation
[SPARK-40727] - Add merge_spark_docker_pr.py to help merge commit
[SPARK-40729] - Spark-shell run failed with Java 19
[SPARK-40733] - ShowCreateTableSuite test failed
[SPARK-40737] - Add basic support for DataFrameWriter
[SPARK-40743] - StructType should contain a list of StructField and each field should have a name
[SPARK-40744] - Make `_reduce_for_stat_function` in `groupby` accept `min_count`
[SPARK-40746] - Make Dockerfile build workflow work in apache repo
[SPARK-40748] - Migrate type check failures of conditions onto error classes
[SPARK-40749] - Migrate type check failures of generators onto error classes
[SPARK-40750] - Migrate type check failures of math expressions onto error classes
[SPARK-40751] - Migrate type check failures of high order functions onto error classes
[SPARK-40752] - Migrate type check failures of misc expressions onto error classes
[SPARK-40754] - Add LICENSE and NOTICE for apache/spark-docker
[SPARK-40755] - Migrate type check failures of number formatting onto error classes
[SPARK-40756] - Migrate type check failures of string expressions onto error classes
[SPARK-40757] - Add PULL_REQUEST_TEMPLATE for spark-docker
[SPARK-40759] - Migrate type check failures of time window onto error classes
[SPARK-40760] - Migrate type check failures of interval expressions onto error classes
[SPARK-40761] - Migrate type check failures of percentile expressions onto error classes
[SPARK-40762] - Check error classes in ErrorParserSuite
[SPARK-40768] - Migrate type check failures of bloom_filter_agg() onto error classes
[SPARK-40769] - Migrate type check failures of aggregate expressions onto error classes
[SPARK-40773] - Refactor checkCorrelationsInSubquery
[SPARK-40774] - Add Sample to proto and DSL
[SPARK-40779] - Fix `corrwith` to work properly with different anchor.
[SPARK-40780] - Add WHERE to Connect proto and DSL
[SPARK-40783] - Enable Spark on K8s integration test for official dockerfiles
[SPARK-40784] - Check error classes in DDLParserSuite
[SPARK-40785] - Check error classes in ExpressionParserSuite
[SPARK-40786] - Check error classes in PlanParserSuite
[SPARK-40787] - Check error classes in SparkSqlParserSuite
[SPARK-40788] - Check error classes in CreateNamespaceParserSuite
[SPARK-40790] - Check error classes in DDL parsing tests
[SPARK-40796] - Check the generated python protos in GitHub Actions
[SPARK-40799] - Enforce Scalafmt for Spark Connect Module
[SPARK-40800] - Always inline expressions in OptimizeOneRowRelationSubquery
[SPARK-40805] - Use `spark` username in official image
[SPARK-40809] - Add as(alias: String) to connect DSL
[SPARK-40810] - Use SparkIllegalArgumentException instead of IllegalArgumentException in CreateDatabaseCommand & AlterDatabaseSetLocationCommand
[SPARK-40811] - Use checkError() to intercept ParseException
[SPARK-40812] - Add Deduplicate to Connect proto
[SPARK-40813] - Add limit and offset to Connect DSL
[SPARK-40816] - Python: rename LogicalPlan.collect to LogicalPlan.to_proto
[SPARK-40823] - Connect Proto should carry unparsed identifiers
[SPARK-40827] - Re-enable the DataFrame.corrwith test after fixing in future pandas.
[SPARK-40828] - Drop Python test tables before and after unit tests
[SPARK-40832] - Add README for spark-docker
[SPARK-40833] - Cleanup apt lists cache in Dockerfile
[SPARK-40836] - AnalyzeResult should use struct for schema
[SPARK-40839] - [Python] Implement `DataFrame.sample`
[SPARK-40845] - Add template support for SPARK_GPG_KEY
[SPARK-40852] - Implement `DataFrame.summary`
[SPARK-40854] - Change default serialization from 'broken' CSV to Spark DF JSON
[SPARK-40856] - Update the error template of WRONG_NUM_PARAMS
[SPARK-40857] - Allow configurable GPRC interceptors for Spark Connect
[SPARK-40859] - Upgrade action/checkout to v3
[SPARK-40860] - Change `set-output` to `GITHUB_EVENT` in spark infra code
[SPARK-40862] - Unexpected operators when rewriting scalar subqueries with non-deterministic expressions
[SPARK-40864] - Remove pip/setuptools dynamic upgrade
[SPARK-40866] - Rename Check Spark repo as Check Spark Docker repo in GA
[SPARK-40870] - Upgrade docker actions to cleanup warning
[SPARK-40871] - Upgrade actions/script to v6 and fix notify workflow
[SPARK-40872] - Fallback to original shuffle block when a push-merged shuffle chunk is zero-size
[SPARK-40875] - Add .agg() to Connect DSL
[SPARK-40877] - Reimplement `crosstab` with dataframe operations
[SPARK-40878] - pin 'grpcio==1.48.1' 'protobuf==4.21.6'
[SPARK-40879] - Support Join UsingColumns in proto
[SPARK-40880] - Reimplement `summary` with dataframe operations
[SPARK-40881] - Upgrade actions/cache to v3 and actions/upload-artifact to v3
[SPARK-40882] - Upgrade actions/setup-java to v3 with distribution specified
[SPARK-40883] - Support Range in Connect proto
[SPARK-40888] - Check error classes in HiveQuerySuite
[SPARK-40889] - Check error classes in PlanResolutionSuite
[SPARK-40890] - Check error classes in DataSourceV2SQLSuite
[SPARK-40891] - Check error classes in TableIdentifierParserSuite
[SPARK-40896] - Fix doctest for `Index.(isin|isnull|notnull)` to work properly with pandas 1.5
[SPARK-40898] - Quote function names in datatype mismatch errors
[SPARK-40899] - UserContext should be extensible
[SPARK-40900] - Reimplement `frequentItems` with dataframe operations
[SPARK-40910] - Replace UnsupportedOperationException with SparkUnsupportedOperationException
[SPARK-40914] - Mark internal API to be private[connect]
[SPARK-40915] - Improve `on` in Join in Python client
[SPARK-40926] - Refactor server side tests to only use DataFrame API
[SPARK-40929] - Add official image dockerfile for Spark v3.3.1
[SPARK-40930] - Support Collect() in Python client
[SPARK-40933] - Reimplement df.stat.{cov, corr} with built-in sql functions
[SPARK-40938] - Support Alias for every Relation
[SPARK-40941] - Use Java 17 in K8s Dockerfile by default and remove `Dockerfile.java17`
[SPARK-40947] - Upgrade pandas to 1.5.1
[SPARK-40948] - Introduce new error class: PATH_NOT_FOUND
[SPARK-40949] - Implement `DataFrame.sortWithinPartitions`
[SPARK-40951] - pyspark-connect tests should be skipped if pandas doesn't exist
[SPARK-40953] - Add missing `limit(n)` in DataFrame.head
[SPARK-40965] - Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1208
[SPARK-40966] - FIX `read_parquet` with `pandas_metadata`
[SPARK-40967] - Migrate failAnalysis() onto error classes
[SPARK-40970] - Support List[Column] for Join's on argument.
[SPARK-40971] - Imports more from connect proto package to avoid calling `proto.` for Connect DSL
[SPARK-40973] - Rename _LEGACY_ERROR_TEMP_0055 to UNCLOSED_BRACKETED_COMMENT
[SPARK-40975] - Assign a name to the legacy error class _LEGACY_ERROR_TEMP_0021
[SPARK-40977] - Complete Support for Union in Python client
[SPARK-40978] - Migrate failAnalysis() w/o context onto error classes
[SPARK-40979] - Keep removed executor info in decommission state
[SPARK-40980] - Support session.sql in Connect DSL
[SPARK-40981] - Support session.range in Python client
[SPARK-40984] - Replace `FRAME_LESS_OFFSET_WITHOUT_FOLDABLE` with `NON_FOLDABLE_INPUT`
[SPARK-40989] - Improve `session.sql` testing coverage in Python client
[SPARK-40990] - DataFrame creation from 2d NumPy array with arbitrary columns
[SPARK-40992] - Support toDF(columnNames) in Connect DSL
[SPARK-40995] - Developer Documentation for Spark Connect
[SPARK-40998] - Assign a name to the legacy error class _LEGACY_ERROR_TEMP_0040
[SPARK-41001] - Connection string support for Python client
[SPARK-41002] - Compatible `take`, `head` and `first` API in Python client
[SPARK-41004] - Check error classes in InterceptorRegistrySuite
[SPARK-41005] - Arrow based collect
[SPARK-41009] - Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1070
[SPARK-41010] - Complete Support for Except and Intersect in Python client
[SPARK-41012] - Rename _LEGACY_ERROR_TEMP_1022 to ORDER_BY_POS_OUT_OF_RANGE
[SPARK-41019] - Provide a query context to failAnalysis
[SPARK-41020] - Assign a name to the legacy error class _LEGACY_ERROR_TEMP_2440
[SPARK-41021] - Test some subclasses of error class DATATYPE_MISMATCH
[SPARK-41022] - Test the error class: DEFAULT_DATABASE_NOT_EXISTS, INDEX_ALREADY_EXISTS, INDEX_NOT_FOUND, ROUTINE_NOT_FOUND
[SPARK-41026] - Support Repartition in Connect DSL
[SPARK-41027] - Use `UNEXPECTED_INPUT_TYPE ` instead of `MAP_FROM_ENTRIES_WRONG_TYPE`
[SPARK-41034] - Connect DataFrame should require RemoteSparkSession
[SPARK-41036] - `columns` API should use `schema` API to avoid data fetching
[SPARK-41038] - Rename `MULTI_VALUE_SUBQUERY_ERROR` to `SCALAR_SUBQUERY_TOO_MANY_ROWS `
[SPARK-41041] - Integrate _LEGACY_ERROR_TEMP_1279 into TABLE_OR_VIEW_ALREADY_EXISTS
[SPARK-41042] - Rename PARSE_CHAR_MISSING_LENGTH to DATA_TYPE_MISSING_SIZE
[SPARK-41043] - Assign a name to the legacy error class _LEGACY_ERROR_TEMP_2429
[SPARK-41044] - Convert DATATYPE_MISMATCH.UNSPECIFIED_FRAME to INTERNAL_ERROR
[SPARK-41046] - Support CreateView in Connect DSL
[SPARK-41054] - Support disk-based KVStore in live UI
[SPARK-41055] - Rename _LEGACY_ERROR_TEMP_2424 to GROUP_BY_AGGREGATE
[SPARK-41058] - Removing unused code in connect
[SPARK-41059] - Rename _LEGACY_ERROR_TEMP_2420 to NESTED_AGGREGATE_FUNCTION
[SPARK-41061] - Support SelectExpr which apply Projection by expressions in Strings in Connect DSL
[SPARK-41062] - Rename UNSUPPORTED_CORRELATED_REFERENCE to CORRELATED_REFERENCE
[SPARK-41064] - Implement `DataFrame.crosstab` and `DataFrame.stat.crosstab`
[SPARK-41065] - Implement `DataFrame.freqItems ` and `DataFrame.stat.freqItems `
[SPARK-41066] - Implement `DataFrame.sampleBy ` and `DataFrame.stat.sampleBy `
[SPARK-41067] - Implement `DataFrame.stat.cov`
[SPARK-41068] - Implement `DataFrame.stat.corr`
[SPARK-41069] - Implement `DataFrame.approxQuantile` and `DataFrame.stat.approxQuantile`
[SPARK-41072] - Convert the internal error about failed stream to user-facing error
[SPARK-41077] - Rename `ColumnRef` to `Column` in Python client implementation
[SPARK-41078] - DataFrame `withColumnsRenamed` can be implemented through `RenameColumns` proto
[SPARK-41095] - Convert unresolved operators to internal errors
[SPARK-41098] - Rename GROUP_BY_POS_REFERS_AGG_EXPR to GROUP_BY_POS_AGGREGATE
[SPARK-41102] - Merge SparkConnectPlanner and SparkConnectCommandPlanner
[SPARK-41103] - Document how to add a new proto field of messages
[SPARK-41105] - Adopt `optional` keyword from proto3 which offers `hasXXX` to differentiate if a field is set or unset
[SPARK-41108] - Control the max size of arrow batch
[SPARK-41109] - Rename the error class _LEGACY_ERROR_TEMP_1216 to INVALID_LIKE_PATTERN
[SPARK-41110] - Implement `DataFrame.sparkSession` in Python client
[SPARK-41111] - Implement `DataFrame.show`
[SPARK-41114] - Support local data for LocalRelation
[SPARK-41115] - Add ClientType to proto to indicate which client sends a request
[SPARK-41116] - Input relation can be optional for Project in Connect proto
[SPARK-41122] - Explain API can support different modes
[SPARK-41127] - Implement DataFrame.CreateGlobalView in Python client
[SPARK-41128] - Implement `DataFrame.fillna ` and `DataFrame.na.fill `
[SPARK-41130] - Rename OUT_OF_DECIMAL_TYPE_RANGE to NUMERIC_OUT_OF_SUPPORTED_RANGE
[SPARK-41131] - Improve error message for UNRESOLVED_MAP_KEY.WITHOUT_SUGGESTION
[SPARK-41133] - Integrate UNSCALED_VALUE_TOO_LARGE_FOR_PRECISION into NUMERIC_VALUE_OUT_OF_RANGE
[SPARK-41135] - Rename UNSUPPORTED_EMPTY_LOCATION to INVALID_EMPTY_LOCATION
[SPARK-41137] - Rename LATERAL_JOIN_OF_TYPE to INVALID_LATERAL_JOIN_TYPE
[SPARK-41139] - Improve error message for PYTHON_UDF_IN_ON_CLAUSE
[SPARK-41140] - Assign a name to the legacy error class _LEGACY_ERROR_TEMP_2440
[SPARK-41148] - Implement `DataFrame.dropna ` and `DataFrame.na.drop `
[SPARK-41150] - Document debugging with PySpark memory profiler
[SPARK-41157] - Show detailed differences in dataframe comparison
[SPARK-41158] - Use `checkError()` to check `DATATYPE_MISMATCH` in `DataFrameFunctionsSuite`
[SPARK-41164] - Update relations.proto to follow Connect Proto development guidance
[SPARK-41166] - Check errorSubClass of DataTypeMismatch in *ExpressionSuites
[SPARK-41169] - Implement `DataFrame.drop`
[SPARK-41172] - Migrate the ambiguous ref error to an error class
[SPARK-41173] - Move `require()` out from the constructors of string expressions
[SPARK-41174] - Propagate an error class to users for invalid `format` of `to_binary()`
[SPARK-41175] - Assign a name to the error class _LEGACY_ERROR_TEMP_1078
[SPARK-41176] - Assign a name to the error class _LEGACY_ERROR_TEMP_1042
[SPARK-41179] - Assign a name to the error class _LEGACY_ERROR_TEMP_1092
[SPARK-41180] - Assign an error class to "Cannot parse the data type"
[SPARK-41181] - Migrate the map options errors onto error classes
[SPARK-41182] - Assign a name to the error class _LEGACY_ERROR_TEMP_1102
[SPARK-41196] - Homogenize the protobuf version across server and client
[SPARK-41201] - Implement `DataFrame.SelectExpr` in Python client
[SPARK-41203] - Dataframe.transform in Python client support
[SPARK-41206] - Assign a name to the error class _LEGACY_ERROR_TEMP_1233
[SPARK-41212] - Implement `DataFrame.isEmpty`
[SPARK-41213] - Implement `DataFrame.__repr__` and `DataFrame.dtypes`
[SPARK-41215] - protoc-3.21.9-linux-x86_64.exe requires GLIBC_2.14
[SPARK-41216] - Make AnalyzePlan support multiple analysis tasks
[SPARK-41217] - Add an error class for failures of built-in function calls
[SPARK-41221] - Add the error class INVALID_FORMAT
[SPARK-41222] - Unify the typing definitions
[SPARK-41225] - Disable unsupported functions
[SPARK-41227] - Implement DataFrame cross join
[SPARK-41228] - Rename COLUMN_NOT_IN_GROUP_BY_CLAUSE to MISSING_AGGREGATION
[SPARK-41230] - Remove `str` from Aggregate expression type
[SPARK-41232] - High-order function: array_append
[SPARK-41234] - High-order function: array_insert
[SPARK-41235] - High-order function: array_compact
[SPARK-41237] - Assign a name to the error class _LEGACY_ERROR_TEMP_0030
[SPARK-41238] - Support more datatypes
[SPARK-41243] - Update the protobuf version in README
[SPARK-41244] - Introducing a Protobuf serializer for UI data on KV store
[SPARK-41250] - DataFrame.to_pandas should not return optional pandas dataframe
[SPARK-41253] - Make K8s volcano IT work in Github Action
[SPARK-41255] - RemoteSparkSession should be called SparkSession
[SPARK-41256] - Implement DataFrame.withColumn(s)
[SPARK-41258] - Upgrade spark-docker actions
[SPARK-41263] - Upgrade buf to v1.9.0
[SPARK-41264] - Make Literal support more datatypes
[SPARK-41265] - Check and upgrade buf.build/protocolbuffers/plugins/python to 3.19.5
[SPARK-41268] - Refactor "Column" for API Compatibility
[SPARK-41269] - Move image matrix into version's workflow
[SPARK-41272] - Assign a name to the error class _LEGACY_ERROR_TEMP_2019
[SPARK-41278] - Clean up unused QualifiedAttribute in Expression.proto
[SPARK-41280] - Implement DataFrame.tail
[SPARK-41287] - Add a test workflow to help test image in fork repo
[SPARK-41291] - `DataFrame.explain` should print and return None
[SPARK-41292] - Window-function support
[SPARK-41293] - Code cleanup for assertXXX methods in ExpressionTypeCheckingSuite
[SPARK-41295] - Assign a name to the error class _LEGACY_ERROR_TEMP_1105
[SPARK-41296] - Assign a name to the error class _LEGACY_ERROR_TEMP_1106
[SPARK-41297] - Support string sql expressions in DF.where()
[SPARK-41300] - Unset Read.schema is incorrectly read when unset
[SPARK-41301] - SparkSession.range should treat end as optional
[SPARK-41302] - Assign a name to the error class _LEGACY_ERROR_TEMP_1185
[SPARK-41304] - Add missing docs for DataFrame API
[SPARK-41306] - Improve Connect Expression proto documentation
[SPARK-41308] - Improve `DataFrame.count()`
[SPARK-41309] - Assign a name to the error class _LEGACY_ERROR_TEMP_1093
[SPARK-41310] - Implement DataFrame.toDF
[SPARK-41311] - Rewrite test RENAME_SRC_PATH_NOT_FOUND to trigger the error from user space
[SPARK-41312] - Implement DataFrame.withColumnRenamed
[SPARK-41314] - Assign a name to the error class _LEGACY_ERROR_TEMP_1094
[SPARK-41315] - Implement `DataFrame.replace ` and `DataFrame.na.replace `
[SPARK-41317] - PySpark write API for Spark Connect
[SPARK-41319] - when-otherwise support
[SPARK-41321] - Support target field for UnresolvedStar
[SPARK-41325] - Add missing avg() to DF group
[SPARK-41326] - Bug in Deduplicate Python transformation
[SPARK-41328] - Add logical and string API to Column
[SPARK-41329] - Solve circular import between Column and _typing/functions
[SPARK-41330] - Improve Documentation for Take,Tail, Limit and Offset
[SPARK-41331] - Add orderBy and drop_duplicates
[SPARK-41332] - Fix `nullOrdering` in `SortOrder`
[SPARK-41333] - Make `Groupby.{min, max, sum, avg, mean}` compatible with PySpark
[SPARK-41334] - move SortField from relations.proto to expressions.proto
[SPARK-41335] - Support IsNull and IsNotNull in Column
[SPARK-41343] - Move FunctionName parsing to server side
[SPARK-41345] - Add Hint to Connect Proto
[SPARK-41346] - Implement asc and desc methods
[SPARK-41347] - Add Cast to Expression proto
[SPARK-41348] - Refactor `UnsafeArrayWriterSuite` to check error class
[SPARK-41349] - Implement `DataFrame.hint`
[SPARK-41351] - Column does not support !=
[SPARK-41354] - Implement `DataFrame.repartitionByRange`
[SPARK-41357] - Implement math functions
[SPARK-41358] - Use `PhysicalDataType` instead of DataType in ColumnVectorUtils
[SPARK-41363] - implement normal functions
[SPARK-41364] - implement `broadcast` function
[SPARK-41366] - DF.groupby.agg() API should be compatible
[SPARK-41371] - Improve Documentation for Command proto
[SPARK-41380] - Implement aggregation functions
[SPARK-41381] - Implement count_distinct and sum_distinct functions
[SPARK-41382] - implement `product` function
[SPARK-41383] - Implement `DataFrame.cube`
[SPARK-41388] - getReusablePVCs should ignore recently created PVCs in the previous batch
[SPARK-41389] - Reuse `WRONG_NUM_ARGS` instead of `_LEGACY_ERROR_TEMP_1044`
[SPARK-41394] - Skip MemoryProfilerTests when pandas is not installed
[SPARK-41397] - Implement part of string/binary functions
[SPARK-41398] - SPJ: Relax constraints on Storage-Partitioned Join when partition keys after runtime filtering do not match
[SPARK-41399] - Refactor column related tests to test_connect_column
[SPARK-41403] - Implement DataFrame.describe
[SPARK-41406] - Refactor error message for `NUM_COLUMNS_MISMATCH` to make it more generic
[SPARK-41407] - Pull out v1 write to WriteFiles
[SPARK-41409] - Reuse `WRONG_NUM_ARGS` instead of `_LEGACY_ERROR_TEMP_1043`
[SPARK-41410] - Support PVC-oriented executor pod allocation
[SPARK-41412] - Implement `Cast`
[SPARK-41413] - SPJ: Avoid shuffle when partition keys mismatch, but join expressions are compatible
[SPARK-41414] - Implement date/timestamp functions
[SPARK-41417] - Assign a name to the error class _LEGACY_ERROR_TEMP_0019
[SPARK-41420] - Protobuf serializer for ApplicationInfoWrapper
[SPARK-41421] - Protobuf serializer for ApplicationEnvironmentInfoWrapper
[SPARK-41422] - Protobuf serializer for ExecutorSummaryWrapper
[SPARK-41423] - Protobuf serializer for StageDataWrapper
[SPARK-41424] - Protobuf serializer for TaskDataWrapper
[SPARK-41425] - Protobuf serializer for RDDStorageInfoWrapper
[SPARK-41426] - Protobuf serializer for ResourceProfileWrapper
[SPARK-41427] - Protobuf serializer for ExecutorStageSummaryWrapper
[SPARK-41428] - Protobuf serializer for SpeculationStageSummaryWrapper
[SPARK-41429] - Protobuf serializer for RDDOperationGraphWrapper
[SPARK-41430] - Protobuf serializer for ProcessSummaryWrapper
[SPARK-41431] - Protobuf serializer for SQLExecutionUIData
[SPARK-41432] - Protobuf serializer for SparkPlanGraphWrapper
[SPARK-41433] - Make Max Arrow BatchSize configurable
[SPARK-41434] - Support LambdaFunction expresssion
[SPARK-41435] - Make `curdate()` throw `WRONG_NUM_ARGS ` instead of `_LEGACY_ERROR_TEMP_1043 ` when args is not null
[SPARK-41436] - Implement `collection` functions: A~C
[SPARK-41438] - Implement DataFrame. colRegex
[SPARK-41439] - Implement `DataFrame.melt` and `DataFrame.unpivot`
[SPARK-41440] - Implement DataFrame.randomSplit
[SPARK-41441] - Allow Generate with no required child output to host outer references
[SPARK-41443] - Assign a name to the error class _LEGACY_ERROR_TEMP_1061
[SPARK-41444] - Implement DataFrameReader.json
[SPARK-41445] - Implement DataFrameReader.parquet
[SPARK-41446] - Make `createDataFrame` support schema and more input dataset types
[SPARK-41453] - Implement DataFrame.subtract
[SPARK-41455] - Resolve dtypes inconsistencies of date/timestamp functions
[SPARK-41457] - Refactor pandas, pyarrow and grpc check in tests
[SPARK-41461] - protoc-3.21.9-linux-x86_64.exe requires GLIBC_2.14
[SPARK-41462] - Date and timestamp type can up cast to TimestampNTZ
[SPARK-41464] - Implement DataFrame.to
[SPARK-41465] - Assign a name to the error class _LEGACY_ERROR_TEMP_1235
[SPARK-41470] - SPJ: Spark shouldn't assume InternalRow implements equals and hashCode
[SPARK-41472] - Implement the rest of string/binary functions
[SPARK-41473] - Implement `functions.format_number`
[SPARK-41477] - Correctly infer the datatype of literal integers
[SPARK-41478] - Assign a name to the error class _LEGACY_ERROR_TEMP_1234
[SPARK-41479] - Add `IPv4 and IPv6` section to K8s document
[SPARK-41481] - Reuse `INVALID_TYPED_LITERAL` instead of `_LEGACY_ERROR_TEMP_0020`
[SPARK-41484] - Implement `collection` functions: E~M
[SPARK-41485] - Unify the environment variable of *_PROTOC_EXEC_PATH
[SPARK-41488] - Assign name to _LEGACY_ERROR_TEMP_1176
[SPARK-41489] - Assign name to _LEGACY_ERROR_TEMP_2415
[SPARK-41490] - Assign name to _LEGACY_ERROR_TEMP_2441
[SPARK-41492] - implement MISC function
[SPARK-41493] - Make csv functions support options
[SPARK-41495] - Implement `collection` functions: P~Z
[SPARK-41502] - Upgrade the minimum Minikube version to 1.28.0
[SPARK-41503] - Implement Partition Transformation Functions
[SPARK-41506] - Refactor LiteralExpression to support DataType
[SPARK-41508] - Assign name to _LEGACY_ERROR_TEMP_1179 and unwrap the existing SparkThrowable
[SPARK-41513] - Implement a Accumulator to collect per mapper row count metrics
[SPARK-41514] - Add `PVC-oriented executor pod allocation` section and revise config name
[SPARK-41518] - Assign a name to the error class _LEGACY_ERROR_TEMP_2422
[SPARK-41525] - Improve onNewSnapshots to use unique list of known executor IDs and PVC names
[SPARK-41526] - Implement `Column.isin`
[SPARK-41528] - Sharing namespace between PySpark and Spark Connect
[SPARK-41529] - Implement SparkSession.stop
[SPARK-41533] - GRPC Errors on the client should be cleaned up
[SPARK-41536] - Remove `Dynamic Resource Allocation` from K8s Future Work
[SPARK-41540] - Add `DISK_USED` executor roll policy
[SPARK-41542] - Run Coverage report for Spark Connect
[SPARK-41543] - Add `TOTAL_SHUFFLE_WRITE` executor roll policy
[SPARK-41546] - pyspark_types_to_proto_types should supports StructType.
[SPARK-41548] - Disable ANSI mode in pyspark.sql.tests.connect.test_connect_functions
[SPARK-41552] - Upgrade `kubernetes-client` to 6.3.1
[SPARK-41565] - Add the error class UNRESOLVED_ROUTINE
[SPARK-41568] - Assign name to _LEGACY_ERROR_TEMP_1236
[SPARK-41571] - Assign name to _LEGACY_ERROR_TEMP_2310
[SPARK-41572] - Assign name to _LEGACY_ERROR_TEMP_2149
[SPARK-41573] - Assign name to _LEGACY_ERROR_TEMP_2136
[SPARK-41574] - Assign name to _LEGACY_ERROR_TEMP_2009
[SPARK-41575] - Assign name to _LEGACY_ERROR_TEMP_2054
[SPARK-41576] - Assign name to _LEGACY_ERROR_TEMP_2051
[SPARK-41578] - Assign name to _LEGACY_ERROR_TEMP_2141
[SPARK-41579] - Assign name to _LEGACY_ERROR_TEMP_1249
[SPARK-41580] - Assign name to _LEGACY_ERROR_TEMP_2137
[SPARK-41581] - Assign name to _LEGACY_ERROR_TEMP_1230
[SPARK-41582] - Reuse `INVALID_TYPED_LITERAL` instead of `_LEGACY_ERROR_TEMP_0022`
[SPARK-41583] - Add Spark Connect and protobuf into setup.py with specifying dependencies
[SPARK-41586] - Introduce new PySpark package: pyspark.errors
[SPARK-41591] - Implement functionality for training a PyTorch file locally
[SPARK-41592] - Implement functionality for training a PyTorch file on the executors
[SPARK-41593] - Implement logging from the executor nodes
[SPARK-41595] - Support generator function explode/explode_outer in the FROM clause
[SPARK-41598] - Migrate the errors from `pyspark/sql/functions.py` into error class.
[SPARK-41600] - Support Catalog.cacheTable
[SPARK-41612] - Support Catalog.isCached
[SPARK-41623] - Support Catalog.uncacheTable
[SPARK-41629] - Support for protocol extensions
[SPARK-41630] - Support lateral column alias in Project code path
[SPARK-41631] - Support lateral column alias in Aggregate code path
[SPARK-41640] - implement `Window` functions
[SPARK-41641] - Implement `Column.over`
[SPARK-41643] - Deduplicate docstrings in pyspark.sql.connect.column
[SPARK-41644] - Introducing SPI mechanism to make it easy for other modules to register ProtoBufSerializer
[SPARK-41645] - Deduplicate docstrings in pyspark.sql.connect.dataframe
[SPARK-41647] - Deduplicate docstrings in pyspark.sql.connect.functions
[SPARK-41648] - Deduplicate docstrings in pyspark.sql.connect.readwriter
[SPARK-41649] - Deduplicate docstrings in pyspark.sql.connect.window
[SPARK-41654] - Enable doctests in pyspark.sql.connect.window
[SPARK-41655] - Enable doctests in pyspark.sql.connect.column
[SPARK-41656] - Enable doctests in pyspark.sql.connect.dataframe
[SPARK-41657] - Enable doctests in pyspark.sql.connect.session
[SPARK-41659] - Enable doctests in pyspark.sql.connect.readwriter
[SPARK-41663] - Implement the rest of Lambda functions
[SPARK-41672] - Enable the deprecated functions
[SPARK-41673] - Implement `Column.astype`
[SPARK-41675] - Make column op support `datetime`
[SPARK-41676] - Protobuf serializer for StreamingQueryData
[SPARK-41677] - Protobuf serializer for StreamingQueryProgressWrapper
[SPARK-41679] - Protobuf serializer for StreamBlockData
[SPARK-41680] - Protobuf serializer for CachedQuantile
[SPARK-41681] - Factor GroupedData out to group.py
[SPARK-41685] - Support optional using Protobuf serializer for KVStore in History server
[SPARK-41687] - Deduplicate docstrings in pyspark.sql.connect.group
[SPARK-41688] - Move Expressions to expressions.py
[SPARK-41689] - Enable doctests in pyspark.sql.connect.group
[SPARK-41692] - implement `DataFrame.rollup`
[SPARK-41693] - Implement `GroupedData.pivot`
[SPARK-41694] - Add new config to clean up `spark.ui.store.path` directory when SparkContext.stop()
[SPARK-41697] - Enable test_df_show, test_drop, test_dropna, test_toDF_with_schema_string and test_with_columns_renamed
[SPARK-41698] - Enable 16 tests that pass
[SPARK-41699] - Upgrade buf to v1.11.0
[SPARK-41700] - Remove `FunctionBuilder`
[SPARK-41701] - Make column op support `decimal`
[SPARK-41702] - Add invalid ops
[SPARK-41703] - Combine NullType and typed_null
[SPARK-41706] - pyspark_types_to_proto_types should supports MapType
[SPARK-41707] - Implement initial Catalog.* API
[SPARK-41708] - Pull v1write information to WriteFiles
[SPARK-41709] - Explicitly define `Seq` as `collection.Seq` to reduce `toSeq` when create ui objects from protobuf objects for Scala 2.13
[SPARK-41710] - Implement `Column.between`
[SPARK-41712] - Migrate the Spark Connect errors into PySpark error framework.
[SPARK-41713] - Make CTAS hold a nested execution for data writing
[SPARK-41715] - Catch specific exceptions for both Spark Connect and PySpark
[SPARK-41716] - Factor pyspark.sql.connect.Catalog._catalog_to_pandas to client.py
[SPARK-41717] - Implement the command logic for print and _repr_html_
[SPARK-41721] - Enable doctests in pyspark.sql.connect.catalog
[SPARK-41722] - Implement time window functions
[SPARK-41723] - Implement `sequence` function
[SPARK-41724] - Implement `call_udf` function
[SPARK-41725] - Remove the workaround of sql(...).collect back in PySpark tests
[SPARK-41726] - Remove OptimizedCreateHiveTableAsSelectCommand
[SPARK-41728] - Implement `unwrap_udt` function
[SPARK-41729] - Assign name to _LEGACY_ERROR_TEMP_0011
[SPARK-41731] - Implement the column accessor
[SPARK-41734] - Wrap catalog messages into a parent message
[SPARK-41736] - pyspark_types_to_proto_types should supports ArrayType
[SPARK-41737] - Implement `GroupedData.{min, max, avg, sum}`
[SPARK-41738] - Client ID should be mixed into SparkSession cache
[SPARK-41740] - Implement `Column.name`
[SPARK-41742] - Support star in groupBy.agg()
[SPARK-41743] - groupBy(...).agg(...).sort does not actually sort the output
[SPARK-41744] - Support multiple arguments in groupBy.max(...)
[SPARK-41745] - SparkSession.createDataFrame does not respect the column names in the row
[SPARK-41746] - SparkSession.createDataFrame does not support nested datatypes
[SPARK-41747] - Support multiple arguments in groupBy.avg(...)
[SPARK-41748] - Support multiple arguments in groupBy.min(...)
[SPARK-41749] - Support multiple arguments in groupBy.sum(...)
[SPARK-41751] - Support Column.bitwiseAND,bitwiseOR,bitwiseXOR,eqNullSafe,isNotNull,isNull,isin
[SPARK-41754] - Add simple developer guides for UI protobuf serializer
[SPARK-41757] - Compatibility of string representation in Column
[SPARK-41759] - Use `weakIntern` on string values in create new objects during deserialization
[SPARK-41761] - Fix arithmetic ops: negate, pow
[SPARK-41764] - Make the internal string op name consistent with FunctionRegistry
[SPARK-41767] - Implement `Column.{withField, dropFields}`
[SPARK-41768] - Refactor the definition of enum - `JobExecutionStatus` to follow with the code style
[SPARK-41770] - eqNullSafe does not support None as its argument
[SPARK-41771] - __getitem__ does not work with Column.isin
[SPARK-41772] - Enable pyspark.sql.connect.column.Column.withField doctest
[SPARK-41773] - Window.partitionBy is not respected with row_number
[SPARK-41775] - Implement training functions as input
[SPARK-41777] - Add Integration Tests
[SPARK-41779] - Make getitem support filter and select
[SPARK-41783] - Make column op support None
[SPARK-41784] - Add missing `__rmod__`
[SPARK-41785] - Implement `GroupedData.mean`
[SPARK-41786] - Deduplicate helper functions
[SPARK-41789] - Make `createDataFrame` support list of Rows
[SPARK-41796] - Test the error class: UNSUPPORTED_CORRELATED_REFERENCE_DATA_TYPE
[SPARK-41797] - Enable test for `array_repeat`
[SPARK-41799] - Combine plan-related tests
[SPARK-41803] - log() function variations are missing
[SPARK-41807] - Remove non-existent error class: UNSUPPORTED_FEATURE.DISTRIBUTE_BY
[SPARK-41808] - Make json functions support options
[SPARK-41809] - Make json functions support DataType Schema
[SPARK-41810] - SparkSession.createDataFrame does not respect the column names in the dictionary
[SPARK-41812] - DataFrame.join: ambiguous column
[SPARK-41815] - Column.isNull returns nan instead of None
[SPARK-41817] - SparkSession.read support reading with schema
[SPARK-41821] - Fix DataFrame.describe
[SPARK-41824] - Implement DataFrame.explain format to be similar to PySpark
[SPARK-41825] - DataFrame.show formatting int as double
[SPARK-41827] - DataFrame.groupBy requires all cols be Column or str
[SPARK-41828] - Implement creating empty Dataframe
[SPARK-41829] - Implement Dataframe.sort,sortWithinPartitions Ordering
[SPARK-41830] - Fix DataFrame.sample parameters
[SPARK-41831] - DataFrame.transform: Only Column or String can be used for projections
[SPARK-41832] - DataFrame.unionByName output is wrong
[SPARK-41833] - DataFrame.collect() output parity with pyspark
[SPARK-41834] - Implement SparkSession.conf
[SPARK-41835] - Implement `transform_keys` function
[SPARK-41836] - Implement `transform_values` function
[SPARK-41837] - DataFrame.createDataFrame datatype conversion error
[SPARK-41838] - DataFrame.show() fix map printing
[SPARK-41840] - DataFrame.show(): 'Column' object is not callable
[SPARK-41842] - Support data type Timestamp(NANOSECOND, null)
[SPARK-41844] - Implement `intX2` function
[SPARK-41845] - Fix `count(expr("*"))` function
[SPARK-41846] - DataFrame windowspec functions : unresolved columns
[SPARK-41847] - DataFrame mapfield,structlist invalid type
[SPARK-41849] - Implement DataFrameReader.text
[SPARK-41850] - Fix `isnan` function
[SPARK-41851] - Fix `nanvl` function
[SPARK-41852] - Fix `pmod` function
[SPARK-41855] - `createDataFrame` doesn't handle None/NaN properly
[SPARK-41856] - Enable test_freqItems, test_input_files, test_toDF_with_schema_string, test_to_pandas_required_pandas_not_found
[SPARK-41857] - Enable test_between_function, test_datetime_functions, test_expr, test_math_functions, test_window_functions_cumulative_sum, test_corr, test_cov, test_crosstab, test_approxQuantile
[SPARK-41862] - Fix a correctness bug in existence DEFAULT value lookups for the Orc data source
[SPARK-41866] - Make `createDataFrame` support array
[SPARK-41868] - Support data type Duration(NANOSECOND)
[SPARK-41869] - DataFrame dropDuplicates should throw error on non list argument
[SPARK-41870] - Handle duplicate columns in `createDataFrame`
[SPARK-41871] - DataFrame hint parameter can be str, float or int
[SPARK-41872] - Fix DataFrame createDataframe handling of None
[SPARK-41874] - Implement DataFrame `sameSemantics`
[SPARK-41875] - Throw proper errors in Dataset.to()
[SPARK-41876] - Implement DataFrame `toLocalIterator`
[SPARK-41877] - SparkSession.createDataFrame error parity
[SPARK-41878] - Add JIRAs or messages for skipped tests
[SPARK-41879] - `DataFrame.collect` should support nested types
[SPARK-41880] - Function `from_json` should support non-literal expression
[SPARK-41881] - `DataFrame.collect` should handle None/NaN properly
[SPARK-41882] - Add tests for SQLAppStatusStore with RocksDB Backend
[SPARK-41884] - DataFrame `toPandas` parity in return types
[SPARK-41886] - `DataFrame.intersect` doctest output has different order
[SPARK-41887] - Support DataFrame hint parameter to be list
[SPARK-41889] - Attach root cause to invalidPatternError
[SPARK-41890] - Reduce `toSeq` in `RDDOperationGraphWrapperSerializer`/SparkPlanGraphWrapperSerializer` for Scala 2.13
[SPARK-41891] - Enable test_add_months_function, test_array_repeat, test_dayofweek, test_first_last_ignorenulls, test_function_parity, test_inline, test_window_time, test_reciprocal_trig_functions
[SPARK-41892] - Add JIRAs or messages for skipped messages
[SPARK-41895] - Add tests for streaming UI with RocksDB backend
[SPARK-41897] - Parity in Error types between pyspark and connect functions
[SPARK-41898] - Window.rowsBetween should handle `float("-inf")` and `float("+inf")` as argument
[SPARK-41899] - DataFrame.createDataFrame converting int to bigint
[SPARK-41900] - Support data type int8
[SPARK-41901] - Parity in String representation of Column
[SPARK-41902] - Parity in String representation of higher_order_function's output
[SPARK-41903] - Support data type ndarray
[SPARK-41905] - Function `slice` should handle string in params
[SPARK-41906] - Handle Function `rand() `
[SPARK-41907] - Function `sampleby` return parity
[SPARK-41921] - Enable doctests in connect.column and connect.functions
[SPARK-41923] - Add `DataFrame.writeTo` to the unsupported list
[SPARK-41924] - Make StructType support metadata and Implement `DataFrame.withMetadata`
[SPARK-41926] - Add Github action test job with RocksDB as UI backend
[SPARK-41927] - Add the unsupported list for `GroupedData`
[SPARK-41928] - Add the unsupported list for functions
[SPARK-41929] - Add function array_compact
[SPARK-41933] - Provide local mode that automatically starts the server
[SPARK-41934] - Add the unsupported function list for `session`
[SPARK-41936] - Make `withMetadata` reuse the `withColumns` proto
[SPARK-41939] - Add the unsupported list for catalog functions
[SPARK-41944] - Pass configurations when local remote mode is on
[SPARK-41945] - Python: connect client lost column data with pyarrow.Table.to_pylist
[SPARK-41957] - Enable the doctest for `DataFrame.hint`
[SPARK-41959] - Improve v1 writes with empty2null
[SPARK-41960] - Assign name to _LEGACY_ERROR_TEMP_1056
[SPARK-41961] - Support table-valued functions with LATERAL
[SPARK-41963] - Different exception message in DataFrame.unpivot
[SPARK-41964] - Add the unsupported function list
[SPARK-41968] - Refactor ProtobufSerDe to ProtobufSerDe[T]
[SPARK-41973] - Assign name to _LEGACY_ERROR_TEMP_1311
[SPARK-41974] - Turn `INCORRECT_END_OFFSET` into `INTERNAL_ERROR`
[SPARK-41975] - Improve error message for `INDEX_ALREADY_EXISTS`
[SPARK-41976] - Improve error message for `INDEX_NOT_FOUND`
[SPARK-41977] - Enable test_generic_hints
[SPARK-41978] - SparkSession.range to take float as arguments
[SPARK-41980] - Enable test_functions_broadcast
[SPARK-41983] - Rename error class: NULL_COMPARISON_RESULT
[SPARK-41984] - Rename & improve error message for RESET_PERMISSION_TO_ORIGINAL
[SPARK-41988] - Fix map_filter and map_zip_with output order
[SPARK-41999] - NPE for bucketed write (ReadwriterTests.test_bucketed_write)
[SPARK-42000] - saveAsTable fail to find the default source (ReadwriterTests.test_insert_into)
[SPARK-42001] - Unexpected schema set to DefaultSource plan (ReadwriterTests.test_save_and_load)
[SPARK-42002] - Implement DataFrameWriterV2 (ReadwriterV2Tests)
[SPARK-42004] - Migrate "XX000" sqlState onto `INTERNAL_ERROR`
[SPARK-42007] - Reuse pyspark.sql.tests.test_group test cases
[SPARK-42008] - Reuse pyspark.sql.tests.test_datasources test cases
[SPARK-42009] - Reuse pyspark.sql.tests.test_serde test cases
[SPARK-42010] - Reuse pyspark.sql.tests.test_column test cases
[SPARK-42011] - Implement DataFrameReader.csv
[SPARK-42012] - Implement DataFrameReader.orc
[SPARK-42013] - Implement DataFrameReader.text to take multiple paths
[SPARK-42014] - Support aware datetimes
[SPARK-42016] - Type inconsistency of struct and map when accessing the nested column
[SPARK-42019] - Reuse pyspark.sql.tests.test_types test cases
[SPARK-42021] - createDataFrame with array.array
[SPARK-42022] - createDataFrame should autogenerate missing column names
[SPARK-42023] - createDataFrame should corse types of string false to bool false
[SPARK-42026] - Protobuf serializer for AppSummary and PoolData
[SPARK-42028] - Support Pandas DF to Spark DF with Nanosecond Timestamps
[SPARK-42029] - Distribution build for Spark Connect does not work with Spark Shell
[SPARK-42032] - Map data show in different order
[SPARK-42038] - SPJ: Support partially clustered distribution
[SPARK-42039] - SPJ: Remove Option in KeyGroupedPartitioning#partitionValues
[SPARK-42041] - DataFrameReader should support list of paths
[SPARK-42042] - DataFrameReader should support StructType schema
[SPARK-42044] - Fix wrong error message for `MUST_AGGREGATE_CORRELATED_SCALAR_SUBQUERY`
[SPARK-42045] - Round/Bround should return an error on integral overflow
[SPARK-42047] - Literal should support numpy datatypes
[SPARK-42048] - Different column name of lit(np.int8)
[SPARK-42062] - Enforce scalafmt for connect-common
[SPARK-42063] - Register `byte[][]` to KyroSerializer
[SPARK-42070] - Change the default value of argument of Mask udf from -1 to NULL
[SPARK-42071] - Register scala.math.Ordering$Reverse to KyroSerializer
[SPARK-42073] - Enable pyspark.sql.tests.test_types 2 test cases
[SPARK-42074] - Enable KryoSerializer in TPCDSQueryBenchmark to enforce SQL class registration
[SPARK-42076] - Factor data conversion `arrow -> rows` out to `conversion.py`
[SPARK-42077] - Literal should throw TypeError for unsupported DataType
[SPARK-42078] - Migrate errors thrown by JVM into PySpark Exception.
[SPARK-42079] - Rename proto messages for `toDF` and `withColumnsRenamed`
[SPARK-42080] - Add guideline for PySpark errors.
[SPARK-42082] - Introduce `PySparkValueError` and `PySparkTypeError`
[SPARK-42085] - Make `from_arrow_schema` support nested types
[SPARK-42089] - Different result in nested lambda function
[SPARK-42095] - Fix gRPC check in tests
[SPARK-42097] - Register SerializedLambda and BitSet to KryoSerializer
[SPARK-42099] - Make `count(*)` work correctly
[SPARK-42100] - Protect null `SQLExecutionUIData#description` in `SQLExecutionUIDataSerializer`
[SPARK-42119] - Add built-in table-valued functions inline and inline_outer
[SPARK-42120] - Add built-in table-valued function json_tuple
[SPARK-42121] - Add built-in table-valued functions posexplode and posexplode_outer
[SPARK-42122] - Add built-in table-valued function stack
[SPARK-42123] - Include column default values in DESCRIBE output for V1 tables
[SPARK-42124] - Scalar Inline Python UDF in Spark Connect
[SPARK-42130] - Handle null string values in AccumulableInfo and ProcessSummary
[SPARK-42137] - Enable spark.kryo.unsafe by default
[SPARK-42138] - Handle null string values in JobData/TaskDataWrapper/ExecutorStageSummaryWrapper
[SPARK-42139] - Handle null string values in SQLExecutionUIData/SQLPlanMetric/SparkPlanGraphWrapper
[SPARK-42140] - Handle null string values in ApplicationEnvironmentInfoWrapper/ApplicationInfoWrapper
[SPARK-42142] - Handle null string values in CachedQuantile/ExecutorSummary/PoolData
[SPARK-42143] - Handle null string values in RDDStorageInfo/RDDDataDistribution/RDDPartitionInfo
[SPARK-42144] - Handle null string values in StageData/StreamBlockData/StreamingQueryData
[SPARK-42146] - Refactor `Utils#setStringField` to make maven build pass when sql module use this method
[SPARK-42148] - Upgrade `kubernetes-client` to 6.4.0
[SPARK-42150] - Upgrade Volcano to 1.7.0
[SPARK-42153] - Handle null string values in PairStrings/RDDOperationNode/RDDOperationClusterWrapper
[SPARK-42154] - Enable Volcano unit tests and integration tests in GitHub Action
[SPARK-42164] - Register partitioned-table-related classes to KryoSerializer
[SPARK-42173] - IPv6 address mapping can fail with sparse addresses
[SPARK-42178] - Handle remaining null string values in ui protobuf serializer and add tests
[SPARK-42182] - Make `ReusedConnectTestCase` to take Spark configurations
[SPARK-42187] - Avoid using RemoteSparkSession.builder.getOrCreate in tests
[SPARK-42190] - Support `local[*]` in `spark-submit` in K8s environment
[SPARK-42192] - Migrate the `TypeError` from `pyspark/sql/dataframe.py` into `PySparkTypeError`.
[SPARK-42197] - Reuses JVM initialization, and separate configuration groups to set in remote local mode
[SPARK-42210] - Standardize registered pickled Python UDFs
[SPARK-42213] - Failed to test ClientE2ETestSuite with maven
[SPARK-42217] - Support lateral column alias in queries with Window
[SPARK-42221] - Introduce a new conf for TimestampNTZ schema inference in JSON/CSV
[SPARK-42224] - Migrate `TypeError` into error framework for Spark Connect functions
[SPARK-42225] - Add `SparkConnectIllegalArgumentException` to handle Spark Connect error precisely.
[SPARK-42229] - Migrate SparkCoreErrors into error class
[SPARK-42231] - Rename error class: MISSING_STATIC_PARTITION_COLUMN
[SPARK-42232] - Rename error class: UNSUPPORTED_FEATURE.JDBC_TRANSACTION
[SPARK-42233] - Improve error message for PIVOT_AFTER_GROUP_BY
[SPARK-42234] - Rename error class: UNSUPPORTED_FEATURE.REPEATED_PIVOT
[SPARK-42236] - Refine `NULLABLE_ARRAY_OR_MAP_ELEMENT`
[SPARK-42238] - Introduce `INCOMPATIBLE_JOIN_TYPES`
[SPARK-42239] - Integrate MUST_AGGREGATE_CORRELATED_SCALAR_SUBQUERY
[SPARK-42243] - Use `spark.sql.inferTimestampNTZInDataSources.enabled` to infer timestamp type on partition columns
[SPARK-42244] - Refine error message by using Python types.
[SPARK-42249] - Refining html strings in error messages
[SPARK-42253] - Add test for detecting duplicated error class
[SPARK-42254] - Assign name to _LEGACY_ERROR_TEMP_1117
[SPARK-42255] - Assign name to _LEGACY_ERROR_TEMP_2430
[SPARK-42263] - Implement `spark.catalog.registerFunction`
[SPARK-42266] - Local mode should work with IPython
[SPARK-42267] - Support left_outer join
[SPARK-42268] - Add UserDefinedType in protos
[SPARK-42269] - Support complex return types in DDL strings
[SPARK-42271] - Reuse UDF test cases under `pyspark.sql.tests`
[SPARK-42272] - Use available ephemeral port for Spark Connect server in testing
[SPARK-42273] - Skip Spark Connect tests if dependencies are not installed
[SPARK-42275] - Avoid using built-in list, dict in static typing
[SPARK-42278] - DS V2 pushdown supports supports JDBC dialects compile `SortOrder` by themselves
[SPARK-42281] - Update Debugging PySpark documents to show error message properly
[SPARK-42294] - Include column default values in DESCRIBE output for V2 tables
[SPARK-42295] - Tear down the test cleanly
[SPARK-42296] - Apply spark.sql.inferTimestampNTZInDataSources.enabled on JDBC data source
[SPARK-42297] - Assign name to _LEGACY_ERROR_TEMP_2412
[SPARK-42301] - Assign name to _LEGACY_ERROR_TEMP_1129
[SPARK-42302] - Assign name to _LEGACY_ERROR_TEMP_2135
[SPARK-42303] - Assign name to _LEGACY_ERROR_TEMP_1326
[SPARK-42305] - Assign name to _LEGACY_ERROR_TEMP_1229
[SPARK-42306] - Assign name to _LEGACY_ERROR_TEMP_1317
[SPARK-42310] - Assign name to _LEGACY_ERROR_TEMP_1289
[SPARK-42312] - Assign name to _LEGACY_ERROR_TEMP_0042
[SPARK-42313] - Assign name to _LEGACY_ERROR_TEMP_1152
[SPARK-42314] - Assign name to _LEGACY_ERROR_TEMP_2127
[SPARK-42315] - Assign name to _LEGACY_ERROR_TEMP_2092
[SPARK-42318] - Assign name to _LEGACY_ERROR_TEMP_2125
[SPARK-42319] - Assign name to _LEGACY_ERROR_TEMP_2123
[SPARK-42320] - Assign name to _LEGACY_ERROR_TEMP_2188
[SPARK-42324] - Assign name to _LEGACY_ERROR_TEMP_1001
[SPARK-42326] - Assign name to _LEGACY_ERROR_TEMP_2099
[SPARK-42327] - Assign name to_LEGACY_ERROR_TEMP_2177
[SPARK-42338] - Different exception in DataFrame.sample
[SPARK-42342] - Introduce base hierarchy to exceptions.
[SPARK-42343] - Ignore `IOException` in `handleBlockRemovalFailure` if SparkContext is stopped
[SPARK-42345] - Rename TimestampNTZ inference conf as spark.sql.sources.timestampNTZTypeInference.enabled
[SPARK-42348] - Add SQLSTATE
[SPARK-42357] - Log `exitCode` when `SparkContext.stop` starts
[SPARK-42363] - Remove session.register_udf
[SPARK-42367] - DataFrame.drop should handle duplicated columns properly
[SPARK-42371] - Add scripts to start and stop Spark Connect server
[SPARK-42378] - Make `DataFrame.select` support `a.*`
[SPARK-42381] - `CreateDataFrame` should accept objects
[SPARK-42402] - Support parameterized SQL by sql()
[SPARK-42408] - Register DoubleType to KryoSerializer
[SPARK-42419] - Migrate `TypeError` into error framework for Spark Connect column API.
[SPARK-42420] - Register WriteTaskResult, BasicWriteTaskStats, and ExecutedWriteSummary to KryoSerializer
[SPARK-42426] - insertInto fails when the column names are different from the table columns
[SPARK-42427] - Conv should return an error if the internal conversion overflows
[SPARK-42428] - Standardize __repr__ of CommonInlineUserDefinedFunction
[SPARK-42430] - Add documentation for TimestampNTZ type
[SPARK-42431] - Union avoid calling `output` before analysis
[SPARK-42433] - Add `array_insert` to Connect
[SPARK-42434] - `array_append` should accept `Any` value
[SPARK-42455] - Rename JDBC option inferTimestampNTZType as preferTimestampNTZ
[SPARK-42458] - createDataFrame should support DDL string as schema
[SPARK-42459] - Create pyspark.sql.connect.utils to keep common codes
[SPARK-42468] - Implement agg by (String, String)*
[SPARK-42475] - Getting Started: Live Notebook for Spark Connect
[SPARK-42476] - Spark Connect API reference.
[SPARK-42481] - Implement agg.{max,min,mean,count,avg,sum}
[SPARK-42510] - Implement `DataFrame.mapInPandas`
[SPARK-42521] - Add NULL values for INSERT commands with user-specified lists of fewer columns than the target table
[SPARK-42522] - Fix DataFrameWriterV2 to find the default source
[SPARK-42524] - Upgrade numpy and pandas in the release Dockerfile
[SPARK-42532] - Update YuniKorn documentation with v1.2
[SPARK-42545] - Remove `experimental` from Volcano docs
[SPARK-42568] - SparkConnectStreamHandler should manage configs properly while creating plans.
[SPARK-42574] - DataFrame.toPandas should handle duplicated column names
[SPARK-42593] - Deprecate & remove the APIs that will be removed in pandas 2.0.
[SPARK-42609] - Add tests for grouping() and grouping_id() functions
[SPARK-42612] - Enable more parity tests related to functions
[SPARK-42630] - Make `parse_data_type` use new proto message `DDLParse`
[SPARK-42641] - Upgrade buf to v1.15.0
[SPARK-42643] - Register Java (aggregate) user-defined functions
[SPARK-42666] - Fix `createDataFrame` to work properly with rows and schema
[SPARK-42705] - SparkSession.sql doesn't return values from commands.
[SPARK-42707] - Remove experimental warning in developer documentation
[SPARK-42710] - Rename FrameMap proto to MapPartitions
[SPARK-42723] - Support parser data type json "timestamp_ltz" as TimestampType
[SPARK-42724] - Upgrade buf to v1.15.1
[SPARK-42725] - Make LiteralExpression support array
[SPARK-42726] - Implement `DataFrame.mapInArrow`
[SPARK-42739] - Ensure release tag to be pushed to release branch
[SPARK-42861] - Review and fix issues in SQL API docs
[SPARK-42864] - Review and fix issues in MLlib API docs
[SPARK-42865] - Review and fix issues in Streaming API docs
[SPARK-42875] - Fix toPandas to handle timezone and map types properly.
[SPARK-42889] - Implement cache, persist, unpersist, and storageLevel
[SPARK-42893] - Block Arrow Python UDFs
[SPARK-42900] - Fix createDataFrame to respect both type inference and column names.
[SPARK-42920] - Python UDF with UDT
[SPARK-42983] - Fix the error message of createDataFrame from np.array(0)
[SPARK-42998] - Fix DataFrame.collect with null struct.
[SPARK-43011] - array_insert should fail with 0 index
[SPARK-43018] - Fix bug with timestamp literals
[SPARK-43085] - Fix bug in column DEFAULT assignment for target tables with multi-part names
[SPARK-44681] - Solve issue referencing github.com/apache/spark-connect-go as Go library

Bug

[SPARK-8731] - Beeline doesn't work with -e option when started in background
[SPARK-28090] - Spark hangs when an execution plan has many projections on nested structs
[SPARK-33782] - Place spark.files, spark.jars and spark.files under the current working directory on the driver in K8S cluster mode
[SPARK-34777] - [UI] StagePage input size/records not show when records greater than zero
[SPARK-35084] - [k8s] On Spark 3, jars listed in spark.jars and spark.jars.packages are not added to sparkContext
[SPARK-35542] - Bucketizer created for multiple columns with parameters splitsArray, inputCols and outputCols can not be loaded after saving it.
[SPARK-35579] - Update to janino 3.1.7 to fix a bug
[SPARK-37259] - JDBC read is always going to wrap the query in a select statement
[SPARK-38404] - Spark does not find CTE inside nested CTE
[SPARK-38488] - Spark doc build not work on Mac OS M1
[SPARK-38503] - Add warn for getAdditionalPreKubernetesResources in executor side
[SPARK-38510] - Failure fetching JSON representation of Spark plans with Hive UDFs
[SPARK-38521] - Throw Exception if overwriting hive partition table with dynamic and staticPartitionOverwriteMode
[SPARK-38597] - Enable Spark on K8S integration tests
[SPARK-38613] - Fix RemoteBlockPushResolverSuite#testWritingPendingBufsIsAbortedImmediatelyDuringComplete
[SPARK-38614] - Don't push down limit through window that's using percent_rank
[SPARK-38708] - Upgrade Hive Metastore Client to the 3.1.3 for Hive 3.1
[SPARK-38717] - Handle Hive's bucket spec case preserving behaviour
[SPARK-38799] - Fix scala license declaration
[SPARK-38802] - Support spark.kubernetes.test.(driver|executor)RequestCores
[SPARK-38846] - Teradata's Number is either converted to its floor value or ceiling value despite its fractional part.
[SPARK-38870] - SparkSession.builder returns a new builder in Scala, but not in Python
[SPARK-38898] - Failed to build python docker images due to .cache not found
[SPARK-38918] - Nested column pruning should filter out attributes that do not belong to the current relation
[SPARK-38956] - Fix FAILED_EXECUTE_UDF test case on Java 17
[SPARK-38962] - Fix wrong computeStats at DataSourceV2Relation
[SPARK-38969] - Graceful decomissionning on Kubernetes fails / decom script error
[SPARK-38994] - Add an Python example of StreamingQueryListener
[SPARK-39015] - SparkRuntimeException when trying to get non-existent key in a map
[SPARK-39041] - Mapping Spark Query ResultSet/Schema to TRowSet/TTableSchema directly
[SPARK-39060] - Typo in error messages of decimal overflow
[SPARK-39079] - Catalog name should not contain dot
[SPARK-39104] - Null Pointer Exeption on unpersist call
[SPARK-39184] - ArrayIndexOutOfBoundsException for some date/time sequences in some time-zones
[SPARK-39221] - sensitive information is not redacted correctly on thrift job/stage page
[SPARK-39242] - AwaitOffset does not wait correctly for atleast expected offset and RateStreamProvider test is flaky
[SPARK-39259] - Timestamps returned by now() and equivalent functions are not consistent in subqueries
[SPARK-39296] - Replcace `Array.toString` with `Array.mkString`
[SPARK-39313] - V2ExpressionUtils.toCatalystOrdering should fail if V2Expression can not be translated
[SPARK-39338] - Remove dynamic pruning subquery if pruningKey's references is empty
[SPARK-39340] - DS v2 agg pushdown should allow dots in the name of top-level columns
[SPARK-39347] - Generate wrong time window when (timestamp-startTime) % slideDuration < 0
[SPARK-39354] - The analysis exception is incorrect
[SPARK-39355] - Single column uses quoted to construct UnresolvedAttribute
[SPARK-39391] - Reuse Partitioner Classes
[SPARK-39393] - Parquet data source only supports push-down predicate filters for non-repeated primitive types
[SPARK-39396] - Spark Thriftserver enabled LDAP，Error using beeline connection: error code 49 - invalid credentials
[SPARK-39399] - proxy-user not working for Spark on k8s in cluster deploy mode
[SPARK-39400] - spark-sql remain hive resource download dir after exit
[SPARK-39401] - Replace withView with withTempView in CTEInlineSuite
[SPARK-39404] - Unable to query _metadata in streaming if getBatch returns multiple logical nodes in the DataFrame
[SPARK-39411] - Release candidates do not have the correct version for PySpark
[SPARK-39412] - IllegalStateException from connector does not work well with error class framework
[SPARK-39417] - Handle Null partition values in PartitioningUtils
[SPARK-39421] - Sphinx build fails with "node class 'meta' is already registered, its visitors will be overridden"
[SPARK-39427] - Disable ANSI intervals in the percentile functions
[SPARK-39437] - normalize plan id separately in PlanStabilitySuite
[SPARK-39444] - Add OptimizeSubqueries into nonExcludableRules list
[SPARK-39445] - Remove the window if windowExpressions is empty in column pruning
[SPARK-39447] - Only non-broadcast query stage can propagate empty relation
[SPARK-39448] - Add ReplaceCTERefWithRepartition into nonExcludableRules list
[SPARK-39476] - Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float
[SPARK-39493] - Update ORC to 1.7.5
[SPARK-39496] - Inline eval path cannot handle null structs
[SPARK-39505] - Escape log content rendered in UI
[SPARK-39543] - The option of DataFrameWriterV2 should be passed to storage properties if fallback to v1
[SPARK-39547] - V2SessionCatalog should not throw NoSuchDatabaseException in loadNamespaceMetadata
[SPARK-39548] - CreateView Command with a window clause query hit a wrong window definition not found issue
[SPARK-39551] - Add AQE invalid plan check
[SPARK-39570] - inline table should allow expressions with alias
[SPARK-39575] - ByteBuffer forget to rewind after get in AvroDeserializer
[SPARK-39582] - "Since <version>" docs on array_agg are incorrect
[SPARK-39596] - Run `Linters, licenses, dependencies and documentation generation ` GitHub Actions failed
[SPARK-39601] - AllocationFailure should not be treated as exitCausedByApp when driver is shutting down
[SPARK-39612] - The dataframe returned by exceptAll() can no longer perform operations such as count() or isEmpty(), or an exception will be thrown.
[SPARK-39614] - K8s pod name follows `DNS Subdomain Names` rule
[SPARK-39620] - History server page and API are using inconsistent conditions to filter running applications
[SPARK-39621] - Make run-tests.py robust by avoiding `rmtree` usage
[SPARK-39622] - ParquetIOSuite fails intermittently on master branch
[SPARK-39647] - Block push fails with java.lang.IllegalArgumentException: Active local dirs list has not been updated by any executor registration even when the NodeManager hasn't been restarted
[SPARK-39648] - Fix type hints of `like`, `rlike`, `ilike` of Column
[SPARK-39650] - Streaming Deduplication should not check the schema of "value"
[SPARK-39672] - NotExists subquery failed with conflicting attributes
[SPARK-39696] - Uncaught exception in thread executor-heartbeater java.util.ConcurrentModificationException: mutation occurred during iteration
[SPARK-39703] - Mima complains with Scala 2.13 in the master branch
[SPARK-39714] - Resolve pyspark mypy part tests.
[SPARK-39731] - Correctness issue when parsing dates with yyyyMMdd format in CSV and JSON
[SPARK-39743] - Unable to set zstd compression level while writing parquet files
[SPARK-39758] - NPE on invalid patterns from the regexp functions
[SPARK-39761] - Add Apache Spark images info in running-on-kubernetes doc
[SPARK-39775] - Regression due to AVRO-2035
[SPARK-39776] - Join‘ verbose string didn't contains JoinType
[SPARK-39783] - Column backticks are misplaced in the AnalysisException [UNRESOLVED_COLUMN] error message when using field with "."
[SPARK-39829] - Upgrade log4j2 to 2.18.0
[SPARK-39830] - Add a test case to read ORC table that requires type promotion
[SPARK-39833] - Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true
[SPARK-39835] - Fix EliminateSorts remove global sort below the local sort
[SPARK-39839] - Handle special case of null variable-length Decimal with non-zero offsetAndSize in UnsafeRow structural integrity check
[SPARK-39847] - Race condition related to interruption of task threads while they are in RocksDBLoader.loadLibrary()
[SPARK-39848] - Upgrade Kafka to 3.2.1
[SPARK-39857] - V2ExpressionBuilder uses the wrong LiteralValue data type for In predicate
[SPARK-39867] - Global limit should not inherit OrderPreservingUnaryNode
[SPARK-39880] - V2 SHOW FUNCTIONS command should print qualified function name like v1
[SPARK-39887] - Expression transform error
[SPARK-39895] - pyspark drop doesn't accept *cols
[SPARK-39896] - The structural integrity of the plan is broken after UnwrapCastInBinaryComparison
[SPARK-39900] - Issue with querying dataframe produced by 'binaryFile' format using 'not' operator
[SPARK-39915] - Dataset.repartition(N) may not create N partitions
[SPARK-39932] - WindowExec should clear the final partition buffer
[SPARK-39936] - Spark View creation with hyphens in column-type names fails
[SPARK-39939] - shift() func need support periods=0
[SPARK-39940] - Batch query cannot read the updates from streaming query if streaming query writes to the catalog table via DSv1 sink
[SPARK-39943] - Upgrade rocksdbjni to 7.4.4
[SPARK-39945] - Upgrade sbt-mima-plugin to 1.1.0
[SPARK-39952] - SaveIntoDataSourceCommand should recache result relation
[SPARK-39962] - Global aggregation against pandas aggregate UDF does not take the column order into account
[SPARK-39974] - Create separate static image tag for infra cache
[SPARK-39976] - NULL check in ArrayIntersect adds extraneous null from first param
[SPARK-39980] - Change infra image to static tag
[SPARK-39981] - CheckOverflowInTableInsert returns exception rather than throwing it
[SPARK-39988] - LevelDBIterator not close after used in `RemoteBlockPushResolver`, `YarnShuffleService` and `ExternalShuffleBlockResolver`
[SPARK-40002] - Limit improperly pushed down through window using ntile function
[SPARK-40036] - LevelDB/RocksDBIterator.next should return false after iterator or db close
[SPARK-40045] - The order of filtering predicates is not reasonable
[SPARK-40052] - Handle direct byte buffers in VectorizedDeltaBinaryPackedReader
[SPARK-40057] - Cleanup "<BLANKLINE>" in doctest
[SPARK-40079] - Add Imputer inputCols validation for empty input case
[SPARK-40089] - Sorting of at least Decimal(20, 2) fails for some values near the max.
[SPARK-40094] - Send TaskEnd event when task failed with NotSerializableException or TaskOutputFileAlreadyExistException to release executors for dynamic allocation
[SPARK-40096] - Finalize shuffle merge slow due to connection creation fails
[SPARK-40114] - Arrow 9.0.0 support with SparkR
[SPARK-40117] - Convert condition to java in DataFrameWriterV2.overwrite
[SPARK-40121] - Initialize projection used for Python UDF
[SPARK-40124] - Update TPCDS v1.4 q32 for Plan Stability tests
[SPARK-40132] - MultilayerPerceptronClassifier rawPredictionCol param missing from setParams
[SPARK-40134] - Update ORC to 1.7.6
[SPARK-40149] - Star expansion after outer join asymmetrically includes joining key
[SPARK-40151] - Fix return type for new median(interval) function
[SPARK-40152] - Codegen compilation error when using split_part
[SPARK-40156] - url_decode() exposes a Java error
[SPARK-40168] - Handle FileNotFoundException when shuffle file deleted in decommissioner
[SPARK-40169] - Fix the issue with Parquet column index and predicate pushdown in Data source V1
[SPARK-40202] - Allow a dictionary in SparkSession.config in PySpark
[SPARK-40212] - SparkSQL castPartValue does not properly handle byte & short
[SPARK-40218] - GROUPING SETS should preserve the grouping columns
[SPARK-40245] - Fix FileScan equality check when partition or data filter columns are not read
[SPARK-40247] - Fix BitSet equality check
[SPARK-40261] - DirectTaskResult meta should not be counted into result size
[SPARK-40270] - Make compute.max_rows as None working in DataFrame.style
[SPARK-40280] - Failure to create parquet predicate push down for ints and longs on some valid files
[SPARK-40295] - Allow v2 functions with literal args in write distribution and ordering
[SPARK-40297] - CTE outer reference nested in CTE main body cannot be resolved
[SPARK-40303] - The performance will be worse after codegen
[SPARK-40314] - Add inline Scala and Python bindings
[SPARK-40315] - Non-deterministic hashCode() calculations for ArrayBasedMapData on equal objects
[SPARK-40320] - When the Executor plugin fails to initialize, the Executor shows active but does not accept tasks forever, just like being hung
[SPARK-40322] - Fix all dead links
[SPARK-40323] - Update ORC to 1.8.0
[SPARK-40380] - Constant-folding of InvokeLike should not result in non-serializable result
[SPARK-40385] - Classes with companion object constructor fails interpreted path
[SPARK-40403] - Negative size in error message when unsafe array is too big
[SPARK-40407] - Repartition of DataFrame can result in severe data skew in some special case
[SPARK-40429] - Only set KeyGroupedPartitioning when the referenced column is in the output
[SPARK-40440] - Fix wrong reference and content in PS windows related doc
[SPARK-40460] - Streaming metrics is zero when select _metadata
[SPARK-40468] - Column pruning is not handled correctly in CSV when _corrupt_record is used
[SPARK-40470] - arrays_zip output unexpected alias column names when using GetMapValue and GetArrayStructFields
[SPARK-40480] - Remove push-based shuffle data after query finished
[SPARK-40482] - Revert SPARK-24544 Print actual failure cause when look up function failed
[SPARK-40492] - Perform maintenance of StateStore instances when they become inactive
[SPARK-40496] - Configs to control "enableDateTimeParsingFallback" are incorrectly swapped
[SPARK-40508] - Treat unknown partitioning as UnknownPartitioning
[SPARK-40521] - PartitionsAlreadyExistException in Hive V1 Command V1 reports all partitions instead of the conflicting partition
[SPARK-40535] - NPE from observe of collect_list
[SPARK-40562] - Add spark.sql.legacy.groupingIdWithAppendedUserGroupBy
[SPARK-40563] - Error at where clause, when sql case executes by else branch
[SPARK-40565] - Non-deterministic filters shouldn't get pushed to V2 file sources
[SPARK-40583] - Documentation error in "Integration with Cloud Infrastructures"
[SPARK-40612] - On Kubernetes for long running app Spark using an invalid principal to renew the delegation token
[SPARK-40617] - Assertion failed in ExecutorMetricsPoller "task count shouldn't below 0"
[SPARK-40618] - Bug in MergeScalarSubqueries rule attempting to merge nested subquery with parent
[SPARK-40622] - Result of a single task in collect() must fit in 2GB
[SPARK-40635] - Scala 2.12 + Hadoop 2 + JDK 8 Daily Test failed
[SPARK-40660] - Switch to XORShiftRandom to distribute elements
[SPARK-40670] - NPE in applyInPandasWithState when the input schema has "non-nullable" column(s)
[SPARK-40694] - Add permisson for label github aciton job
[SPARK-40695] - Add permisson for notify and status update job
[SPARK-40696] - Add permisson for infra image
[SPARK-40703] - Performance regression for joins in Spark 3.3 vs Spark 3.2
[SPARK-40705] - Issue with spark converting Row to Json using Scala 2.13
[SPARK-40738] - spark-shell fails with "bad array subscript" in cygwin or msys bash session
[SPARK-40739] - "sbt packageBin" fails in cygwin or other windows bash session
[SPARK-40753] - Fix bug in test case for catalog directory operation
[SPARK-40771] - Estimated size in log message can overflow Int
[SPARK-40775] - V2 file scans have duplicative descriptions
[SPARK-40798] - Alter partition should verify value
[SPARK-40806] - Typo fix: CREATE TABLE -> REPLACE TABLE
[SPARK-40815] - SymlinkTextInputFormat returns incorrect result due to enabled spark.hadoopRDD.ignoreEmptySplits
[SPARK-40817] - Remote spark.jars URIs ignored for Spark on Kubernetes in cluster mode
[SPARK-40819] - Parquet INT64 (TIMESTAMP(NANOS,true)) now throwing Illegal Parquet type instead of automatically converting to LongType
[SPARK-40829] - STORED AS serde in CREATE TABLE LIKE view does not work
[SPARK-40838] - Upgrade infra base image to focal-20220922
[SPARK-40851] - TimestampFormatter behavior changed when using the latest Java 8/11/17
[SPARK-40858] - Cleanup github action warning
[SPARK-40867] - Flaky test ProtobufCatalystDataConversionSuite
[SPARK-40869] - KubernetesConf.getResourceNamePrefix creates invalid name prefixes
[SPARK-40874] - Fix broadcasts in Python UDFs when encryption is enabled
[SPARK-40901] - Unable to store Spark Driver logs with Absolute Hadoop based URI FS Path
[SPARK-40902] - Quick submission of drivers in tests to mesos scheduler results in dropping drivers
[SPARK-40906] - `Mode` should copy keys before inserting into Map
[SPARK-40907] - `PandasMode` should copy keys before inserting into Map
[SPARK-40924] - Unhex function works incorrectly when input has uneven number of symbols
[SPARK-40932] - Barrier: messages for allGather will be overridden by the following barrier APIs
[SPARK-40944] - Relax ordering constraint for CREATE TABLE column options
[SPARK-40963] - ExtractGenerator sets incorrect nullability in new Project
[SPARK-40969] - Unable to download spark 3.3.0 tarball after 3.3.1 release in spark-docker
[SPARK-40987] - Avoid creating a directory when deleting a block, causing DAGScheduler to not work
[SPARK-40999] - Hints on subqueries are not properly propagated
[SPARK-41003] - BHJ LeftAnti does not update numOutputRows when codegen is disabled
[SPARK-41007] - BigInteger Serialization doesn't work with JavaBean Encoder
[SPARK-41008] - Isotonic regression result differs from sklearn implementation
[SPARK-41015] - Failure of ProtobufCatalystDataConversionSuite.scala
[SPARK-41035] - Incorrect results or NPE when a literal is reused across distinct aggregations
[SPARK-41040] - Self-union streaming query may fail when using readStream.table
[SPARK-41047] - Remove legacy example of round function with negative scale
[SPARK-41049] - Nondeterministic expressions have unstable values if they are children of CodegenFallback expressions
[SPARK-41056] - Fix new R_LIBS_SITE behavior introduced in R 4.2
[SPARK-41093] - Remove netty-tcnative-classes from Spark dependencyList
[SPARK-41118] - to_number/try_to_number throws NullPointerException when format is null
[SPARK-41136] - Shorten graceful shutdown time of ExecutorPodsSnapshotsStoreImpl to prevent blocking shutdown process
[SPARK-41144] - UnresolvedHint should not cause query failure
[SPARK-41149] - Fix `SparkSession.builder.config` to support bool
[SPARK-41151] - Keep built-in file _metadata column nullable value consistent
[SPARK-41154] - Incorrect relation caching for queries with time travel spec
[SPARK-41162] - Anti-join must not be pushed below aggregation with ambiguous predicates
[SPARK-41165] - Arrow collect should factor in failures
[SPARK-41177] - maven test `protobuf` module failed
[SPARK-41178] - fix parser rule precedence between JOIN and comma
[SPARK-41184] - Fill NA tests are flaky
[SPARK-41186] - Fix doctest for new version mlfow
[SPARK-41187] - [Core] LiveExecutor MemoryLeak in AppStatusListener when ExecutorLost happen
[SPARK-41188] - Set executorEnv OMP_NUM_THREADS to be spark.task.cpus by default for spark executor JVM processes
[SPARK-41189] - Add an environment to switch on and off namedtuple hack
[SPARK-41192] - Task finished before speculative task scheduled leads to holding idle executors
[SPARK-41193] - Ignore `collect data with single partition larger than 2GB bytes array limit` in `DatasetLargeResultCollectingSuite` as default
[SPARK-41198] - Streaming query metrics is broken with CTE
[SPARK-41199] - Streaming query metrics is broken with mixed-up usage of DSv1 streaming source and DSv2 streaming source
[SPARK-41219] - Regression in IntegralDivide returning null instead of 0
[SPARK-41254] - YarnAllocator.rpIdToYarnResource map is not properly updated
[SPARK-41261] - applyInPandasWithState can produce incorrect key value in user function for timed out state
[SPARK-41313] - AM shutdown hook fails with IllegalStateException if AM crashes on startup (recurrence of SPARK-3900)
[SPARK-41327] - Fix SparkStatusTracker.getExecutorInfos by switch On/OffHeapStorageMemory info
[SPARK-41339] - RocksDB state store WriteBatch doesn't clean up native memory
[SPARK-41344] - Reading V2 datasource masks underlying error
[SPARK-41350] - allow simple name access of using join hidden columns after subquery alias
[SPARK-41365] - Stages UI page fails to load for proxy in some yarn versions
[SPARK-41374] - Update ORC to 1.8.1
[SPARK-41375] - Avoid empty latest KafkaSourceOffset
[SPARK-41376] - Executor netty direct memory check should respect spark.shuffle.io.preferDirectBufs
[SPARK-41377] - Fix spark-version-info.properties not found on Windows
[SPARK-41379] - Inconsistency of spark session in DataFrame in user function for foreachBatch sink in PySpark
[SPARK-41385] - Replace deprecated `.newInstance()` in K8s module
[SPARK-41395] - InterpretedMutableProjection can corrupt unsafe buffer when used with decimal data
[SPARK-41411] - Multi-Stateful Operator watermark support bug fix
[SPARK-41437] - Do not optimize the input query twice for v1 write fallback
[SPARK-41448] - Make consistent MR job IDs in FileBatchWriter and FileFormatWriter
[SPARK-41452] - to_char throws NullPointerException when format is null
[SPARK-41458] - Correctly transform the SPI services for Yarn Shuffle Service
[SPARK-41468] - Fix PlanExpression handling in EquivalentExpressions
[SPARK-41475] - Fix lint-scala command error
[SPARK-41522] - GA dependencies test faild
[SPARK-41535] - InterpretedUnsafeProjection and InterpretedMutableProjection can corrupt unsafe buffer when used with calendar interval data
[SPARK-41539] - stats and constraints in LogicalRDD may not be in sync with output attributes
[SPARK-41554] - Decimal.changePrecision produces ArrayIndexOutOfBoundsException
[SPARK-41668] - DECODE function returns wrong results when passed NULL
[SPARK-41683] - Spark UI: In jobs API, numActiveStages can be negative in some cases
[SPARK-41732] - Session window: analysis rule "SessionWindowing" does not apply tree-pattern based pruning
[SPARK-41733] - Session window: analysis rule "ResolveWindowTime" does not apply tree-pattern based pruning
[SPARK-41735] - Any SparkThrowable (with an error class) not in error-classes.json is masked in SQLExecution.withNewExecutionId and end-user will see "org.apache.spark.SparkException: [INTERNAL_ERROR]"
[SPARK-41741] - [SQL] ParquetFilters StringStartsWith push down matching string do not use UTF-8
[SPARK-41780] - `regexp_replace('', '[a\\\\d]{0, 2}', 'x')` causes an internal error
[SPARK-41790] - Set TRANSFORM reader and writer's format correctly
[SPARK-41792] - Shuffle merge finalization removes the wrong finalization state from the DB
[SPARK-41793] - Incorrect result for window frames defined by a range clause on large decimals
[SPARK-41804] - InterpretedUnsafeProjection doesn't properly handle an array of UDTs
[SPARK-41848] - Tasks are over-scheduled with TaskResourceProfile
[SPARK-41858] - Fix ORC reader perf regression due to DEFAULT value feature
[SPARK-41859] - CreateHiveTableAsSelectCommand should set the overwrite flag correctly
[SPARK-41894] - sql/core module mvn clean failed
[SPARK-41896] - Filtering by row_index always returns empty results
[SPARK-41912] - Subquery should not validate CTE
[SPARK-41914] - Sorting issue with partitioned-writing and planned write optimization disabled
[SPARK-41937] - SparkR datetime column compare with Sys.time() throws error in R (>= 4.2.0)
[SPARK-41947] - Update the contents of error class guidelines
[SPARK-41948] - Fix NPE for error classes: CANNOT_PARSE_JSON_FIELD
[SPARK-41952] - Upgrade Parquet to fix off-heap memory leaks in Zstd codec
[SPARK-41958] - Disallow arbitrary custom classpath with proxy user in cluster mode
[SPARK-41982] - When the inserted partition type is of string type, similar `dt=01` will be converted to `dt=1`
[SPARK-41985] - Centralize more column resolution rules
[SPARK-41989] - PYARROW_IGNORE_TIMEZONE warning can break application logging setup
[SPARK-41990] - Filtering by composite field name like `field name` doesn't work with pushDownPredicate = true
[SPARK-41991] - Interpreted mode subexpression elimination can throw exception during insert
[SPARK-42046] - Add `connect-client-jvm` to connect module
[SPARK-42057] - Avoid losing exception info in Protobuf errors
[SPARK-42059] - Update ORC to 1.8.2
[SPARK-42061] - Mark Expressions that have state has stateful
[SPARK-42066] - The DATATYPE_MISMATCH error class contains inappropriate and duplicating subclasses
[SPARK-42084] - Avoid leaking the qualified-access-only restriction
[SPARK-42088] - Running python3 setup.py sdist on windows reports a permission error
[SPARK-42090] - Introduce sasl retry count in RetryingBlockTransferor
[SPARK-42109] - Upgrade Kafka to 3.3.2
[SPARK-42112] - Add null check before `ContinuousWriteRDD#compute` method close dataWriter
[SPARK-42113] - Upgrade pandas to 1.5.3
[SPARK-42115] - Push down limit through Python UDFs
[SPARK-42134] - Fix getPartitionFiltersAndDataFilters() to handle filters without referenced attributes
[SPARK-42156] - Support client-side retries in Spark Connect Python client
[SPARK-42157] - `spark.scheduler.mode=FAIR` should provide FAIR scheduler
[SPARK-42162] - Memory usage on executors increased drastically for a complex query with large number of addition operations
[SPARK-42163] - Schema pruning fails on non-foldable array index or map key
[SPARK-42171] - Fix `pyspark-errors` module and enable it in GitHub Action
[SPARK-42174] - Use scikit-learn instead of sklearn
[SPARK-42176] - Cast boolean to timestamp fails with ClassCastException
[SPARK-42177] - Change master to brach-3.4 in GitHub Actions
[SPARK-42186] - Make SparkR able to stop properly when the connection is timed-out
[SPARK-42196] - Typo in StreamingQuery.scala
[SPARK-42201] - `build/sbt` should allow SBT_OPTS to override JVM memory setting
[SPARK-42228] - connect-client-jvm module should shaded+relocation grpc
[SPARK-42241] - Correct the condition for `SparkConnectServerUtils#findSparkConnectJar` to find the correct connect server jar for maven
[SPARK-42242] - Upgrade snappy-java to 1.1.9.1
[SPARK-42250] - predict_batch_udf with float fails when the batch size consists of single value
[SPARK-42259] - ResolveGroupingAnalytics should take care of Python UDAF
[SPARK-42274] - Upgrade `compress-lzf` to 1.1.2
[SPARK-42276] - Add ServicesResourceTransformer to connect server module shade configuration
[SPARK-42286] - Fix internal error for valid CASE WHEN expression with CAST when inserting into a table
[SPARK-42331] - Fix metadata col can not been resolved
[SPARK-42344] - The default size of the CONFIG_MAP_MAXSIZE should not be greater than 1048576
[SPARK-42346] - distinct(count colname) with UNION ALL causes query analyzer bug
[SPARK-42384] - Mask function's generated code does not handle null input
[SPARK-42401] - Incorrect results or NPE when inserting null value into array using array_insert/array_append
[SPARK-42403] - JsonProtocol should handle null JSON strings
[SPARK-42406] - [PROTOBUF] Recursive field handling is incompatible with delta
[SPARK-42410] - Support Scala 2.12/2.13 tests in connect module
[SPARK-42416] - Dateset operations should not resolve the analyzed logical plan again
[SPARK-42444] - DataFrame.drop should handle multi columns properly
[SPARK-42445] - Fix SparkR install.spark function
[SPARK-42448] - spark sql shell prompts wrong database info
[SPARK-42462] - Prevent `docker-image-tool.sh` from publishing OCI manifests
[SPARK-42478] - Make a serializable jobTrackerId instead of a non-serializable JobID in FileWriterFactory
[SPARK-42515] - ClientE2ETestSuite local test failed
[SPARK-42516] - Non-captured session time zone in view creation
[SPARK-42534] - Fix DB2 Limit clause
[SPARK-42547] - Make PySpark working with Python 3.7
[SPARK-42596] - [YARN] OMP_NUM_THREADS not set to number of executor cores by default
[SPARK-42600] - currentDatabase Shall use NamespaceHelper instead of MultipartIdentifierHelper
[SPARK-42608] - Use full column names for inner fields in resolution errors
[SPARK-42611] - Insert char/varchar length checks for inner fields during resolution
[SPARK-42616] - SparkSQLCLIDriver shall only close started hive sessionState
[SPARK-42655] - Incorrect ambiguous column reference error
[SPARK-42665] - `simple udf` test failed using Maven
[SPARK-42673] - Make build/mvn build Spark only with the verified maven version
[SPARK-42677] - Fix the invalid tests for broadcast hint
[SPARK-42697] - /api/v1/applications return 0 for duration
[SPARK-42700] - Add h2 as test dependency of connect-server module
[SPARK-42709] - Do not rely on __file__
[SPARK-42851] - EquivalentExpressions methods need to be consistently guarded by supportedExpression
[SPARK-42928] - Make resolvePersistentFunction synchronized
[SPARK-42936] - Unresolved having at the end of analysis when using with LCA with the having clause that can be resolved directly by its child Aggregate
[SPARK-42967] - Fix SparkListenerTaskStart.stageAttemptId when a task is started after the stage is cancelled
[SPARK-42971] - When processing the WorkDirCleanup event, if appDirs is empty, should print workdir
[SPARK-43041] - Restore constructors of exceptions for compatibility in connector API
[SPARK-43158] - Set upperbound of pandas version in binder integrations
[SPARK-43538] - Spark Homebrew Formulae currently depends on non-officially-supported Java 20

Epic

[SPARK-32082] - Project Zen: Improving Python usability
[SPARK-40653] - Protobuf Support in Structured Streaming

Story

[SPARK-40211] - Allow executeTake() / collectLimit's number of starting partitions to be customized

New Feature

[SPARK-27561] - Support "lateral column alias references" to allow column aliases to be used within SELECT clauses
[SPARK-30641] - Project Matrix: Linear Models revisit and refactor
[SPARK-35662] - Support Timestamp without time zone data type
[SPARK-37568] - Support 2-arguments by the convert_timezone() function
[SPARK-37671] - Support ANSI Aggregation Function of regression
[SPARK-38591] - Add sortWithinGroups to KeyValueGroupedDataset
[SPARK-38647] - Add SupportsReportOrdering mix in interface for Scan
[SPARK-38864] - Unpivot / melt function for Dataset API
[SPARK-38904] - Low cost DataFrame schema swap util
[SPARK-39057] - Offset could work without Limit
[SPARK-39071] - Add unwrap_udt function for unwrapping UserDefinedType columns
[SPARK-39159] - Add new Dataset API for Offset
[SPARK-39168] - Consider all values in a python list when inferring schema
[SPARK-39305] - Implement the EQUAL_NULL function
[SPARK-39306] - support scalar subquery in time travel
[SPARK-39320] - Add the MEDIAN() function
[SPARK-39457] - Support IPv6-only environment
[SPARK-39567] - Support ANSI intervals in the percentile functions
[SPARK-39618] - Add the REGEXP_COUNT function
[SPARK-39625] - add Dataset.to(StructType)
[SPARK-39695] - Add the REGEXP_SUBSTR function
[SPARK-39741] - Support url encode/decode as built-in function
[SPARK-39744] - Add the REGEXP_INSTR function
[SPARK-39808] - Support aggregate function MODE
[SPARK-39876] - Unpivot / melt function for SQL
[SPARK-39877] - Unpivot / melt function for PySpark
[SPARK-40003] - Add median to PySpark
[SPARK-40007] - Add Mode to PySpark
[SPARK-40015] - Add sc.listArchives and sc.listFiles to PySpark
[SPARK-40087] - Support multiple Column drop in R
[SPARK-40264] - Add helper function for DL model inference in pyspark.ml.functions
[SPARK-40281] - Memory Profiler on Executors
[SPARK-40530] - Add error-related developer APIs
[SPARK-40585] - Support double-quoted identifiers
[SPARK-40849] - Async log purge
[SPARK-40956] - SQL Equivalent for Dataframe overwrite command
[SPARK-40957] - Add in memory cache in HDFSMetadataLog
[SPARK-41183] - Add an extension API to do plan normalization for caching
[SPARK-41195] - Support PIVOT/UNPIVOT with join children
[SPARK-41271] - Parameterized SQL
[SPARK-41290] - Support GENERATED ALWAYS AS syntax in create/replace table to create a generated column
[SPARK-41323] - Support CURRENT_SCHEMA() as alias for CURRENT_DATABASE()
[SPARK-41378] - Support Column Stats in DS V2
[SPARK-41515] - PVC-oriented executor pod allocation
[SPARK-41635] - GROUP BY ALL
[SPARK-41637] - ORDER BY ALL
[SPARK-41666] - Support parameterized SQL in PySpark
[SPARK-42477] - python: accept user_agent in spark connect's connection string
[SPARK-42556] - Dataset.colregex should link a plan_id when it only matches a single column.
[SPARK-42610] - Add implicit encoders to SQLImplicits
[SPARK-42614] - Make all constructors private[sql]
[SPARK-42632] - Fix scala paths in tests
[SPARK-42637] - Add SparkSession.stop
[SPARK-42680] - Create the helper function withSQLConf for connect's test
[SPARK-42690] - Implement CSV/JSON parsing funcions
[SPARK-42884] - Add Ammonite REPL support

Improvement

[SPARK-25050] - Handle more than two types in avro union types when writing avro files
[SPARK-29260] - Enable supported Hive metastore versions once it support altering database location
[SPARK-32170] - Improve the speculation for the inefficient tasks by the task metrics.
[SPARK-33605] - Add gcs-connector to hadoop-cloud module
[SPARK-33753] - Reduce the memory footprint and gc of the cache (hadoopJobMetadata)
[SPARK-34265] - Instrument Python UDF execution using SQL Metrics
[SPARK-34659] - Web UI does not correctly get appId
[SPARK-34927] - Support TPCDSQueryBenchmark in Benchmarks
[SPARK-35242] - Support change catalog default database for spark
[SPARK-35739] - [Spark Sql] Add Java-comptable Dataset.join overloads
[SPARK-35743] - Improve Parquet vectorized reader
[SPARK-36259] - Expose localtimestamp in pyspark.sql.functions
[SPARK-36462] - Allow Spark on Kube to operate without polling or watchers
[SPARK-36664] - Log time spent waiting for cluster resources
[SPARK-36837] - Upgrade Kafka to 3.1.0
[SPARK-37348] - PySpark pmod function
[SPARK-37523] - Support optimize skewed partitions in Distribution and Ordering if numPartitions is not specified
[SPARK-37825] - Make spark beeline be able to handle javaOpts
[SPARK-37956] - Add Java and Python examples to the Parquet encryption feature documentation
[SPARK-37961] - override maxRows/maxRowsPerPartition for some logical operators
[SPARK-37980] - Extend METADATA column to support row indices for file based data sources
[SPARK-38034] - Optimize time complexity and extend applicable cases for TransposeWindow
[SPARK-38098] - Add support for ArrayType of nested StructType to arrow-based conversion
[SPARK-38194] - Make memory overhead factor configurable
[SPARK-38277] - Clear write batch after RocksDB state store's commit
[SPARK-38334] - Implement support for DEFAULT values for columns in tables
[SPARK-38349] - No need to filter events when sessionwindow gapDuration greater than 0
[SPARK-38522] - Strengthen the contract on iterator method in StateStore
[SPARK-38541] - Upgrade netty to 4.1.75
[SPARK-38545] - Upgarde scala-maven-plugin from 4.4.0 to 4.5.6
[SPARK-38555] - Avoid contention and get or create clientPools quickly in the TransportClientFactory
[SPARK-38564] - Support collecting metrics from streaming sinks
[SPARK-38568] - Upgrade ZSTD-JNI to 1.5.2-2
[SPARK-38569] - external top-level directory is problematic for bazel
[SPARK-38573] - Support Auto Partition Statistics Collection
[SPARK-38575] - Duduplicate branch specification in GitHub Actions workflow
[SPARK-38582] - Add KubernetesUtils.buildEnvVars(WithFieldRef)? utility functions
[SPARK-38584] - Unify the data validation
[SPARK-38585] - Simplify the code of TreeNode.clone()
[SPARK-38593] - Incorporate numRowsDroppedByWatermark metric from SessionWindowStateStoreRestoreExec into StateOperatorProgress
[SPARK-38594] - Change to use `NettyUtils` to create `EventLoop` and `ChannelClass` in RBackend
[SPARK-38611] - Use `assertThrows` instead of handwriting `intercept` method in `CatalogLoadingSuite`
[SPARK-38619] - Clean up Junit api usage in scalatest
[SPARK-38620] - Replace `value.formatted(formatString)` with `formatString.format(value)` to clean up compilation warning
[SPARK-38622] - Upgrade jersey to 2.35
[SPARK-38624] - Reduce UnsafeProjection.create call times when Percentile function serializes the aggregation buffer object
[SPARK-38635] - Remove duplicate log for spark ApplicationMaster
[SPARK-38641] - Get rid of invalid configuration elements in mvn_scalafmt in main pom.xml
[SPARK-38646] - Pull a trait out for Python functions
[SPARK-38660] - PySpark DeprecationWarning: distutils Version classes are deprecated
[SPARK-38661] - [TESTS] Replace 'abc & Symbol("abc") symbols with $"abc" in tests
[SPARK-38670] - Add offset commit time to streaming query listener
[SPARK-38671] - Publish snapshot from branch-3.3
[SPARK-38673] - Replace java assert with Junit api in Java UTs
[SPARK-38674] - Remove useless deduplicate in SubqueryBroadcastExec
[SPARK-38679] - Expose the number partitions in a stage to TaskContext
[SPARK-38683] - It is unnecessary to release the ShuffleManagedBufferIterator or ShuffleChunkManagedBufferIterator or ManagedBufferIterator buffers when the client channel's connection is terminated
[SPARK-38694] - Simplify Java UT code with Junit `assertThrows`
[SPARK-38711] - Refactor pyspark.sql.streaming module
[SPARK-38713] - Change spark.sessionstate.conf.getConf/setConf operation to spark.conf.get/set
[SPARK-38756] - Clean up useless security configs in `TransportConf`
[SPARK-38757] - Update the Oracle docker image version used for test and integration
[SPARK-38759] - Add StreamingQueryListener support in PySpark
[SPARK-38760] - Implement DataFrame.observe in PySpark
[SPARK-38767] - Support ignoreCorruptFiles and ignoreMissingFiles in Data Source options
[SPARK-38770] - Remove renameMainAppResource from baseDriverContainer
[SPARK-38772] - Formatting the log plan in AdaptiveSparkPlanExec
[SPARK-38779] - Unify the pushed operator checking between FileSource test suite and JDBC test suite
[SPARK-38797] - Runtime Filter support pushdown through window
[SPARK-38798] - Make `spark.file.transferTo` as an `ConfigEntry`
[SPARK-38803] - Set minio cpu to 250m (0.25) in K8s IT
[SPARK-38804] - Add StreamingQueryManager.removeListener in PySpark
[SPARK-38826] - dropFieldIfAllNull option does not work for empty JSON struct
[SPARK-38832] - Remove unnecessary distinct in aggregate expression by distinctKeys
[SPARK-38835] - Refactor FsHistoryProviderSuite to test rocks db
[SPARK-38836] - Increase the performance of ExpressionSet
[SPARK-38841] - Enable Bloom filter join by default
[SPARK-38847] - Introduce a `viewToSeq` function for `KVUtils`
[SPARK-38848] - Replcace all `@Test(expected = XXException)` with assertThrows
[SPARK-38850] - Upgrade Kafka to 3.2.0
[SPARK-38851] - Refactor `HistoryServerSuite` to add UTs for RocksDB
[SPARK-38881] - PySpark Kinesis Streaming should expose metricsLevel CloudWatch config that is already supported in the Scala/Java APIs
[SPARK-38885] - Upgrade netty to 4.1.76
[SPARK-38886] - Remove outer join if aggregate functions are duplicate agnostic on streamed side
[SPARK-38888] - Add `RocksDBProvider` similar to `LevelDBProvider`
[SPARK-38896] - Use tryWithResource to recycling KVStoreIterator
[SPARK-38909] - Encapsulate LevelDB used by ExternalShuffleBlockResolver and YarnShuffleService as LocalDB
[SPARK-38914] - Allow user to insert specified columns into insertable view
[SPARK-38921] - Use k8s-client to create queue resource in Volcano IT
[SPARK-38929] - Improve error messages for cast failures in ANSI
[SPARK-38940] - Test Series' anchor frame for in-place updates on Series
[SPARK-38966] - Fix CI for fork branches in-sync with upstream master
[SPARK-38968] - remove hadoopConf from KerberosConfDriverFeatureStep
[SPARK-38970] - Skip build-and-test workflow on forks when scheduled
[SPARK-38971] - Test anchor frame for in-place `Series.rename_axis`
[SPARK-38979] - Improve error log readability in OrcUtils.requestedColumnIds
[SPARK-38985] - Support sub-error-class for UNSUPPORTED_FEATURE et al
[SPARK-38999] - Refactor DataSourceScanExec code to
[SPARK-39002] - StringEndsWith/Contains support push down to Parquet so that we can leverage dictionary filter
[SPARK-39014] - Respect ignoreMissingFiles from Data Source options in InMemoryFileIndex
[SPARK-39016] - Fix compilation warnings related to "`enum` will become a keyword in Scala 3"
[SPARK-39038] - Skip reporting test results if triggering workflow was skipped
[SPARK-39042] - Use `Map.values()` instead of `Map.entrySet()` in scenarios that do not use `keys`
[SPARK-39050] - Convert UNSUPPORTED_OPERATION to UNSUPPORTED_FEATURE
[SPARK-39051] - Minor refactoring of `python/pyspark/sql/pandas/conversion.py`
[SPARK-39052] - Support Char in Literal.create
[SPARK-39062] - Add Standalone backend support for Stage Level Scheduling
[SPARK-39067] - Upgrade scala-maven-plugin to 4.6.1
[SPARK-39068] - Make thriftserver and sparksql-cli support in-memory catalog
[SPARK-39073] - Keep rowCount after hive table partition pruning if table only have hive statistics
[SPARK-39102] - Replace the usage of guava's Files.createTempDir() with java.nio.file.Files.createTempDirectory()
[SPARK-39111] - Mark overriden methods with `@override` annotation
[SPARK-39113] - rename self to cls in python/pyspark/mllib/clustering.py
[SPARK-39116] - Replcace double negation in exists with forall
[SPARK-39119] - Upgrade to Hadoop 3.3.3
[SPARK-39123] - Upgrade `org.scalatestplus:mockito` to 3.2.12.0
[SPARK-39124] - Upgrade rocksdbjni to 7.1.2
[SPARK-39133] - Mention log level setting in PYSPARK_JVM_STACKTRACE_ENABLED
[SPARK-39134] - Add custom metric of skipped null values for stream join operator
[SPARK-39137] - Use slice instead of take and drop
[SPARK-39138] - Add ANSI general value specification and function -user
[SPARK-39146] - The singleton Jackson ObjectMapper should be preferred
[SPARK-39147] - Code simplification, use count() instead of filter().size, etc.
[SPARK-39152] - StreamCorruptedException cause job failure for disk persisted RDD
[SPARK-39156] - Remove ParquetLogRedirector usage from ParquetFileFormat
[SPARK-39160] - Remove workaround for ARROW-1948
[SPARK-39161] - Upgrade rocksdbjni to 7.2.2
[SPARK-39171] - Unify the Cast expression
[SPARK-39172] - Remove outer join if all output come from streamed side and buffered side keys exist unique key
[SPARK-39180] - Simplify the planning of limit and offset
[SPARK-39182] - Upgrade to Arrow 8.0.0
[SPARK-39186] - make skew consistent with pandas
[SPARK-39192] - make pandas-on-spark's kurt consistent with pandas
[SPARK-39196] - Replace getOrElse(null) with orNull
[SPARK-39204] - Replace `Utils.createTempDir` related methods with JavaUtils
[SPARK-39205] - Add `PANDAS API ON SPARK` label
[SPARK-39213] - Create ANY_VALUE aggregate function
[SPARK-39217] - Makes DPP support the pruning side has Union
[SPARK-39225] - Support spark.history.fs.update.batchSize
[SPARK-39231] - Change to use `ConstantColumnVector` to store partition columns in `VectorizedParquetRecordReader`
[SPARK-39235] - Make Catalog API be compatible with 3-layer-namespace
[SPARK-39248] - Decimal divide much slower than multiply
[SPARK-39251] - Simplify MultiLike if remainPatterns is empty
[SPARK-39254] - Upgrade ZSTD-JNI to 1.5.2-3
[SPARK-39256] - Reduce multiple file attribute calls of JavaUtils#deleteRecursivelyUsingJavaIO
[SPARK-39260] - Use `Reader.getSchema` instead of `Reader.getTypes`
[SPARK-39261] - Improve newline formatting for error messages
[SPARK-39262] - Correct the behavior of creating DataFrame from an RDD
[SPARK-39266] - Cleanup unused spark.rpc.numRetries and spark.rpc.retry.wait configs
[SPARK-39267] - Clean up dsl unnecessary symbol
[SPARK-39277] - Make Optimizer extends SQLConfHelper
[SPARK-39282] - Replace If-Else branch with bitwise operators in roundNumberOfBytesToNearestWord
[SPARK-39295] - Improve documentation of pandas API support list.
[SPARK-39298] - Change to use `seq.indices` constructing ranges
[SPARK-39299] - Series.autocorr use SQL.corr to avoid conversion to vector
[SPARK-39301] - Levearge LocalRelation in createDataFrame with Arrow optimization
[SPARK-39308] - Upgrade parquet to 1.12.3
[SPARK-39312] - Use Parquet in predicate for Spark In filter
[SPARK-39318] - Remove tpch-plan-stability WithStats golden files
[SPARK-39321] - Refactor TryCast to use RuntimeReplaceable
[SPARK-39323] - Hide empty `taskResourceAssignments` from INFO log
[SPARK-39325] - Improve MapOutputTracker convertMapStatuses performance
[SPARK-39332] - Upgrade RoaringBitmap to 0.9.28
[SPARK-39333] - Change to use `foreach` when `map` produce no result
[SPARK-39349] - Add a CheckError() method to SparkFunSuite
[SPARK-39368] - Move RewritePredicateSubquery into InjectRuntimeFilter
[SPARK-39374] - Improve error message for user specified column list
[SPARK-39377] - Normalize expr ids in ListQuery and Exists expressions
[SPARK-39381] - Make vectorized orc columar writer batch size configurable
[SPARK-39387] - Upgrade hive-storage-api to 2.7.3
[SPARK-39388] - Reuse orcSchema when push down Orc predicates
[SPARK-39390] - Hide and optimize `viewAcls`/`viewAclsGroups`/`modifyAcls`/`modifyAclsGroups` fron INFO log
[SPARK-39392] - Refine ANSI error messages and remove 'To return NULL instead'
[SPARK-39397] - Relax AliasAwareOutputExpression to support alias with expression
[SPARK-39409] - Upgrade scala-maven-plugin to 4.6.2
[SPARK-39414] - Upgrade Scala to 2.12.16
[SPARK-39428] - use code block for `Coalesce Hints for SQL Queries`
[SPARK-39439] - Suppress error log for in-progress event log not found
[SPARK-39440] - Add a config to disable event timeline
[SPARK-39441] - Speed up DeduplicateRelations
[SPARK-39443] - Improve docstring of pyspark.sql.functions.col/first
[SPARK-39446] - Add relevance score for nDCG evaluation in MLLIB
[SPARK-39449] - Propagate empty relation through Window
[SPARK-39456] - Fix broken function links in the auto-generated pandas API support list documentation.
[SPARK-39466] - Clean `core/temp-secrets/` after executing `SecurityManagerSuite`
[SPARK-39469] - Infer date type for CSV schema inference
[SPARK-39488] - Simplify the error handling of TempResolvedColumn
[SPARK-39489] - Improve EventLoggingListener and ReplayListener performance by replacing Json4S ASTs with Jackson trees
[SPARK-39492] - Rework MISSING_COLUMN error class
[SPARK-39497] - Improve the analysis exception of missing map key column
[SPARK-39511] - Push limit 1 to right side if join type is LeftSemiOrAnti and join condition is empty
[SPARK-39512] - Document the Spark Docker container release process
[SPARK-39533] - Deprecate scoreLabelsWeight in BinaryClassificationMetrics
[SPARK-39534] - Series.argmax only needs single pass
[SPARK-39538] - Convert CaseInsensitiveStringMap#logger to static
[SPARK-39545] - Override `concat` method for `ExpressionSet` in Scala 2.13 to improve the performance
[SPARK-39546] - Support ports definition in executor pod template
[SPARK-39564] - Expose the information of catalog table to the logical plan in streaming query
[SPARK-39576] - Support GitHub Actions generate benchmark results using Scala 2.13
[SPARK-39591] - SPIP: Asynchronous Offset Management in Structured Streaming
[SPARK-39595] - Upgrade rocksdbjni to 7.3.1
[SPARK-39599] - Upgrade maven to 3.8.6
[SPARK-39606] - Use child stats to estimate order operator
[SPARK-39613] - Upgrade shapeless to 2.3.9
[SPARK-39616] - Upgrade Breeze to 2.0
[SPARK-39626] - Upgrade RoaringBitmap from 0.9.28 to 0.9.30
[SPARK-39633] - Dataframe options for time travel via `timestampAsOf` should respect both formats of specifying timestamp
[SPARK-39635] - Custom driver metrics for Datasource v2
[SPARK-39636] - Fix multiple small bugs in JsonProtocol, impacting StorageLevel and Task/Executor resource requests
[SPARK-39638] - Change to use `ConstantColumnVector` to store partition columns in `OrcColumnarBatchReader`
[SPARK-39651] - Prune filter condition if compare with rand is deterministic
[SPARK-39653] - Remove `ColumnVectorUtils#populate(WritableColumnVector, InternalRow, int) ` method
[SPARK-39657] - YARN AM client should call the non-static setTokensConf method
[SPARK-39661] - Avoid creating unnecessary SLF4J Logger
[SPARK-39662] - Upgrade HtmlUnit and its related artifacts from 2.50.0 to 2.62.0.
[SPARK-39666] - Use UnsafeProjection.create to respect `spark.sql.codegen.factoryMode` in ExpresssionEncoder
[SPARK-39667] - Add another workaround when there is not enough memory to build and broadcast the table
[SPARK-39675] - Switch 'spark.sql.codegen.factoryMode' configuration from testing purpose to internal purpose
[SPARK-39676] - Add task partition id for Task assertEquals method in JsonProtocolSuite
[SPARK-39679] - TakeOrderedAndProjectExec should respect child output ordering
[SPARK-39689] - Support 2-chars lineSep in CSV datasource
[SPARK-39691] - Supplement `MapStatusesConvertBenchmark` result generated by Java 11 and 17
[SPARK-39693] - `tpcds-1g-gen` shouldn't execute If benchmark GA does not specify to execute TPCDSQueryBenchmark
[SPARK-39694] - Update `${sbtProject}/test:runMain` to `${sbtProject}/Test/runMain`
[SPARK-39699] - Make CollapseProject smarter about collection creation expressions
[SPARK-39702] - Reduce memory overhead of TransportCipher$EncryptedMessage's byteRawChannel buffer
[SPARK-39706] - Set missing column with defaultValue as constant in `ParquetColumnVector`
[SPARK-39713] - ANSI mode: add suggestion of using try_element_at for INVALID_ARRAY_INDEX error
[SPARK-39724] - Remove duplicate `.setAccessible(true)` in `kvstore.KVTypeInfo`
[SPARK-39727] - Upgrade joda-time from 2.10.13 to 2.10.14
[SPARK-39728] - Test for parity of SQL functions between Python and JVM DataFrame API's
[SPARK-39733] - Add map_contains_key to pyspark.sql.functions
[SPARK-39734] - Add call_udf to pyspark.sql.functions
[SPARK-39739] - Upgrade sbt to 1.7.0
[SPARK-39748] - Include the origin logical plan for LogicalRDD if it comes from DataFrame
[SPARK-39749] - ANSI SQL mode: use plain string representation on casting Decimal to String
[SPARK-39751] - Better naming for hash aggregate key probing metric
[SPARK-39754] - Remove unused import or unnecessary {}
[SPARK-39755] - Improve LocalDirsFeatureStep to randomize local directories
[SPARK-39757] - Upgrade sbt from 1.7.0 to 1.7.1
[SPARK-39760] - Support Varchar in PySpark
[SPARK-39764] - Make PhysicalOperation the same as ScanOperation
[SPARK-39767] - Remove UnresolvedDBObjectName and add UnresolvedIdentifier
[SPARK-39784] - Put Literal values on the right side of the data source filter after translating Catalyst Expression to data source filter
[SPARK-39785] - Use setBufferedIo instead of withBufferedIo to cleanup log4j2 deprecated api usage
[SPARK-39789] - Remove unused method and redundant throw exception declare
[SPARK-39798] - Simplify `GenericArrayData` constructor implementation
[SPARK-39803] - Use commons-text LevenshteinDistance instead of commons-langs3 `StringUtils.getLevenshteinDistance`
[SPARK-39806] - Queries accessing METADATA struct crash on partitioned tables
[SPARK-39809] - Support CharType in PySpark
[SPARK-39812] - Simplify code to construct AggregateExpression with toAggregateExpression
[SPARK-39823] - add DataFrame.as(StructType) in PySpark
[SPARK-39831] - R dependencies installation start to fail after devtools_2.4.4 was released
[SPARK-39832] - regexp_replace should support column arguments
[SPARK-39834] - Include the origin stats and constraints for LogicalRDD if it comes from DataFrame
[SPARK-39840] - Factor PythonArrowInput out as a symmetry to PythonArrowOutput
[SPARK-39849] - Dataset.as(StructType) fills missing new columns with null value
[SPARK-39851] - Improve join stats estimation if one side can keep uniqueness
[SPARK-39853] - Support stage level schedule for standalone cluster when dynamic allocation is disabled
[SPARK-39858] - Remove unnecessary AliasHelper or PredicateHelper for some rules
[SPARK-39860] - More expressions should extend Predicate
[SPARK-39863] - Upgrade Hadoop to 3.3.4
[SPARK-39864] - ExecutionListenerManager's registration of the ExecutionListenerBus should be lazy
[SPARK-39868] - StageFailed event should attach with the root cause
[SPARK-39870] - Add flag to run-tests.py to retain the test output.
[SPARK-39872] - HeapByteBuffer#get(int) is a hotspot path when using BytePackerForLong#unpack8Values with ByteBuffer input API
[SPARK-39873] - Remove OptimizeLimitZero and merge it into EliminateLimits
[SPARK-39875] - The method in final class should not declare as protected
[SPARK-39879] - Reduce local-cluster memory configuration in BroadcastJoinSuite* and HiveSparkSubmitSuite
[SPARK-39881] - Python Lint does not actually check for `black` formatter
[SPARK-39882] - Upgrade rocksdbjni to 7.4.3
[SPARK-39883] - Add DataFrame function parity check
[SPARK-39890] - Make TakeOrderedAndProjectExec inherit AliasAwareOutputOrdering
[SPARK-39891] - Bump h2 to 2.1.214
[SPARK-39902] - Add Scan details to spark plan scan node in SparkUI
[SPARK-39904] - Rename inferDate to preferDate and fix an issue when inferring schema
[SPARK-39906] - Eliminate build warnings - 'sbt 0.13 shell syntax is deprecated; use slash syntax instead'
[SPARK-39911] - Optimize global Sort to RepartitionByExpression
[SPARK-39912] - Refine CatalogImpl
[SPARK-39913] - Upgrade Arrow to 9.0.0
[SPARK-39925] - Add array_sort(column, comparator) overload to DataFrame operations
[SPARK-39944] - Upgrade dropwizard metrics to 4.2.10
[SPARK-39947] - Upgrade jersey to 2.36
[SPARK-39948] - Exclude hive-vector-code-gen dependency
[SPARK-39951] - Support columnar batches with nested fields in Parquet V2
[SPARK-39954] - Upgrade ASM to 9.3
[SPARK-39955] - Improve LaunchTask process to avoid Stage failures caused by fail-to-send LaunchTask messages
[SPARK-39957] - Delay onDisconnected to enable Driver receives ExecutorExitCode
[SPARK-39958] - Add warning log when unable to load custom metric object
[SPARK-39960] - Upgrade mysql-connector-java to 8.0.30
[SPARK-39963] - Simplify the implementation of SimplifyCasts
[SPARK-39973] - Avoid noisy warnings logs when spark.scheduler.listenerbus.metrics.maxListenerClassesTimed = 0
[SPARK-39975] - Upgrade rocksdbjni to 7.4.5
[SPARK-39977] - Remove unnecessary guava exclusion from jackson-module-scala
[SPARK-39982] - StructType.fromJson method missing documentation
[SPARK-39983] - Should not cache unserialized broadcast relations on the driver
[SPARK-39986] - Better example for Co-grouped Map
[SPARK-39989] - Support estimate column statistics if it is foldable expression
[SPARK-39991] - AQE should use available column statistics from completed query stages
[SPARK-40004] - Redundant `LevelDB.get` in `RemoteBlockPushResolver`
[SPARK-40009] - Add missing doc string info to DataFrame API
[SPARK-40019] - Refactor comment of ArrayType
[SPARK-40020] - centralize the code of qualifying identifiers in SessionCatalog
[SPARK-40022] - YarnClusterSuite should not ABORTED when there is no Python3 environment
[SPARK-40030] - Upgrade scala-maven-plugin to 4.7.1
[SPARK-40033] - Nested schema pruning support through element_at
[SPARK-40039] - Introducing a streaming checkpoint file manager based on Hadoop's Abortable interface
[SPARK-40040] - Push local limit to both sides if join condition is empty
[SPARK-40050] - Enhance EliminateSorts to support removing sorts via LocalLimit
[SPARK-40053] - HiveExternalCatalogVersionsSuite will test all spark versions and aborted when Python 2.7 is used
[SPARK-40056] - Upgrade mvn-scalafmt from 1.0.4 to 1.1.1640084764.9f463a9
[SPARK-40058] - Avoid filter twice in HadoopFSUtils
[SPARK-40067] - Add table name to Spark plan node in SparkUI
[SPARK-40071] - Update plugins to latest versions
[SPARK-40072] - MAVEN_OPTS in make-distributions.sh is different from one specified in pom.xml
[SPARK-40073] - Should Use `connector/${moduleName}` instead of `external/${moduleName}`
[SPARK-40084] - Upgrade Py4J from 0.10.9.5 to 0.10.9.7
[SPARK-40085] - use INTERNAL_ERROR error class instead of IllegalStateException to indicate bugs
[SPARK-40086] - Improve AliasAwareOutputPartitioning to take all aliases into account
[SPARK-40095] - sc.uiWebUrl should not throw exception when webui is disabled
[SPARK-40105] - Improve repartition in ReplaceCTERefWithRepartition
[SPARK-40106] - Task failure handlers should always run if the task failed
[SPARK-40112] - Improve the TO_BINARY() function
[SPARK-40113] - Reactor ParquetScanBuilder DataSourceV2 interface implementation
[SPARK-40128] - Add DELTA_LENGTH_BYTE_ARRAY as a recognized standalone encoding in VectorizedColumnReader
[SPARK-40145] - Create infra image when cut down branches
[SPARK-40146] - Simply the codegen of getting map value
[SPARK-40153] - Unify the logic of resolve functions and table-valued functions
[SPARK-40162] - Upgrade RoaringBitmap from 0.9.30 to 0.9.31
[SPARK-40163] - [SPARK][SQL] feat: SparkSession.confing(Map)
[SPARK-40165] - Update test plugins to latest versions
[SPARK-40166] - Add array_sort(column, comparator) to PySpark
[SPARK-40167] - Add array_sort(column, comparator) to SparkR
[SPARK-40175] - Converting Tuple2 to Scala Map via `.toMap` is slow
[SPARK-40185] - Remove column suggestion when the candidate list is empty for unresolved column/attribute/map key
[SPARK-40192] - Remove redundant groupby
[SPARK-40194] - SPLIT function on empty regex should truncate trailing empty string.
[SPARK-40197] - Replace query plan with context for MULTI_VALUE_SUBQUERY_ERROR
[SPARK-40207] - Specify the column name when the data type is not supported by datasource
[SPARK-40214] - Add `get` to dataframe functions
[SPARK-40215] - Add SQL configs to control CSV/JSON date and timestamp parsing behaviour
[SPARK-40216] - Extract common `prepareWrite` method for `ParquetFileFormat` and `ParquetWrite` to eliminate duplicate code
[SPARK-40219] - resolved view plan should hold the schema to avoid redundant lookup
[SPARK-40224] - Make ObjectHashAggregateExec release memory eagerly when fallback to sort-based
[SPARK-40225] - PySpark rdd.takeOrdered should check num and numPartitions
[SPARK-40228] - Don't simplify multiLike if child is not attribute
[SPARK-40234] - Clean only MDC items set by Spark
[SPARK-40235] - Use interruptible lock instead of synchronized in Executor.updateDependencies()
[SPARK-40239] - Remove duplicated 'fraction' validation in RDD.sample
[SPARK-40240] - PySpark rdd.takeSample should validate `num > maxSampleSize` at first
[SPARK-40241] - Correct the link of GenericUDTF
[SPARK-40243] - Enhance Hive UDF support documentation
[SPARK-40248] - Use larger number of bits to build bloom filter
[SPARK-40251] - Upgrade dev.ludovic.netlib from 2.2.1 to 3.0.2
[SPARK-40252] - Replace `Stream.collect(Collectors.joining(delimiter))` with `StringJoiner` Api
[SPARK-40254] - Upgrade netty from 4.1.77 to 4.1.80
[SPARK-40256] - Switch base image from openjdk to eclipse-temurin
[SPARK-40276] - reduce the result size of RDD.takeOrdered
[SPARK-40283] - Update mima's previousSparkVersion to 3.3.0
[SPARK-40285] - Simplify the roundTo[Numeric] for Decimal
[SPARK-40293] - Make the V2 table error message more meaningful
[SPARK-40301] - Add parameter validation in pyspark.rdd
[SPARK-40308] - str_to_map should accept non-foldable delimiter arguments
[SPARK-40311] - Introduce withColumnsRenamed
[SPARK-40312] - Add missing configuration documentation in Spark History Server
[SPARK-40321] - Upgrade rocksdbjni to 7.5.3
[SPARK-40352] - Add function aliases: len, datepart, dateadd, date_diff and curdate
[SPARK-40360] - Convert some DDL exception to new error framework
[SPARK-40365] - Bump ANTLR runtime version from 4.8 to 4.9.3
[SPARK-40376] - `np.bool` will be deprecated
[SPARK-40382] - Reduce projections in Expand when multiple distinct aggregations have semantically equivalent children
[SPARK-40383] - Pin mypy ==0.920 in dev/requirements.txt
[SPARK-40387] - Improve the implementation of Spark Decimal
[SPARK-40396] - Update scalatest and scalatestplus to use latest version
[SPARK-40397] - Migrate selenium-java from 3.1 to 4.2 and upgrade org.scalatestplus:selenium to 3.2.13.0
[SPARK-40398] - Use Loop instead of Arrays.stream api
[SPARK-40401] - Remove the support of deprecated `spark.akka.*` config
[SPARK-40404] - Fix the wrong description related to `spark.shuffle.service.db` in the document
[SPARK-40406] - The default logging should go to stderr
[SPARK-40411] - Refactor FlatMapGroupsWithStateExec to have a parent trait
[SPARK-40414] - Fix PythonArrowInput and PythonArrowOutput to be more generic to handle complicated type/data
[SPARK-40419] - Integrate Grouped Aggregate Pandas UDFs into *.sql test cases
[SPARK-40424] - Refactor ChromeUIHistoryServerSuite to test rocksdb
[SPARK-40425] - DROP TABLE does not need to do table lookup
[SPARK-40428] - Add a shutdownhook to CoarseGrained scheduler to avoid dangling resources during abnormal shutdown
[SPARK-40436] - Upgrade Scala to 2.12.17
[SPARK-40456] - PartitionIterator.hasNext should be cheap to call repeatedly
[SPARK-40463] - Update gpg's keyserver
[SPARK-40466] - Improve the error message if the DSv2 source is disabled but DSv1 streaming source is not available
[SPARK-40471] - Upgrade RoaringBitmap to 0.9.32
[SPARK-40474] - Correct CSV schema inference and data parsing behavior on columns with mixed dates and timestamps
[SPARK-40476] - Reduce the shuffle size of ALS
[SPARK-40478] - Add create datasource table options docs
[SPARK-40484] - Upgrade log4j2 to 2.19.0
[SPARK-40487] - Make defaultJoin in BroadcastNestedLoopJoinExec running in parallel
[SPARK-40488] - Do not wrap exceptions thrown in FileFormatWriter.write with SparkException
[SPARK-40490] - `YarnShuffleIntegrationSuite` no longer verifies `registeredExecFile` reload after SPARK-17321
[SPARK-40494] - Optimize the performance of `keys.zipWithIndex.toMap` code pattern
[SPARK-40500] - Use `pd.items` instead of `pd.iteritems`
[SPARK-40501] - Add PushProjectionThroughLimit for Optimizer
[SPARK-40511] - Upgrade slf4j to 2.x
[SPARK-40527] - Keep struct field names or map keys in CreateStruct
[SPARK-40531] - Upgrade zstd-jni from 1.5.2-3 to 1.5.2-4
[SPARK-40544] - The file size of `sql/hive/target/unit-tests.log` is too big
[SPARK-40545] - SparkSQLEnvSuite failed to clean the `spark_derby` directory after execution
[SPARK-40547] - Fix dead links in sparkr-vignettes.Rmd
[SPARK-40548] - Upgrade rocksdbjni from 7.5.3 to 7.6.0
[SPARK-40556] - Unpersist the intermediate datasets cached in AttachDistributedSequenceExec
[SPARK-40574] - Add PURGE to DROP TABLE doc
[SPARK-40575] - Add badges for PySpark downloads
[SPARK-40595] - Improve error message for unused CTE relations
[SPARK-40599] - Add multiTransform methods to TreeNode to generate alternatives
[SPARK-40601] - Improve error when cogrouping groups with mismatching key sizes
[SPARK-40604] - Verify the temporary column names in PS
[SPARK-40606] - Eliminate `to_pandas` warnings in test
[SPARK-40607] - Remove redundant string interpolator operations
[SPARK-40611] - Improve the performance for setInterval & getInterval of UnsafeRow
[SPARK-40619] - HivePartitionFilteringSuites teset aborted due to `java.lang.OutOfMemoryError: Metaspace`
[SPARK-40620] - Deduplication of WorkerOffer build in CoarseGrainedSchedulerBackend
[SPARK-40628] - Do not push complex left semi/anti join condition through project
[SPARK-40633] - Upgrade janino to 3.1.9
[SPARK-40634] - Upgrade jodatime to 2.11.2
[SPARK-40639] - Upgrade sbt from 1.7.1 to 1.7.2
[SPARK-40640] - SparkHadoopUtil to set origin of hadoop/hive config options
[SPARK-40646] - Fix returning partial results in JSON data source and JSON functions
[SPARK-40648] - Add `@ExtendedLevelDBTest` to the leveldb relevant tests in the yarn module
[SPARK-40654] - Protobuf support MVP with descriptor files
[SPARK-40655] - Protobuf functions in Python
[SPARK-40657] - Add support for compiled classes (Java classes)
[SPARK-40661] - Upgrade `jetty-http` from 9.4.48.v20220622 to 9.4.49.v20220914
[SPARK-40667] - Refactor File Data Source Options
[SPARK-40675] - Supplement missing spark configuration in documentation
[SPARK-40676] - Upgrade scalatest related test dependencies to 3.2.14
[SPARK-40697] - Add read-side char/varchar handling to cover external data files
[SPARK-40711] - Add spill size metrics for window
[SPARK-40712] - upgrade sbt-assembly plugin to 1.2.0
[SPARK-40724] - Simplify `corr` with method `inline`
[SPARK-40725] - Add mypy-protobuf to requirements
[SPARK-40728] - Upgrade ASM to 9.4
[SPARK-40735] - Consistently invoke bash with /usr/bin/env bash in scripts to make code more portable
[SPARK-40740] - Improve listFunctions in SessionCatalog
[SPARK-40742] - Java compilation warnings related to generic type
[SPARK-40745] - Reduce the shuffle size of ALS in mllib
[SPARK-40765] - Optimize redundant fs operations in `CommandUtils#calculateSingleLocationSize#getPathSize` method
[SPARK-40766] - Upgrade the guava defined in `plugins.sbt` to `31.0.1-jre`
[SPARK-40772] - Improve spark.sql.adaptive.skewJoin.skewedPartitionFactor to support float values
[SPARK-40776] - Add documentation (similar to Avro functions).
[SPARK-40777] - Use error classes for Protobuf exceptions
[SPARK-40778] - Make HeartbeatReceiver as an IsolatedRpcEndpoint
[SPARK-40782] - Upgrade Jackson-databind to 2.13.4.1
[SPARK-40794] - Upgrade Netty from 4.1.80 to 4.1.84
[SPARK-40795] - Exclude redundant jars from spark-protobuf-assembly jar
[SPARK-40797] - Force grouped import onto single line with Scalafmt
[SPARK-40803] - LZ4CompressionCodec looks up configuration on each stream creation
[SPARK-40821] - Introduce window_time function to extract event time from the window column
[SPARK-40826] - Add additional checkpoint rename file check
[SPARK-40834] - Use SparkListenerSQLExecutionEnd to track final SQL status in UI
[SPARK-40843] - Clean up deprecated api usage in SparkThrowableSuite
[SPARK-40846] - GA test failed with Java 8u352
[SPARK-40853] - Pin mypy-protobuf==3.3.0
[SPARK-40863] - Upgrade dropwizard metrics from 4.2.10 to 4.2.12
[SPARK-40865] - Upgrade jodatime to 2.12.0
[SPARK-40886] - Bump Jackson Databind 2.13.4.2
[SPARK-40892] - Loosen the requirement of window_time rule - allow multiple window_time calls
[SPARK-40895] - Upgrade Arrow to 10.0.0
[SPARK-40897] - Add missing PySpark APIs to References
[SPARK-40904] - Support zsh in K8s `entrypoint.sh`
[SPARK-40905] - Upgrade rocksdbjni to 7.7.3
[SPARK-40913] - Pin `pytest==7.1.3`
[SPARK-40919] - Bad case of `AnalysisTest#assertAnalysisErrorClass` when `expectedMessageParameters.size between [2, 4]`
[SPARK-40921] - Add WHEN NOT MATCHED BY SOURCE clause to MERGE INTO command
[SPARK-40925] - Fix late record filtering to support chaining of steteful operators
[SPARK-40935] - Upgrade ZSTD-JNI to 1.5.2-5
[SPARK-40936] - Refactor `AnalysisTest#assertAnalysisErrorClass` by reusing the `SparkFunSuite#checkError`
[SPARK-40940] - Fix the unsupported ops checker to allow chaining of stateful operators
[SPARK-40943] - Make MSCK optional in MSCK REPAIR TABLE commands
[SPARK-40950] - isRemoteAddressMaxedOut performance overhead on scala 2.13
[SPARK-40976] - Upgrade sbt to 1.7.3
[SPARK-40985] - Upgrade RoaringBitmap to 0.9.35
[SPARK-40991] - Update cloudpickle to v2.2.0
[SPARK-40996] - Upgrade `sbt-checkstyle-plugin` to 4.0.0
[SPARK-41017] - Support column pruning with multiple nondeterministic Filters
[SPARK-41023] - Upgrade Jackson to 2.14.0
[SPARK-41024] - Upgrade scala-maven-plugin to 4.7.2
[SPARK-41029] - Optimize the use of `GenericArrayData` constructor for Scala 2.13
[SPARK-41031] - Upgrade `org.tukaani:xz` to 1.9
[SPARK-41039] - Upgrade `scala-parallel-collections` to 1.0.4 for Scala 2.13
[SPARK-41045] - Pre-compute to eliminate ScalaReflection calls after deserializer is created
[SPARK-41048] - Improve output partitioning and ordering with AQE cache
[SPARK-41050] - Upgrade scalafmt from 3.5.9 to 3.6.1
[SPARK-41051] - Optimize ProcfsMetrics file acquisition
[SPARK-41071] - Metaspace OOM when Local run dev/make-distribution.sh
[SPARK-41074] - Add option `--upgrade` in dependency installation command
[SPARK-41087] - Make `build/mvn` use the same JAVA_OPTS as `dev/make-distribution.sh`
[SPARK-41089] - Relocate Netty native arm64 libs
[SPARK-41090] - Enhance Dataset.createTempView testing coverage for db_name.view_name
[SPARK-41092] - Do not use identifier to match interval units
[SPARK-41096] - Support reading parquet FIXED_LEN_BYTE_ARRAY type
[SPARK-41097] - Remove redundant collection conversion for Scala 2.13
[SPARK-41106] - Reduce collection conversion when create AttributeMap
[SPARK-41112] - RuntimeFilter should apply ColumnPruning eagerly with in-subquery filter
[SPARK-41113] - Upgrade sbt to 1.8.0
[SPARK-41120] - Upgrade joda-time from 2.12.0 to 2.12.1
[SPARK-41121] - Upgrade sbt-assembly from 1.2.0 to 2.0.0
[SPARK-41123] - Upgrade mysql-connector-java from 8.0.30 to 8.0.31
[SPARK-41126] - `entrypoint.sh` should use its WORKDIR instead of `/tmp` directory
[SPARK-41134] - improve error message of internal errors
[SPARK-41153] - Log migrated shuffle data size and migration time
[SPARK-41155] - Add error message to SchemaColumnConvertNotSupportedException
[SPARK-41161] - Upgrade `scala-parser-combinators` to 2.1.1
[SPARK-41167] - Optimize LikeSimplification rule to improve multi like performance
[SPARK-41194] - Add log4j2.properties for testing to the protobuf module
[SPARK-41197] - Upgrade Kafka to 3.3.1
[SPARK-41209] - Improve PySpark type inference in _merge_type method
[SPARK-41211] - Upgrade ZooKeeper to 3.6.3
[SPARK-41223] - Upgrade slf4j to 2.0.4
[SPARK-41226] - Refactor Spark types by introducing physical types
[SPARK-41239] - Upgrade Jackson to 2.14.1
[SPARK-41248] - Add config flag to control before of JSON partial results parsing in SPARK-40646
[SPARK-41251] - Upgrade pandas from 1.5.1 to 1.5.2
[SPARK-41252] - Upgrade arrow from 10.0.0 to 10.0.1
[SPARK-41260] - Cast NumPy instances to Python primitive types in GroupState update
[SPARK-41267] - Add unpivot / melt to SparkR
[SPARK-41273] - Update plugins to latest versions
[SPARK-41275] - Upgrade pickle to 1.3
[SPARK-41276] - Optimize constructor use of `StructType`
[SPARK-41316] - Add @tailrec wherever possible
[SPARK-41338] - resolve outer references and normal columns in the same analyzer batch
[SPARK-41355] - Workaround hive table name validation issue
[SPARK-41360] - Avoid BlockManager re-registration if the executor has been lost
[SPARK-41369] - Refactor connect directory structure
[SPARK-41373] - Rename CAST_WITH_FUN_SUGGESTION to CAST_WITH_FUNC_SUGGESTION
[SPARK-41387] - Add assertion on end offset range for Kafka data source with Trigger.AvailableNow
[SPARK-41390] - Update the script used to generate register function in UDFRegistration
[SPARK-41393] - Upgrade slf4j to 2.0.5
[SPARK-41402] - Override nodeName of StringDecode
[SPARK-41404] - Support `ColumnarBatchSuite#testRandomRows` to test more primitive dataType
[SPARK-41405] - centralize the column resolution logic
[SPARK-41408] - Upgrade scala-maven-plugin to 4.8.0
[SPARK-41442] - Only update SQLMetric value if merging with valid metric
[SPARK-41447] - Reduce the number of doMergeApplicationListing invocations
[SPARK-41450] - PySpark built from master code raise error "java.lang.ClassNotFoundException: org.eclipse.jetty.server.Handler"
[SPARK-41454] - Support Python 3.11
[SPARK-41456] - Improve the performance of try_cast
[SPARK-41460] - Introduce IsolatedThreadSafeRpcEndpoint to extend IsolatedRpcEndpoint
[SPARK-41463] - Ensure error class (and subclass) names contain only capital letters, numbers and underscores
[SPARK-41466] - Change Scala Style configuration to catch AnyFunSuite instead of FunSuite
[SPARK-41467] - Upgrade httpclient from 4.5.13 to 4.5.14
[SPARK-41469] - Task rerun on decommissioned executor can be avoided if shuffle data has migrated
[SPARK-41474] - Exclude proto files from spark-protobuf
[SPARK-41476] - Prevent `README.md` from triggering CIs
[SPARK-41482] - Upgrade dropwizard metrics 4.2.13
[SPARK-41491] - Update postgres docker image to 15.1
[SPARK-41509] - Delay execution hash until after aggregation for semi-join runtime filter.
[SPARK-41511] - LongToUnsafeRowMap support ignoresDuplicatedKey
[SPARK-41520] - Split AND_OR TreePattern to separate AND and OR TreePatterns
[SPARK-41523] - `protoc-jar-maven-plugin` should uniformly use `protoc-jar-maven-plugin.version` as the version
[SPARK-41524] - Expose SQL confs and extraOptions separately in o.a.s.sql.execution.streaming.state.RocksDBConf
[SPARK-41530] - Rename MedianHeap to PercentileMap and support percentile
[SPARK-41534] - Setup initial client module for Spark Connect
[SPARK-41541] - Fix wrong child call in SQLShuffleWriteMetricsReporter.decRecordsWritten()
[SPARK-41544] - Upgrade `versions-maven-plugin` to 2.14.1
[SPARK-41553] - Fix the documentation for num_files
[SPARK-41561] - Upgrade slf4j related dependencies from 2.0.5 to 2.0.6
[SPARK-41562] - Upgrade joda-time from 2.12.1 to 2.12.2
[SPARK-41567] - Move configuration of `versions-maven-plugin` to parent pom
[SPARK-41569] - Upgrade rocksdbjni to 7.8.3
[SPARK-41584] - Upgrade RoaringBitmap to 0.9.36
[SPARK-41587] - Upgrade org.scalatestplus:selenium-4-4 to org.scalatestplus:selenium-4-7
[SPARK-41588] - Make "Rule id not found" error message more actionable
[SPARK-41660] - only propagate metadata columns if they are used
[SPARK-41669] - Speed up CollapseProject for wide tables
[SPARK-41704] - Upgrade `sbt-assembly` from 2.0.0 to 2.1.0
[SPARK-41711] - Upgrade protobuf-java to 3.21.12
[SPARK-41714] - Update maven-checkstyle-plugin from 3.1.2 to 3.2.0
[SPARK-41719] - Spark SSLOptions sub settings should be set only when ssl is enabled
[SPARK-41720] - Rename UnresolvedFunc to UnresolvedFunctionName
[SPARK-41750] - Upgrade dev.ludovic.netlib to 3.0.3
[SPARK-41760] - Enforce scalafmt for Spark Connect Client module
[SPARK-41778] - Add an alias "reduce" to ArrayAggregate
[SPARK-41787] - Upgrade silencer from 1.7.10 to 1.7.12
[SPARK-41791] - Create distinct metadata attributes for metadata that is constant or file and metadata that is generated during the scan
[SPARK-41798] - Upgrade hive-storage-api to 2.8.1
[SPARK-41800] - Upgrade commons-compress to 1.22
[SPARK-41802] - Upgrade Apache httpcore to 4.4.16
[SPARK-41805] - Reuse expressions in WindowSpecDefinition
[SPARK-41806] - Use AppendData.byName for SQL INSERT INTO by name for DSV2 and block ambiguous queries with static partitions columns
[SPARK-41822] - Setup Scala/JVM Client Connection
[SPARK-41860] - Make AvroScanBuilder and JsonScanBuilder case classes
[SPARK-41861] - Make v2 ScanBuilders' build() return typed scan
[SPARK-41883] - Upgrade dropwizard metrics 4.2.15
[SPARK-41893] - Publish SBOM artifacts
[SPARK-41925] - Enable spark.sql.orc.enableNestedColumnVectorizedReader by default
[SPARK-41938] - Upgrade sbt from 1.8.0 to 1.8.2
[SPARK-41941] - Upgrade scalatest related test dependencies to 3.2.15
[SPARK-41943] - Use java api to create files and grant permissions is DiskBlockManager
[SPARK-41949] - Make stage scheduling support local-cluster mode
[SPARK-41962] - Update the import order of scala package in class SpecificParquetRecordReaderBase
[SPARK-41965] - Add DataFrameWriterV2 to PySpark API references
[SPARK-41966] - Add `CharType` and `TimestampNTZType` to PySpark API references
[SPARK-41970] - Introduce SparkPath to address paths and URIs
[SPARK-41986] - Introduce shuffle on SinglePartition
[SPARK-41994] - Harden SQLSTATE usage for error classes
[SPARK-42031] - Clean up remove methods that do not need override
[SPARK-42037] - Rename AMPLAB_ to SPARK_ prefix in build environment variables
[SPARK-42043] - Basic Scala Client Result Implementation
[SPARK-42049] - Improve AliasAwareOutputExpression
[SPARK-42055] - Upgrade scalatest-maven-plugin from 2.1.0 to 2.2.0
[SPARK-42056] - Add missing options for Protobuf functions.
[SPARK-42058] - Harden SQLSTATE usage for error classes (2)
[SPARK-42065] - Remove duplicated test_freqItems
[SPARK-42067] - Upgrade buf from 1.11.0 to 1.12.0
[SPARK-42081] - improve the plan change validation
[SPARK-42083] - Make (Executor|StatefulSet)PodsAllocator extendable
[SPARK-42086] - Sort test cases in SQLQueryTestSuite
[SPARK-42091] - Upgrade jetty to 9.4.50.v20221201
[SPARK-42092] - Upgrade RoaringBitmap to 0.9.38
[SPARK-42096] - Code cleanup for connect module
[SPARK-42106] - [Pyspark] Hide parameters when re-printing user provided remote URL in REPL
[SPARK-42108] - Make Analyzer transform Count(*) into Count(1)
[SPARK-42111] - Mark Orc*FilterSuite/OrcV*SchemaPruningSuite as ExtendedSQLTest
[SPARK-42114] - Add uniform parquet encryption test case
[SPARK-42116] - Mark ColumnarBatchSuite as ExtendedSQLTest
[SPARK-42129] - Upgrade rocksdbjni to 7.9.2
[SPARK-42133] - Add basic Dataset API methods to Spark Connect Scala Client
[SPARK-42149] - Remove the env `SPARK_USE_CONC_INCR_GC` used to enable CMS GC for Yarn AM
[SPARK-42152] - Use `_` instead of `-` in `shadedPattern` for relocation package name
[SPARK-42161] - Upgrade Arrow to 11.0.0
[SPARK-42166] - Make `docker-image-tool.sh` usage message up-to-date
[SPARK-42167] - Improve GitHub Action `lint` job to stop on failures earlier
[SPARK-42172] - Compatibility check for Scala Client
[SPARK-42180] - Update `SCALA_VERSION` in `_config.yml`
[SPARK-42202] - Scala Client E2E test stop the server gracefully
[SPARK-42220] - Upgrade buf from 1.12.0 to 1.13.1
[SPARK-42230] - Improve `lint` job by skipping PySpark and SparkR docs if unchanged
[SPARK-42237] - change binary to unsupported dataType in csv format
[SPARK-42277] - Use ROCKSDB for spark.history.store.hybridStore.diskBackend by default
[SPARK-42283] - Add Simple Scala UDFs to Scala/JVM Client
[SPARK-42287] - Optimize the packaging strategy of connect client module
[SPARK-42333] - Change log level to debug when fetching result set from SparkExecuteStatementOperation
[SPARK-42334] - Make sure connect client assembly and sql package is built before running client tests - SBT
[SPARK-42354] - Upgrade Jackson to 2.14.2
[SPARK-42372] - Improve performance of HiveGenericUDTF by making inputProjection instantiate once
[SPARK-42390] - Upgrade buf from 1.13.1 to 1.14.0
[SPARK-42394] - Fix the usage information of bin/spark-sql --help
[SPARK-42398] - refine default column value framework
[SPARK-42422] - Upgrade `maven-shade-plugin` to 3.4.1
[SPARK-42423] - Add metadata column file block start and length
[SPARK-42429] - IntelliJ Build issue: value getArgument is not a member of org.mockito.invocation.InvocationOnMock
[SPARK-42436] - Improve multiTransform to generate alternatives dynamically
[SPARK-42457] - Scala Client Session Read API
[SPARK-42480] - Improve the performance of drop partitions
[SPARK-42482] - Scala client Write API V1
[SPARK-42514] - Scala Client add partition transforms functions
[SPARK-42518] - Scala client Write API V2
[SPARK-42526] - Add Classifier.getNumClasses back
[SPARK-42527] - Scala Client add Window functions
[SPARK-42543] - Specify protocol for UDF artifact transfer in JVM/Scala client
[SPARK-42548] - Add ReferenceAllColumns to skip rewriting attributes
[SPARK-42599] - Make `CompatibilitySuite` as a tool like `dev/mima`
[SPARK-42653] - Artifact transfer from Scala/JVM client to Server
[SPARK-42656] - Spark Connect Scala Client Shell Script
[SPARK-42675] - Should clean up temp view after test
[SPARK-42684] - v2 catalog should not allow column default value by default
[SPARK-42712] - Improve docstring of mapInPandas and mapInArrow
[SPARK-42722] - Python Connect def schema() should not cache the schema
[SPARK-42895] - ValueError when invoking any session operations on a stopped Spark session
[SPARK-42904] - Char/Varchar Support for JDBC Catalog
[SPARK-42908] - Raise RuntimeError if SparkContext is not initialized when parsing DDL-formatted type strings
[SPARK-42917] - Correct getUpdateColumnNullabilityQuery for DerbyDialect
[SPARK-42946] - Sensitive data could still be exposed by variable substitution
[SPARK-43009] - Parameterized sql() with constants
[SPARK-43075] - Change gRPC to grpcio when it is not installed.

Test

[SPARK-38755] - Add file to address missing pandas general functions
[SPARK-38786] - Test Bug in StatisticsSuite "change stats after add/drop partition command"
[SPARK-38893] - Test SourceProgress in PySpark
[SPARK-38920] - Add ORC blockSize tests to BloomFilterBenchmark
[SPARK-38923] - Regenerate benchmark results
[SPARK-38944] - Close `NioBufferedFileInputStream` opened by `ExternalAppendOnlyUnsafeRowArraySuite`
[SPARK-38948] - `DiskRowQueue` leak in `PythonForeachWriterSuite `
[SPARK-39034] - Add tests for options from `to_json` and `from_json`.
[SPARK-39035] - Add tests for options from `to_csv` and `from_csv`.
[SPARK-39117] - Do not include number of functions in sql-expression-schema.md
[SPARK-39181] - SessionCatalog.reset should not drop temp functions twice
[SPARK-39253] - Improve PySpark API reference to be more readable
[SPARK-39331] - Flay test: StreamingListenerTests.test_listener_events
[SPARK-39369] - Use JAVA_OPTS for AppVeyer build to increase the memory properly
[SPARK-39372] - Support R 4.2.0 in SparkR
[SPARK-39394] - Improve PySpark structured streaming page more readable
[SPARK-39463] - Use UUID for test database location in JavaJdbcRDDSuite
[SPARK-39477] - Remove "Number of queries" info from the golden files of SQLQueryTestSuite
[SPARK-39495] - Support SPARK_TEST_HIVE_CLIENT_VERSIONS for HiveClientVersions
[SPARK-39584] - Fix TPCDSQueryBenchmark Measuring Performance of Wrong Query Results
[SPARK-39604] - Miss UT for DerbyDialet's getCatalystType
[SPARK-39631] - Update FilterPushdownBenchmark results
[SPARK-39663] - Miss UT for MysqlDialect's listIndexes
[SPARK-39701] - Move withSecretFile to SparkFunSuite to reuse
[SPARK-39711] - Remove redundant trait: BeforeAndAfterAll & BeforeAndAfterEach & Logging
[SPARK-39826] - Bump scalatest-maven-plugin to 2.1.0
[SPARK-39856] - Avoid OOM in TPC-DS build with SMJ
[SPARK-39869] - Fix flaky hive - slow tests because of out-of-memory
[SPARK-39874] - Deflake BroadcastJoinSuite*
[SPARK-39959] - Recover SparkR CRAN check in GitHub Actions CI
[SPARK-40116] - Remove Arrow in AppVeyor for now
[SPARK-40133] - Regenerate excludedTpcdsQueries's golden files if regenerateGoldenFiles is true
[SPARK-40172] - Temporarily disable flaky test cases in ImageFileFormatSuite
[SPARK-40203] - Add test cases for Spark Decimal
[SPARK-40229] - Re-enable excel I/O test for pandas API on Spark.
[SPARK-40265] - Fix the inconsistent behavior for Index.intersection.
[SPARK-40271] - Support list type for pyspark.sql.functions.lit
[SPARK-40273] - Fix the documents "Contributing and Maintaining Type Hints".
[SPARK-40410] - Migrate trait QueryErrorsSuiteBase into SparkFunSuite
[SPARK-40461] - Set upperbound for pyzmq 24.0.0 for Python linter
[SPARK-40495] - Add additional tests to StreamingSessionWindowSuite
[SPARK-40669] - Parameterize InMemoryColumnarBenchmark
[SPARK-40682] - Set spark.driver.maxResultSize to 3g in SqlBasedBenchmark
[SPARK-40789] - Separate tests under pyspark.sql.tests
[SPARK-40903] - Avoid reordering decimal Add for canonicalization
[SPARK-40968] - Fix some wrong/misleading comments in DAGSchedulerSuite
[SPARK-41486] - Upgrade MySQL docker image to 8.0.31 to support arm64
[SPARK-41504] - Update R version to 4.1.2 in Dockerfile comment
[SPARK-41558] - Disable Coverage in python.pyspark.tests.test_memory_profiler
[SPARK-41559] - Reenable Codecov report in the scheduled job
[SPARK-41753] - Add tests for ArrayZip to check the result size and nullability.
[SPARK-41774] - Remove def test_vectorized_udf_unsupported_types
[SPARK-41782] - Regenerate benchmark results
[SPARK-41854] - Automatic reformat/check python/setup.py
[SPARK-41863] - Skip `flake8` tests if the command is not available
[SPARK-41864] - Fix mypy linter errors
[SPARK-41996] - KafkaMicroBatchV2SourceSuite failed for topic partitions unavailable test due to kafka operations taking longer
[SPARK-42087] - Use `--no-same-owner` when HiveExternalCatalogVersionsSuite untars.
[SPARK-42110] - Reduce the number of repetition in ParquetDeltaEncodingSuite.`random data test`
[SPARK-42181] - Skip `torch` tests when torch is not installed
[SPARK-42183] - Exclude pyspark.ml.torch.tests in MyPy tests
[SPARK-42279] - Simplify `pyspark.pandas.tests.test_resample`
[SPARK-42282] - Split 'pyspark.pandas.tests.test_groupby'
[SPARK-42341] - Fix JoinSelectionHelperSuite and PlanStabilitySuite to use explicit broadcast threshold
[SPARK-42364] - Split 'pyspark.pandas.tests.test_dataframe'
[SPARK-42365] - Split 'pyspark.pandas.tests.test_ops_on_diff_frames'
[SPARK-42368] - Ignore SparkRemoteFileTest K8s IT test case in GitHub Action
[SPARK-42474] - Add extraJVMOptions JVM GC option K8s test cases
[SPARK-42507] - Simplify ORC schema merging conflict error check
[SPARK-42587] - Use wrapper versions for SBT and Maven in `connect` module tests

Task

[SPARK-28764] - Remove unnecessary writePartitionedFile method from ExternalSorter
[SPARK-35208] - Add docs for LATERAL subqueries
[SPARK-38181] - Update comments for KafkaDataConsumer
[SPARK-38289] - Refactor SQL CLI exit code related code
[SPARK-38550] - Use a disk-based store to save more information in live UI to help debug
[SPARK-38572] - Setting version to 3.4.0-SNAPSHOT
[SPARK-38651] - Writing out empty or nested empty schemas in Datasource should be configurable
[SPARK-38705] - Use function identifier in create and drop function command
[SPARK-38910] - Clean sparkStaging dir should before unregister()
[SPARK-39110] - Add metrics properties to Environment page
[SPARK-39178] - When throw SparkFatalException, should show root cause too.
[SPARK-39195] - Spark OutputCommitCoordinator should abort stage when committed file not consistent with task status
[SPARK-39224] - Lower general ProcfsMetricsGetter error log levels except /proc/ lookup error
[SPARK-39244] - Use `--no-echo` instead of `--slave` in R 4.0
[SPARK-39264] - Add explicit type checks and casting for awaitOffset fix
[SPARK-39781] - Add support for configuring max_open_files through RocksDB state store provider
[SPARK-39805] - Deprecate Trigger.Once and Promote Trigger.AvailableNow
[SPARK-39861] - Deprecate Python 3.7 Support
[SPARK-39918] - Replace the wording "un-comparable" with "incomparable"
[SPARK-40213] - Incorrect ASCII value for Latin-1 Supplement characters
[SPARK-40292] - arrays_zip output unexpected alias column names
[SPARK-40319] - Remove duplicated query execution error method for PARSE_DATETIME_BY_NEW_PARSER
[SPARK-40389] - Decimals can't upcast as integral types if the cast can overflow
[SPARK-40467] - Split FlatMapGroupsWithState down to multiple test suites
[SPARK-40491] - Remove too old TODO for JdbcRDD
[SPARK-40651] - Drop Hadoop2 binary distribtuion from release process
[SPARK-40844] - Flip the default value of Kafka offset fetching config
[SPARK-41101] - Add messageClassName support for pypspark-protobuf
[SPARK-41224] - Optimize Arrow collect to stream the result from server to client
[SPARK-41247] - Unify the protobuf versions in Spark connect and protobuf connector
[SPARK-41249] - Add acceptance test for self-union on streaming query
[SPARK-41396] - Oneof field support and recursive fields
[SPARK-41415] - SASL Request Retries
[SPARK-41499] - Upgrade protobuf version to 3.21.11
[SPARK-41538] - Metadata column should be appended at the end of project list
[SPARK-41639] - Remove ScalaReflectionLock
[SPARK-41690] - Introduce AgnosticEncoders
[SPARK-41752] - UI improvement for nested SQL executions
[SPARK-41853] - Use Map in place of SortedMap for ErrorClassesJsonReader
[SPARK-41930] - Remove `branch-3.1` from publish_snapshot job
[SPARK-41972] - Fix flaky test in StreamingQueryStatusListenerSuite
[SPARK-41993] - Move RowEncoder to AgnosticEncoders
[SPARK-42003] - Reduce duplicate code in ResolveGroupByAll
[SPARK-42075] - Deprecate DStream API
[SPARK-42093] - Move JavaTypeInference to AgnosticEncoders
[SPARK-42105] - Document work (Release note & Guide doc) for SPARK-40925
[SPARK-42284] - Make sure Connect Server assembly jar is available before we run Scala Client tests
[SPARK-42377] - Test Framework for Connect Scala Client
[SPARK-42440] - Implement First batch of Dataset APIs
[SPARK-42441] - Scala Client - Implement Column API
[SPARK-42453] - Implement function max in Scala client
[SPARK-42460] - E2E test should clean-up results
[SPARK-42461] - Scala Client - Initial Set of Functions
[SPARK-42464] - Fix 2.13 build errors caused by explain output changes and udfs.
[SPARK-42465] - ProtoToPlanTestSuite should analyze its input plans
[SPARK-42495] - Scala Client: Add 2nd batch of functions
[SPARK-42512] - Scala Client: Add 3rd batch of functions
[SPARK-42520] - Spark Connect Scala Client: Window
[SPARK-42569] - Throw unsupported exceptions for non-supported API
[SPARK-42624] - Reorganize imports in test_functions
[SPARK-42876] - DataType's physicalDataType should be private[sql]
[SPARK-42878] - Named Table should support options

Dependency upgrade

[SPARK-39099] - Add dependencies to Dockerfile for building Spark releases
[SPARK-39125] - Upgrade netty and netty-tcnative
[SPARK-39183] - Upgrade Apache Xerces Java to 2.12.2
[SPARK-39540] - Upgrade mysql-connector-java to 8.0.29
[SPARK-39725] - Upgrade jetty-http from 9.4.46.v20220331 to 9.4.48.v20220622
[SPARK-39927] - Upgrade Avro to version 1.11.1
[SPARK-39992] - Upgrade slf4j to 1.7.36
[SPARK-39996] - Upgrade postgresql to 42.5.0
[SPARK-40037] - Upgrade com.google.crypto.tink:tink from 1.6.1 to 1.7.0
[SPARK-40326] - upgrade com.fasterxml.jackson.dataformat:jackson-dataformat-yaml from 2.13.3 to 2.13.4
[SPARK-40522] - Upgrade Apache Kafka from 3.2.1 to 3.2.3
[SPARK-40552] - Upgrade protobuf-python from 4.21.5 to 4.21.6
[SPARK-40801] - Upgrade Apache Commons Text to 1.10
[SPARK-40884] - Upgrade fabric8io - kubernetes-client to 6.2.0
[SPARK-41030] - Upgrade Apache Ivy to 2.5.1
[SPARK-41076] - Upgrade protobuf-java to 3.21.9
[SPARK-41240] - Upgrade Protobuf from 3.19.4 to 3.19.5
[SPARK-41245] - Upgrade postgresql from 42.5.0 to 42.5.1
[SPARK-41566] - Upgrade netty from 4.1.84.Final to 4.1.86.Final
[SPARK-41634] - Upgrade minimatch to 3.1.2
[SPARK-42218] - Upgrade netty to version 4.1.87.Final
[SPARK-42362] - Upgrade kubernetes-client from 6.4.0 to 6.4.1

Question

[SPARK-40221] - Not able to format using scalafmt

Umbrella

[SPARK-39515] - Improve/recover scheduled jobs in GitHub Actions
[SPARK-40576] - Support pandas 1.5.x.
[SPARK-41053] - Better Spark UI scalability and Driver stability for large applications
[SPARK-41283] - Feature parity: Functions API in Spark Connect
[SPARK-41550] - Dynamic Allocation on K8S GA
[SPARK-41594] - Support table-valued generator functions in the FROM clause
[SPARK-41597] - Improve PySpark errors
[SPARK-41642] - Deduplicate docstrings in Python Spark Connect
[SPARK-42339] - Improve Kryo Serializer Support
[SPARK-42802] - Customized K8s Scheduler GA

Documentation

[SPARK-38581] - List of supported pandas APIs for pandas API on Spark docs.
[SPARK-38961] - Enhance to automatically generate the pandas API support list
[SPARK-39001] - Document which options are unsupported in CSV and JSON functions
[SPARK-39577] - Add SQL reference for built-in functions
[SPARK-39677] - Wrong args item formatting of the regexp functions
[SPARK-39707] - Add SQL reference for aggregate functions
[SPARK-39737] - PERCENTILE_CONT and PERCENTILE_DISC should support aggregate filter
[SPARK-39777] - Remove Hive bucketing incompatibility doc
[SPARK-39780] - Add an additional usage example for the map_zip_with function
[SPARK-39968] - Update K8s doc to recommend K8s 1.22+
[SPARK-40028] - Add binary examples for string expressions
[SPARK-40043] - Document DataStreamWriter.toTable and DataStreamReader.table
[SPARK-40266] - Corrected console output in quick-start - Datatype Integer instead of Long
[SPARK-40279] - Document spark.yarn.report.interval
[SPARK-40922] - pyspark.pandas.read_csv supports reading multiple files, but that is undocumented
[SPARK-40983] - Remove Hadoop requirements for zstd mention in Parquet compression codec
[SPARK-40994] - Add code example for JDBC data source with partitionColumn
[SPARK-41014] - Improve documentation and typing of applyInPandas for groupby and cogroup
[SPARK-41596] - Document the new feature "Async Progress Tracking" to Structured Streaming guide doc
[SPARK-41951] - Update SQL migration guide and documentations
[SPARK-42405] - Better documentation of array_insert function
[SPARK-42418] - Updating PySpark documentation to support new users better
[SPARK-42446] - Updating PySpark documentation to enhance usability
[SPARK-42456] - Consolidating the PySpark version upgrade note pages into a single page to make it easier to read
[SPARK-42530] - Remove Hadoop 2 from PySpark installation guide
[SPARK-42592] - Document SS guide doc for supporting multiple stateful operators (especially chained aggregations)
[SPARK-42628] - Add a migration note for bloom filter join
[SPARK-42713] - Add '__getattr__' and '__getitem__' of DataFrame and Column to API reference
[SPARK-42903] - Avoid documenting None as as a return value in docstring
[SPARK-42924] - Clarify the comment of parameterized SQL args

Edit/Copy Release Notes

The text area below allows the project release notes to be edited and copied to another document.

Release Notes - Spark - Version 3.4.0
    
<h2>        Sub-task
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-28330'>SPARK-28330</a>] -         ANSI SQL: Top-level &lt;result offset clause&gt; in &lt;query expression&gt;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-28516'>SPARK-28516</a>] -         Data Type Formatting Functions: `to_char`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-30220'>SPARK-30220</a>] -         Support Filter expression uses IN/EXISTS predicate sub-queries
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-30661'>SPARK-30661</a>] -         KMeans blockify input vectors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-30835'>SPARK-30835</a>] -         Add support for YARN decommissioning &amp; pre-emption
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-33236'>SPARK-33236</a>] -         Enable Push-based shuffle service to store state in NM level DB for work preserving restart
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-33573'>SPARK-33573</a>] -         Server side metrics related to push-based shuffle
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34305'>SPARK-34305</a>] -         Unify v1 and v2 ALTER TABLE .. SET SERDE tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36114'>SPARK-36114</a>] -         Support subqueries with correlated non-equality predicates
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36124'>SPARK-36124</a>] -         Support set operators to be on correlation paths
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36511'>SPARK-36511</a>] -         Remove ColumnIO once PARQUET-2050 is released in Parquet 1.13
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36620'>SPARK-36620</a>] -         Client side related push-based shuffle metrics
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37194'>SPARK-37194</a>] -         Avoid unnecessary sort in FileFormatWriter if it&#39;s not dynamic partition
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37287'>SPARK-37287</a>] -         Pull out dynamic partition and bucket sort from FileFormatWriter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37378'>SPARK-37378</a>] -         SPJ: Convert V2 Transform expressions into catalyst expressions and load their associated functions from V2 FunctionCatalog
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37425'>SPARK-37425</a>] -         Inline type hints for python/pyspark/mllib/recommendation.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37599'>SPARK-37599</a>] -         Unify v1 and v2 ALTER TABLE .. SET LOCATION tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37623'>SPARK-37623</a>] -         Support ANSI Aggregate Function: regr_intercept
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37672'>SPARK-37672</a>] -         Support ANSI Aggregate Function: regr_sxx
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37681'>SPARK-37681</a>] -         Support ANSI Aggregate Function: regr_sxy
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37702'>SPARK-37702</a>] -         Support ANSI Aggregate Function: regr_syy
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37888'>SPARK-37888</a>] -         Unify v1 and v2 DESCRIBE TABLE tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37938'>SPARK-37938</a>] -         Use error classes in the parsing errors of partitions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37939'>SPARK-37939</a>] -         Use error classes in the parsing errors of properties
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37945'>SPARK-37945</a>] -         Use error classes in the execution errors of arithmetic ops
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37982'>SPARK-37982</a>] -         Use error classes in the execution errors related to unsupported input type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38005'>SPARK-38005</a>] -         Support cleaning up merged shuffle files and state from external shuffle service
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38106'>SPARK-38106</a>] -         Use error classes in the parsing errors of functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38108'>SPARK-38108</a>] -         Use error classes in the compilation errors of UDF/UDAF
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38257'>SPARK-38257</a>] -         Upgrade rocksdbjni to 7.0.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38270'>SPARK-38270</a>] -         SQL CLI AM should keep same exitcode with client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38335'>SPARK-38335</a>] -         Parser changes for DEFAULT column support
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38336'>SPARK-38336</a>] -         Catalyst changes for DEFAULT column support
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38441'>SPARK-38441</a>] -         Support string and bool `regex` in `Series.replace`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38479'>SPARK-38479</a>] -         Add `Series.duplicated` to indicate duplicate Series values.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38493'>SPARK-38493</a>] -         Improve the test coverage for pyspark/pandas module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38496'>SPARK-38496</a>] -         Improve the test coverage for pyspark/sql module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38552'>SPARK-38552</a>] -         Implement `keep` parameter of `frame.nlargest/nsmallest` to decide how to resolve ties
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38576'>SPARK-38576</a>] -         Implement `numeric_only` parameter for `DataFrame/Series.rank` to rank numeric columns only
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38588'>SPARK-38588</a>] -         Validate input dataset of ml.classification
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38608'>SPARK-38608</a>] -         Implement `bool_only` parameter of `DataFrame.all` and`DataFrame.any`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38669'>SPARK-38669</a>] -         Validate input dataset of ml.clustering
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38678'>SPARK-38678</a>] -         Enable RocksDB tests on Apple Silicon on MacOS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38686'>SPARK-38686</a>] -         Implement `keep` parameter of `(Index/MultiIndex).drop_duplicates`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38687'>SPARK-38687</a>] -         Use error classes in the compilation errors of generators
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38688'>SPARK-38688</a>] -         Use error classes in the compilation errors of deserializer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38689'>SPARK-38689</a>] -         Use error classes in the compilation errors of not allowed DESC PARTITION
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38697'>SPARK-38697</a>] -         Extend SparkSessionExtensions to inject rules into AQE Optimizer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38700'>SPARK-38700</a>] -         Use error classes in the execution errors of save mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38701'>SPARK-38701</a>] -         Inline IllegalStateException out from QueryExecutionErrors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38704'>SPARK-38704</a>] -         Support string `inclusive` parameter of `Series.between`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38718'>SPARK-38718</a>] -         Test the error class: AMBIGUOUS_FIELD_NAME
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38720'>SPARK-38720</a>] -         Test the error class: CANNOT_CHANGE_DECIMAL_PRECISION
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38721'>SPARK-38721</a>] -         Test the error class: CANNOT_PARSE_DECIMAL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38722'>SPARK-38722</a>] -         Test the error class: CAST_CAUSES_OVERFLOW
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38724'>SPARK-38724</a>] -         Test the error class: DIVIDE_BY_ZERO
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38725'>SPARK-38725</a>] -         Test the error class: DUPLICATE_KEY
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38726'>SPARK-38726</a>] -         Support `how` parameter of `MultiIndex.dropna`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38727'>SPARK-38727</a>] -         Test the error class: FAILED_EXECUTE_UDF
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38728'>SPARK-38728</a>] -         Test the error class: FAILED_RENAME_PATH
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38729'>SPARK-38729</a>] -         Test the error class: FAILED_SET_ORIGINAL_PERMISSION_BACK
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38730'>SPARK-38730</a>] -         Move tests for the grouping error classes to QueryCompilationErrorsSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38731'>SPARK-38731</a>] -         Move the tests `GROUPING_SIZE_LIMIT_EXCEEDED` to QueryCompilationErrorsSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38732'>SPARK-38732</a>] -         Test the error class: INCOMPARABLE_PIVOT_COLUMN
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38733'>SPARK-38733</a>] -         Test the error class: INCOMPATIBLE_DATASOURCE_REGISTER
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38734'>SPARK-38734</a>] -         Test the error class: INDEX_OUT_OF_BOUNDS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38736'>SPARK-38736</a>] -         Test the error classes: INVALID_ARRAY_INDEX*
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38737'>SPARK-38737</a>] -         Test the error classes: INVALID_FIELD_NAME
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38738'>SPARK-38738</a>] -         Test the error class: INVALID_FRACTION_OF_SECOND
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38739'>SPARK-38739</a>] -         Test the error class: INVALID_INPUT_SYNTAX_FOR_NUMERIC_TYPE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38740'>SPARK-38740</a>] -         Test the error class: INVALID_JSON_SCHEMA_MAPTYPE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38741'>SPARK-38741</a>] -         Test the error class: MAP_KEY_DOES_NOT_EXIST*
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38742'>SPARK-38742</a>] -         Move the tests `MISSING_COLUMN` to QueryCompilationErrorsSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38744'>SPARK-38744</a>] -         Test the pivot error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38745'>SPARK-38745</a>] -         Move the tests for `NON_PARTITION_COLUMN` to QueryCompilationErrorsSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38746'>SPARK-38746</a>] -         Move the tests for `PARSE_EMPTY_STATEMENT` to QueryParsingErrorsSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38747'>SPARK-38747</a>] -         Move the tests for `PARSE_SYNTAX_ERROR` to QueryParsingErrorsSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38748'>SPARK-38748</a>] -         Test the error class: PIVOT_VALUE_DATA_TYPE_MISMATCH
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38749'>SPARK-38749</a>] -         Test the error class: RENAME_SRC_PATH_NOT_FOUND
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38750'>SPARK-38750</a>] -         Test the error class: SECOND_FUNCTION_ARGUMENT_NOT_INTEGER
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38751'>SPARK-38751</a>] -         Test the error class: UNRECOGNIZED_SQL_TYPE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38752'>SPARK-38752</a>] -         Test the error class: UNSUPPORTED_DATATYPE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38753'>SPARK-38753</a>] -         Move the tests for `WRITING_JOB_ABORTED` to QueryExecutionErrorsSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38765'>SPARK-38765</a>] -         Implement `inplace` parameter of `Series.clip`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38768'>SPARK-38768</a>] -         If limit could pushed down and Data source only have one partition, DS V2 should not do limit again
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38774'>SPARK-38774</a>] -         impl Series.autocorr
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38775'>SPARK-38775</a>] -         cleanup validation functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38785'>SPARK-38785</a>] -         impl Series.ewm and DataFrame.ewm
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38791'>SPARK-38791</a>] -         Output parameter values of error classes in SQL style
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38793'>SPARK-38793</a>] -         Support `return_indexer` parameter of `Index/MultiIndex.sort_values`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38795'>SPARK-38795</a>] -         Support INSERT INTO user specified column lists with DEFAULT values
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38811'>SPARK-38811</a>] -         Support ALTER TABLE ADD COLUMN commands with DEFAULT values
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38820'>SPARK-38820</a>] -         Refresh dtype when astype(&quot;category&quot;)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38821'>SPARK-38821</a>] -         test_nsmallest test failed tue to pandas 1.4.0-1.4.2 bug
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38822'>SPARK-38822</a>] -         Raise indexError when insert loc is out of bounds
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38827'>SPARK-38827</a>] -         Improve the test coverage for pyspark/find_spark_home.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38834'>SPARK-38834</a>] -         Update the version of TimestampNTZ related changes as 3.4.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38837'>SPARK-38837</a>] -         Implement `dropna` parameter of `SeriesGroupBy.value_counts`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38838'>SPARK-38838</a>] -         Support ALTER TABLE ALTER COLUMN commands with DEFAULT values
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38840'>SPARK-38840</a>] -         Enable spark.sql.parquet.enableNestedColumnVectorizedReader on master branch by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38844'>SPARK-38844</a>] -         impl Series.interpolate and DataFrame.interpolate
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38854'>SPARK-38854</a>] -         Improve the test coverage for pyspark/statcounter.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38857'>SPARK-38857</a>] -         series name should be preserved in series.mode()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38859'>SPARK-38859</a>] -         iloc setitem failed due to &quot;Cannot convert * into bool&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38863'>SPARK-38863</a>] -         Implement `skipna` parameter of `DataFrame.all`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38865'>SPARK-38865</a>] -         Update document of JDBC options for pushDownAggregate and pushDownLimit
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38869'>SPARK-38869</a>] -         Respect Table capability `ACCEPT_ANY_SCHEMA` in default column resolution
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38877'>SPARK-38877</a>] -         CLONE - Improve the test coverage for pyspark/find_spark_home.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38878'>SPARK-38878</a>] -         CLONE - Improve the test coverage for pyspark/statcounter.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38879'>SPARK-38879</a>] -         Improve the test coverage for pyspark/rddsampler.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38880'>SPARK-38880</a>] -         Implement `numeric_only` parameter of `GroupBy.max/min`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38890'>SPARK-38890</a>] -         Implement `ignore_index` of `DataFrame.sort_index`.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38891'>SPARK-38891</a>] -         Skipping allocating vector for repetition &amp; definition levels when possible
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38894'>SPARK-38894</a>] -         Exclude pyspark.cloudpickle in test coverage report
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38897'>SPARK-38897</a>] -         DS V2 supports push down string functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38899'>SPARK-38899</a>] -         DS V2 supports push down datetime functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38901'>SPARK-38901</a>] -         DS V2 supports push down misc functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38903'>SPARK-38903</a>] -         Implement `ignore_index` of `Series.sort_values` and `Series.sort_index`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38907'>SPARK-38907</a>] -         Impl DataFrame.corrwith
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38913'>SPARK-38913</a>] -         Output identifiers in error messages in SQL style
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38937'>SPARK-38937</a>] -         interpolate support param `limit_direction`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38938'>SPARK-38938</a>] -         Implement `inplace` and `columns` parameters of `Series.drop`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38943'>SPARK-38943</a>] -         EWM support ignore_na
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38946'>SPARK-38946</a>] -         Generates a new dataframe instead of operating inplace in setitem
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38947'>SPARK-38947</a>] -         Support Groupby positional indexing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38949'>SPARK-38949</a>] -         Wrap SQL statements by double quotes in error messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38952'>SPARK-38952</a>] -         Implement `numeric_only` of `GroupBy.first` and `GroupBy.last`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38959'>SPARK-38959</a>] -         DataSource V2: Support runtime group filtering in row-level commands
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38978'>SPARK-38978</a>] -         DS V2 supports push down OFFSET operator
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38980'>SPARK-38980</a>] -         Move error class tests requiring ANSI SQL mode to QueryExecutionAnsiErrorsSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38982'>SPARK-38982</a>] -         test_categories_setter failed due to pandas bug
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38984'>SPARK-38984</a>] -         Allow comparison between TimestampNTZ and Timestamp/Date
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38986'>SPARK-38986</a>] -         Prepend error class tag to error messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38987'>SPARK-38987</a>] -         Handle fallback when merged shuffle blocks are corrupted and spark.shuffle.detectCorrupt is set to true
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38989'>SPARK-38989</a>] -         Implement `ignore_index` of `DataFrame/Series.sample`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38993'>SPARK-38993</a>] -         Impl DataFrame.boxplot and DataFrame.plot.box
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38996'>SPARK-38996</a>] -         Use double quotes for types in error messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39000'>SPARK-39000</a>] -         Convert bools to ints in basic statistical functions of GroupBy objects
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39006'>SPARK-39006</a>] -         Show a directional error message for PVC Dynamic Allocation Failure
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39007'>SPARK-39007</a>] -         Use double quotes for SQL configs in error messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39018'>SPARK-39018</a>] -         Add support for YARN decommissioning when ESS is Disabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39028'>SPARK-39028</a>] -         Use SparkDateTimeException when casting to datetime types failed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39029'>SPARK-39029</a>] -         Improve the test coverage for pyspark/broadcast.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39037'>SPARK-39037</a>] -         DS V2 Top N push-down supports order by expressions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39047'>SPARK-39047</a>] -         Replace the error class ILLEGAL_SUBSTRING by INVALID_PARAMETER_VALUE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39053'>SPARK-39053</a>] -         test_multi_index_dtypes failed due to index mismatch
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39054'>SPARK-39054</a>] -         GroupByTest failed due to axis Length mismatch
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39077'>SPARK-39077</a>] -         Implement `skipna` of basic statistical functions of DataFrame and Series
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39078'>SPARK-39078</a>] -         Support UPDATE commands with DEFAULT values
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39081'>SPARK-39081</a>] -         Impl DataFrame.resample and Series.resample
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39085'>SPARK-39085</a>] -         Move error message of INCONSISTENT_BEHAVIOR_CROSS_VERSION to the json file
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39086'>SPARK-39086</a>] -         Support UDT in Spark Parquet vectorized reader
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39087'>SPARK-39087</a>] -         Improve error messages: step 1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39095'>SPARK-39095</a>] -         Adjust `GroupBy.std` to match pandas 1.4
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39096'>SPARK-39096</a>] -         Support MERGE commands with DEFAULT values
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39097'>SPARK-39097</a>] -          Improve the test coverage for pyspark/taskcontext.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39108'>SPARK-39108</a>] -         Show hints for try_add/try_substract/try_multiply in error messages of int/long overflow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39109'>SPARK-39109</a>] -         Adjust `GroupBy.mean/median` to match pandas 1.4
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39114'>SPARK-39114</a>] -         ml.optim.aggregator avoid re-allocating buffers
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39121'>SPARK-39121</a>] -         Fix doc format/syntax error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39139'>SPARK-39139</a>] -         DS V2 supports push down DS V2 UDF
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39143'>SPARK-39143</a>] -         Support CSV file scans with DEFAULT values
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39148'>SPARK-39148</a>] -         DS V2 aggregate push down can work with OFFSET or LIMIT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39163'>SPARK-39163</a>] -         Throw an exception w/ error class for an invalid bucket file
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39164'>SPARK-39164</a>] -         Wrap asserts/illegal state exceptions by the INTERNAL_ERROR exception in actions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39165'>SPARK-39165</a>] -         Replace sys.error by IllegalStateException in Spark SQL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39167'>SPARK-39167</a>] -         Throw an exception w/ an error class for multiple rows from a subquery used as an expression
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39170'>SPARK-39170</a>] -         ImportError when creating pyspark.pandas document &quot;Supported APIs&quot; if pandas version is low.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39179'>SPARK-39179</a>] -         Improve the test coverage for pyspark/shuffle.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39187'>SPARK-39187</a>] -         Remove SparkIllegalStateException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39189'>SPARK-39189</a>] -         interpolate supports limit_area
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39197'>SPARK-39197</a>] -         Implement `skipna` parameter of `GroupBy.all`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39200'>SPARK-39200</a>] -         Stream is corrupted Exception while fetching the blocks from fallback storage system
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39201'>SPARK-39201</a>] -         Implement `ignore_index` of `DataFrame.explode` and `DataFrame.drop_duplicates`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39211'>SPARK-39211</a>] -         Support JSON file scans with default values
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39214'>SPARK-39214</a>] -         Improve errors related to CAST
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39223'>SPARK-39223</a>] -         implement skew and kurt in Rolling/RollingGroupby/Expanding/ExpandingGroupby
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39230'>SPARK-39230</a>] -         Support ANSI Aggregate Function: regr_slope
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39234'>SPARK-39234</a>] -         Code clean up in SparkThrowableHelper.getMessage
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39236'>SPARK-39236</a>] -         Make CreateTable API and ListTables API compatible 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39243'>SPARK-39243</a>] -         Describe the rules of quoting elements in error messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39246'>SPARK-39246</a>] -         Implement Groupby.skew
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39255'>SPARK-39255</a>] -         Improve error messages: step 2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39263'>SPARK-39263</a>] -         GetTable, TableExists and DatabaseExists
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39265'>SPARK-39265</a>] -         Support Parquet file scans with DEFAULT values
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39270'>SPARK-39270</a>] -         JDBC dialect supports registering dialect specific functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39271'>SPARK-39271</a>] -         Upgrade pandas to 1.4.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39285'>SPARK-39285</a>] -         Spark should not check filed name when read data
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39294'>SPARK-39294</a>] -         Support Orc file scans with DEFAULT values
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39309'>SPARK-39309</a>] -         &#39;_SubTest&#39; object has no attribute &#39;elapsed_time&#39;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39310'>SPARK-39310</a>] -         rename `required_same_anchor`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39314'>SPARK-39314</a>] -         Respect ps.concat sort parameter to follow pandas behavior
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39316'>SPARK-39316</a>] -         Merge PromotePrecision and CheckOverflow into decimal binary arithmetic
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39317'>SPARK-39317</a>] -         groupby.apply doc test failed when SPARK_CONF_ARROW_ENABLED disable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39319'>SPARK-39319</a>] -         Make query context as part of SparkThrowable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39324'>SPARK-39324</a>] -         Log ExecutorDecommission as INFO level in TaskSchedulerImpl
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39326'>SPARK-39326</a>] -         replace &quot;NaN&quot; with real &quot;None&quot; value in indexes in doctest
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39335'>SPARK-39335</a>] -         DescribeTableCommand should redact properties
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39339'>SPARK-39339</a>] -         Support TimestampNTZ in JDBC data source
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39342'>SPARK-39342</a>] -         ShowTablePropertiesCommand/ShowTablePropertiesExec should redact properties.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39343'>SPARK-39343</a>] -         DescribeTableExec should redact properties
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39346'>SPARK-39346</a>] -         Convert asserts/illegal state exception to internal errors on each phase
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39350'>SPARK-39350</a>] -         DescribeNamespace should redact properties
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39351'>SPARK-39351</a>] -         ShowCreateTable should redact properties
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39359'>SPARK-39359</a>] -         Restrict DEFAULT columns to allowlist of supported data source types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39383'>SPARK-39383</a>] -         Support V2 data sources with DEFAULT values
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39384'>SPARK-39384</a>] -         Compile build-in linear regression aggregate functions for JDBC dialect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39385'>SPARK-39385</a>] -         Translate linear regression aggregate functions for pushdown
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39406'>SPARK-39406</a>] -         Accept NumPy array in createDataFrame
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39413'>SPARK-39413</a>] -         Capitalize sql keywords in JDBCV2Suite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39425'>SPARK-39425</a>] -         Add migration guide for PS behavior changes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39432'>SPARK-39432</a>] -         element_at(*, 0) does not return INVALID_ARRAY_INDEX_IN_ELEMENT_AT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39434'>SPARK-39434</a>] -         Provide runtime error query context when array index is out of bound
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39450'>SPARK-39450</a>] -         Reuse PVCs by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39451'>SPARK-39451</a>] -         Support casting intervals to integrals in ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39453'>SPARK-39453</a>] -         DS V2 supports push down misc non-aggregate functions(non ANSI)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39459'>SPARK-39459</a>] -         local*HostName* methods should support IPv6
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39460'>SPARK-39460</a>] -         Fix CoarseGrainedSchedulerBackendSuite to handle fast allocations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39461'>SPARK-39461</a>] -         Print `SPARK_LOCAL_(HOSTNAME|IP)` in `build/{mvn|sbt}`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39464'>SPARK-39464</a>] -         Use `Utils.localCanonicalHostName` instead of `localhost` in tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39468'>SPARK-39468</a>] -         Improve RpcAddress to add [] to IPv6 if needed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39470'>SPARK-39470</a>] -         Support cast of ANSI intervals to decimals
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39479'>SPARK-39479</a>] -         DS V2 supports push down math functions(non ANSI)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39482'>SPARK-39482</a>] -         Add build and test documentation on IPv6
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39490'>SPARK-39490</a>] -         Support `ipFamilyPolicy` and `ipFamilies` in Driver Service
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39491'>SPARK-39491</a>] -         Hadoop 2.7 build fails due to org.apache.hadoop.yarn.api.records.NodeState.DECOMMISSIONING
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39501'>SPARK-39501</a>] -         Propagate `java.net.preferIPv6Addresses=true` in SBT tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39502'>SPARK-39502</a>] -         Downgrade scala-maven-plugin to 4.6.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39503'>SPARK-39503</a>] -         Add session catalog name for v1 database table and function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39506'>SPARK-39506</a>] -         CacheTable, isCached, UncacheTable, setCurrentCatalog, currentCatalog, listCatalogs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39507'>SPARK-39507</a>] -         SocketAuthServer should respect Java IPv6 options
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39508'>SPARK-39508</a>] -         Support IPv6 between JVM and Python Daemon in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39509'>SPARK-39509</a>] -         Support DEFAULT_ARTIFACT_REPOSITORY in check-license
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39514'>SPARK-39514</a>] -         LauncherBackendSuite should add java.net.preferIPv6Addresses conf
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39516'>SPARK-39516</a>] -         Set a scheduled build for branch-3.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39517'>SPARK-39517</a>] -         Recover branch-3.2 build broken by is-changed.py script missing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39519'>SPARK-39519</a>] -         Test failure in SPARK-39387 with JDK 11
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39520'>SPARK-39520</a>] -         ExpressionSetSuite test failure with Scala 2.13
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39521'>SPARK-39521</a>] -         Define each workflow for each scheduled job in GitHub Actions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39522'>SPARK-39522</a>] -         Add Apache Spark infra GA image cache
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39528'>SPARK-39528</a>] -         Use V2 Filter in SupportsRuntimeFiltering
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39529'>SPARK-39529</a>] -         Refactor and merge all related job selection logic into precondition 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39530'>SPARK-39530</a>] -         Fix KafkaTestUtils to support IPv6
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39542'>SPARK-39542</a>] -         Improve YARN client mode to support IPv6
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39552'>SPARK-39552</a>] -         Unify v1 and v2 DESCRIBE TABLE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39553'>SPARK-39553</a>] -         Failed to remove shuffle ${shuffleId} - null when using Scala 2.13
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39555'>SPARK-39555</a>] -         Make createTable and listTables in the python side support 3-layer-namespace
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39557'>SPARK-39557</a>] -         Support ARRAY, STRUCT, MAP types as DEFAULT values
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39559'>SPARK-39559</a>] -         Support IPv6 in WebUI
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39561'>SPARK-39561</a>] -         Improve SparkContext to propagate `java.net.preferIPv6Addresses`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39562'>SPARK-39562</a>] -         Make hive-thrift server module passes in IPv6 environment
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39563'>SPARK-39563</a>] -         Use localHostNameForURI in UISuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39566'>SPARK-39566</a>] -         Improve YARN cluster mode to support IPv6
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39571'>SPARK-39571</a>] -         Add net-tools to Spark docker files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39572'>SPARK-39572</a>] -         Fix `test_daemon.py` to support IPv6
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39574'>SPARK-39574</a>] -         Better error message when `ps.Index` is used for DataFrame/Series creation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39579'>SPARK-39579</a>] -         Make ListFunctions/getFunction/functionExists API compatible 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39583'>SPARK-39583</a>] -         Make RefreshTable be compatible with 3 layer namespace
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39594'>SPARK-39594</a>] -         Improve logs to show addresses in addition to port
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39597'>SPARK-39597</a>] -         Make GetTable, TableExists and DatabaseExists in the python side support 3-layer-namespace
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39598'>SPARK-39598</a>] -         Make *cache*, *catalog* in the python side support 3-layer-namespace
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39607'>SPARK-39607</a>] -         DataSourceV2: Distribution and ordering support V2 function in writing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39610'>SPARK-39610</a>] -         Add safe.directory for container based job
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39611'>SPARK-39611</a>] -         PySpark support numpy 1.23.X
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39615'>SPARK-39615</a>] -         Make listColumns be compatible with 3 layer namespace
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39627'>SPARK-39627</a>] -         DS V2 pushdown should unify the compile API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39629'>SPARK-39629</a>] -         Support v2 SHOW FUNCTIONS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39641'>SPARK-39641</a>] -         Unify v1 and v2 SHOW FUNCTIONS tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39643'>SPARK-39643</a>] -         Prohibit subquery expressions in DEFAULT values for now
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39645'>SPARK-39645</a>] -         Make getDatabase and listDatabases compatible with 3 layer namespace
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39646'>SPARK-39646</a>] -         Make setCurrentDatabase compatible with 3 layer namespace
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39649'>SPARK-39649</a>] -         Make listDatabases / getDatabase / listColumns / refreshTable in PySpark support 3-layer-namespace
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39686'>SPARK-39686</a>] -         Disable scheduled builds that do not pass even once
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39687'>SPARK-39687</a>] -         Make sure new catalog methods listed in API reference
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39688'>SPARK-39688</a>] -         getReusablePVCs should handle accounts with no PVC permission
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39697'>SPARK-39697</a>] -         Add REFRESH_DATE flag and use previous cache to build cache image
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39700'>SPARK-39700</a>] -         Update two-parameter listColumns/getTable/getFunction/tableExists/functionExists functions docs to mention limitation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39704'>SPARK-39704</a>] -         Implement createIndex &amp; dropIndex &amp; IndexExists in JDBC (H2 dialect)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39716'>SPARK-39716</a>] -         Make currentDatabase/setCurrentDatabase/listCatalogs in SparkR support 3L namespace
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39718'>SPARK-39718</a>] -         Enable base image build in PySpark job
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39719'>SPARK-39719</a>] -         Implement databaseExists/getDatabase in SparkR support 3L namespace
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39720'>SPARK-39720</a>] -         Implement tableExists/getTable in SparkR for 3L namespace
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39723'>SPARK-39723</a>] -         Implement functionExists/getFunc in SparkR for 3L namespace
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39735'>SPARK-39735</a>] -         Enable base image build in lint job and fix sparkr env
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39736'>SPARK-39736</a>] -         Enable base image build in SparkR job
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39756'>SPARK-39756</a>] -         Better error messages for missing pandas scalars
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39759'>SPARK-39759</a>] -         Implement listIndexes in JDBC (H2 dialect)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39762'>SPARK-39762</a>] -         Support numpy 1.23.0 (Remove numpy&lt;1.23.0 version limit)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39772'>SPARK-39772</a>] -         namespace should be null when database is null in the old constructors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39773'>SPARK-39773</a>] -         Update document of JDBC options for pushDownOffset
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39778'>SPARK-39778</a>] -         Improve error messages: step 3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39787'>SPARK-39787</a>] -         Use error class in the parsing error of function to_timestamp
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39788'>SPARK-39788</a>] -         Rename catalogName to dialectName for JdbcUtils
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39792'>SPARK-39792</a>] -         Add DecimalDivideWithOverflowCheck for decimal average
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39795'>SPARK-39795</a>] -         New SQL function: try_to_timestamp
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39799'>SPARK-39799</a>] -         DataSourceV2: View catalog interface
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39807'>SPARK-39807</a>] -         Respect ``Series.concat`` sort parameter to follow 1.4.3 behavior
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39810'>SPARK-39810</a>] -         Catalog.tableExists should handle nested namespace
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39818'>SPARK-39818</a>] -         Fix bug in ARRAY, STRUCT, MAP types with DEFAULT values with NULL field(s)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39819'>SPARK-39819</a>] -         DS V2 aggregate push down can work with Top N or Paging (Sort with group expressions)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39827'>SPARK-39827</a>] -         add_months() returns a java error on overflow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39828'>SPARK-39828</a>] -         Catalog.listTables() should respect currentCatalog
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39836'>SPARK-39836</a>] -         Simplify V2ExpressionBuilder by extract common method.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39844'>SPARK-39844</a>] -         Restrict adding DEFAULT columns for existing tables to allowlist of supported data source types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39846'>SPARK-39846</a>] -         Enable spark.dynamicAllocation.shuffleTracking.enabled by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39852'>SPARK-39852</a>] -         Unify v1 and v2 DESCRIBE TABLE tests for columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39859'>SPARK-39859</a>] -         Support v2 `DESCRIBE TABLE EXTENDED` for columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39862'>SPARK-39862</a>] -         Fix bug in existence DEFAULT value lookups for V2 data sources
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39884'>SPARK-39884</a>] -         KubernetesExecutorBackend should handle IPv6 hostname
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39889'>SPARK-39889</a>] -         Use different error classes for numeric/interval divided by 0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39898'>SPARK-39898</a>] -         Upgrade kubernetes-client to 5.12.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39899'>SPARK-39899</a>] -         Incorrect passing of message parameters in InvalidUDFClassException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39905'>SPARK-39905</a>] -         Remove checkErrorClass()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39907'>SPARK-39907</a>] -         Implement axis and skipna of Series.argmin
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39909'>SPARK-39909</a>] -         Organize the check of push down information for JDBCV2Suite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39914'>SPARK-39914</a>] -         Add DS V2 Filter to V1 Filter conversion
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39917'>SPARK-39917</a>] -         Use different error classes for numeric/interval arithmetic overflow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39923'>SPARK-39923</a>] -         Put QueryContext to array instead of Option
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39926'>SPARK-39926</a>] -         Fix bug in existence DEFAULT value lookups for non-vectorized Parquet scans
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39928'>SPARK-39928</a>] -         Optimize Utils.getIteratorSize for Scala 2.13 refer to IterableOnceOps.size
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39929'>SPARK-39929</a>] -         DS V2 supports push down string  functions(non ANSI)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39933'>SPARK-39933</a>] -         Check query context by checkError()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39935'>SPARK-39935</a>] -         Switch validateParsingError onto checkError
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39949'>SPARK-39949</a>] -         Principals in KafkaTestUtils should use canonical host name
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39961'>SPARK-39961</a>] -         DS V2 push-down translate Cast if the cast is safe
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39964'>SPARK-39964</a>] -         DS V2 pushdown should unify the translate path
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39965'>SPARK-39965</a>] -         Skip PVC cleanup when driver doesn&#39;t own PVCs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39966'>SPARK-39966</a>] -         Use V2 Filter in SupportsDelete
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39985'>SPARK-39985</a>] -         Test DEFAULT column values with DataFrames
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39987'>SPARK-39987</a>] -         Support PEAK_JVM_(ON|OFF)HEAP_MEMORY executor rolling policy
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40000'>SPARK-40000</a>] -         Add config to toggle whether to automatically add default values for INSERTs without user-specified fields
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40001'>SPARK-40001</a>] -         Add config to make DEFAULT values in JSON tables mutually exclusive with SQLConf.JSON_GENERATOR_IGNORE_NULL_FIELDS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40006'>SPARK-40006</a>] -         Make pyspark.sql.group examples self-contained
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40008'>SPARK-40008</a>] -         Support casting integrals to intervals in ANSI mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40010'>SPARK-40010</a>] -         Make pyspark.sql.window examples self-contained
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40012'>SPARK-40012</a>] -         Make pyspark.sql.dataframe examples self-contained
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40013'>SPARK-40013</a>] -         DS V2 expressions should have the default toString
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40014'>SPARK-40014</a>] -         Support cast of decimals to ANSI intervals
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40016'>SPARK-40016</a>] -         Remove unnecessary TryEval in TrySum
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40018'>SPARK-40018</a>] -         Output SparkThrowable to SQL golden files in JSON format
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40027'>SPARK-40027</a>] -         Make pyspark.sql.streaming.readwriter examples self-contained
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40029'>SPARK-40029</a>] -         Make pyspark.sql.types examples self-contained
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40041'>SPARK-40041</a>] -         Add Document Parameters for pyspark.sql.window
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40042'>SPARK-40042</a>] -         Make pyspark.sql.streaming.query examples self-contained
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40044'>SPARK-40044</a>] -         Incorrect target interval type in cast overflow errors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40051'>SPARK-40051</a>] -         Make pyspark.sql.catalog examples self-contained
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40054'>SPARK-40054</a>] -         Restore the error handling syntax of try_cast()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40055'>SPARK-40055</a>] -         listCatalogs should also return spark_catalog even spark_catalog implementation is defaultSessionCatalog
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40060'>SPARK-40060</a>] -         Add numberDecommissioningExecutors metric
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40061'>SPARK-40061</a>] -         Document cast of ANSI intervals
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40064'>SPARK-40064</a>] -         Use V2 Filter in SupportsOverwrite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40066'>SPARK-40066</a>] -         ANSI mode: always return null on invalid access to map column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40077'>SPARK-40077</a>] -         Make pyspark.context examples self-contained
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40078'>SPARK-40078</a>] -         Make pyspark.sql.column examples self-contained
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40081'>SPARK-40081</a>] -         Add Document Parameters for pyspark.sql.streaming.query
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40098'>SPARK-40098</a>] -         Format error messages in the Thrift Server
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40102'>SPARK-40102</a>] -         Use SparkException instead of IllegalStateException in SparkPlan
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40107'>SPARK-40107</a>] -         Pull out empty2null conversion from FileFormatWriter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40109'>SPARK-40109</a>] -         New SQL function: get()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40111'>SPARK-40111</a>] -         Make pyspark.rdd examples self-contained
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40120'>SPARK-40120</a>] -         Make pyspark.sql.readwriter examples self-contained
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40135'>SPARK-40135</a>] -         Support ps.Index in DataFrame creation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40136'>SPARK-40136</a>] -         Incorrect fragment of query context
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40138'>SPARK-40138</a>] -         Implement DataFrame.mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40142'>SPARK-40142</a>] -         Make pyspark.sql.functions examples self-contained
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40147'>SPARK-40147</a>] -         Make pyspark.sql.session examples self-contained
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40157'>SPARK-40157</a>] -         Make pyspark.files examples self-contained
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40160'>SPARK-40160</a>] -         Make pyspark.broadcast examples self-contained
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40161'>SPARK-40161</a>] -         Make Series.mode apply PandasMode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40173'>SPARK-40173</a>] -         Make pyspark.taskcontext examples self-contained
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40180'>SPARK-40180</a>] -         Format error messages by spark-sql
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40183'>SPARK-40183</a>] -         Use error class NUMERIC_VALUE_OUT_OF_RANGE for overflow in decimal conversion
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40187'>SPARK-40187</a>] -         Add doc for using Apache YuniKorn as a customized scheduler
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40191'>SPARK-40191</a>] -         Make pyspark.resource examples self-contained
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40196'>SPARK-40196</a>] -         Consolidate `lit` function with NumPy scalar in sql and pandas module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40198'>SPARK-40198</a>] -         Enable spark.storage.decommission.(rdd|shuffle)Blocks.enabled by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40205'>SPARK-40205</a>] -         Provide a query context of ELEMENT_AT_BY_INDEX_ZERO
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40209'>SPARK-40209</a>] -         Incorrect value in the error message of NUMERIC_VALUE_OUT_OF_RANGE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40220'>SPARK-40220</a>] -         Don&#39;t output the empty map of error message parameters
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40222'>SPARK-40222</a>] -         Numeric try_add/try_divide/try_subtract/try_multiply should throw error from their children
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40257'>SPARK-40257</a>] -         Remove since usage in streaming/query.py and window.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40260'>SPARK-40260</a>] -         Use error classes in the compilation errors of GROUP BY a position
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40269'>SPARK-40269</a>] -         Randomize the orders of peer in BlockManagerDecommissioner
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40291'>SPARK-40291</a>] -         Improve the message for column not in group by clause error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40300'>SPARK-40300</a>] -         Migrate onto the DATATYPE_MISMATCH error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40302'>SPARK-40302</a>] -         Add YuniKornSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40304'>SPARK-40304</a>] -         Add decomTestTag to K8s Integration Test
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40305'>SPARK-40305</a>] -         Implement Groupby.sem
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40310'>SPARK-40310</a>] -         try_sum() should throw the exceptions from its child
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40313'>SPARK-40313</a>] -         ps.DataFrame(data, index) should support the same anchor
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40318'>SPARK-40318</a>] -         try_avg() should throw the exceptions from its child
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40324'>SPARK-40324</a>] -         Provide a query context of ParseException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40330'>SPARK-40330</a>] -         Implement `Series.searchsorted`.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40332'>SPARK-40332</a>] -         Implement `GroupBy.quantile`.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40333'>SPARK-40333</a>] -         Implement `GroupBy.nth`.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40334'>SPARK-40334</a>] -         Implement `GroupBy.prod`.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40339'>SPARK-40339</a>] -         Implement `Expanding.quantile`.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40342'>SPARK-40342</a>] -         Implement `Rolling.quantile`.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40345'>SPARK-40345</a>] -         Implement `ExpandingGroupby.quantile`.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40348'>SPARK-40348</a>] -         Implement `RollingGroupby.quantile`.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40356'>SPARK-40356</a>] -         Upgrade pandas to 1.4.4
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40357'>SPARK-40357</a>] -         Migrate window type check failures onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40358'>SPARK-40358</a>] -         Migrate collection type check failures onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40359'>SPARK-40359</a>] -         Migrate JSON type check failures onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40361'>SPARK-40361</a>] -         Migrate arithmetic type check failures onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40368'>SPARK-40368</a>] -         Migrate Bloom Filter type check failures onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40369'>SPARK-40369</a>] -         Migrate the type check failures of calls via reflection onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40370'>SPARK-40370</a>] -         Migrate cast type check failures onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40371'>SPARK-40371</a>] -         Migrate type check failures of NthValue and NTile onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40372'>SPARK-40372</a>] -         Migrate failures of array type checks onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40374'>SPARK-40374</a>] -         Migrate type check failures of type creators onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40379'>SPARK-40379</a>] -         Propagate decommission executor loss reason during onDisconnect in K8s
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40386'>SPARK-40386</a>] -         Implement `ddof` in `DataFrame.cov`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40391'>SPARK-40391</a>] -         Test the error class UNSUPPORTED_FEATURE.JDBC_TRANSACTION
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40393'>SPARK-40393</a>] -         Refactor expanding and rolling test for function with input
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40399'>SPARK-40399</a>] -         Make `pearson` correlation in `DataFrame.corr` support missing values and `min_periods`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40400'>SPARK-40400</a>] -         Pass error message parameters to exceptions as a map
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40416'>SPARK-40416</a>] -         Add error classes for subquery expression CheckAnalysis failures
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40417'>SPARK-40417</a>] -         Use YuniKorn v1.1+
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40420'>SPARK-40420</a>] -         Sort message parameters in the JSON formats
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40421'>SPARK-40421</a>] -         Make `spearman` correlation in `DataFrame.corr` support missing values and `min_periods`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40423'>SPARK-40423</a>] -         Add explicit YuniKorn queue submission test coverage
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40426'>SPARK-40426</a>] -         Return a map from SparkThrowable.getMessageParameters
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40432'>SPARK-40432</a>] -         Introduce GroupStateImpl and GroupStateTimeout in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40433'>SPARK-40433</a>] -         Add toJVMRow in PythonSQLUtils to convert pickled PySpark Row to JVM Row
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40434'>SPARK-40434</a>] -         Implement applyInPandasWithState in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40435'>SPARK-40435</a>] -         Add test suites for applyInPandasWithState in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40445'>SPARK-40445</a>] -         Refactor Resampler
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40446'>SPARK-40446</a>] -         Rename `_MissingPandasXXX` as `MissingPandasXXX`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40447'>SPARK-40447</a>] -         Implement `kendall` correlation in `DataFrame.corr`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40448'>SPARK-40448</a>] -         Initial prototype implementation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40453'>SPARK-40453</a>] -         Improve error handling for GRPC server
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40454'>SPARK-40454</a>] -         Initial DSL framework for protobuf testing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40458'>SPARK-40458</a>] -         Bump Kubernetes Client Version to 6.1.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40459'>SPARK-40459</a>] -         recoverDiskStore should not stop by existing recomputed files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40473'>SPARK-40473</a>] -         Migrate parsing errors onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40479'>SPARK-40479</a>] -         Migrate unexpected input type error to an error class
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40481'>SPARK-40481</a>] -         Ignore stage fetch failure caused by decommissioned executor
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40483'>SPARK-40483</a>] -         Add `CONNECT` label
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40486'>SPARK-40486</a>] -         Implement `spearman` and `kendall` in `DataFrame.corrwith`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40498'>SPARK-40498</a>] -         Implement `kendall` and `min_periods` in `Series.corr`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40503'>SPARK-40503</a>] -         Add resampling to API references
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40509'>SPARK-40509</a>] -         Construct an example of applyInPandasWithState in examples directory
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40510'>SPARK-40510</a>] -         Implement `ddof` in `Series.cov`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40512'>SPARK-40512</a>] -         Upgrade pandas to 1.5.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40515'>SPARK-40515</a>] -         Add apache/spark-docker repo
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40516'>SPARK-40516</a>] -         Add official image dockerfile for Spark v3.3.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40519'>SPARK-40519</a>] -         Add &quot;Publish workflow&quot; to help release apache/spark image
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40520'>SPARK-40520</a>] -         Add a script to generate DOI mainifest
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40528'>SPARK-40528</a>] -         Add dockerfile template
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40529'>SPARK-40529</a>] -         Remove `pyspark.pandas.ml`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40532'>SPARK-40532</a>] -         Python version for UDF should follow the servers version
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40533'>SPARK-40533</a>] -         Extend type support for Spark Connect literals
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40534'>SPARK-40534</a>] -         Extend support for Join Relation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40536'>SPARK-40536</a>] -         Make Spark Connect port configurable.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40537'>SPARK-40537</a>] -         Re-enable mypi supoprt
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40538'>SPARK-40538</a>] -         Add missing PySpark functions to Spark Connect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40539'>SPARK-40539</a>] -         PySpark read API parity for Spark Connect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40540'>SPARK-40540</a>] -         Migrate compilation errors onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40542'>SPARK-40542</a>] -         Make `ddof` in `DataFrame.std` and `Series.std` accept arbitary integers
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40543'>SPARK-40543</a>] -         Make `ddof` in `DataFrame.var` and `Series.var` accept arbitary integers
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40550'>SPARK-40550</a>] -         DataSource V2: Handle DELETE commands for delta-based sources
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40551'>SPARK-40551</a>] -         DataSource V2: Add APIs for delta-based row-level operations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40554'>SPARK-40554</a>] -         Make `ddof` in `DataFrame.sem` and `Series.sem` accept arbitary integers
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40557'>SPARK-40557</a>] -         Re-generate Spark Connect Python protos
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40560'>SPARK-40560</a>] -         Rename message to messageFormat in the STANDARD format of errors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40561'>SPARK-40561</a>] -         Implement `min_count` in GroupBy.min
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40569'>SPARK-40569</a>] -         Add smoke test in standalone cluster for spark-docker
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40571'>SPARK-40571</a>] -         Construct a test case to verify fault-tolerance semantic with random python worker failures
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40573'>SPARK-40573</a>] -         Make `ddof` in `GroupBy.std`, `GroupBy.var` and `GroupBy.sem` accept arbitary integers
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40577'>SPARK-40577</a>] -         Fix CategoricalIndex.append
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40578'>SPARK-40578</a>] -         Fix `IndexesTest.test_to_frame` when pandas 1.5.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40579'>SPARK-40579</a>] -         `GroupBy.first` should skip nulls
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40580'>SPARK-40580</a>] -         Update the document for DataFrame.to_orc
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40587'>SPARK-40587</a>] -         SELECT * shouldn&#39;t be empty project list in proto.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40589'>SPARK-40589</a>] -         Fix test for `DataFrame.corr_with` to skip the pandas regression
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40590'>SPARK-40590</a>] -         Fix `ps.read_parquet` when pandas_metadata is True
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40592'>SPARK-40592</a>] -         Implement `min_count` in `GroupBy.max`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40593'>SPARK-40593</a>] -         protoc-3.21.1-linux-x86_64.exe requires GLIBC_2.14 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40605'>SPARK-40605</a>] -         Connect module should use log4j2.properties to configure test log output as other modules
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40613'>SPARK-40613</a>] -         Update sbt-protoc to 1.0.6
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40615'>SPARK-40615</a>] -         Check unsupported data type when decorrelating subqueries
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40621'>SPARK-40621</a>] -         Implement `numeric_only` and `min_count` in `GroupBy.sum`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40631'>SPARK-40631</a>] -         Implement `min_count` in `GroupBy.first`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40636'>SPARK-40636</a>] -         Fix wrong remained shuffles log in BlockManagerDecommissioner
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40643'>SPARK-40643</a>] -         Implement `min_count` in `GroupBy.last`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40645'>SPARK-40645</a>] -         Throw exception for Collect() and recommend to use toPandas()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40663'>SPARK-40663</a>] -         Migrate execution errors onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40665'>SPARK-40665</a>] -         Avoid embedding Spark Connect in the Apache Spark binary release
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40671'>SPARK-40671</a>] -         Support driver service labels
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40672'>SPARK-40672</a>] -         Run Scala side tests in GitHub Actions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40674'>SPARK-40674</a>] -         Use uniitest&#39;s asserts instead of built-in assert
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40677'>SPARK-40677</a>] -         Shade more dependency to be able to run separately
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40680'>SPARK-40680</a>] -         Avoid hardcoded versions in SBT build
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40687'>SPARK-40687</a>] -         Support data masking built-in Function  &#39;mask&#39;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40693'>SPARK-40693</a>] -         mypy complains accessing the variable defined in the class method 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40698'>SPARK-40698</a>] -         Improve the precision of `product` for intergral inputs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40699'>SPARK-40699</a>] -         Supplement undocumented yarn configuration in documentation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40702'>SPARK-40702</a>] -         Confusing partition specs in PartitionsAlreadyExistException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40707'>SPARK-40707</a>] -         Add groupby to connect DSL and test more than one grouping expressions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40709'>SPARK-40709</a>] -         Supplement undocumented avro configurations in documentation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40710'>SPARK-40710</a>] -         Supplement undocumented parquet configurations in documentation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40713'>SPARK-40713</a>] -         Improve SET operation support in the proto and the server
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40714'>SPARK-40714</a>] -         Remove PartitionAlreadyExistsException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40717'>SPARK-40717</a>] -         Support Column Alias in connect DSL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40718'>SPARK-40718</a>] -         Replace shaded netty with grpc netty to avoid double shaded dependency.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40726'>SPARK-40726</a>] -         Supplement undocumented orc configurations in documentation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40727'>SPARK-40727</a>] -         Add merge_spark_docker_pr.py to help merge commit
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40729'>SPARK-40729</a>] -         Spark-shell run failed with Java 19
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40733'>SPARK-40733</a>] -         ShowCreateTableSuite test failed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40737'>SPARK-40737</a>] -         Add basic support for DataFrameWriter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40743'>SPARK-40743</a>] -         StructType should contain a list of StructField and each field should have a name
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40744'>SPARK-40744</a>] -         Make `_reduce_for_stat_function` in `groupby` accept `min_count`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40746'>SPARK-40746</a>] -         Make Dockerfile build workflow work in apache repo
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40748'>SPARK-40748</a>] -         Migrate type check failures of conditions onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40749'>SPARK-40749</a>] -         Migrate type check failures of generators onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40750'>SPARK-40750</a>] -         Migrate type check failures of math expressions onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40751'>SPARK-40751</a>] -         Migrate type check failures of high order functions onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40752'>SPARK-40752</a>] -         Migrate type check failures of misc expressions onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40754'>SPARK-40754</a>] -         Add LICENSE and NOTICE for apache/spark-docker
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40755'>SPARK-40755</a>] -         Migrate type check failures of number formatting onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40756'>SPARK-40756</a>] -         Migrate type check failures of string expressions onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40757'>SPARK-40757</a>] -         Add PULL_REQUEST_TEMPLATE for spark-docker
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40759'>SPARK-40759</a>] -         Migrate type check failures of time window onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40760'>SPARK-40760</a>] -         Migrate type check failures of interval expressions onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40761'>SPARK-40761</a>] -         Migrate type check failures of percentile expressions onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40762'>SPARK-40762</a>] -         Check error classes in ErrorParserSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40768'>SPARK-40768</a>] -         Migrate type check failures of bloom_filter_agg() onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40769'>SPARK-40769</a>] -         Migrate type check failures of aggregate expressions onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40773'>SPARK-40773</a>] -         Refactor checkCorrelationsInSubquery
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40774'>SPARK-40774</a>] -         Add Sample to proto and DSL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40779'>SPARK-40779</a>] -         Fix `corrwith` to work properly with different anchor.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40780'>SPARK-40780</a>] -         Add WHERE to Connect proto and DSL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40783'>SPARK-40783</a>] -         Enable Spark on K8s integration test for official dockerfiles
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40784'>SPARK-40784</a>] -         Check error classes in DDLParserSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40785'>SPARK-40785</a>] -         Check error classes in ExpressionParserSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40786'>SPARK-40786</a>] -         Check error classes in PlanParserSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40787'>SPARK-40787</a>] -         Check error classes in SparkSqlParserSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40788'>SPARK-40788</a>] -         Check error classes in CreateNamespaceParserSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40790'>SPARK-40790</a>] -         Check error classes in DDL parsing tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40796'>SPARK-40796</a>] -         Check the generated python protos in GitHub Actions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40799'>SPARK-40799</a>] -         Enforce Scalafmt for Spark Connect Module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40800'>SPARK-40800</a>] -         Always inline expressions in OptimizeOneRowRelationSubquery
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40805'>SPARK-40805</a>] -         Use `spark` username in official image
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40809'>SPARK-40809</a>] -         Add as(alias: String) to connect DSL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40810'>SPARK-40810</a>] -         Use SparkIllegalArgumentException instead of IllegalArgumentException in CreateDatabaseCommand &amp; AlterDatabaseSetLocationCommand
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40811'>SPARK-40811</a>] -         Use checkError() to intercept ParseException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40812'>SPARK-40812</a>] -         Add Deduplicate to Connect proto
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40813'>SPARK-40813</a>] -         Add limit and offset to Connect DSL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40816'>SPARK-40816</a>] -         Python: rename LogicalPlan.collect to LogicalPlan.to_proto
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40823'>SPARK-40823</a>] -         Connect Proto should carry unparsed identifiers
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40827'>SPARK-40827</a>] -         Re-enable the DataFrame.corrwith test after fixing in future pandas.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40828'>SPARK-40828</a>] -         Drop Python test tables before and after unit tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40832'>SPARK-40832</a>] -         Add README for spark-docker
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40833'>SPARK-40833</a>] -         Cleanup apt lists cache in Dockerfile
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40836'>SPARK-40836</a>] -         AnalyzeResult should use struct for schema
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40839'>SPARK-40839</a>] -         [Python] Implement `DataFrame.sample`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40845'>SPARK-40845</a>] -         Add template support for SPARK_GPG_KEY
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40852'>SPARK-40852</a>] -         Implement `DataFrame.summary`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40854'>SPARK-40854</a>] -         Change default serialization from &#39;broken&#39; CSV to Spark DF JSON
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40856'>SPARK-40856</a>] -         Update the error template of WRONG_NUM_PARAMS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40857'>SPARK-40857</a>] -         Allow configurable GPRC interceptors for Spark Connect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40859'>SPARK-40859</a>] -         Upgrade action/checkout to v3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40860'>SPARK-40860</a>] -         Change `set-output` to `GITHUB_EVENT` in spark infra code
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40862'>SPARK-40862</a>] -         Unexpected operators when rewriting scalar subqueries with non-deterministic expressions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40864'>SPARK-40864</a>] -         Remove pip/setuptools dynamic upgrade
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40866'>SPARK-40866</a>] -         Rename Check Spark repo as Check Spark Docker repo in GA
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40870'>SPARK-40870</a>] -         Upgrade docker actions to cleanup warning
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40871'>SPARK-40871</a>] -         Upgrade actions/script to v6 and fix notify workflow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40872'>SPARK-40872</a>] -         Fallback to original shuffle block when a push-merged shuffle chunk is zero-size
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40875'>SPARK-40875</a>] -         Add .agg() to Connect DSL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40877'>SPARK-40877</a>] -         Reimplement `crosstab` with dataframe operations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40878'>SPARK-40878</a>] -         pin &#39;grpcio==1.48.1&#39; &#39;protobuf==4.21.6&#39;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40879'>SPARK-40879</a>] -         Support Join UsingColumns in proto
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40880'>SPARK-40880</a>] -         Reimplement `summary` with dataframe operations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40881'>SPARK-40881</a>] -         Upgrade actions/cache to v3 and actions/upload-artifact to v3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40882'>SPARK-40882</a>] -         Upgrade actions/setup-java to v3 with distribution specified
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40883'>SPARK-40883</a>] -         Support Range in Connect proto
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40888'>SPARK-40888</a>] -         Check error classes in HiveQuerySuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40889'>SPARK-40889</a>] -         Check error classes in PlanResolutionSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40890'>SPARK-40890</a>] -         Check error classes in DataSourceV2SQLSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40891'>SPARK-40891</a>] -         Check error classes in TableIdentifierParserSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40896'>SPARK-40896</a>] -         Fix doctest for `Index.(isin|isnull|notnull)` to work properly with pandas 1.5
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40898'>SPARK-40898</a>] -         Quote function names in datatype mismatch errors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40899'>SPARK-40899</a>] -         UserContext should be extensible
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40900'>SPARK-40900</a>] -         Reimplement `frequentItems` with dataframe operations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40910'>SPARK-40910</a>] -         Replace UnsupportedOperationException with SparkUnsupportedOperationException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40914'>SPARK-40914</a>] -         Mark internal API to be private[connect]
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40915'>SPARK-40915</a>] -         Improve `on` in Join in Python client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40926'>SPARK-40926</a>] -         Refactor server side tests to only use DataFrame API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40929'>SPARK-40929</a>] -         Add official image dockerfile for Spark v3.3.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40930'>SPARK-40930</a>] -         Support Collect() in Python client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40933'>SPARK-40933</a>] -         Reimplement df.stat.{cov, corr} with built-in sql functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40938'>SPARK-40938</a>] -         Support Alias for every Relation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40941'>SPARK-40941</a>] -         Use Java 17 in K8s Dockerfile by default and remove `Dockerfile.java17`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40947'>SPARK-40947</a>] -         Upgrade pandas to 1.5.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40948'>SPARK-40948</a>] -         Introduce new error class: PATH_NOT_FOUND
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40949'>SPARK-40949</a>] -         Implement `DataFrame.sortWithinPartitions`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40951'>SPARK-40951</a>] -         pyspark-connect tests should be skipped if pandas doesn&#39;t exist
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40953'>SPARK-40953</a>] -         Add missing `limit(n)` in DataFrame.head
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40965'>SPARK-40965</a>] -         Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1208
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40966'>SPARK-40966</a>] -         FIX `read_parquet` with `pandas_metadata`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40967'>SPARK-40967</a>] -         Migrate failAnalysis() onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40970'>SPARK-40970</a>] -         Support List[Column] for Join&#39;s on argument.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40971'>SPARK-40971</a>] -         Imports more from connect proto package to avoid calling `proto.` for Connect DSL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40973'>SPARK-40973</a>] -         Rename _LEGACY_ERROR_TEMP_0055 to UNCLOSED_BRACKETED_COMMENT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40975'>SPARK-40975</a>] -         Assign a name to the legacy error class _LEGACY_ERROR_TEMP_0021
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40977'>SPARK-40977</a>] -         Complete Support for Union in Python client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40978'>SPARK-40978</a>] -         Migrate failAnalysis() w/o context onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40979'>SPARK-40979</a>] -         Keep removed executor info in decommission state
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40980'>SPARK-40980</a>] -         Support session.sql in Connect DSL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40981'>SPARK-40981</a>] -         Support session.range in Python client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40984'>SPARK-40984</a>] -         Replace `FRAME_LESS_OFFSET_WITHOUT_FOLDABLE` with `NON_FOLDABLE_INPUT`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40989'>SPARK-40989</a>] -         Improve `session.sql` testing coverage in Python client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40990'>SPARK-40990</a>] -         DataFrame creation from 2d NumPy array with arbitrary columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40992'>SPARK-40992</a>] -         Support toDF(columnNames) in Connect DSL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40995'>SPARK-40995</a>] -         Developer Documentation for Spark Connect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40998'>SPARK-40998</a>] -         Assign a name to the legacy error class _LEGACY_ERROR_TEMP_0040
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41001'>SPARK-41001</a>] -         Connection string support for Python client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41002'>SPARK-41002</a>] -         Compatible `take`, `head` and `first` API in Python client 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41004'>SPARK-41004</a>] -         Check error classes in InterceptorRegistrySuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41005'>SPARK-41005</a>] -         Arrow based collect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41009'>SPARK-41009</a>] -         Assign a name to the legacy error class _LEGACY_ERROR_TEMP_1070
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41010'>SPARK-41010</a>] -         Complete Support for Except and Intersect in Python client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41012'>SPARK-41012</a>] -         Rename _LEGACY_ERROR_TEMP_1022 to ORDER_BY_POS_OUT_OF_RANGE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41019'>SPARK-41019</a>] -         Provide a query context to failAnalysis
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41020'>SPARK-41020</a>] -         Assign a name to the legacy error class _LEGACY_ERROR_TEMP_2440
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41021'>SPARK-41021</a>] -         Test some subclasses of  error class DATATYPE_MISMATCH
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41022'>SPARK-41022</a>] -         Test the error class: DEFAULT_DATABASE_NOT_EXISTS, INDEX_ALREADY_EXISTS, INDEX_NOT_FOUND, ROUTINE_NOT_FOUND
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41026'>SPARK-41026</a>] -         Support Repartition in Connect DSL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41027'>SPARK-41027</a>] -         Use `UNEXPECTED_INPUT_TYPE ` instead of `MAP_FROM_ENTRIES_WRONG_TYPE`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41034'>SPARK-41034</a>] -         Connect DataFrame should require RemoteSparkSession
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41036'>SPARK-41036</a>] -         `columns` API should use `schema` API to avoid data fetching
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41038'>SPARK-41038</a>] -         Rename `MULTI_VALUE_SUBQUERY_ERROR` to `SCALAR_SUBQUERY_TOO_MANY_ROWS `
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41041'>SPARK-41041</a>] -         Integrate _LEGACY_ERROR_TEMP_1279 into TABLE_OR_VIEW_ALREADY_EXISTS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41042'>SPARK-41042</a>] -         Rename PARSE_CHAR_MISSING_LENGTH to DATA_TYPE_MISSING_SIZE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41043'>SPARK-41043</a>] -         Assign a name to the legacy error class _LEGACY_ERROR_TEMP_2429
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41044'>SPARK-41044</a>] -         Convert DATATYPE_MISMATCH.UNSPECIFIED_FRAME to INTERNAL_ERROR
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41046'>SPARK-41046</a>] -         Support CreateView in Connect DSL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41054'>SPARK-41054</a>] -         Support disk-based KVStore in live UI
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41055'>SPARK-41055</a>] -         Rename _LEGACY_ERROR_TEMP_2424 to GROUP_BY_AGGREGATE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41058'>SPARK-41058</a>] -         Removing unused code in connect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41059'>SPARK-41059</a>] -         Rename _LEGACY_ERROR_TEMP_2420 to NESTED_AGGREGATE_FUNCTION
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41061'>SPARK-41061</a>] -         Support SelectExpr which apply Projection by expressions in Strings in Connect DSL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41062'>SPARK-41062</a>] -         Rename UNSUPPORTED_CORRELATED_REFERENCE to CORRELATED_REFERENCE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41064'>SPARK-41064</a>] -         Implement `DataFrame.crosstab` and `DataFrame.stat.crosstab`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41065'>SPARK-41065</a>] -         Implement `DataFrame.freqItems ` and `DataFrame.stat.freqItems `
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41066'>SPARK-41066</a>] -         Implement `DataFrame.sampleBy ` and `DataFrame.stat.sampleBy `
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41067'>SPARK-41067</a>] -         Implement `DataFrame.stat.cov`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41068'>SPARK-41068</a>] -         Implement `DataFrame.stat.corr`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41069'>SPARK-41069</a>] -         Implement `DataFrame.approxQuantile` and `DataFrame.stat.approxQuantile`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41072'>SPARK-41072</a>] -         Convert the internal error about failed stream to user-facing error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41077'>SPARK-41077</a>] -         Rename `ColumnRef` to `Column` in Python client implementation 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41078'>SPARK-41078</a>] -         DataFrame `withColumnsRenamed` can be implemented through `RenameColumns` proto
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41095'>SPARK-41095</a>] -         Convert unresolved operators to internal errors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41098'>SPARK-41098</a>] -         Rename GROUP_BY_POS_REFERS_AGG_EXPR to GROUP_BY_POS_AGGREGATE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41102'>SPARK-41102</a>] -         Merge SparkConnectPlanner and SparkConnectCommandPlanner
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41103'>SPARK-41103</a>] -         Document how to add a new proto field of messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41105'>SPARK-41105</a>] -         Adopt `optional` keyword from proto3 which offers `hasXXX` to differentiate if a field is set or unset 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41108'>SPARK-41108</a>] -         Control the max size of arrow batch
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41109'>SPARK-41109</a>] -         Rename the error class _LEGACY_ERROR_TEMP_1216 to INVALID_LIKE_PATTERN
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41110'>SPARK-41110</a>] -         Implement `DataFrame.sparkSession` in Python client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41111'>SPARK-41111</a>] -         Implement `DataFrame.show`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41114'>SPARK-41114</a>] -         Support local data for LocalRelation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41115'>SPARK-41115</a>] -         Add ClientType to proto to indicate which client sends a request
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41116'>SPARK-41116</a>] -         Input relation can be optional for Project in Connect proto
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41122'>SPARK-41122</a>] -         Explain API can support different modes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41127'>SPARK-41127</a>] -         Implement DataFrame.CreateGlobalView in Python client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41128'>SPARK-41128</a>] -         Implement `DataFrame.fillna ` and `DataFrame.na.fill `
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41130'>SPARK-41130</a>] -         Rename OUT_OF_DECIMAL_TYPE_RANGE to NUMERIC_OUT_OF_SUPPORTED_RANGE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41131'>SPARK-41131</a>] -         Improve error message for UNRESOLVED_MAP_KEY.WITHOUT_SUGGESTION
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41133'>SPARK-41133</a>] -         Integrate UNSCALED_VALUE_TOO_LARGE_FOR_PRECISION into NUMERIC_VALUE_OUT_OF_RANGE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41135'>SPARK-41135</a>] -         Rename UNSUPPORTED_EMPTY_LOCATION to INVALID_EMPTY_LOCATION
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41137'>SPARK-41137</a>] -         Rename LATERAL_JOIN_OF_TYPE to INVALID_LATERAL_JOIN_TYPE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41139'>SPARK-41139</a>] -         Improve error message for PYTHON_UDF_IN_ON_CLAUSE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41140'>SPARK-41140</a>] -         Assign a name to the legacy error class _LEGACY_ERROR_TEMP_2440
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41148'>SPARK-41148</a>] -         Implement `DataFrame.dropna ` and `DataFrame.na.drop `
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41150'>SPARK-41150</a>] -         Document debugging with PySpark memory profiler
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41157'>SPARK-41157</a>] -         Show detailed differences in dataframe comparison
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41158'>SPARK-41158</a>] -         Use `checkError()` to  check `DATATYPE_MISMATCH` in `DataFrameFunctionsSuite`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41164'>SPARK-41164</a>] -         Update relations.proto to follow Connect Proto development guidance
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41166'>SPARK-41166</a>] -         Check errorSubClass of DataTypeMismatch in *ExpressionSuites
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41169'>SPARK-41169</a>] -         Implement `DataFrame.drop`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41172'>SPARK-41172</a>] -         Migrate the ambiguous ref error to an error class
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41173'>SPARK-41173</a>] -         Move `require()` out from the constructors of string expressions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41174'>SPARK-41174</a>] -         Propagate an error class to users for invalid `format` of `to_binary()`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41175'>SPARK-41175</a>] -         Assign a name to the error class _LEGACY_ERROR_TEMP_1078
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41176'>SPARK-41176</a>] -         Assign a name to the error class _LEGACY_ERROR_TEMP_1042
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41179'>SPARK-41179</a>] -         Assign a name to the error class _LEGACY_ERROR_TEMP_1092
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41180'>SPARK-41180</a>] -         Assign an error class to &quot;Cannot parse the data type&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41181'>SPARK-41181</a>] -         Migrate the map options errors onto error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41182'>SPARK-41182</a>] -         Assign a name to the error class _LEGACY_ERROR_TEMP_1102
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41196'>SPARK-41196</a>] -         Homogenize the protobuf version across server and client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41201'>SPARK-41201</a>] -         Implement `DataFrame.SelectExpr` in Python client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41203'>SPARK-41203</a>] -         Dataframe.transform in Python client support
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41206'>SPARK-41206</a>] -         Assign a name to the error class _LEGACY_ERROR_TEMP_1233
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41212'>SPARK-41212</a>] -         Implement `DataFrame.isEmpty`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41213'>SPARK-41213</a>] -         Implement `DataFrame.__repr__` and `DataFrame.dtypes`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41215'>SPARK-41215</a>] -         protoc-3.21.9-linux-x86_64.exe requires GLIBC_2.14
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41216'>SPARK-41216</a>] -         Make AnalyzePlan support multiple analysis tasks 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41217'>SPARK-41217</a>] -         Add an error class for failures of built-in function calls
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41221'>SPARK-41221</a>] -         Add the error class INVALID_FORMAT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41222'>SPARK-41222</a>] -         Unify the typing definitions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41225'>SPARK-41225</a>] -         Disable unsupported functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41227'>SPARK-41227</a>] -         Implement DataFrame cross join
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41228'>SPARK-41228</a>] -         Rename COLUMN_NOT_IN_GROUP_BY_CLAUSE to MISSING_AGGREGATION
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41230'>SPARK-41230</a>] -         Remove `str` from Aggregate expression type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41232'>SPARK-41232</a>] -         High-order function: array_append
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41234'>SPARK-41234</a>] -         High-order function: array_insert
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41235'>SPARK-41235</a>] -         High-order function: array_compact
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41237'>SPARK-41237</a>] -         Assign a name to the error class _LEGACY_ERROR_TEMP_0030
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41238'>SPARK-41238</a>] -         Support more datatypes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41243'>SPARK-41243</a>] -         Update the protobuf version in README
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41244'>SPARK-41244</a>] -         Introducing a Protobuf serializer for UI data on KV store
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41250'>SPARK-41250</a>] -         DataFrame.to_pandas should not return optional pandas dataframe
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41253'>SPARK-41253</a>] -         Make K8s volcano IT work in Github Action
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41255'>SPARK-41255</a>] -         RemoteSparkSession should be called SparkSession
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41256'>SPARK-41256</a>] -         Implement DataFrame.withColumn(s)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41258'>SPARK-41258</a>] -         Upgrade spark-docker actions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41263'>SPARK-41263</a>] -         Upgrade buf to v1.9.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41264'>SPARK-41264</a>] -         Make Literal support more datatypes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41265'>SPARK-41265</a>] -         Check and upgrade buf.build/protocolbuffers/plugins/python to 3.19.5
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41268'>SPARK-41268</a>] -         Refactor &quot;Column&quot; for API Compatibility 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41269'>SPARK-41269</a>] -         Move image matrix into version&#39;s workflow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41272'>SPARK-41272</a>] -         Assign a name to the error class _LEGACY_ERROR_TEMP_2019
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41278'>SPARK-41278</a>] -         Clean up unused QualifiedAttribute in Expression.proto
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41280'>SPARK-41280</a>] -         Implement DataFrame.tail
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41287'>SPARK-41287</a>] -         Add a test workflow to help test image in fork repo
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41291'>SPARK-41291</a>] -         `DataFrame.explain` should print and return None
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41292'>SPARK-41292</a>] -         Window-function support
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41293'>SPARK-41293</a>] -         Code cleanup for assertXXX methods in ExpressionTypeCheckingSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41295'>SPARK-41295</a>] -         Assign a name to the error class _LEGACY_ERROR_TEMP_1105
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41296'>SPARK-41296</a>] -         Assign a name to the error class _LEGACY_ERROR_TEMP_1106
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41297'>SPARK-41297</a>] -         Support string sql expressions in DF.where()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41300'>SPARK-41300</a>] -         Unset Read.schema is incorrectly read when unset
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41301'>SPARK-41301</a>] -         SparkSession.range should treat end as optional
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41302'>SPARK-41302</a>] -         Assign a name to the error class _LEGACY_ERROR_TEMP_1185
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41304'>SPARK-41304</a>] -         Add missing docs for DataFrame API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41306'>SPARK-41306</a>] -         Improve Connect Expression proto documentation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41308'>SPARK-41308</a>] -         Improve `DataFrame.count()`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41309'>SPARK-41309</a>] -         Assign a name to the error class _LEGACY_ERROR_TEMP_1093
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41310'>SPARK-41310</a>] -         Implement DataFrame.toDF
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41311'>SPARK-41311</a>] -         Rewrite test RENAME_SRC_PATH_NOT_FOUND to trigger the error from user space
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41312'>SPARK-41312</a>] -         Implement DataFrame.withColumnRenamed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41314'>SPARK-41314</a>] -         Assign a name to the error class _LEGACY_ERROR_TEMP_1094
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41315'>SPARK-41315</a>] -         Implement `DataFrame.replace ` and `DataFrame.na.replace `
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41317'>SPARK-41317</a>] -         PySpark write API for Spark Connect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41319'>SPARK-41319</a>] -         when-otherwise support
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41321'>SPARK-41321</a>] -         Support target field for UnresolvedStar
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41325'>SPARK-41325</a>] -         Add missing avg() to DF group
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41326'>SPARK-41326</a>] -         Bug in Deduplicate Python transformation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41328'>SPARK-41328</a>] -         Add logical and string API to Column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41329'>SPARK-41329</a>] -         Solve circular import between Column and _typing/functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41330'>SPARK-41330</a>] -         Improve Documentation for Take,Tail, Limit and Offset
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41331'>SPARK-41331</a>] -         Add orderBy and drop_duplicates
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41332'>SPARK-41332</a>] -         Fix `nullOrdering` in `SortOrder`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41333'>SPARK-41333</a>] -         Make `Groupby.{min, max, sum, avg, mean}` compatible with PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41334'>SPARK-41334</a>] -         move SortField from relations.proto to expressions.proto
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41335'>SPARK-41335</a>] -         Support IsNull and IsNotNull in Column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41343'>SPARK-41343</a>] -         Move FunctionName parsing to server side
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41345'>SPARK-41345</a>] -         Add Hint to Connect Proto
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41346'>SPARK-41346</a>] -         Implement asc and desc methods
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41347'>SPARK-41347</a>] -         Add Cast to Expression proto
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41348'>SPARK-41348</a>] -         Refactor `UnsafeArrayWriterSuite` to check error class
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41349'>SPARK-41349</a>] -         Implement `DataFrame.hint`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41351'>SPARK-41351</a>] -         Column does not support !=
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41354'>SPARK-41354</a>] -         Implement `DataFrame.repartitionByRange`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41357'>SPARK-41357</a>] -         Implement math functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41358'>SPARK-41358</a>] -         Use `PhysicalDataType` instead of DataType in ColumnVectorUtils
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41363'>SPARK-41363</a>] -         implement normal functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41364'>SPARK-41364</a>] -         implement `broadcast` function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41366'>SPARK-41366</a>] -         DF.groupby.agg() API should be compatible
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41371'>SPARK-41371</a>] -         Improve Documentation for Command proto
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41380'>SPARK-41380</a>] -         Implement aggregation functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41381'>SPARK-41381</a>] -         Implement count_distinct and sum_distinct functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41382'>SPARK-41382</a>] -         implement `product` function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41383'>SPARK-41383</a>] -         Implement `DataFrame.cube`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41388'>SPARK-41388</a>] -         getReusablePVCs should ignore recently created PVCs in the previous batch
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41389'>SPARK-41389</a>] -         Reuse `WRONG_NUM_ARGS` instead of `_LEGACY_ERROR_TEMP_1044`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41394'>SPARK-41394</a>] -         Skip MemoryProfilerTests when pandas is not installed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41397'>SPARK-41397</a>] -         Implement part of string/binary functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41398'>SPARK-41398</a>] -         SPJ: Relax constraints on Storage-Partitioned Join when partition keys after runtime filtering do not match
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41399'>SPARK-41399</a>] -         Refactor column related tests to test_connect_column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41403'>SPARK-41403</a>] -         Implement DataFrame.describe
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41406'>SPARK-41406</a>] -         Refactor error message for `NUM_COLUMNS_MISMATCH` to make it more generic
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41407'>SPARK-41407</a>] -         Pull out v1 write to WriteFiles
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41409'>SPARK-41409</a>] -         Reuse `WRONG_NUM_ARGS` instead of `_LEGACY_ERROR_TEMP_1043`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41410'>SPARK-41410</a>] -         Support PVC-oriented executor pod allocation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41412'>SPARK-41412</a>] -         Implement `Cast`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41413'>SPARK-41413</a>] -         SPJ: Avoid shuffle when partition keys mismatch, but join expressions are compatible
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41414'>SPARK-41414</a>] -         Implement date/timestamp functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41417'>SPARK-41417</a>] -         Assign a name to the error class _LEGACY_ERROR_TEMP_0019
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41420'>SPARK-41420</a>] -         Protobuf serializer for ApplicationInfoWrapper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41421'>SPARK-41421</a>] -         Protobuf serializer for ApplicationEnvironmentInfoWrapper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41422'>SPARK-41422</a>] -         Protobuf serializer for ExecutorSummaryWrapper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41423'>SPARK-41423</a>] -         Protobuf serializer for StageDataWrapper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41424'>SPARK-41424</a>] -         Protobuf serializer for TaskDataWrapper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41425'>SPARK-41425</a>] -         Protobuf serializer for RDDStorageInfoWrapper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41426'>SPARK-41426</a>] -         Protobuf serializer for ResourceProfileWrapper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41427'>SPARK-41427</a>] -         Protobuf serializer for ExecutorStageSummaryWrapper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41428'>SPARK-41428</a>] -         Protobuf serializer for SpeculationStageSummaryWrapper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41429'>SPARK-41429</a>] -         Protobuf serializer for RDDOperationGraphWrapper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41430'>SPARK-41430</a>] -         Protobuf serializer for ProcessSummaryWrapper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41431'>SPARK-41431</a>] -         Protobuf serializer for SQLExecutionUIData
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41432'>SPARK-41432</a>] -         Protobuf serializer for SparkPlanGraphWrapper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41433'>SPARK-41433</a>] -         Make Max Arrow BatchSize configurable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41434'>SPARK-41434</a>] -         Support LambdaFunction expresssion
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41435'>SPARK-41435</a>] -         Make `curdate()` throw `WRONG_NUM_ARGS ` instead of  `_LEGACY_ERROR_TEMP_1043 ` when args is not null
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41436'>SPARK-41436</a>] -         Implement `collection` functions: A~C
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41438'>SPARK-41438</a>] -         Implement DataFrame. colRegex
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41439'>SPARK-41439</a>] -         Implement `DataFrame.melt` and `DataFrame.unpivot`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41440'>SPARK-41440</a>] -         Implement DataFrame.randomSplit
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41441'>SPARK-41441</a>] -         Allow Generate with no required child output to host outer references
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41443'>SPARK-41443</a>] -         Assign a name to the error class _LEGACY_ERROR_TEMP_1061
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41444'>SPARK-41444</a>] -         Implement DataFrameReader.json
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41445'>SPARK-41445</a>] -         Implement DataFrameReader.parquet
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41446'>SPARK-41446</a>] -         Make `createDataFrame` support schema and more input dataset types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41453'>SPARK-41453</a>] -         Implement DataFrame.subtract
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41455'>SPARK-41455</a>] -         Resolve dtypes inconsistencies of date/timestamp functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41457'>SPARK-41457</a>] -         Refactor pandas, pyarrow and grpc check in tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41461'>SPARK-41461</a>] -         protoc-3.21.9-linux-x86_64.exe requires GLIBC_2.14
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41462'>SPARK-41462</a>] -         Date and timestamp type can up cast to TimestampNTZ
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41464'>SPARK-41464</a>] -         Implement DataFrame.to
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41465'>SPARK-41465</a>] -         Assign a name to the error class _LEGACY_ERROR_TEMP_1235
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41470'>SPARK-41470</a>] -         SPJ: Spark shouldn&#39;t assume InternalRow implements equals and hashCode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41472'>SPARK-41472</a>] -         Implement the rest of string/binary functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41473'>SPARK-41473</a>] -         Implement `functions.format_number`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41477'>SPARK-41477</a>] -         Correctly infer the datatype of literal integers
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41478'>SPARK-41478</a>] -         Assign a name to the error class _LEGACY_ERROR_TEMP_1234
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41479'>SPARK-41479</a>] -         Add `IPv4 and IPv6` section to K8s document
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41481'>SPARK-41481</a>] -         Reuse `INVALID_TYPED_LITERAL` instead of `_LEGACY_ERROR_TEMP_0020`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41484'>SPARK-41484</a>] -         Implement `collection` functions: E~M
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41485'>SPARK-41485</a>] -         Unify the environment variable of *_PROTOC_EXEC_PATH
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41488'>SPARK-41488</a>] -         Assign name to _LEGACY_ERROR_TEMP_1176
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41489'>SPARK-41489</a>] -         Assign name to _LEGACY_ERROR_TEMP_2415
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41490'>SPARK-41490</a>] -         Assign name to _LEGACY_ERROR_TEMP_2441
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41492'>SPARK-41492</a>] -         implement MISC function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41493'>SPARK-41493</a>] -         Make csv functions support options
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41495'>SPARK-41495</a>] -         Implement `collection` functions: P~Z
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41502'>SPARK-41502</a>] -         Upgrade the minimum Minikube version to 1.28.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41503'>SPARK-41503</a>] -         Implement Partition Transformation Functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41506'>SPARK-41506</a>] -         Refactor LiteralExpression to support DataType
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41508'>SPARK-41508</a>] -          Assign name to _LEGACY_ERROR_TEMP_1179 and unwrap the existing SparkThrowable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41513'>SPARK-41513</a>] -         Implement a Accumulator to collect per mapper row count metrics
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41514'>SPARK-41514</a>] -         Add `PVC-oriented executor pod allocation` section and revise config name
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41518'>SPARK-41518</a>] -         Assign a name to the error class _LEGACY_ERROR_TEMP_2422
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41525'>SPARK-41525</a>] -         Improve onNewSnapshots to use unique list of known executor IDs and PVC names
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41526'>SPARK-41526</a>] -         Implement `Column.isin`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41528'>SPARK-41528</a>] -         Sharing namespace between PySpark and Spark Connect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41529'>SPARK-41529</a>] -         Implement SparkSession.stop
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41533'>SPARK-41533</a>] -         GRPC Errors on the client should be cleaned up
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41536'>SPARK-41536</a>] -         Remove `Dynamic Resource Allocation` from K8s Future Work
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41540'>SPARK-41540</a>] -         Add `DISK_USED` executor roll policy
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41542'>SPARK-41542</a>] -         Run Coverage report for Spark Connect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41543'>SPARK-41543</a>] -         Add `TOTAL_SHUFFLE_WRITE` executor roll policy
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41546'>SPARK-41546</a>] -         pyspark_types_to_proto_types should supports StructType.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41548'>SPARK-41548</a>] -         Disable ANSI mode in pyspark.sql.tests.connect.test_connect_functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41552'>SPARK-41552</a>] -         Upgrade `kubernetes-client` to 6.3.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41565'>SPARK-41565</a>] -         Add the error class UNRESOLVED_ROUTINE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41568'>SPARK-41568</a>] -         Assign name to _LEGACY_ERROR_TEMP_1236
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41571'>SPARK-41571</a>] -         Assign name to _LEGACY_ERROR_TEMP_2310
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41572'>SPARK-41572</a>] -         Assign name to _LEGACY_ERROR_TEMP_2149
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41573'>SPARK-41573</a>] -         Assign name to _LEGACY_ERROR_TEMP_2136
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41574'>SPARK-41574</a>] -         Assign name to _LEGACY_ERROR_TEMP_2009
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41575'>SPARK-41575</a>] -         Assign name to _LEGACY_ERROR_TEMP_2054
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41576'>SPARK-41576</a>] -         Assign name to _LEGACY_ERROR_TEMP_2051
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41578'>SPARK-41578</a>] -         Assign name to _LEGACY_ERROR_TEMP_2141
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41579'>SPARK-41579</a>] -         Assign name to _LEGACY_ERROR_TEMP_1249
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41580'>SPARK-41580</a>] -         Assign name to _LEGACY_ERROR_TEMP_2137
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41581'>SPARK-41581</a>] -         Assign name to _LEGACY_ERROR_TEMP_1230
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41582'>SPARK-41582</a>] -         Reuse `INVALID_TYPED_LITERAL` instead of `_LEGACY_ERROR_TEMP_0022`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41583'>SPARK-41583</a>] -         Add Spark Connect and protobuf into setup.py with specifying dependencies
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41586'>SPARK-41586</a>] -         Introduce new PySpark package: pyspark.errors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41591'>SPARK-41591</a>] -         Implement functionality for training a PyTorch file locally
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41592'>SPARK-41592</a>] -         Implement functionality for training a PyTorch file on the executors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41593'>SPARK-41593</a>] -         Implement logging from the executor nodes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41595'>SPARK-41595</a>] -         Support generator function explode/explode_outer in the FROM clause
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41598'>SPARK-41598</a>] -         Migrate the errors from `pyspark/sql/functions.py` into error class.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41600'>SPARK-41600</a>] -         Support Catalog.cacheTable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41612'>SPARK-41612</a>] -         Support Catalog.isCached
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41623'>SPARK-41623</a>] -         Support Catalog.uncacheTable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41629'>SPARK-41629</a>] -         Support for protocol extensions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41630'>SPARK-41630</a>] -         Support lateral column alias in Project code path
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41631'>SPARK-41631</a>] -         Support lateral column alias in Aggregate code path
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41640'>SPARK-41640</a>] -         implement `Window` functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41641'>SPARK-41641</a>] -         Implement `Column.over`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41643'>SPARK-41643</a>] -         Deduplicate docstrings in pyspark.sql.connect.column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41644'>SPARK-41644</a>] -         Introducing SPI mechanism to make it easy for other modules to register ProtoBufSerializer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41645'>SPARK-41645</a>] -         Deduplicate docstrings in pyspark.sql.connect.dataframe
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41647'>SPARK-41647</a>] -         Deduplicate docstrings in pyspark.sql.connect.functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41648'>SPARK-41648</a>] -         Deduplicate docstrings in pyspark.sql.connect.readwriter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41649'>SPARK-41649</a>] -         Deduplicate docstrings in pyspark.sql.connect.window
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41654'>SPARK-41654</a>] -         Enable doctests in pyspark.sql.connect.window
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41655'>SPARK-41655</a>] -         Enable doctests in pyspark.sql.connect.column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41656'>SPARK-41656</a>] -         Enable doctests in pyspark.sql.connect.dataframe
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41657'>SPARK-41657</a>] -         Enable doctests in pyspark.sql.connect.session
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41659'>SPARK-41659</a>] -         Enable doctests in pyspark.sql.connect.readwriter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41663'>SPARK-41663</a>] -         Implement the rest of Lambda functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41672'>SPARK-41672</a>] -         Enable the deprecated functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41673'>SPARK-41673</a>] -         Implement `Column.astype`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41675'>SPARK-41675</a>] -         Make column op support `datetime`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41676'>SPARK-41676</a>] -         Protobuf serializer for StreamingQueryData
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41677'>SPARK-41677</a>] -         Protobuf serializer for StreamingQueryProgressWrapper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41679'>SPARK-41679</a>] -         Protobuf serializer for StreamBlockData
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41680'>SPARK-41680</a>] -         Protobuf serializer for CachedQuantile
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41681'>SPARK-41681</a>] -         Factor GroupedData out to group.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41685'>SPARK-41685</a>] -         Support optional using Protobuf serializer for KVStore in History server
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41687'>SPARK-41687</a>] -         Deduplicate docstrings in pyspark.sql.connect.group
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41688'>SPARK-41688</a>] -         Move Expressions to expressions.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41689'>SPARK-41689</a>] -         Enable doctests in pyspark.sql.connect.group
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41692'>SPARK-41692</a>] -         implement `DataFrame.rollup`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41693'>SPARK-41693</a>] -         Implement `GroupedData.pivot`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41694'>SPARK-41694</a>] -         Add new config to clean up `spark.ui.store.path` directory when SparkContext.stop()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41697'>SPARK-41697</a>] -         Enable test_df_show, test_drop, test_dropna, test_toDF_with_schema_string and test_with_columns_renamed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41698'>SPARK-41698</a>] -         Enable 16 tests that pass
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41699'>SPARK-41699</a>] -         Upgrade buf to v1.11.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41700'>SPARK-41700</a>] -         Remove `FunctionBuilder`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41701'>SPARK-41701</a>] -         Make column op support `decimal`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41702'>SPARK-41702</a>] -         Add invalid ops
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41703'>SPARK-41703</a>] -         Combine NullType and typed_null
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41706'>SPARK-41706</a>] -         pyspark_types_to_proto_types should supports MapType
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41707'>SPARK-41707</a>] -         Implement initial Catalog.* API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41708'>SPARK-41708</a>] -         Pull v1write information to WriteFiles
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41709'>SPARK-41709</a>] -         Explicitly define `Seq` as `collection.Seq` to reduce `toSeq` when create ui objects from protobuf objects for Scala 2.13
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41710'>SPARK-41710</a>] -         Implement `Column.between`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41712'>SPARK-41712</a>] -         Migrate the Spark Connect errors into PySpark error framework.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41713'>SPARK-41713</a>] -         Make CTAS hold a nested execution for data writing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41715'>SPARK-41715</a>] -         Catch specific exceptions for both Spark Connect and PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41716'>SPARK-41716</a>] -         Factor pyspark.sql.connect.Catalog._catalog_to_pandas to client.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41717'>SPARK-41717</a>] -         Implement the command logic for print and _repr_html_
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41721'>SPARK-41721</a>] -         Enable doctests in pyspark.sql.connect.catalog
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41722'>SPARK-41722</a>] -         Implement time window functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41723'>SPARK-41723</a>] -         Implement `sequence` function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41724'>SPARK-41724</a>] -         Implement `call_udf` function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41725'>SPARK-41725</a>] -         Remove the workaround of sql(...).collect back in PySpark tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41726'>SPARK-41726</a>] -         Remove OptimizedCreateHiveTableAsSelectCommand
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41728'>SPARK-41728</a>] -         Implement `unwrap_udt` function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41729'>SPARK-41729</a>] -          Assign name to _LEGACY_ERROR_TEMP_0011
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41731'>SPARK-41731</a>] -         Implement the column accessor
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41734'>SPARK-41734</a>] -         Wrap catalog messages into a parent message
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41736'>SPARK-41736</a>] -         pyspark_types_to_proto_types should supports ArrayType
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41737'>SPARK-41737</a>] -         Implement `GroupedData.{min, max, avg, sum}`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41738'>SPARK-41738</a>] -         Client ID should be mixed into SparkSession cache
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41740'>SPARK-41740</a>] -         Implement `Column.name`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41742'>SPARK-41742</a>] -         Support star in groupBy.agg()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41743'>SPARK-41743</a>] -         groupBy(...).agg(...).sort does not actually sort the output
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41744'>SPARK-41744</a>] -         Support multiple arguments in groupBy.max(...)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41745'>SPARK-41745</a>] -         SparkSession.createDataFrame does not respect the column names in the row
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41746'>SPARK-41746</a>] -         SparkSession.createDataFrame does not support nested datatypes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41747'>SPARK-41747</a>] -         Support multiple arguments in groupBy.avg(...)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41748'>SPARK-41748</a>] -         Support multiple arguments in groupBy.min(...)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41749'>SPARK-41749</a>] -         Support multiple arguments in groupBy.sum(...)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41751'>SPARK-41751</a>] -         Support Column.bitwiseAND,bitwiseOR,bitwiseXOR,eqNullSafe,isNotNull,isNull,isin
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41754'>SPARK-41754</a>] -         Add simple developer guides for UI protobuf serializer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41757'>SPARK-41757</a>] -         Compatibility of string representation in Column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41759'>SPARK-41759</a>] -         Use `weakIntern` on string values in create new objects during deserialization
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41761'>SPARK-41761</a>] -         Fix arithmetic ops: negate, pow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41764'>SPARK-41764</a>] -         Make the internal string op name consistent with FunctionRegistry
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41767'>SPARK-41767</a>] -         Implement `Column.{withField, dropFields}`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41768'>SPARK-41768</a>] -         Refactor the definition of enum - `JobExecutionStatus` to follow with the code style 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41770'>SPARK-41770</a>] -         eqNullSafe does not support None as its argument
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41771'>SPARK-41771</a>] -         __getitem__ does not work with Column.isin
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41772'>SPARK-41772</a>] -         Enable pyspark.sql.connect.column.Column.withField doctest
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41773'>SPARK-41773</a>] -         Window.partitionBy is not respected with row_number 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41775'>SPARK-41775</a>] -         Implement training functions as input
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41777'>SPARK-41777</a>] -         Add Integration Tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41779'>SPARK-41779</a>] -         Make getitem support filter and select
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41783'>SPARK-41783</a>] -         Make column op support None
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41784'>SPARK-41784</a>] -         Add missing `__rmod__`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41785'>SPARK-41785</a>] -         Implement `GroupedData.mean`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41786'>SPARK-41786</a>] -         Deduplicate helper functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41789'>SPARK-41789</a>] -         Make `createDataFrame` support list of Rows
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41796'>SPARK-41796</a>] -         Test the error class: UNSUPPORTED_CORRELATED_REFERENCE_DATA_TYPE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41797'>SPARK-41797</a>] -         Enable test for `array_repeat`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41799'>SPARK-41799</a>] -         Combine plan-related tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41803'>SPARK-41803</a>] -         log() function variations are missing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41807'>SPARK-41807</a>] -         Remove non-existent error class: UNSUPPORTED_FEATURE.DISTRIBUTE_BY
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41808'>SPARK-41808</a>] -         Make json functions support options
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41809'>SPARK-41809</a>] -         Make json functions support DataType Schema
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41810'>SPARK-41810</a>] -         SparkSession.createDataFrame does not respect the column names in the dictionary
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41812'>SPARK-41812</a>] -         DataFrame.join: ambiguous column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41815'>SPARK-41815</a>] -         Column.isNull returns nan instead of None
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41817'>SPARK-41817</a>] -         SparkSession.read support reading with schema
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41821'>SPARK-41821</a>] -         Fix DataFrame.describe
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41824'>SPARK-41824</a>] -         Implement DataFrame.explain format to be similar to PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41825'>SPARK-41825</a>] -         DataFrame.show formatting int as double
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41827'>SPARK-41827</a>] -         DataFrame.groupBy requires all cols be Column or str
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41828'>SPARK-41828</a>] -         Implement creating empty Dataframe
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41829'>SPARK-41829</a>] -         Implement Dataframe.sort,sortWithinPartitions Ordering
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41830'>SPARK-41830</a>] -         Fix DataFrame.sample parameters
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41831'>SPARK-41831</a>] -         DataFrame.transform: Only Column or String can be used for projections
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41832'>SPARK-41832</a>] -         DataFrame.unionByName output is wrong
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41833'>SPARK-41833</a>] -         DataFrame.collect() output parity with pyspark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41834'>SPARK-41834</a>] -         Implement SparkSession.conf
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41835'>SPARK-41835</a>] -         Implement `transform_keys` function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41836'>SPARK-41836</a>] -         Implement `transform_values` function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41837'>SPARK-41837</a>] -         DataFrame.createDataFrame datatype conversion error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41838'>SPARK-41838</a>] -         DataFrame.show() fix map printing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41840'>SPARK-41840</a>] -         DataFrame.show(): &#39;Column&#39; object is not callable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41842'>SPARK-41842</a>] -         Support data type Timestamp(NANOSECOND, null)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41844'>SPARK-41844</a>] -         Implement `intX2` function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41845'>SPARK-41845</a>] -         Fix `count(expr(&quot;*&quot;))` function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41846'>SPARK-41846</a>] -         DataFrame windowspec functions : unresolved columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41847'>SPARK-41847</a>] -         DataFrame mapfield,structlist invalid type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41849'>SPARK-41849</a>] -         Implement DataFrameReader.text
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41850'>SPARK-41850</a>] -         Fix `isnan` function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41851'>SPARK-41851</a>] -         Fix `nanvl` function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41852'>SPARK-41852</a>] -         Fix `pmod` function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41855'>SPARK-41855</a>] -         `createDataFrame` doesn&#39;t handle None/NaN properly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41856'>SPARK-41856</a>] -         Enable test_freqItems, test_input_files, test_toDF_with_schema_string, test_to_pandas_required_pandas_not_found
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41857'>SPARK-41857</a>] -         Enable test_between_function, test_datetime_functions, test_expr, test_math_functions, test_window_functions_cumulative_sum, test_corr, test_cov, test_crosstab, test_approxQuantile
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41862'>SPARK-41862</a>] -         Fix a correctness bug in existence DEFAULT value lookups for the Orc data source
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41866'>SPARK-41866</a>] -         Make `createDataFrame` support array
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41868'>SPARK-41868</a>] -         Support data type Duration(NANOSECOND)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41869'>SPARK-41869</a>] -         DataFrame dropDuplicates should throw error on non list argument
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41870'>SPARK-41870</a>] -         Handle duplicate columns in `createDataFrame`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41871'>SPARK-41871</a>] -         DataFrame hint parameter can be str, float or int
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41872'>SPARK-41872</a>] -         Fix DataFrame createDataframe handling of None
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41874'>SPARK-41874</a>] -         Implement DataFrame `sameSemantics`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41875'>SPARK-41875</a>] -         Throw proper errors in Dataset.to()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41876'>SPARK-41876</a>] -         Implement DataFrame `toLocalIterator`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41877'>SPARK-41877</a>] -         SparkSession.createDataFrame error parity
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41878'>SPARK-41878</a>] -         Add JIRAs or messages for skipped tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41879'>SPARK-41879</a>] -         `DataFrame.collect` should support nested types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41880'>SPARK-41880</a>] -         Function `from_json` should support non-literal expression
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41881'>SPARK-41881</a>] -         `DataFrame.collect` should handle None/NaN properly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41882'>SPARK-41882</a>] -         Add tests for SQLAppStatusStore with RocksDB Backend
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41884'>SPARK-41884</a>] -         DataFrame `toPandas` parity in return types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41886'>SPARK-41886</a>] -         `DataFrame.intersect` doctest output has different order
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41887'>SPARK-41887</a>] -         Support DataFrame hint parameter to be list
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41889'>SPARK-41889</a>] -         Attach root cause to invalidPatternError
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41890'>SPARK-41890</a>] -         Reduce `toSeq` in `RDDOperationGraphWrapperSerializer`/SparkPlanGraphWrapperSerializer` for Scala 2.13
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41891'>SPARK-41891</a>] -         Enable test_add_months_function, test_array_repeat, test_dayofweek, test_first_last_ignorenulls, test_function_parity, test_inline, test_window_time, test_reciprocal_trig_functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41892'>SPARK-41892</a>] -         Add JIRAs or messages for skipped messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41895'>SPARK-41895</a>] -         Add tests for streaming UI with RocksDB backend
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41897'>SPARK-41897</a>] -         Parity in Error types between pyspark and connect functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41898'>SPARK-41898</a>] -         Window.rowsBetween should handle `float(&quot;-inf&quot;)` and `float(&quot;+inf&quot;)` as argument
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41899'>SPARK-41899</a>] -         DataFrame.createDataFrame converting int to bigint
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41900'>SPARK-41900</a>] -         Support data type int8
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41901'>SPARK-41901</a>] -         Parity in String representation of Column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41902'>SPARK-41902</a>] -         Parity in String representation of higher_order_function&#39;s output
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41903'>SPARK-41903</a>] -         Support data type ndarray
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41905'>SPARK-41905</a>] -         Function `slice` should handle string in params
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41906'>SPARK-41906</a>] -         Handle Function `rand() `
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41907'>SPARK-41907</a>] -         Function `sampleby` return parity
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41921'>SPARK-41921</a>] -         Enable doctests in connect.column and connect.functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41923'>SPARK-41923</a>] -         Add `DataFrame.writeTo` to the unsupported list
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41924'>SPARK-41924</a>] -         Make StructType support metadata and Implement `DataFrame.withMetadata`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41926'>SPARK-41926</a>] -         Add Github action test job with RocksDB as UI backend
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41927'>SPARK-41927</a>] -         Add the unsupported list for `GroupedData`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41928'>SPARK-41928</a>] -         Add the unsupported list for functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41929'>SPARK-41929</a>] -         Add function array_compact
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41933'>SPARK-41933</a>] -         Provide local mode that automatically starts the server
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41934'>SPARK-41934</a>] -         Add the unsupported function list for `session`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41936'>SPARK-41936</a>] -         Make `withMetadata` reuse the `withColumns` proto
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41939'>SPARK-41939</a>] -         Add the unsupported list for catalog functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41944'>SPARK-41944</a>] -         Pass configurations when local remote mode is on
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41945'>SPARK-41945</a>] -         Python: connect client lost column data with pyarrow.Table.to_pylist
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41957'>SPARK-41957</a>] -         Enable the doctest for `DataFrame.hint`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41959'>SPARK-41959</a>] -         Improve v1 writes with empty2null
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41960'>SPARK-41960</a>] -         Assign name to _LEGACY_ERROR_TEMP_1056
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41961'>SPARK-41961</a>] -         Support table-valued functions with LATERAL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41963'>SPARK-41963</a>] -         Different exception message in DataFrame.unpivot
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41964'>SPARK-41964</a>] -         Add the unsupported function list
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41968'>SPARK-41968</a>] -         Refactor ProtobufSerDe to ProtobufSerDe[T]
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41973'>SPARK-41973</a>] -         Assign name to _LEGACY_ERROR_TEMP_1311
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41974'>SPARK-41974</a>] -         Turn `INCORRECT_END_OFFSET` into `INTERNAL_ERROR`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41975'>SPARK-41975</a>] -         Improve error message for `INDEX_ALREADY_EXISTS`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41976'>SPARK-41976</a>] -         Improve error message for `INDEX_NOT_FOUND`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41977'>SPARK-41977</a>] -         Enable test_generic_hints
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41978'>SPARK-41978</a>] -         SparkSession.range to take float as arguments
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41980'>SPARK-41980</a>] -         Enable test_functions_broadcast
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41983'>SPARK-41983</a>] -         Rename error class: NULL_COMPARISON_RESULT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41984'>SPARK-41984</a>] -         Rename &amp; improve error message for RESET_PERMISSION_TO_ORIGINAL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41988'>SPARK-41988</a>] -         Fix map_filter and map_zip_with output order
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41999'>SPARK-41999</a>] -         NPE for bucketed write (ReadwriterTests.test_bucketed_write)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42000'>SPARK-42000</a>] -         saveAsTable fail to find the default source (ReadwriterTests.test_insert_into)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42001'>SPARK-42001</a>] -         Unexpected schema set to DefaultSource plan (ReadwriterTests.test_save_and_load)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42002'>SPARK-42002</a>] -         Implement DataFrameWriterV2 (ReadwriterV2Tests)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42004'>SPARK-42004</a>] -         Migrate &quot;XX000&quot; sqlState onto `INTERNAL_ERROR`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42007'>SPARK-42007</a>] -         Reuse pyspark.sql.tests.test_group test cases
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42008'>SPARK-42008</a>] -         Reuse pyspark.sql.tests.test_datasources test cases 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42009'>SPARK-42009</a>] -         Reuse pyspark.sql.tests.test_serde test cases 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42010'>SPARK-42010</a>] -         Reuse pyspark.sql.tests.test_column test cases
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42011'>SPARK-42011</a>] -         Implement DataFrameReader.csv
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42012'>SPARK-42012</a>] -         Implement DataFrameReader.orc
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42013'>SPARK-42013</a>] -         Implement DataFrameReader.text to take multiple paths
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42014'>SPARK-42014</a>] -         Support aware datetimes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42016'>SPARK-42016</a>] -         Type inconsistency of struct and map when accessing the nested column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42019'>SPARK-42019</a>] -         Reuse pyspark.sql.tests.test_types test cases
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42021'>SPARK-42021</a>] -         createDataFrame with array.array
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42022'>SPARK-42022</a>] -         createDataFrame should autogenerate missing column names
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42023'>SPARK-42023</a>] -         createDataFrame should corse types of string false to bool false
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42026'>SPARK-42026</a>] -         Protobuf serializer for AppSummary and PoolData
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42028'>SPARK-42028</a>] -         Support Pandas DF to Spark DF with Nanosecond Timestamps
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42029'>SPARK-42029</a>] -         Distribution build for Spark Connect does not work with Spark Shell
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42032'>SPARK-42032</a>] -         Map data show in different order
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42038'>SPARK-42038</a>] -         SPJ: Support partially clustered distribution
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42039'>SPARK-42039</a>] -         SPJ: Remove Option in KeyGroupedPartitioning#partitionValues
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42041'>SPARK-42041</a>] -         DataFrameReader should support list of paths
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42042'>SPARK-42042</a>] -         DataFrameReader should support StructType schema
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42044'>SPARK-42044</a>] -         Fix wrong error message for `MUST_AGGREGATE_CORRELATED_SCALAR_SUBQUERY`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42045'>SPARK-42045</a>] -         Round/Bround should return an error on integral overflow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42047'>SPARK-42047</a>] -         Literal should support numpy datatypes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42048'>SPARK-42048</a>] -         Different column name of lit(np.int8)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42062'>SPARK-42062</a>] -         Enforce scalafmt for connect-common
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42063'>SPARK-42063</a>] -         Register `byte[][]` to KyroSerializer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42070'>SPARK-42070</a>] -         Change the default value of argument of Mask udf from -1 to NULL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42071'>SPARK-42071</a>] -         Register scala.math.Ordering$Reverse to KyroSerializer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42073'>SPARK-42073</a>] -         Enable pyspark.sql.tests.test_types 2 test cases
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42074'>SPARK-42074</a>] -         Enable KryoSerializer in TPCDSQueryBenchmark to enforce SQL class registration
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42076'>SPARK-42076</a>] -         Factor data conversion `arrow -&gt; rows` out to `conversion.py`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42077'>SPARK-42077</a>] -         Literal should throw TypeError for unsupported DataType
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42078'>SPARK-42078</a>] -         Migrate errors thrown by JVM into PySpark Exception.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42079'>SPARK-42079</a>] -         Rename proto messages for `toDF` and `withColumnsRenamed`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42080'>SPARK-42080</a>] -         Add guideline for PySpark errors.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42082'>SPARK-42082</a>] -         Introduce `PySparkValueError` and `PySparkTypeError`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42085'>SPARK-42085</a>] -         Make `from_arrow_schema` support nested types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42089'>SPARK-42089</a>] -         Different result in nested lambda function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42095'>SPARK-42095</a>] -         Fix gRPC check in tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42097'>SPARK-42097</a>] -         Register SerializedLambda and BitSet to KryoSerializer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42099'>SPARK-42099</a>] -         Make `count(*)` work correctly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42100'>SPARK-42100</a>] -         Protect null `SQLExecutionUIData#description` in `SQLExecutionUIDataSerializer`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42119'>SPARK-42119</a>] -         Add built-in table-valued functions inline and inline_outer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42120'>SPARK-42120</a>] -         Add built-in table-valued function json_tuple
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42121'>SPARK-42121</a>] -         Add built-in table-valued functions posexplode and posexplode_outer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42122'>SPARK-42122</a>] -         Add built-in table-valued function stack
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42123'>SPARK-42123</a>] -         Include column default values in DESCRIBE output for V1 tables
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42124'>SPARK-42124</a>] -         Scalar Inline Python UDF in Spark Connect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42130'>SPARK-42130</a>] -         Handle null string values in AccumulableInfo and ProcessSummary
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42137'>SPARK-42137</a>] -         Enable spark.kryo.unsafe by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42138'>SPARK-42138</a>] -         Handle null string values in JobData/TaskDataWrapper/ExecutorStageSummaryWrapper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42139'>SPARK-42139</a>] -         Handle null string values in SQLExecutionUIData/SQLPlanMetric/SparkPlanGraphWrapper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42140'>SPARK-42140</a>] -         Handle null string values in ApplicationEnvironmentInfoWrapper/ApplicationInfoWrapper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42142'>SPARK-42142</a>] -         Handle null string values in CachedQuantile/ExecutorSummary/PoolData
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42143'>SPARK-42143</a>] -         Handle null string values in RDDStorageInfo/RDDDataDistribution/RDDPartitionInfo
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42144'>SPARK-42144</a>] -         Handle null string values in StageData/StreamBlockData/StreamingQueryData
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42146'>SPARK-42146</a>] -         Refactor `Utils#setStringField` to make maven build pass when sql module use this method
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42148'>SPARK-42148</a>] -         Upgrade `kubernetes-client` to 6.4.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42150'>SPARK-42150</a>] -         Upgrade Volcano to 1.7.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42153'>SPARK-42153</a>] -         Handle null string values in PairStrings/RDDOperationNode/RDDOperationClusterWrapper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42154'>SPARK-42154</a>] -         Enable Volcano unit tests and integration tests in GitHub Action
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42164'>SPARK-42164</a>] -         Register partitioned-table-related classes to KryoSerializer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42173'>SPARK-42173</a>] -         IPv6 address mapping can fail with sparse addresses
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42178'>SPARK-42178</a>] -         Handle remaining null string values in ui protobuf serializer and add tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42182'>SPARK-42182</a>] -         Make `ReusedConnectTestCase` to take Spark configurations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42187'>SPARK-42187</a>] -         Avoid using RemoteSparkSession.builder.getOrCreate in tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42190'>SPARK-42190</a>] -         Support `local[*]` in `spark-submit` in K8s environment
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42192'>SPARK-42192</a>] -         Migrate the `TypeError` from `pyspark/sql/dataframe.py` into `PySparkTypeError`.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42197'>SPARK-42197</a>] -         Reuses JVM initialization, and separate configuration groups to set in remote local mode 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42210'>SPARK-42210</a>] -         Standardize registered pickled Python UDFs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42213'>SPARK-42213</a>] -         Failed to test ClientE2ETestSuite with maven
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42217'>SPARK-42217</a>] -         Support lateral column alias in queries with Window
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42221'>SPARK-42221</a>] -         Introduce a new conf for TimestampNTZ schema inference in JSON/CSV
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42224'>SPARK-42224</a>] -         Migrate `TypeError` into error framework for Spark Connect functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42225'>SPARK-42225</a>] -         Add `SparkConnectIllegalArgumentException` to handle Spark Connect error precisely.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42229'>SPARK-42229</a>] -         Migrate SparkCoreErrors into error class
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42231'>SPARK-42231</a>] -         Rename error class: MISSING_STATIC_PARTITION_COLUMN
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42232'>SPARK-42232</a>] -         Rename error class: UNSUPPORTED_FEATURE.JDBC_TRANSACTION
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42233'>SPARK-42233</a>] -         Improve error message for PIVOT_AFTER_GROUP_BY
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42234'>SPARK-42234</a>] -         Rename error class: UNSUPPORTED_FEATURE.REPEATED_PIVOT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42236'>SPARK-42236</a>] -         Refine `NULLABLE_ARRAY_OR_MAP_ELEMENT`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42238'>SPARK-42238</a>] -         Introduce `INCOMPATIBLE_JOIN_TYPES`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42239'>SPARK-42239</a>] -         Integrate MUST_AGGREGATE_CORRELATED_SCALAR_SUBQUERY
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42243'>SPARK-42243</a>] -         Use `spark.sql.inferTimestampNTZInDataSources.enabled` to infer timestamp type on partition columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42244'>SPARK-42244</a>] -         Refine error message by using Python types.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42249'>SPARK-42249</a>] -         Refining html strings in error messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42253'>SPARK-42253</a>] -         Add test for detecting duplicated error class
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42254'>SPARK-42254</a>] -         Assign name to _LEGACY_ERROR_TEMP_1117
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42255'>SPARK-42255</a>] -         Assign name to _LEGACY_ERROR_TEMP_2430
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42263'>SPARK-42263</a>] -         Implement `spark.catalog.registerFunction`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42266'>SPARK-42266</a>] -         Local mode should work with IPython
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42267'>SPARK-42267</a>] -         Support left_outer join
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42268'>SPARK-42268</a>] -         Add UserDefinedType in protos
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42269'>SPARK-42269</a>] -         Support complex return types in DDL strings
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42271'>SPARK-42271</a>] -         Reuse UDF test cases under `pyspark.sql.tests`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42272'>SPARK-42272</a>] -         Use available ephemeral port for Spark Connect server in testing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42273'>SPARK-42273</a>] -         Skip Spark Connect tests if dependencies are not installed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42275'>SPARK-42275</a>] -         Avoid using built-in list, dict in static typing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42278'>SPARK-42278</a>] -         DS V2 pushdown supports supports JDBC dialects compile `SortOrder` by themselves
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42281'>SPARK-42281</a>] -         Update Debugging PySpark documents to show error message properly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42294'>SPARK-42294</a>] -         Include column default values in DESCRIBE output for V2 tables
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42295'>SPARK-42295</a>] -         Tear down the test cleanly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42296'>SPARK-42296</a>] -         Apply spark.sql.inferTimestampNTZInDataSources.enabled on JDBC data source
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42297'>SPARK-42297</a>] -         Assign name to _LEGACY_ERROR_TEMP_2412
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42301'>SPARK-42301</a>] -         Assign name to _LEGACY_ERROR_TEMP_1129
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42302'>SPARK-42302</a>] -         Assign name to _LEGACY_ERROR_TEMP_2135
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42303'>SPARK-42303</a>] -         Assign name to _LEGACY_ERROR_TEMP_1326
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42305'>SPARK-42305</a>] -         Assign name to _LEGACY_ERROR_TEMP_1229
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42306'>SPARK-42306</a>] -         Assign name to _LEGACY_ERROR_TEMP_1317
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42310'>SPARK-42310</a>] -         Assign name to _LEGACY_ERROR_TEMP_1289
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42312'>SPARK-42312</a>] -         Assign name to _LEGACY_ERROR_TEMP_0042
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42313'>SPARK-42313</a>] -         Assign name to _LEGACY_ERROR_TEMP_1152
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42314'>SPARK-42314</a>] -         Assign name to _LEGACY_ERROR_TEMP_2127
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42315'>SPARK-42315</a>] -         Assign name to _LEGACY_ERROR_TEMP_2092
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42318'>SPARK-42318</a>] -         Assign name to _LEGACY_ERROR_TEMP_2125
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42319'>SPARK-42319</a>] -         Assign name to _LEGACY_ERROR_TEMP_2123
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42320'>SPARK-42320</a>] -         Assign name to _LEGACY_ERROR_TEMP_2188
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42324'>SPARK-42324</a>] -         Assign name to _LEGACY_ERROR_TEMP_1001
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42326'>SPARK-42326</a>] -         Assign name to _LEGACY_ERROR_TEMP_2099
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42327'>SPARK-42327</a>] -         Assign name to_LEGACY_ERROR_TEMP_2177
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42338'>SPARK-42338</a>] -         Different exception in DataFrame.sample
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42342'>SPARK-42342</a>] -         Introduce base hierarchy to exceptions.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42343'>SPARK-42343</a>] -         Ignore `IOException` in `handleBlockRemovalFailure` if SparkContext is stopped
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42345'>SPARK-42345</a>] -         Rename TimestampNTZ inference conf as spark.sql.sources.timestampNTZTypeInference.enabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42348'>SPARK-42348</a>] -         Add SQLSTATE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42357'>SPARK-42357</a>] -         Log `exitCode` when `SparkContext.stop` starts
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42363'>SPARK-42363</a>] -         Remove session.register_udf
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42367'>SPARK-42367</a>] -         DataFrame.drop should handle duplicated columns properly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42371'>SPARK-42371</a>] -         Add scripts to start and stop Spark Connect server
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42378'>SPARK-42378</a>] -         Make `DataFrame.select` support `a.*`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42381'>SPARK-42381</a>] -         `CreateDataFrame` should accept objects
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42402'>SPARK-42402</a>] -         Support parameterized SQL by sql()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42408'>SPARK-42408</a>] -         Register DoubleType to KryoSerializer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42419'>SPARK-42419</a>] -         Migrate `TypeError` into error framework for Spark Connect column API.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42420'>SPARK-42420</a>] -         Register WriteTaskResult, BasicWriteTaskStats, and ExecutedWriteSummary to KryoSerializer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42426'>SPARK-42426</a>] -         insertInto fails when the column names are different from the table columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42427'>SPARK-42427</a>] -         Conv should return an error if the internal conversion overflows
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42428'>SPARK-42428</a>] -         Standardize __repr__ of CommonInlineUserDefinedFunction
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42430'>SPARK-42430</a>] -         Add documentation for TimestampNTZ type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42431'>SPARK-42431</a>] -         Union avoid calling `output` before analysis
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42433'>SPARK-42433</a>] -         Add `array_insert` to Connect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42434'>SPARK-42434</a>] -         `array_append` should accept `Any` value
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42455'>SPARK-42455</a>] -         Rename JDBC option inferTimestampNTZType as preferTimestampNTZ
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42458'>SPARK-42458</a>] -         createDataFrame should support DDL string as schema
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42459'>SPARK-42459</a>] -         Create pyspark.sql.connect.utils to keep common codes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42468'>SPARK-42468</a>] -         Implement agg by (String, String)*
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42475'>SPARK-42475</a>] -         Getting Started: Live Notebook for Spark Connect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42476'>SPARK-42476</a>] -         Spark Connect API reference.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42481'>SPARK-42481</a>] -         Implement agg.{max,min,mean,count,avg,sum}
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42510'>SPARK-42510</a>] -         Implement `DataFrame.mapInPandas`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42521'>SPARK-42521</a>] -         Add NULL values for INSERT commands with user-specified lists of fewer columns than the target table
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42522'>SPARK-42522</a>] -         Fix DataFrameWriterV2 to find the default source
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42524'>SPARK-42524</a>] -         Upgrade numpy and pandas in the release Dockerfile
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42532'>SPARK-42532</a>] -         Update YuniKorn documentation with v1.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42545'>SPARK-42545</a>] -         Remove `experimental` from Volcano docs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42568'>SPARK-42568</a>] -         SparkConnectStreamHandler should manage configs properly while creating plans.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42574'>SPARK-42574</a>] -         DataFrame.toPandas should handle duplicated column names
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42593'>SPARK-42593</a>] -         Deprecate &amp; remove the APIs that will be removed in pandas 2.0.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42609'>SPARK-42609</a>] -         Add tests for grouping() and grouping_id() functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42612'>SPARK-42612</a>] -         Enable more parity tests related to functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42630'>SPARK-42630</a>] -         Make `parse_data_type` use new proto message `DDLParse`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42641'>SPARK-42641</a>] -         Upgrade buf to v1.15.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42643'>SPARK-42643</a>] -         Register Java (aggregate) user-defined functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42666'>SPARK-42666</a>] -         Fix `createDataFrame` to work properly with rows and schema
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42705'>SPARK-42705</a>] -         SparkSession.sql doesn&#39;t return values from commands.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42707'>SPARK-42707</a>] -         Remove experimental warning in developer documentation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42710'>SPARK-42710</a>] -         Rename FrameMap proto to MapPartitions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42723'>SPARK-42723</a>] -         Support parser data type json &quot;timestamp_ltz&quot; as TimestampType
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42724'>SPARK-42724</a>] -         Upgrade buf to v1.15.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42725'>SPARK-42725</a>] -         Make LiteralExpression support array
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42726'>SPARK-42726</a>] -         Implement `DataFrame.mapInArrow`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42739'>SPARK-42739</a>] -         Ensure release tag to be pushed to release branch
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42861'>SPARK-42861</a>] -         Review and fix issues in SQL API docs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42864'>SPARK-42864</a>] -         Review and fix issues in MLlib API docs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42865'>SPARK-42865</a>] -         Review and fix issues in Streaming API docs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42875'>SPARK-42875</a>] -         Fix toPandas to handle timezone and map types properly.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42889'>SPARK-42889</a>] -         Implement cache, persist, unpersist, and storageLevel
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42893'>SPARK-42893</a>] -         Block Arrow Python UDFs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42900'>SPARK-42900</a>] -         Fix createDataFrame to respect both type inference and column names.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42920'>SPARK-42920</a>] -         Python UDF with UDT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42983'>SPARK-42983</a>] -         Fix the error message of createDataFrame from np.array(0)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42998'>SPARK-42998</a>] -         Fix DataFrame.collect with null struct.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-43011'>SPARK-43011</a>] -         array_insert should fail with 0 index
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-43018'>SPARK-43018</a>] -         Fix bug with timestamp literals 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-43085'>SPARK-43085</a>] -         Fix bug in column DEFAULT assignment for target tables with multi-part names
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-44681'>SPARK-44681</a>] -         Solve issue referencing github.com/apache/spark-connect-go as Go library
</li>
</ul>
            
<h2>        Bug
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-8731'>SPARK-8731</a>] -         Beeline doesn&#39;t work with -e option when started in background
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-28090'>SPARK-28090</a>] -         Spark hangs when an execution plan has many projections on nested structs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-33782'>SPARK-33782</a>] -         Place spark.files, spark.jars and spark.files under the current working directory on the driver in K8S cluster mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34777'>SPARK-34777</a>] -         [UI] StagePage input size/records not show when records greater than zero
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35084'>SPARK-35084</a>] -         [k8s] On Spark 3, jars listed in spark.jars and spark.jars.packages are not added to sparkContext
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35542'>SPARK-35542</a>] -         Bucketizer created for multiple columns with parameters splitsArray,  inputCols and outputCols can not be loaded after saving it.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35579'>SPARK-35579</a>] -         Update to janino 3.1.7 to fix a bug
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37259'>SPARK-37259</a>] -         JDBC read is always going to wrap the query in a select statement
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38404'>SPARK-38404</a>] -         Spark does not find CTE inside nested CTE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38488'>SPARK-38488</a>] -         Spark doc build not work on Mac OS M1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38503'>SPARK-38503</a>] -         Add warn for getAdditionalPreKubernetesResources in executor side
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38510'>SPARK-38510</a>] -         Failure fetching JSON representation of Spark plans with Hive UDFs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38521'>SPARK-38521</a>] -         Throw Exception if overwriting hive partition table with dynamic and staticPartitionOverwriteMode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38597'>SPARK-38597</a>] -         Enable Spark on K8S integration tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38613'>SPARK-38613</a>] -         Fix RemoteBlockPushResolverSuite#testWritingPendingBufsIsAbortedImmediatelyDuringComplete
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38614'>SPARK-38614</a>] -         Don&#39;t push down limit through window that&#39;s using percent_rank
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38708'>SPARK-38708</a>] -         Upgrade Hive Metastore Client to the 3.1.3 for Hive 3.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38717'>SPARK-38717</a>] -         Handle Hive&#39;s bucket spec case preserving behaviour
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38799'>SPARK-38799</a>] -         Fix scala license declaration
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38802'>SPARK-38802</a>] -         Support spark.kubernetes.test.(driver|executor)RequestCores
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38846'>SPARK-38846</a>] -         Teradata&#39;s Number is either converted to its floor value or ceiling value despite its fractional part.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38870'>SPARK-38870</a>] -         SparkSession.builder returns a new builder in Scala, but not in Python
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38898'>SPARK-38898</a>] -         Failed to build python docker images due to .cache not found
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38918'>SPARK-38918</a>] -         Nested column pruning should filter out attributes that do not belong to the current relation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38956'>SPARK-38956</a>] -         Fix FAILED_EXECUTE_UDF test case on Java 17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38962'>SPARK-38962</a>] -         Fix wrong computeStats at DataSourceV2Relation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38969'>SPARK-38969</a>] -         Graceful decomissionning on Kubernetes fails / decom script error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38994'>SPARK-38994</a>] -         Add an Python example of StreamingQueryListener
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39015'>SPARK-39015</a>] -         SparkRuntimeException when trying to get non-existent key in a map
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39041'>SPARK-39041</a>] -         Mapping Spark Query ResultSet/Schema to TRowSet/TTableSchema directly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39060'>SPARK-39060</a>] -         Typo in error messages of decimal overflow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39079'>SPARK-39079</a>] -         Catalog name should not contain dot
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39104'>SPARK-39104</a>] -         Null Pointer Exeption on unpersist call
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39184'>SPARK-39184</a>] -         ArrayIndexOutOfBoundsException for some date/time sequences in some time-zones
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39221'>SPARK-39221</a>] -         sensitive information is not redacted correctly on thrift job/stage page
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39242'>SPARK-39242</a>] -         AwaitOffset does not wait correctly for atleast expected offset and RateStreamProvider test is flaky
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39259'>SPARK-39259</a>] -         Timestamps returned by now() and equivalent functions are not consistent in subqueries
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39296'>SPARK-39296</a>] -         Replcace `Array.toString` with `Array.mkString`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39313'>SPARK-39313</a>] -         V2ExpressionUtils.toCatalystOrdering should fail if V2Expression can not be translated
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39338'>SPARK-39338</a>] -         Remove dynamic pruning subquery if pruningKey&#39;s references is empty
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39340'>SPARK-39340</a>] -         DS v2 agg pushdown should allow dots in the name of top-level columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39347'>SPARK-39347</a>] -         Generate wrong time window when (timestamp-startTime) % slideDuration &lt; 0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39354'>SPARK-39354</a>] -         The analysis exception is incorrect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39355'>SPARK-39355</a>] -         Single column uses quoted to construct UnresolvedAttribute
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39391'>SPARK-39391</a>] -         Reuse Partitioner Classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39393'>SPARK-39393</a>] -         Parquet data source only supports push-down predicate filters for non-repeated primitive types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39396'>SPARK-39396</a>] -         Spark Thriftserver enabled LDAP，Error using beeline connection: error code 49 - invalid credentials
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39399'>SPARK-39399</a>] -         proxy-user not working for Spark on k8s in cluster deploy mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39400'>SPARK-39400</a>] -         spark-sql remain hive resource download dir after exit
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39401'>SPARK-39401</a>] -         Replace withView with withTempView in CTEInlineSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39404'>SPARK-39404</a>] -         Unable to query _metadata in streaming if getBatch returns multiple logical nodes in the DataFrame
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39411'>SPARK-39411</a>] -         Release candidates do not have the correct version for PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39412'>SPARK-39412</a>] -         IllegalStateException from connector does not work well with error class framework
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39417'>SPARK-39417</a>] -         Handle Null partition values in PartitioningUtils
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39421'>SPARK-39421</a>] -         Sphinx build fails with &quot;node class &#39;meta&#39; is already registered, its visitors will be overridden&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39427'>SPARK-39427</a>] -         Disable ANSI intervals in the percentile functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39437'>SPARK-39437</a>] -         normalize plan id separately in PlanStabilitySuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39444'>SPARK-39444</a>] -         Add OptimizeSubqueries into nonExcludableRules list
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39445'>SPARK-39445</a>] -         Remove the window if windowExpressions is empty in column pruning
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39447'>SPARK-39447</a>] -         Only non-broadcast query stage can propagate empty relation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39448'>SPARK-39448</a>] -         Add ReplaceCTERefWithRepartition into nonExcludableRules list
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39476'>SPARK-39476</a>] -         Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39493'>SPARK-39493</a>] -         Update ORC to 1.7.5
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39496'>SPARK-39496</a>] -         Inline eval path cannot handle null structs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39505'>SPARK-39505</a>] -         Escape log content rendered in UI
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39543'>SPARK-39543</a>] -         The option of DataFrameWriterV2 should be passed to storage properties if fallback to v1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39547'>SPARK-39547</a>] -         V2SessionCatalog should not throw NoSuchDatabaseException in loadNamespaceMetadata
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39548'>SPARK-39548</a>] -         CreateView Command with a window clause query hit a wrong window definition not found issue
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39551'>SPARK-39551</a>] -         Add AQE invalid plan check
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39570'>SPARK-39570</a>] -         inline table should allow expressions with alias
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39575'>SPARK-39575</a>] -         ByteBuffer forget to rewind after get in AvroDeserializer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39582'>SPARK-39582</a>] -         &quot;Since &lt;version&gt;&quot; docs on array_agg are incorrect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39596'>SPARK-39596</a>] -         Run `Linters, licenses, dependencies and documentation generation ` GitHub Actions failed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39601'>SPARK-39601</a>] -         AllocationFailure should not be treated as exitCausedByApp when driver is shutting down
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39612'>SPARK-39612</a>] -         The dataframe returned by exceptAll() can no longer perform operations such as count() or isEmpty(), or an exception will be thrown.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39614'>SPARK-39614</a>] -         K8s pod name follows `DNS Subdomain Names` rule
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39620'>SPARK-39620</a>] -         History server page and API are using inconsistent conditions to filter running applications
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39621'>SPARK-39621</a>] -         Make run-tests.py robust by avoiding `rmtree` usage
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39622'>SPARK-39622</a>] -         ParquetIOSuite fails intermittently on master branch
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39647'>SPARK-39647</a>] -         Block push fails with java.lang.IllegalArgumentException: Active local dirs list has not been updated by any executor registration even when the NodeManager hasn&#39;t been restarted
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39648'>SPARK-39648</a>] -         Fix type hints of `like`, `rlike`, `ilike` of Column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39650'>SPARK-39650</a>] -         Streaming Deduplication should not check the schema of &quot;value&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39672'>SPARK-39672</a>] -         NotExists subquery failed with conflicting attributes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39696'>SPARK-39696</a>] -         Uncaught exception in thread executor-heartbeater java.util.ConcurrentModificationException: mutation occurred during iteration
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39703'>SPARK-39703</a>] -         Mima complains with Scala 2.13 in the master branch
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39714'>SPARK-39714</a>] -         Resolve pyspark mypy part tests.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39731'>SPARK-39731</a>] -         Correctness issue when parsing dates with yyyyMMdd format in CSV and JSON
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39743'>SPARK-39743</a>] -         Unable to set zstd compression level while writing parquet files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39758'>SPARK-39758</a>] -         NPE on invalid patterns from the regexp functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39761'>SPARK-39761</a>] -         Add Apache Spark images info in running-on-kubernetes doc
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39775'>SPARK-39775</a>] -         Regression due to AVRO-2035
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39776'>SPARK-39776</a>] -         Join‘ verbose string didn&#39;t contains JoinType
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39783'>SPARK-39783</a>] -         Column backticks are misplaced in the AnalysisException [UNRESOLVED_COLUMN] error message when using field with &quot;.&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39829'>SPARK-39829</a>] -         Upgrade log4j2 to 2.18.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39830'>SPARK-39830</a>] -         Add a test case to read ORC table that requires type promotion
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39833'>SPARK-39833</a>] -         Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39835'>SPARK-39835</a>] -         Fix EliminateSorts remove global sort below the local sort
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39839'>SPARK-39839</a>] -         Handle special case of null variable-length Decimal with non-zero offsetAndSize in UnsafeRow structural integrity check
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39847'>SPARK-39847</a>] -         Race condition related to interruption of task threads while they are in RocksDBLoader.loadLibrary()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39848'>SPARK-39848</a>] -         Upgrade Kafka to 3.2.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39857'>SPARK-39857</a>] -         V2ExpressionBuilder uses the wrong LiteralValue data type for In predicate
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39867'>SPARK-39867</a>] -         Global limit should not inherit OrderPreservingUnaryNode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39880'>SPARK-39880</a>] -         V2 SHOW FUNCTIONS command should print qualified function name like v1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39887'>SPARK-39887</a>] -         Expression transform error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39895'>SPARK-39895</a>] -         pyspark drop doesn&#39;t accept *cols 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39896'>SPARK-39896</a>] -         The structural integrity of the plan is broken after UnwrapCastInBinaryComparison
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39900'>SPARK-39900</a>] -         Issue with querying dataframe produced by &#39;binaryFile&#39; format using &#39;not&#39; operator
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39915'>SPARK-39915</a>] -         Dataset.repartition(N) may not create N partitions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39932'>SPARK-39932</a>] -         WindowExec should clear the final partition buffer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39936'>SPARK-39936</a>] -         Spark View creation with hyphens in column-type names fails
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39939'>SPARK-39939</a>] -         shift() func need support periods=0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39940'>SPARK-39940</a>] -         Batch query cannot read the updates from streaming query if streaming query writes to the catalog table via DSv1 sink
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39943'>SPARK-39943</a>] -         Upgrade rocksdbjni to 7.4.4
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39945'>SPARK-39945</a>] -         Upgrade sbt-mima-plugin to 1.1.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39952'>SPARK-39952</a>] -         SaveIntoDataSourceCommand should recache result relation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39962'>SPARK-39962</a>] -         Global aggregation against pandas aggregate UDF does not take the column order into account
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39974'>SPARK-39974</a>] -         Create separate static image tag for infra cache
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39976'>SPARK-39976</a>] -         NULL check in ArrayIntersect adds extraneous null from first param
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39980'>SPARK-39980</a>] -         Change infra image to static tag
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39981'>SPARK-39981</a>] -         CheckOverflowInTableInsert returns exception rather than throwing it
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39988'>SPARK-39988</a>] -         LevelDBIterator not close after used in `RemoteBlockPushResolver`, `YarnShuffleService` and `ExternalShuffleBlockResolver` 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40002'>SPARK-40002</a>] -         Limit improperly pushed down through window using ntile function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40036'>SPARK-40036</a>] -         LevelDB/RocksDBIterator.next should return false after iterator or db close
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40045'>SPARK-40045</a>] -         The order of filtering predicates is not reasonable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40052'>SPARK-40052</a>] -         Handle direct byte buffers in VectorizedDeltaBinaryPackedReader
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40057'>SPARK-40057</a>] -         Cleanup &quot;&lt;BLANKLINE&gt;&quot; in doctest
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40079'>SPARK-40079</a>] -         Add Imputer inputCols validation for empty input case
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40089'>SPARK-40089</a>] -         Sorting of at least Decimal(20, 2) fails for some values near the max.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40094'>SPARK-40094</a>] -          Send TaskEnd event when task failed with NotSerializableException or TaskOutputFileAlreadyExistException to release executors for dynamic allocation 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40096'>SPARK-40096</a>] -         Finalize shuffle merge slow due to connection creation fails
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40114'>SPARK-40114</a>] -         Arrow 9.0.0 support with SparkR
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40117'>SPARK-40117</a>] -         Convert condition to java in DataFrameWriterV2.overwrite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40121'>SPARK-40121</a>] -         Initialize projection used for Python UDF
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40124'>SPARK-40124</a>] -         Update TPCDS v1.4 q32 for Plan Stability tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40132'>SPARK-40132</a>] -         MultilayerPerceptronClassifier rawPredictionCol param missing from setParams
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40134'>SPARK-40134</a>] -         Update ORC to 1.7.6
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40149'>SPARK-40149</a>] -         Star expansion after outer join asymmetrically includes joining key
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40151'>SPARK-40151</a>] -         Fix return type for new median(interval) function 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40152'>SPARK-40152</a>] -         Codegen compilation error when using split_part
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40156'>SPARK-40156</a>] -         url_decode() exposes a Java error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40168'>SPARK-40168</a>] -         Handle FileNotFoundException when shuffle file deleted in decommissioner
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40169'>SPARK-40169</a>] -         Fix the issue with Parquet column index and predicate pushdown in Data source V1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40202'>SPARK-40202</a>] -         Allow a dictionary in SparkSession.config in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40212'>SPARK-40212</a>] -         SparkSQL castPartValue does not properly handle byte &amp; short
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40218'>SPARK-40218</a>] -         GROUPING SETS should preserve the grouping columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40245'>SPARK-40245</a>] -         Fix FileScan equality check when partition or data filter columns are not read
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40247'>SPARK-40247</a>] -         Fix BitSet equality check
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40261'>SPARK-40261</a>] -         DirectTaskResult meta should not be counted into result size
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40270'>SPARK-40270</a>] -         Make compute.max_rows as None working in DataFrame.style
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40280'>SPARK-40280</a>] -         Failure to create parquet predicate push down for ints and longs on some valid files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40295'>SPARK-40295</a>] -         Allow v2 functions with literal args in write distribution and ordering
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40297'>SPARK-40297</a>] -         CTE outer reference nested in CTE main body cannot be resolved
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40303'>SPARK-40303</a>] -         The performance will be worse after codegen
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40314'>SPARK-40314</a>] -         Add inline Scala and Python bindings
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40315'>SPARK-40315</a>] -         Non-deterministic hashCode() calculations for ArrayBasedMapData on equal objects
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40320'>SPARK-40320</a>] -         When the Executor plugin fails to initialize, the Executor shows active but does not accept tasks forever, just like being hung
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40322'>SPARK-40322</a>] -         Fix all dead links
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40323'>SPARK-40323</a>] -         Update ORC to 1.8.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40380'>SPARK-40380</a>] -         Constant-folding of InvokeLike should not result in non-serializable result
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40385'>SPARK-40385</a>] -         Classes with companion object constructor fails interpreted path
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40403'>SPARK-40403</a>] -         Negative size in error message when unsafe array is too big
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40407'>SPARK-40407</a>] -         Repartition of DataFrame can result in severe data skew in some special case
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40429'>SPARK-40429</a>] -         Only set KeyGroupedPartitioning when the referenced column is in the output
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40440'>SPARK-40440</a>] -         Fix wrong reference and content in PS windows related doc
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40460'>SPARK-40460</a>] -         Streaming metrics is zero when select _metadata
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40468'>SPARK-40468</a>] -         Column pruning is not handled correctly in CSV when _corrupt_record is used
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40470'>SPARK-40470</a>] -         arrays_zip output unexpected alias column names when using GetMapValue and GetArrayStructFields
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40480'>SPARK-40480</a>] -         Remove push-based shuffle data after query finished
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40482'>SPARK-40482</a>] -         Revert SPARK-24544 Print actual failure cause when look up function failed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40492'>SPARK-40492</a>] -         Perform maintenance of StateStore instances when they become inactive
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40496'>SPARK-40496</a>] -         Configs to control &quot;enableDateTimeParsingFallback&quot; are incorrectly swapped
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40508'>SPARK-40508</a>] -         Treat unknown partitioning as UnknownPartitioning
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40521'>SPARK-40521</a>] -         PartitionsAlreadyExistException in Hive V1 Command V1 reports all partitions instead of the conflicting partition
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40535'>SPARK-40535</a>] -         NPE from observe of collect_list
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40562'>SPARK-40562</a>] -         Add spark.sql.legacy.groupingIdWithAppendedUserGroupBy
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40563'>SPARK-40563</a>] -         Error at where clause, when sql case executes by else branch
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40565'>SPARK-40565</a>] -         Non-deterministic filters shouldn&#39;t get pushed to V2 file sources
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40583'>SPARK-40583</a>] -         Documentation error in &quot;Integration with Cloud Infrastructures&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40612'>SPARK-40612</a>] -         On Kubernetes for long running app Spark using an invalid principal to renew the delegation token
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40617'>SPARK-40617</a>] -         Assertion failed in ExecutorMetricsPoller &quot;task count shouldn&#39;t below 0&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40618'>SPARK-40618</a>] -         Bug in MergeScalarSubqueries rule attempting to merge nested subquery with parent
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40622'>SPARK-40622</a>] -         Result of a single task in collect() must fit in 2GB
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40635'>SPARK-40635</a>] -          Scala 2.12 + Hadoop 2 + JDK 8 Daily Test failed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40660'>SPARK-40660</a>] -         Switch to XORShiftRandom to distribute elements
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40670'>SPARK-40670</a>] -         NPE in applyInPandasWithState when the input schema has &quot;non-nullable&quot; column(s)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40694'>SPARK-40694</a>] -         Add permisson for label github aciton job
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40695'>SPARK-40695</a>] -         Add permisson for notify and status update job
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40696'>SPARK-40696</a>] -         Add permisson for infra image
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40703'>SPARK-40703</a>] -         Performance regression for joins in Spark 3.3 vs Spark 3.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40705'>SPARK-40705</a>] -         Issue with spark converting Row to Json using Scala 2.13
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40738'>SPARK-40738</a>] -         spark-shell fails with &quot;bad array subscript&quot; in cygwin or msys bash session
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40739'>SPARK-40739</a>] -         &quot;sbt packageBin&quot; fails in cygwin or other windows bash session
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40753'>SPARK-40753</a>] -         Fix bug in test case for catalog directory operation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40771'>SPARK-40771</a>] -         Estimated size in log message can overflow Int
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40775'>SPARK-40775</a>] -         V2 file scans have duplicative descriptions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40798'>SPARK-40798</a>] -         Alter partition should verify value
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40806'>SPARK-40806</a>] -         Typo fix: CREATE TABLE -&gt; REPLACE TABLE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40815'>SPARK-40815</a>] -         SymlinkTextInputFormat returns incorrect result due to enabled spark.hadoopRDD.ignoreEmptySplits
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40817'>SPARK-40817</a>] -         Remote spark.jars URIs ignored for Spark on Kubernetes in cluster mode 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40819'>SPARK-40819</a>] -         Parquet INT64 (TIMESTAMP(NANOS,true)) now throwing Illegal Parquet type instead of automatically converting to LongType 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40829'>SPARK-40829</a>] -         STORED AS serde in CREATE TABLE LIKE view does not work
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40838'>SPARK-40838</a>] -         Upgrade infra base image to focal-20220922
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40851'>SPARK-40851</a>] -         TimestampFormatter behavior changed when using the latest Java 8/11/17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40858'>SPARK-40858</a>] -         Cleanup github action warning
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40867'>SPARK-40867</a>] -         Flaky test ProtobufCatalystDataConversionSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40869'>SPARK-40869</a>] -         KubernetesConf.getResourceNamePrefix creates invalid name prefixes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40874'>SPARK-40874</a>] -         Fix broadcasts in Python UDFs when encryption is enabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40901'>SPARK-40901</a>] -         Unable to store Spark Driver logs with Absolute Hadoop based URI FS Path
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40902'>SPARK-40902</a>] -         Quick submission of drivers in tests to mesos scheduler results in dropping drivers
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40906'>SPARK-40906</a>] -         `Mode` should copy keys before inserting into Map
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40907'>SPARK-40907</a>] -         `PandasMode` should copy keys before inserting into Map
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40924'>SPARK-40924</a>] -         Unhex function works incorrectly when input has uneven number of symbols
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40932'>SPARK-40932</a>] -         Barrier: messages for allGather will be overridden by the following barrier APIs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40944'>SPARK-40944</a>] -         Relax ordering constraint for CREATE TABLE column options
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40963'>SPARK-40963</a>] -         ExtractGenerator sets incorrect nullability in new Project
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40969'>SPARK-40969</a>] -         Unable to download spark 3.3.0 tarball after 3.3.1 release in spark-docker
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40987'>SPARK-40987</a>] -         Avoid creating a directory when deleting a block, causing DAGScheduler to not work
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40999'>SPARK-40999</a>] -         Hints on subqueries are not properly propagated
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41003'>SPARK-41003</a>] -         BHJ LeftAnti does not update numOutputRows when codegen is disabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41007'>SPARK-41007</a>] -         BigInteger Serialization doesn&#39;t work with JavaBean Encoder
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41008'>SPARK-41008</a>] -         Isotonic regression result differs from sklearn implementation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41015'>SPARK-41015</a>] -         Failure of ProtobufCatalystDataConversionSuite.scala
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41035'>SPARK-41035</a>] -         Incorrect results or NPE when a literal is reused across distinct aggregations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41040'>SPARK-41040</a>] -         Self-union streaming query may fail when using readStream.table
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41047'>SPARK-41047</a>] -         Remove legacy example of round function with negative scale 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41049'>SPARK-41049</a>] -         Nondeterministic expressions have unstable values if they are children of CodegenFallback expressions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41056'>SPARK-41056</a>] -         Fix new R_LIBS_SITE behavior introduced in R 4.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41093'>SPARK-41093</a>] -         Remove netty-tcnative-classes from Spark dependencyList
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41118'>SPARK-41118</a>] -         to_number/try_to_number throws NullPointerException when format is null
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41136'>SPARK-41136</a>] -         Shorten graceful shutdown time of ExecutorPodsSnapshotsStoreImpl to prevent blocking shutdown process
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41144'>SPARK-41144</a>] -         UnresolvedHint should not cause query failure
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41149'>SPARK-41149</a>] -         Fix `SparkSession.builder.config` to support bool
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41151'>SPARK-41151</a>] -         Keep built-in file _metadata column nullable value consistent
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41154'>SPARK-41154</a>] -         Incorrect relation caching for queries with time travel spec
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41162'>SPARK-41162</a>] -         Anti-join must not be pushed below aggregation with ambiguous predicates
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41165'>SPARK-41165</a>] -         Arrow collect should factor in failures
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41177'>SPARK-41177</a>] -         maven test `protobuf` module failed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41178'>SPARK-41178</a>] -         fix parser rule precedence between JOIN and comma
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41184'>SPARK-41184</a>] -         Fill NA tests are flaky
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41186'>SPARK-41186</a>] -         Fix doctest for new version mlfow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41187'>SPARK-41187</a>] -         [Core] LiveExecutor MemoryLeak in AppStatusListener when ExecutorLost happen
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41188'>SPARK-41188</a>] -         Set executorEnv OMP_NUM_THREADS to be spark.task.cpus by default for spark executor JVM processes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41189'>SPARK-41189</a>] -         Add an environment to switch on and off namedtuple hack 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41192'>SPARK-41192</a>] -         Task finished before speculative task scheduled leads to holding idle executors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41193'>SPARK-41193</a>] -         Ignore `collect data with single partition larger than 2GB bytes array limit` in `DatasetLargeResultCollectingSuite` as default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41198'>SPARK-41198</a>] -         Streaming query metrics is broken with CTE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41199'>SPARK-41199</a>] -         Streaming query metrics is broken with mixed-up usage of DSv1 streaming source and DSv2 streaming source
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41219'>SPARK-41219</a>] -         Regression in IntegralDivide returning null instead of 0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41254'>SPARK-41254</a>] -         YarnAllocator.rpIdToYarnResource map is not properly updated
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41261'>SPARK-41261</a>] -         applyInPandasWithState can produce incorrect key value in user function for timed out state
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41313'>SPARK-41313</a>] -         AM shutdown hook fails with IllegalStateException if AM crashes on startup (recurrence of SPARK-3900)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41327'>SPARK-41327</a>] -         Fix SparkStatusTracker.getExecutorInfos by switch On/OffHeapStorageMemory info
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41339'>SPARK-41339</a>] -         RocksDB state store WriteBatch doesn&#39;t clean up native memory
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41344'>SPARK-41344</a>] -         Reading V2 datasource masks underlying error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41350'>SPARK-41350</a>] -         allow simple name access of using join hidden columns after subquery alias
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41365'>SPARK-41365</a>] -         Stages UI page fails to load for proxy in some yarn versions 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41374'>SPARK-41374</a>] -         Update ORC to 1.8.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41375'>SPARK-41375</a>] -         Avoid empty latest KafkaSourceOffset
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41376'>SPARK-41376</a>] -         Executor netty direct memory check should respect spark.shuffle.io.preferDirectBufs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41377'>SPARK-41377</a>] -         Fix spark-version-info.properties not found on Windows
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41379'>SPARK-41379</a>] -         Inconsistency of spark session in DataFrame in user function for foreachBatch sink in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41385'>SPARK-41385</a>] -         Replace deprecated `.newInstance()` in K8s module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41395'>SPARK-41395</a>] -         InterpretedMutableProjection can corrupt unsafe buffer when used with decimal data
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41411'>SPARK-41411</a>] -         Multi-Stateful Operator watermark support bug fix
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41437'>SPARK-41437</a>] -         Do not optimize the input query twice for v1 write fallback
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41448'>SPARK-41448</a>] -         Make consistent MR job IDs in FileBatchWriter and FileFormatWriter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41452'>SPARK-41452</a>] -         to_char throws NullPointerException when format is null
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41458'>SPARK-41458</a>] -         Correctly transform the SPI services for Yarn Shuffle Service
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41468'>SPARK-41468</a>] -         Fix PlanExpression handling in EquivalentExpressions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41475'>SPARK-41475</a>] -         Fix lint-scala command error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41522'>SPARK-41522</a>] -         GA dependencies test faild
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41535'>SPARK-41535</a>] -         InterpretedUnsafeProjection and InterpretedMutableProjection can corrupt unsafe buffer when used with calendar interval data
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41539'>SPARK-41539</a>] -         stats and constraints in LogicalRDD may not be in sync with output attributes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41554'>SPARK-41554</a>] -         Decimal.changePrecision produces ArrayIndexOutOfBoundsException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41668'>SPARK-41668</a>] -         DECODE function returns wrong results when passed NULL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41683'>SPARK-41683</a>] -         Spark UI: In jobs API, numActiveStages can be negative in some cases
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41732'>SPARK-41732</a>] -         Session window: analysis rule &quot;SessionWindowing&quot; does not apply tree-pattern based pruning
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41733'>SPARK-41733</a>] -         Session window: analysis rule &quot;ResolveWindowTime&quot; does not apply tree-pattern based pruning
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41735'>SPARK-41735</a>] -         Any SparkThrowable (with an error class) not in error-classes.json is masked in SQLExecution.withNewExecutionId and end-user will see &quot;org.apache.spark.SparkException: [INTERNAL_ERROR]&quot; 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41741'>SPARK-41741</a>] -         [SQL] ParquetFilters StringStartsWith push down matching string do not use UTF-8
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41780'>SPARK-41780</a>] -         `regexp_replace(&#39;&#39;, &#39;[a\\\\d]{0, 2}&#39;, &#39;x&#39;)` causes an internal error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41790'>SPARK-41790</a>] -         Set TRANSFORM reader and writer&#39;s format correctly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41792'>SPARK-41792</a>] -         Shuffle merge finalization removes the wrong finalization state from the DB
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41793'>SPARK-41793</a>] -         Incorrect result for window frames defined by a range clause on large decimals 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41804'>SPARK-41804</a>] -         InterpretedUnsafeProjection doesn&#39;t properly handle an array of UDTs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41848'>SPARK-41848</a>] -         Tasks are over-scheduled with TaskResourceProfile
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41858'>SPARK-41858</a>] -         Fix ORC reader perf regression due to DEFAULT value feature
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41859'>SPARK-41859</a>] -         CreateHiveTableAsSelectCommand should set the overwrite flag correctly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41894'>SPARK-41894</a>] -         sql/core module mvn clean failed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41896'>SPARK-41896</a>] -         Filtering by row_index always returns empty results
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41912'>SPARK-41912</a>] -         Subquery should not validate CTE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41914'>SPARK-41914</a>] -         Sorting issue with partitioned-writing and planned write optimization disabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41937'>SPARK-41937</a>] -         SparkR datetime column compare with Sys.time() throws error in R (&gt;= 4.2.0)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41947'>SPARK-41947</a>] -         Update the contents of error class guidelines
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41948'>SPARK-41948</a>] -         Fix NPE for error classes: CANNOT_PARSE_JSON_FIELD
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41952'>SPARK-41952</a>] -         Upgrade Parquet to fix off-heap memory leaks in Zstd codec
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41958'>SPARK-41958</a>] -         Disallow arbitrary custom classpath with proxy user in cluster mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41982'>SPARK-41982</a>] -         When the inserted partition type is of string type, similar `dt=01` will be converted to `dt=1`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41985'>SPARK-41985</a>] -         Centralize more column resolution rules
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41989'>SPARK-41989</a>] -         PYARROW_IGNORE_TIMEZONE warning can break application logging setup
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41990'>SPARK-41990</a>] -         Filtering by composite field name like `field name` doesn&#39;t work with pushDownPredicate = true
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41991'>SPARK-41991</a>] -         Interpreted mode subexpression elimination can throw exception during insert
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42046'>SPARK-42046</a>] -         Add `connect-client-jvm` to connect module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42057'>SPARK-42057</a>] -         Avoid losing exception info in Protobuf errors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42059'>SPARK-42059</a>] -         Update ORC to 1.8.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42061'>SPARK-42061</a>] -         Mark Expressions that have state has stateful
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42066'>SPARK-42066</a>] -         The DATATYPE_MISMATCH error class contains inappropriate and duplicating subclasses
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42084'>SPARK-42084</a>] -         Avoid leaking the qualified-access-only restriction
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42088'>SPARK-42088</a>] -         Running python3 setup.py sdist on windows reports a permission error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42090'>SPARK-42090</a>] -         Introduce sasl retry count in RetryingBlockTransferor
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42109'>SPARK-42109</a>] -         Upgrade Kafka to 3.3.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42112'>SPARK-42112</a>] -         Add null check before `ContinuousWriteRDD#compute` method close dataWriter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42113'>SPARK-42113</a>] -         Upgrade pandas to 1.5.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42115'>SPARK-42115</a>] -         Push down limit through Python UDFs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42134'>SPARK-42134</a>] -         Fix getPartitionFiltersAndDataFilters() to handle filters without referenced attributes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42156'>SPARK-42156</a>] -         Support client-side retries in Spark Connect Python client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42157'>SPARK-42157</a>] -         `spark.scheduler.mode=FAIR` should provide FAIR scheduler
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42162'>SPARK-42162</a>] -         Memory usage on executors increased drastically for a complex query with large number of addition operations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42163'>SPARK-42163</a>] -         Schema pruning fails on non-foldable array index or map key
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42171'>SPARK-42171</a>] -         Fix `pyspark-errors` module and enable it in GitHub Action
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42174'>SPARK-42174</a>] -         Use scikit-learn instead of sklearn
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42176'>SPARK-42176</a>] -         Cast boolean to timestamp fails with ClassCastException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42177'>SPARK-42177</a>] -         Change master to brach-3.4 in GitHub Actions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42186'>SPARK-42186</a>] -         Make SparkR able to stop properly when the connection is timed-out
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42196'>SPARK-42196</a>] -         Typo in StreamingQuery.scala
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42201'>SPARK-42201</a>] -         `build/sbt` should allow SBT_OPTS to override JVM memory setting
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42228'>SPARK-42228</a>] -         connect-client-jvm module should shaded+relocation grpc
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42241'>SPARK-42241</a>] -          Correct the condition for `SparkConnectServerUtils#findSparkConnectJar` to find the correct connect server jar for maven
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42242'>SPARK-42242</a>] -         Upgrade snappy-java to 1.1.9.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42250'>SPARK-42250</a>] -         predict_batch_udf with float fails when the batch size consists of single value
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42259'>SPARK-42259</a>] -         ResolveGroupingAnalytics should take care of Python UDAF
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42274'>SPARK-42274</a>] -         Upgrade `compress-lzf` to 1.1.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42276'>SPARK-42276</a>] -         Add ServicesResourceTransformer to connect server module  shade configuration
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42286'>SPARK-42286</a>] -         Fix internal error for valid CASE WHEN expression with CAST when inserting into a table
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42331'>SPARK-42331</a>] -         Fix metadata col can not been resolved
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42344'>SPARK-42344</a>] -         The default size of the CONFIG_MAP_MAXSIZE should not be greater than 1048576
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42346'>SPARK-42346</a>] -         distinct(count colname) with UNION ALL causes query analyzer bug
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42384'>SPARK-42384</a>] -         Mask function&#39;s generated code does not handle null input
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42401'>SPARK-42401</a>] -         Incorrect results or NPE when inserting null value into array using array_insert/array_append
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42403'>SPARK-42403</a>] -         JsonProtocol should handle null JSON strings
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42406'>SPARK-42406</a>] -         [PROTOBUF] Recursive field handling is incompatible with delta
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42410'>SPARK-42410</a>] -         Support Scala 2.12/2.13 tests in connect module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42416'>SPARK-42416</a>] -         Dateset operations should not resolve the analyzed logical plan again
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42444'>SPARK-42444</a>] -         DataFrame.drop should handle multi columns properly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42445'>SPARK-42445</a>] -         Fix SparkR install.spark function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42448'>SPARK-42448</a>] -         spark sql shell prompts wrong database info
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42462'>SPARK-42462</a>] -         Prevent `docker-image-tool.sh` from publishing OCI manifests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42478'>SPARK-42478</a>] -         Make a serializable jobTrackerId instead of a non-serializable JobID in FileWriterFactory
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42515'>SPARK-42515</a>] -         ClientE2ETestSuite local test failed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42516'>SPARK-42516</a>] -         Non-captured session time zone in view creation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42534'>SPARK-42534</a>] -         Fix DB2 Limit clause
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42547'>SPARK-42547</a>] -         Make PySpark working with Python 3.7
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42596'>SPARK-42596</a>] -         [YARN] OMP_NUM_THREADS not set to number of executor cores by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42600'>SPARK-42600</a>] -         currentDatabase Shall use  NamespaceHelper instead of MultipartIdentifierHelper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42608'>SPARK-42608</a>] -         Use full column names for inner fields in resolution errors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42611'>SPARK-42611</a>] -         Insert char/varchar length checks for inner fields during resolution
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42616'>SPARK-42616</a>] -         SparkSQLCLIDriver shall only close started hive sessionState
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42655'>SPARK-42655</a>] -         Incorrect ambiguous column reference error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42665'>SPARK-42665</a>] -         `simple udf` test failed using Maven 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42673'>SPARK-42673</a>] -         Make build/mvn build Spark only with the verified maven version
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42677'>SPARK-42677</a>] -         Fix the invalid tests for broadcast hint
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42697'>SPARK-42697</a>] -         /api/v1/applications return 0 for duration
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42700'>SPARK-42700</a>] -         Add h2 as test dependency of connect-server module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42709'>SPARK-42709</a>] -         Do not rely on __file__
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42851'>SPARK-42851</a>] -         EquivalentExpressions methods need to be consistently guarded by supportedExpression
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42928'>SPARK-42928</a>] -         Make resolvePersistentFunction synchronized
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42936'>SPARK-42936</a>] -         Unresolved having at the end of analysis when using with LCA with the having clause that can be resolved directly by its child Aggregate
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42967'>SPARK-42967</a>] -         Fix SparkListenerTaskStart.stageAttemptId when a task is started after the stage is cancelled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42971'>SPARK-42971</a>] -         When processing the WorkDirCleanup event, if appDirs is empty, should print workdir 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-43041'>SPARK-43041</a>] -         Restore constructors of exceptions for compatibility in connector API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-43158'>SPARK-43158</a>] -         Set upperbound of pandas version in binder integrations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-43538'>SPARK-43538</a>] -         Spark Homebrew Formulae currently depends on non-officially-supported Java 20
</li>
</ul>
    
<h2>        Epic
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-32082'>SPARK-32082</a>] -         Project Zen: Improving Python usability
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40653'>SPARK-40653</a>] -         Protobuf Support in Structured Streaming
</li>
</ul>
    
<h2>        Story
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40211'>SPARK-40211</a>] -         Allow executeTake() / collectLimit&#39;s number of starting partitions to be customized
</li>
</ul>
    
<h2>        New Feature
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-27561'>SPARK-27561</a>] -         Support &quot;lateral column alias references&quot; to allow column aliases to be used within SELECT clauses
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-30641'>SPARK-30641</a>] -         Project Matrix: Linear Models revisit and refactor
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35662'>SPARK-35662</a>] -         Support Timestamp without time zone data type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37568'>SPARK-37568</a>] -         Support 2-arguments by the convert_timezone() function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37671'>SPARK-37671</a>] -         Support ANSI Aggregation Function of regression
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38591'>SPARK-38591</a>] -         Add sortWithinGroups to KeyValueGroupedDataset
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38647'>SPARK-38647</a>] -         Add SupportsReportOrdering mix in interface for Scan
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38864'>SPARK-38864</a>] -         Unpivot / melt function for Dataset API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38904'>SPARK-38904</a>] -         Low cost DataFrame schema swap util
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39057'>SPARK-39057</a>] -         Offset could work without Limit
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39071'>SPARK-39071</a>] -         Add unwrap_udt function for unwrapping UserDefinedType columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39159'>SPARK-39159</a>] -         Add new Dataset API for Offset
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39168'>SPARK-39168</a>] -         Consider all values in a python list when inferring schema
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39305'>SPARK-39305</a>] -         Implement the EQUAL_NULL function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39306'>SPARK-39306</a>] -         support scalar subquery in time travel
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39320'>SPARK-39320</a>] -         Add the MEDIAN() function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39457'>SPARK-39457</a>] -         Support IPv6-only environment
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39567'>SPARK-39567</a>] -         Support ANSI intervals in the percentile functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39618'>SPARK-39618</a>] -         Add the REGEXP_COUNT function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39625'>SPARK-39625</a>] -         add Dataset.to(StructType)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39695'>SPARK-39695</a>] -         Add the REGEXP_SUBSTR function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39741'>SPARK-39741</a>] -         Support url encode/decode as built-in function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39744'>SPARK-39744</a>] -         Add the REGEXP_INSTR function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39808'>SPARK-39808</a>] -         Support aggregate function MODE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39876'>SPARK-39876</a>] -         Unpivot / melt function for SQL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39877'>SPARK-39877</a>] -         Unpivot / melt function for PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40003'>SPARK-40003</a>] -         Add median to PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40007'>SPARK-40007</a>] -         Add Mode to PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40015'>SPARK-40015</a>] -         Add sc.listArchives and sc.listFiles to PySpark 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40087'>SPARK-40087</a>] -         Support multiple Column drop in R
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40264'>SPARK-40264</a>] -         Add helper function for DL model inference in pyspark.ml.functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40281'>SPARK-40281</a>] -         Memory Profiler on Executors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40530'>SPARK-40530</a>] -         Add error-related developer APIs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40585'>SPARK-40585</a>] -         Support double-quoted identifiers
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40849'>SPARK-40849</a>] -         Async log purge
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40956'>SPARK-40956</a>] -         SQL Equivalent for Dataframe overwrite command
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40957'>SPARK-40957</a>] -         Add in memory cache in HDFSMetadataLog
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41183'>SPARK-41183</a>] -         Add an extension API to do plan normalization for caching
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41195'>SPARK-41195</a>] -         Support PIVOT/UNPIVOT with join children
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41271'>SPARK-41271</a>] -         Parameterized SQL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41290'>SPARK-41290</a>] -         Support GENERATED ALWAYS AS syntax in create/replace table to create a generated column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41323'>SPARK-41323</a>] -         Support CURRENT_SCHEMA() as alias for CURRENT_DATABASE()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41378'>SPARK-41378</a>] -         Support Column Stats in DS V2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41515'>SPARK-41515</a>] -         PVC-oriented executor pod allocation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41635'>SPARK-41635</a>] -         GROUP BY ALL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41637'>SPARK-41637</a>] -         ORDER BY ALL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41666'>SPARK-41666</a>] -         Support parameterized SQL in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42477'>SPARK-42477</a>] -          python: accept user_agent in spark connect&#39;s connection string
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42556'>SPARK-42556</a>] -         Dataset.colregex should link a plan_id when it only matches a single column.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42610'>SPARK-42610</a>] -         Add implicit encoders to SQLImplicits
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42614'>SPARK-42614</a>] -         Make all constructors private[sql]
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42632'>SPARK-42632</a>] -         Fix scala paths in tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42637'>SPARK-42637</a>] -         Add SparkSession.stop
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42680'>SPARK-42680</a>] -         Create the helper function withSQLConf for connect&#39;s test
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42690'>SPARK-42690</a>] -         Implement CSV/JSON parsing funcions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42884'>SPARK-42884</a>] -         Add Ammonite REPL support
</li>
</ul>
    
<h2>        Improvement
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-25050'>SPARK-25050</a>] -         Handle more than two types in avro union types when writing avro files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-29260'>SPARK-29260</a>] -         Enable supported Hive metastore versions once it support altering database location
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-32170'>SPARK-32170</a>] -          Improve the speculation for the inefficient tasks by the task metrics.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-33605'>SPARK-33605</a>] -         Add gcs-connector to hadoop-cloud module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-33753'>SPARK-33753</a>] -         Reduce the memory footprint and gc of the cache (hadoopJobMetadata)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34265'>SPARK-34265</a>] -         Instrument Python UDF execution using SQL Metrics
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34659'>SPARK-34659</a>] -         Web UI does not correctly get appId
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-34927'>SPARK-34927</a>] -         Support TPCDSQueryBenchmark in Benchmarks
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35242'>SPARK-35242</a>] -         Support change catalog default database for spark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35739'>SPARK-35739</a>] -         [Spark Sql] Add Java-comptable Dataset.join overloads
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35743'>SPARK-35743</a>] -         Improve Parquet vectorized reader
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36259'>SPARK-36259</a>] -         Expose localtimestamp in pyspark.sql.functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36462'>SPARK-36462</a>] -         Allow Spark on Kube to operate without polling or watchers
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36664'>SPARK-36664</a>] -         Log time spent waiting for cluster resources
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-36837'>SPARK-36837</a>] -         Upgrade Kafka to 3.1.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37348'>SPARK-37348</a>] -         PySpark pmod function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37523'>SPARK-37523</a>] -         Support optimize skewed partitions in Distribution and Ordering if numPartitions is not specified
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37825'>SPARK-37825</a>] -         Make spark beeline be able to handle javaOpts
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37956'>SPARK-37956</a>] -         Add Java and Python examples to the Parquet encryption feature documentation 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37961'>SPARK-37961</a>] -         override maxRows/maxRowsPerPartition for some logical operators
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-37980'>SPARK-37980</a>] -         Extend METADATA column to support row indices for file based data sources
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38034'>SPARK-38034</a>] -         Optimize time complexity and extend applicable cases for TransposeWindow 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38098'>SPARK-38098</a>] -         Add support for ArrayType of nested StructType to arrow-based conversion
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38194'>SPARK-38194</a>] -         Make memory overhead factor configurable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38277'>SPARK-38277</a>] -         Clear write batch after RocksDB state store&#39;s commit
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38334'>SPARK-38334</a>] -         Implement support for DEFAULT values for columns in tables 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38349'>SPARK-38349</a>] -         No need to filter events when sessionwindow gapDuration greater than 0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38522'>SPARK-38522</a>] -         Strengthen the contract on iterator method in StateStore
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38541'>SPARK-38541</a>] -         Upgrade netty to 4.1.75
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38545'>SPARK-38545</a>] -         Upgarde scala-maven-plugin from 4.4.0 to 4.5.6
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38555'>SPARK-38555</a>] -          Avoid contention and get or create clientPools quickly in the TransportClientFactory
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38564'>SPARK-38564</a>] -         Support collecting metrics from streaming sinks
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38568'>SPARK-38568</a>] -         Upgrade ZSTD-JNI to 1.5.2-2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38569'>SPARK-38569</a>] -         external top-level directory is problematic for bazel
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38573'>SPARK-38573</a>] -         Support Auto Partition Statistics Collection
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38575'>SPARK-38575</a>] -         Duduplicate branch specification in GitHub Actions workflow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38582'>SPARK-38582</a>] -         Add KubernetesUtils.buildEnvVars(WithFieldRef)? utility functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38584'>SPARK-38584</a>] -         Unify the data validation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38585'>SPARK-38585</a>] -         Simplify the code of TreeNode.clone()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38593'>SPARK-38593</a>] -         Incorporate numRowsDroppedByWatermark metric from SessionWindowStateStoreRestoreExec into StateOperatorProgress
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38594'>SPARK-38594</a>] -         Change to use `NettyUtils` to create `EventLoop` and `ChannelClass` in RBackend
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38611'>SPARK-38611</a>] -         Use `assertThrows` instead of handwriting `intercept` method in `CatalogLoadingSuite` 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38619'>SPARK-38619</a>] -         Clean up Junit api usage in scalatest
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38620'>SPARK-38620</a>] -         Replace `value.formatted(formatString)` with `formatString.format(value)` to clean up compilation warning
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38622'>SPARK-38622</a>] -         Upgrade jersey to 2.35
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38624'>SPARK-38624</a>] -         Reduce UnsafeProjection.create call times when Percentile function serializes the aggregation buffer object
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38635'>SPARK-38635</a>] -         Remove duplicate log for spark ApplicationMaster
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38641'>SPARK-38641</a>] -         Get rid of invalid configuration elements in mvn_scalafmt in main pom.xml
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38646'>SPARK-38646</a>] -         Pull a trait out for Python functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38660'>SPARK-38660</a>] -         PySpark DeprecationWarning: distutils Version classes are deprecated
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38661'>SPARK-38661</a>] -         [TESTS] Replace &#39;abc &amp; Symbol(&quot;abc&quot;) symbols with $&quot;abc&quot; in tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38670'>SPARK-38670</a>] -         Add offset commit time to streaming query listener
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38671'>SPARK-38671</a>] -         Publish snapshot from branch-3.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38673'>SPARK-38673</a>] -         Replace java assert with Junit api in Java UTs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38674'>SPARK-38674</a>] -         Remove useless deduplicate in SubqueryBroadcastExec
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38679'>SPARK-38679</a>] -         Expose the number partitions in a stage to TaskContext
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38683'>SPARK-38683</a>] -         It is unnecessary to release the ShuffleManagedBufferIterator or ShuffleChunkManagedBufferIterator or ManagedBufferIterator buffers when the client channel&#39;s connection is terminated
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38694'>SPARK-38694</a>] -         Simplify Java UT code with Junit `assertThrows`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38711'>SPARK-38711</a>] -         Refactor pyspark.sql.streaming module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38713'>SPARK-38713</a>] -         Change spark.sessionstate.conf.getConf/setConf operation to spark.conf.get/set
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38756'>SPARK-38756</a>] -         Clean up useless security configs in `TransportConf`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38757'>SPARK-38757</a>] -         Update the Oracle docker image version used for test and integration
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38759'>SPARK-38759</a>] -         Add StreamingQueryListener support in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38760'>SPARK-38760</a>] -         Implement DataFrame.observe in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38767'>SPARK-38767</a>] -         Support ignoreCorruptFiles and ignoreMissingFiles in Data Source options
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38770'>SPARK-38770</a>] -         Remove renameMainAppResource from baseDriverContainer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38772'>SPARK-38772</a>] -         Formatting the log plan in AdaptiveSparkPlanExec
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38779'>SPARK-38779</a>] -         Unify the pushed operator checking between FileSource test suite and JDBC test suite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38797'>SPARK-38797</a>] -         Runtime Filter support pushdown through window
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38798'>SPARK-38798</a>] -         Make `spark.file.transferTo` as an `ConfigEntry`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38803'>SPARK-38803</a>] -         Set minio cpu to 250m (0.25) in K8s IT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38804'>SPARK-38804</a>] -         Add StreamingQueryManager.removeListener in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38826'>SPARK-38826</a>] -         dropFieldIfAllNull option does not work for empty JSON struct
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38832'>SPARK-38832</a>] -         Remove unnecessary distinct in aggregate expression by distinctKeys
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38835'>SPARK-38835</a>] -         Refactor FsHistoryProviderSuite to test rocks db
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38836'>SPARK-38836</a>] -         Increase the performance of ExpressionSet
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38841'>SPARK-38841</a>] -         Enable Bloom filter join by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38847'>SPARK-38847</a>] -         Introduce a `viewToSeq` function for `KVUtils`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38848'>SPARK-38848</a>] -         Replcace all `@Test(expected = XXException)` with assertThrows
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38850'>SPARK-38850</a>] -         Upgrade Kafka to 3.2.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38851'>SPARK-38851</a>] -         Refactor `HistoryServerSuite` to add UTs for RocksDB
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38881'>SPARK-38881</a>] -         PySpark Kinesis Streaming should expose metricsLevel CloudWatch config that is already supported in the Scala/Java APIs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38885'>SPARK-38885</a>] -         Upgrade netty to 4.1.76
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38886'>SPARK-38886</a>] -         Remove outer join if aggregate functions are duplicate agnostic on streamed side
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38888'>SPARK-38888</a>] -         Add `RocksDBProvider` similar to `LevelDBProvider`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38896'>SPARK-38896</a>] -         Use tryWithResource to recycling KVStoreIterator
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38909'>SPARK-38909</a>] -         Encapsulate LevelDB used by ExternalShuffleBlockResolver and YarnShuffleService as LocalDB
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38914'>SPARK-38914</a>] -         Allow user to insert specified columns into insertable view
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38921'>SPARK-38921</a>] -         Use k8s-client to create queue resource in Volcano IT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38929'>SPARK-38929</a>] -         Improve error messages for cast failures in ANSI
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38940'>SPARK-38940</a>] -         Test Series&#39; anchor frame for in-place updates on Series
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38966'>SPARK-38966</a>] -         Fix CI for fork branches in-sync with upstream master
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38968'>SPARK-38968</a>] -         remove hadoopConf from KerberosConfDriverFeatureStep
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38970'>SPARK-38970</a>] -         Skip build-and-test workflow on forks when scheduled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38971'>SPARK-38971</a>] -         Test anchor frame for in-place `Series.rename_axis`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38979'>SPARK-38979</a>] -         Improve error log readability in OrcUtils.requestedColumnIds
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38985'>SPARK-38985</a>] -         Support sub-error-class for UNSUPPORTED_FEATURE et al
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38999'>SPARK-38999</a>] -         Refactor DataSourceScanExec code to 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39002'>SPARK-39002</a>] -         StringEndsWith/Contains support push down to Parquet so that we can leverage dictionary filter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39014'>SPARK-39014</a>] -         Respect ignoreMissingFiles from Data Source options in InMemoryFileIndex
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39016'>SPARK-39016</a>] -         Fix compilation warnings related to &quot;`enum` will become a keyword in Scala 3&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39038'>SPARK-39038</a>] -         Skip reporting test results if triggering workflow was skipped
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39042'>SPARK-39042</a>] -         Use `Map.values()` instead of `Map.entrySet()` in scenarios that do not use `keys`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39050'>SPARK-39050</a>] -         Convert UNSUPPORTED_OPERATION to UNSUPPORTED_FEATURE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39051'>SPARK-39051</a>] -         Minor refactoring of `python/pyspark/sql/pandas/conversion.py`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39052'>SPARK-39052</a>] -         Support Char in Literal.create
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39062'>SPARK-39062</a>] -         Add Standalone backend support for Stage Level Scheduling
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39067'>SPARK-39067</a>] -         Upgrade scala-maven-plugin to 4.6.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39068'>SPARK-39068</a>] -         Make thriftserver and sparksql-cli support in-memory catalog
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39073'>SPARK-39073</a>] -         Keep rowCount after hive table partition pruning if table only have hive statistics
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39102'>SPARK-39102</a>] -         Replace the usage of  guava&#39;s Files.createTempDir() with java.nio.file.Files.createTempDirectory()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39111'>SPARK-39111</a>] -         Mark overriden methods with `@override` annotation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39113'>SPARK-39113</a>] -         rename self to cls in python/pyspark/mllib/clustering.py
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39116'>SPARK-39116</a>] -         Replcace double negation in exists with forall
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39119'>SPARK-39119</a>] -         Upgrade to Hadoop 3.3.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39123'>SPARK-39123</a>] -         Upgrade `org.scalatestplus:mockito` to 3.2.12.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39124'>SPARK-39124</a>] -         Upgrade rocksdbjni to 7.1.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39133'>SPARK-39133</a>] -         Mention log level setting in PYSPARK_JVM_STACKTRACE_ENABLED
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39134'>SPARK-39134</a>] -         Add custom metric of skipped null values for stream join operator
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39137'>SPARK-39137</a>] -         Use slice instead of take and drop
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39138'>SPARK-39138</a>] -         Add ANSI general value specification and function -user
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39146'>SPARK-39146</a>] -         The singleton Jackson ObjectMapper should be preferred
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39147'>SPARK-39147</a>] -         Code simplification, use count() instead of filter().size, etc.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39152'>SPARK-39152</a>] -         StreamCorruptedException cause job failure for disk persisted RDD
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39156'>SPARK-39156</a>] -         Remove ParquetLogRedirector usage from ParquetFileFormat
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39160'>SPARK-39160</a>] -         Remove workaround for ARROW-1948
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39161'>SPARK-39161</a>] -         Upgrade rocksdbjni to 7.2.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39171'>SPARK-39171</a>] -         Unify the Cast expression
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39172'>SPARK-39172</a>] -         Remove outer join if all output come from streamed side and buffered side keys exist unique key
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39180'>SPARK-39180</a>] -         Simplify the planning of limit and offset
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39182'>SPARK-39182</a>] -         Upgrade to Arrow 8.0.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39186'>SPARK-39186</a>] -         make skew consistent with pandas
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39192'>SPARK-39192</a>] -         make pandas-on-spark&#39;s kurt consistent with pandas
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39196'>SPARK-39196</a>] -         Replace getOrElse(null) with orNull
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39204'>SPARK-39204</a>] -         Replace `Utils.createTempDir` related methods with JavaUtils
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39205'>SPARK-39205</a>] -         Add `PANDAS API ON SPARK` label
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39213'>SPARK-39213</a>] -         Create ANY_VALUE aggregate function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39217'>SPARK-39217</a>] -         Makes DPP support the pruning side has Union
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39225'>SPARK-39225</a>] -         Support spark.history.fs.update.batchSize
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39231'>SPARK-39231</a>] -         Change to use `ConstantColumnVector` to store partition columns in `VectorizedParquetRecordReader`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39235'>SPARK-39235</a>] -         Make Catalog API be compatible with 3-layer-namespace
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39248'>SPARK-39248</a>] -         Decimal divide much slower than multiply
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39251'>SPARK-39251</a>] -         Simplify MultiLike if remainPatterns is empty
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39254'>SPARK-39254</a>] -          Upgrade ZSTD-JNI to 1.5.2-3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39256'>SPARK-39256</a>] -         Reduce multiple file attribute calls of JavaUtils#deleteRecursivelyUsingJavaIO
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39260'>SPARK-39260</a>] -         Use `Reader.getSchema` instead of `Reader.getTypes`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39261'>SPARK-39261</a>] -         Improve newline formatting for error messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39262'>SPARK-39262</a>] -         Correct the behavior of creating DataFrame from an RDD
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39266'>SPARK-39266</a>] -         Cleanup unused spark.rpc.numRetries and spark.rpc.retry.wait configs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39267'>SPARK-39267</a>] -         Clean up dsl unnecessary symbol
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39277'>SPARK-39277</a>] -         Make Optimizer extends SQLConfHelper
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39282'>SPARK-39282</a>] -         Replace If-Else branch with bitwise operators in roundNumberOfBytesToNearestWord
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39295'>SPARK-39295</a>] -         Improve documentation of pandas API support list.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39298'>SPARK-39298</a>] -         Change to use `seq.indices` constructing ranges
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39299'>SPARK-39299</a>] -         Series.autocorr use SQL.corr to avoid conversion to vector
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39301'>SPARK-39301</a>] -         Levearge LocalRelation in createDataFrame with Arrow optimization
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39308'>SPARK-39308</a>] -         Upgrade parquet to 1.12.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39312'>SPARK-39312</a>] -         Use Parquet in predicate for Spark In filter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39318'>SPARK-39318</a>] -         Remove tpch-plan-stability WithStats golden files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39321'>SPARK-39321</a>] -         Refactor TryCast to use RuntimeReplaceable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39323'>SPARK-39323</a>] -         Hide empty `taskResourceAssignments` from INFO log
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39325'>SPARK-39325</a>] -         Improve MapOutputTracker convertMapStatuses performance
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39332'>SPARK-39332</a>] -         Upgrade RoaringBitmap to 0.9.28
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39333'>SPARK-39333</a>] -         Change to use `foreach` when `map` produce no result
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39349'>SPARK-39349</a>] -         Add a CheckError() method to SparkFunSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39368'>SPARK-39368</a>] -         Move RewritePredicateSubquery into InjectRuntimeFilter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39374'>SPARK-39374</a>] -         Improve error message for user specified column list
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39377'>SPARK-39377</a>] -         Normalize expr ids in ListQuery and Exists expressions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39381'>SPARK-39381</a>] -         Make vectorized orc columar writer batch size configurable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39387'>SPARK-39387</a>] -         Upgrade hive-storage-api to 2.7.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39388'>SPARK-39388</a>] -         Reuse orcSchema when push down Orc predicates
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39390'>SPARK-39390</a>] -         Hide and optimize `viewAcls`/`viewAclsGroups`/`modifyAcls`/`modifyAclsGroups` fron INFO log
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39392'>SPARK-39392</a>] -         Refine ANSI error messages and remove &#39;To return NULL instead&#39;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39397'>SPARK-39397</a>] -         Relax AliasAwareOutputExpression to support alias with expression
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39409'>SPARK-39409</a>] -         Upgrade scala-maven-plugin to 4.6.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39414'>SPARK-39414</a>] -         Upgrade Scala to 2.12.16
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39428'>SPARK-39428</a>] -         use code block for `Coalesce Hints for SQL Queries`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39439'>SPARK-39439</a>] -         Suppress error log for in-progress event log not found
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39440'>SPARK-39440</a>] -         Add a config to disable event timeline
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39441'>SPARK-39441</a>] -         Speed up DeduplicateRelations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39443'>SPARK-39443</a>] -         Improve docstring of pyspark.sql.functions.col/first
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39446'>SPARK-39446</a>] -         Add relevance score for nDCG evaluation in MLLIB
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39449'>SPARK-39449</a>] -         Propagate empty relation through Window
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39456'>SPARK-39456</a>] -         Fix broken function links in the auto-generated pandas API support list documentation.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39466'>SPARK-39466</a>] -         Clean `core/temp-secrets/` after executing `SecurityManagerSuite`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39469'>SPARK-39469</a>] -         Infer date type for CSV schema inference
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39488'>SPARK-39488</a>] -         Simplify the error handling of TempResolvedColumn
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39489'>SPARK-39489</a>] -         Improve EventLoggingListener and ReplayListener performance by replacing Json4S ASTs with Jackson trees
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39492'>SPARK-39492</a>] -         Rework MISSING_COLUMN error class
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39497'>SPARK-39497</a>] -         Improve the analysis exception of missing map key column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39511'>SPARK-39511</a>] -         Push limit 1 to right side if join type is LeftSemiOrAnti and join condition is empty
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39512'>SPARK-39512</a>] -         Document the Spark Docker container release process
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39533'>SPARK-39533</a>] -         Deprecate scoreLabelsWeight in BinaryClassificationMetrics
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39534'>SPARK-39534</a>] -         Series.argmax only needs single pass
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39538'>SPARK-39538</a>] -         Convert CaseInsensitiveStringMap#logger to static
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39545'>SPARK-39545</a>] -         Override `concat` method for `ExpressionSet` in Scala 2.13 to improve the performance
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39546'>SPARK-39546</a>] -         Support ports definition in executor pod template
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39564'>SPARK-39564</a>] -         Expose the information of catalog table to the logical plan in streaming query
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39576'>SPARK-39576</a>] -         Support GitHub Actions generate benchmark results using Scala 2.13
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39591'>SPARK-39591</a>] -         SPIP: Asynchronous Offset Management in Structured Streaming
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39595'>SPARK-39595</a>] -         Upgrade rocksdbjni to 7.3.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39599'>SPARK-39599</a>] -         Upgrade maven to 3.8.6
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39606'>SPARK-39606</a>] -         Use child stats to estimate order operator
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39613'>SPARK-39613</a>] -         Upgrade shapeless to 2.3.9
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39616'>SPARK-39616</a>] -         Upgrade Breeze to 2.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39626'>SPARK-39626</a>] -         Upgrade RoaringBitmap from 0.9.28 to 0.9.30
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39633'>SPARK-39633</a>] -         Dataframe options for time travel via `timestampAsOf` should respect both formats of specifying timestamp
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39635'>SPARK-39635</a>] -         Custom driver metrics for Datasource v2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39636'>SPARK-39636</a>] -         Fix multiple small bugs in JsonProtocol, impacting StorageLevel and Task/Executor resource requests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39638'>SPARK-39638</a>] -         Change to use `ConstantColumnVector` to store partition columns in `OrcColumnarBatchReader`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39651'>SPARK-39651</a>] -         Prune filter condition if compare with rand is deterministic
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39653'>SPARK-39653</a>] -         Remove `ColumnVectorUtils#populate(WritableColumnVector, InternalRow, int) `  method
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39657'>SPARK-39657</a>] -         YARN AM client should call the non-static setTokensConf method
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39661'>SPARK-39661</a>] -         Avoid creating unnecessary SLF4J Logger
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39662'>SPARK-39662</a>] -         Upgrade HtmlUnit and its related artifacts from 2.50.0 to 2.62.0.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39666'>SPARK-39666</a>] -         Use UnsafeProjection.create to respect `spark.sql.codegen.factoryMode` in ExpresssionEncoder
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39667'>SPARK-39667</a>] -         Add another workaround when there is not enough memory to build and broadcast the table
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39675'>SPARK-39675</a>] -         Switch &#39;spark.sql.codegen.factoryMode&#39; configuration from testing purpose to internal purpose
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39676'>SPARK-39676</a>] -         Add task partition id for Task assertEquals method in JsonProtocolSuite 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39679'>SPARK-39679</a>] -         TakeOrderedAndProjectExec should respect child output ordering
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39689'>SPARK-39689</a>] -         Support 2-chars lineSep in CSV datasource
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39691'>SPARK-39691</a>] -         Supplement `MapStatusesConvertBenchmark` result generated by Java 11 and 17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39693'>SPARK-39693</a>] -         `tpcds-1g-gen` shouldn&#39;t execute If benchmark GA does not specify to execute TPCDSQueryBenchmark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39694'>SPARK-39694</a>] -         Update `${sbtProject}/test:runMain` to `${sbtProject}/Test/runMain`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39699'>SPARK-39699</a>] -         Make CollapseProject smarter about collection creation expressions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39702'>SPARK-39702</a>] -         Reduce memory overhead of TransportCipher$EncryptedMessage&#39;s byteRawChannel buffer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39706'>SPARK-39706</a>] -         Set missing column with defaultValue as constant in `ParquetColumnVector` 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39713'>SPARK-39713</a>] -         ANSI mode: add suggestion of using try_element_at for INVALID_ARRAY_INDEX error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39724'>SPARK-39724</a>] -         Remove duplicate `.setAccessible(true)`  in `kvstore.KVTypeInfo`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39727'>SPARK-39727</a>] -         Upgrade joda-time from 2.10.13 to 2.10.14
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39728'>SPARK-39728</a>] -         Test for parity of SQL functions between Python and JVM DataFrame API&#39;s
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39733'>SPARK-39733</a>] -         Add map_contains_key to pyspark.sql.functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39734'>SPARK-39734</a>] -         Add call_udf to pyspark.sql.functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39739'>SPARK-39739</a>] -         Upgrade sbt to 1.7.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39748'>SPARK-39748</a>] -         Include the origin logical plan for LogicalRDD if it comes from DataFrame
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39749'>SPARK-39749</a>] -         ANSI SQL mode: use plain string representation on casting Decimal to String
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39751'>SPARK-39751</a>] -         Better naming for hash aggregate key probing metric
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39754'>SPARK-39754</a>] -         Remove unused import or unnecessary {}
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39755'>SPARK-39755</a>] -         Improve LocalDirsFeatureStep to randomize local directories
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39757'>SPARK-39757</a>] -         Upgrade sbt from 1.7.0 to 1.7.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39760'>SPARK-39760</a>] -         Support Varchar in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39764'>SPARK-39764</a>] -         Make PhysicalOperation the same as ScanOperation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39767'>SPARK-39767</a>] -         Remove UnresolvedDBObjectName and add UnresolvedIdentifier
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39784'>SPARK-39784</a>] -         Put Literal values on the right side of the data source filter after translating Catalyst Expression to data source filter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39785'>SPARK-39785</a>] -         Use setBufferedIo instead of withBufferedIo to cleanup log4j2 deprecated api usage
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39789'>SPARK-39789</a>] -         Remove unused method and redundant throw exception declare
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39798'>SPARK-39798</a>] -         Simplify `GenericArrayData` constructor implementation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39803'>SPARK-39803</a>] -         Use commons-text LevenshteinDistance instead of commons-langs3 `StringUtils.getLevenshteinDistance`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39806'>SPARK-39806</a>] -         Queries accessing METADATA struct crash on partitioned tables
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39809'>SPARK-39809</a>] -         Support CharType in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39812'>SPARK-39812</a>] -         Simplify code to construct AggregateExpression with toAggregateExpression
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39823'>SPARK-39823</a>] -         add DataFrame.as(StructType) in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39831'>SPARK-39831</a>] -         R dependencies installation start to fail after devtools_2.4.4 was released
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39832'>SPARK-39832</a>] -         regexp_replace should support column arguments
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39834'>SPARK-39834</a>] -         Include the origin stats and constraints for LogicalRDD if it comes from DataFrame
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39840'>SPARK-39840</a>] -         Factor PythonArrowInput out as a symmetry to PythonArrowOutput
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39849'>SPARK-39849</a>] -         Dataset.as(StructType) fills missing new columns with null value
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39851'>SPARK-39851</a>] -         Improve join stats estimation if one side can keep uniqueness
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39853'>SPARK-39853</a>] -         Support stage level schedule for standalone cluster when dynamic allocation is disabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39858'>SPARK-39858</a>] -         Remove unnecessary AliasHelper or PredicateHelper for some rules
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39860'>SPARK-39860</a>] -         More expressions should extend Predicate
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39863'>SPARK-39863</a>] -         Upgrade Hadoop to 3.3.4
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39864'>SPARK-39864</a>] -         ExecutionListenerManager&#39;s registration of the ExecutionListenerBus should be lazy
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39868'>SPARK-39868</a>] -         StageFailed event should attach with the root cause
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39870'>SPARK-39870</a>] -         Add flag to run-tests.py to retain the test output.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39872'>SPARK-39872</a>] -         HeapByteBuffer#get(int) is a hotspot path when using BytePackerForLong#unpack8Values with ByteBuffer input API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39873'>SPARK-39873</a>] -         Remove OptimizeLimitZero and merge it into EliminateLimits
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39875'>SPARK-39875</a>] -         The method in final class should not declare as protected
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39879'>SPARK-39879</a>] -         Reduce local-cluster memory configuration in BroadcastJoinSuite* and HiveSparkSubmitSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39881'>SPARK-39881</a>] -         Python Lint does not actually check for `black` formatter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39882'>SPARK-39882</a>] -         Upgrade rocksdbjni to 7.4.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39883'>SPARK-39883</a>] -         Add DataFrame function parity check
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39890'>SPARK-39890</a>] -         Make TakeOrderedAndProjectExec inherit AliasAwareOutputOrdering
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39891'>SPARK-39891</a>] -         Bump h2 to 2.1.214
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39902'>SPARK-39902</a>] -         Add Scan details to spark plan scan node in SparkUI
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39904'>SPARK-39904</a>] -         Rename inferDate to preferDate and fix an issue when inferring schema
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39906'>SPARK-39906</a>] -         Eliminate build warnings - &#39;sbt 0.13 shell syntax is deprecated; use slash syntax instead&#39;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39911'>SPARK-39911</a>] -         Optimize global Sort to RepartitionByExpression
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39912'>SPARK-39912</a>] -         Refine CatalogImpl
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39913'>SPARK-39913</a>] -         Upgrade Arrow to 9.0.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39925'>SPARK-39925</a>] -         Add array_sort(column, comparator) overload to DataFrame operations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39944'>SPARK-39944</a>] -         Upgrade dropwizard metrics to 4.2.10
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39947'>SPARK-39947</a>] -         Upgrade jersey to 2.36
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39948'>SPARK-39948</a>] -         Exclude hive-vector-code-gen dependency
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39951'>SPARK-39951</a>] -         Support columnar batches with nested fields in Parquet V2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39954'>SPARK-39954</a>] -         Upgrade ASM to 9.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39955'>SPARK-39955</a>] -         Improve LaunchTask process to avoid Stage failures caused by fail-to-send LaunchTask messages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39957'>SPARK-39957</a>] -         Delay onDisconnected to enable Driver receives ExecutorExitCode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39958'>SPARK-39958</a>] -         Add warning log when unable to load custom metric object
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39960'>SPARK-39960</a>] -         Upgrade mysql-connector-java to 8.0.30
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39963'>SPARK-39963</a>] -         Simplify the implementation of SimplifyCasts
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39973'>SPARK-39973</a>] -         Avoid noisy warnings logs when spark.scheduler.listenerbus.metrics.maxListenerClassesTimed = 0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39975'>SPARK-39975</a>] -         Upgrade rocksdbjni to 7.4.5
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39977'>SPARK-39977</a>] -         Remove unnecessary guava exclusion from jackson-module-scala
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39982'>SPARK-39982</a>] -         StructType.fromJson method missing documentation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39983'>SPARK-39983</a>] -         Should not cache unserialized broadcast relations on the driver
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39986'>SPARK-39986</a>] -         Better example for Co-grouped Map
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39989'>SPARK-39989</a>] -         Support estimate column statistics if it is foldable expression
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39991'>SPARK-39991</a>] -         AQE should use available column statistics from completed query stages
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40004'>SPARK-40004</a>] -         Redundant `LevelDB.get` in `RemoteBlockPushResolver`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40009'>SPARK-40009</a>] -         Add missing doc string info to DataFrame API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40019'>SPARK-40019</a>] -         Refactor comment of ArrayType
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40020'>SPARK-40020</a>] -         centralize the code of qualifying identifiers in SessionCatalog
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40022'>SPARK-40022</a>] -         YarnClusterSuite should not ABORTED when there is no Python3 environment
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40030'>SPARK-40030</a>] -         Upgrade scala-maven-plugin to 4.7.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40033'>SPARK-40033</a>] -         Nested schema pruning support through element_at
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40039'>SPARK-40039</a>] -         Introducing a streaming checkpoint file manager based on Hadoop&#39;s Abortable interface
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40040'>SPARK-40040</a>] -         Push local limit to both sides if join condition is empty
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40050'>SPARK-40050</a>] -         Enhance EliminateSorts to support removing sorts via LocalLimit
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40053'>SPARK-40053</a>] -         HiveExternalCatalogVersionsSuite will test all spark versions and aborted when Python 2.7 is used 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40056'>SPARK-40056</a>] -         Upgrade mvn-scalafmt from 1.0.4 to 1.1.1640084764.9f463a9
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40058'>SPARK-40058</a>] -         Avoid filter twice in HadoopFSUtils
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40067'>SPARK-40067</a>] -         Add table name to Spark plan node in SparkUI
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40071'>SPARK-40071</a>] -         Update plugins to latest versions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40072'>SPARK-40072</a>] -         MAVEN_OPTS in make-distributions.sh is different from one specified in pom.xml
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40073'>SPARK-40073</a>] -         Should Use `connector/${moduleName}` instead of `external/${moduleName}`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40084'>SPARK-40084</a>] -         Upgrade Py4J from 0.10.9.5 to 0.10.9.7
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40085'>SPARK-40085</a>] -         use INTERNAL_ERROR error class instead of IllegalStateException to indicate bugs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40086'>SPARK-40086</a>] -         Improve AliasAwareOutputPartitioning to take all aliases into account
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40095'>SPARK-40095</a>] -         sc.uiWebUrl should not throw exception when webui is disabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40105'>SPARK-40105</a>] -         Improve repartition in ReplaceCTERefWithRepartition
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40106'>SPARK-40106</a>] -         Task failure handlers should always run if the task failed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40112'>SPARK-40112</a>] -         Improve the TO_BINARY() function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40113'>SPARK-40113</a>] -         Reactor ParquetScanBuilder DataSourceV2 interface implementation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40128'>SPARK-40128</a>] -         Add DELTA_LENGTH_BYTE_ARRAY as a recognized standalone encoding in VectorizedColumnReader
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40145'>SPARK-40145</a>] -         Create infra image when cut down branches
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40146'>SPARK-40146</a>] -         Simply the codegen of getting map value
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40153'>SPARK-40153</a>] -         Unify the logic of resolve functions and table-valued functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40162'>SPARK-40162</a>] -         Upgrade RoaringBitmap from 0.9.30 to 0.9.31
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40163'>SPARK-40163</a>] -         [SPARK][SQL] feat: SparkSession.confing(Map)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40165'>SPARK-40165</a>] -         Update test plugins to latest versions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40166'>SPARK-40166</a>] -         Add array_sort(column, comparator) to PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40167'>SPARK-40167</a>] -         Add array_sort(column, comparator) to SparkR
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40175'>SPARK-40175</a>] -         Converting Tuple2 to Scala Map via `.toMap` is slow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40185'>SPARK-40185</a>] -         Remove column suggestion when the candidate list is empty for unresolved column/attribute/map key
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40192'>SPARK-40192</a>] -         Remove redundant groupby
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40194'>SPARK-40194</a>] -         SPLIT function on empty regex should truncate trailing empty string.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40197'>SPARK-40197</a>] -         Replace query plan with context for MULTI_VALUE_SUBQUERY_ERROR
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40207'>SPARK-40207</a>] -         Specify the column name when the data type is not supported by datasource
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40214'>SPARK-40214</a>] -         Add `get` to dataframe functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40215'>SPARK-40215</a>] -         Add SQL configs to control CSV/JSON date and timestamp parsing behaviour
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40216'>SPARK-40216</a>] -         Extract common `prepareWrite` method for `ParquetFileFormat` and `ParquetWrite` to eliminate duplicate code
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40219'>SPARK-40219</a>] -         resolved view plan should hold the schema to avoid redundant lookup
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40224'>SPARK-40224</a>] -         Make ObjectHashAggregateExec release memory eagerly when fallback to sort-based
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40225'>SPARK-40225</a>] -         PySpark rdd.takeOrdered should check num and numPartitions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40228'>SPARK-40228</a>] -         Don&#39;t simplify multiLike if child is not attribute
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40234'>SPARK-40234</a>] -         Clean only MDC items set by Spark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40235'>SPARK-40235</a>] -         Use interruptible lock instead of synchronized in Executor.updateDependencies()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40239'>SPARK-40239</a>] -         Remove duplicated &#39;fraction&#39; validation in RDD.sample
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40240'>SPARK-40240</a>] -         PySpark rdd.takeSample should validate `num &gt; maxSampleSize` at first
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40241'>SPARK-40241</a>] -         Correct the link of GenericUDTF
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40243'>SPARK-40243</a>] -         Enhance Hive UDF support documentation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40248'>SPARK-40248</a>] -         Use larger number of bits to build bloom filter 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40251'>SPARK-40251</a>] -         Upgrade dev.ludovic.netlib from 2.2.1 to 3.0.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40252'>SPARK-40252</a>] -         Replace `Stream.collect(Collectors.joining(delimiter))` with `StringJoiner` Api
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40254'>SPARK-40254</a>] -         Upgrade netty from 4.1.77 to 4.1.80
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40256'>SPARK-40256</a>] -         Switch base image from openjdk to eclipse-temurin
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40276'>SPARK-40276</a>] -         reduce the result size of RDD.takeOrdered
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40283'>SPARK-40283</a>] -         Update mima&#39;s previousSparkVersion to 3.3.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40285'>SPARK-40285</a>] -         Simplify the roundTo[Numeric] for Decimal
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40293'>SPARK-40293</a>] -         Make the V2 table error message more meaningful
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40301'>SPARK-40301</a>] -         Add parameter validation in pyspark.rdd
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40308'>SPARK-40308</a>] -         str_to_map should accept non-foldable delimiter arguments
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40311'>SPARK-40311</a>] -         Introduce withColumnsRenamed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40312'>SPARK-40312</a>] -         Add missing configuration documentation in Spark History Server
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40321'>SPARK-40321</a>] -         Upgrade rocksdbjni to 7.5.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40352'>SPARK-40352</a>] -         Add function aliases: len, datepart, dateadd, date_diff and curdate
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40360'>SPARK-40360</a>] -         Convert some DDL exception to new error framework
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40365'>SPARK-40365</a>] -         Bump ANTLR runtime version from 4.8 to 4.9.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40376'>SPARK-40376</a>] -         `np.bool` will be deprecated
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40382'>SPARK-40382</a>] -         Reduce projections in Expand when multiple distinct aggregations have semantically equivalent children
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40383'>SPARK-40383</a>] -         Pin mypy ==0.920 in dev/requirements.txt
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40387'>SPARK-40387</a>] -         Improve the implementation of Spark Decimal
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40396'>SPARK-40396</a>] -         Update scalatest and scalatestplus to use latest version
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40397'>SPARK-40397</a>] -         Migrate selenium-java from 3.1 to 4.2 and upgrade org.scalatestplus:selenium to 3.2.13.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40398'>SPARK-40398</a>] -         Use Loop instead of Arrays.stream api
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40401'>SPARK-40401</a>] -         Remove the support of deprecated `spark.akka.*` config
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40404'>SPARK-40404</a>] -         Fix the wrong description related to `spark.shuffle.service.db` in the document
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40406'>SPARK-40406</a>] -         The default logging should go to stderr
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40411'>SPARK-40411</a>] -         Refactor FlatMapGroupsWithStateExec to have a parent trait
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40414'>SPARK-40414</a>] -         Fix PythonArrowInput and PythonArrowOutput to be more generic to handle complicated type/data
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40419'>SPARK-40419</a>] -         Integrate Grouped Aggregate Pandas UDFs into *.sql test cases
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40424'>SPARK-40424</a>] -         Refactor ChromeUIHistoryServerSuite to test rocksdb
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40425'>SPARK-40425</a>] -         DROP TABLE does not need to do table lookup
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40428'>SPARK-40428</a>] -         Add a shutdownhook to CoarseGrained scheduler to avoid dangling resources during abnormal shutdown
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40436'>SPARK-40436</a>] -         Upgrade Scala to 2.12.17
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40456'>SPARK-40456</a>] -         PartitionIterator.hasNext should be cheap to call repeatedly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40463'>SPARK-40463</a>] -         Update gpg&#39;s keyserver
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40466'>SPARK-40466</a>] -         Improve the error message if the DSv2 source is disabled but DSv1 streaming source is not available
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40471'>SPARK-40471</a>] -         Upgrade RoaringBitmap to 0.9.32
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40474'>SPARK-40474</a>] -         Correct CSV schema inference and data parsing behavior on columns with mixed dates and timestamps
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40476'>SPARK-40476</a>] -         Reduce the shuffle size of ALS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40478'>SPARK-40478</a>] -         Add create datasource table options docs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40484'>SPARK-40484</a>] -         Upgrade log4j2 to 2.19.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40487'>SPARK-40487</a>] -         Make defaultJoin in BroadcastNestedLoopJoinExec running in parallel
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40488'>SPARK-40488</a>] -         Do not wrap exceptions thrown in FileFormatWriter.write with SparkException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40490'>SPARK-40490</a>] -         `YarnShuffleIntegrationSuite` no longer verifies `registeredExecFile`  reload after  SPARK-17321
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40494'>SPARK-40494</a>] -         Optimize the performance of `keys.zipWithIndex.toMap` code pattern 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40500'>SPARK-40500</a>] -         Use `pd.items` instead of `pd.iteritems`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40501'>SPARK-40501</a>] -         Add PushProjectionThroughLimit for Optimizer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40511'>SPARK-40511</a>] -         Upgrade slf4j to 2.x
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40527'>SPARK-40527</a>] -         Keep struct field names or map keys in CreateStruct
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40531'>SPARK-40531</a>] -         Upgrade zstd-jni from 1.5.2-3 to 1.5.2-4
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40544'>SPARK-40544</a>] -         The file size of `sql/hive/target/unit-tests.log` is too big
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40545'>SPARK-40545</a>] -         SparkSQLEnvSuite failed to clean the `spark_derby` directory after execution
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40547'>SPARK-40547</a>] -         Fix dead links in sparkr-vignettes.Rmd
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40548'>SPARK-40548</a>] -         Upgrade rocksdbjni from 7.5.3 to 7.6.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40556'>SPARK-40556</a>] -         Unpersist the intermediate datasets cached in AttachDistributedSequenceExec
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40574'>SPARK-40574</a>] -         Add PURGE to DROP TABLE doc
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40575'>SPARK-40575</a>] -         Add badges for PySpark downloads
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40595'>SPARK-40595</a>] -         Improve error message for unused CTE relations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40599'>SPARK-40599</a>] -         Add multiTransform methods to TreeNode to generate alternatives
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40601'>SPARK-40601</a>] -         Improve error when cogrouping groups with mismatching key sizes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40604'>SPARK-40604</a>] -         Verify the temporary column names in PS
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40606'>SPARK-40606</a>] -         Eliminate `to_pandas` warnings in test
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40607'>SPARK-40607</a>] -         Remove redundant string interpolator operations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40611'>SPARK-40611</a>] -         Improve the performance for setInterval &amp; getInterval of UnsafeRow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40619'>SPARK-40619</a>] -         HivePartitionFilteringSuites teset aborted due to `java.lang.OutOfMemoryError: Metaspace`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40620'>SPARK-40620</a>] -         Deduplication of WorkerOffer build in CoarseGrainedSchedulerBackend
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40628'>SPARK-40628</a>] -         Do not push complex left semi/anti join condition through project
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40633'>SPARK-40633</a>] -         Upgrade janino to 3.1.9
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40634'>SPARK-40634</a>] -         Upgrade jodatime to 2.11.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40639'>SPARK-40639</a>] -         Upgrade sbt from 1.7.1 to 1.7.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40640'>SPARK-40640</a>] -         SparkHadoopUtil to set origin of hadoop/hive config options
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40646'>SPARK-40646</a>] -         Fix returning partial results in JSON data source and JSON functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40648'>SPARK-40648</a>] -           Add `@ExtendedLevelDBTest` to the leveldb relevant  tests in the yarn module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40654'>SPARK-40654</a>] -         Protobuf support MVP with descriptor files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40655'>SPARK-40655</a>] -         Protobuf functions in Python 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40657'>SPARK-40657</a>] -         Add support for compiled classes (Java classes)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40661'>SPARK-40661</a>] -         Upgrade `jetty-http` from 9.4.48.v20220622 to 9.4.49.v20220914
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40667'>SPARK-40667</a>] -         Refactor File Data Source Options
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40675'>SPARK-40675</a>] -         Supplement missing spark configuration in documentation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40676'>SPARK-40676</a>] -         Upgrade scalatest related test dependencies to 3.2.14
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40697'>SPARK-40697</a>] -         Add read-side char/varchar handling to cover external data files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40711'>SPARK-40711</a>] -         Add spill size metrics for window
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40712'>SPARK-40712</a>] -         upgrade sbt-assembly plugin to 1.2.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40724'>SPARK-40724</a>] -         Simplify `corr` with method `inline`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40725'>SPARK-40725</a>] -         Add mypy-protobuf to requirements
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40728'>SPARK-40728</a>] -         Upgrade ASM to 9.4
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40735'>SPARK-40735</a>] -         Consistently invoke bash with /usr/bin/env bash in scripts to make code more portable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40740'>SPARK-40740</a>] -         Improve listFunctions in SessionCatalog
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40742'>SPARK-40742</a>] -         Java compilation warnings related to generic type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40745'>SPARK-40745</a>] -         Reduce the shuffle size of ALS in mllib
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40765'>SPARK-40765</a>] -         Optimize redundant fs operations in `CommandUtils#calculateSingleLocationSize#getPathSize` method
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40766'>SPARK-40766</a>] -         Upgrade the guava defined in `plugins.sbt` to `31.0.1-jre`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40772'>SPARK-40772</a>] -         Improve spark.sql.adaptive.skewJoin.skewedPartitionFactor to support float values
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40776'>SPARK-40776</a>] -         Add documentation (similar to Avro functions).
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40777'>SPARK-40777</a>] -         Use error classes for Protobuf exceptions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40778'>SPARK-40778</a>] -         Make HeartbeatReceiver as an IsolatedRpcEndpoint
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40782'>SPARK-40782</a>] -         Upgrade Jackson-databind to 2.13.4.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40794'>SPARK-40794</a>] -         Upgrade Netty from 4.1.80 to 4.1.84
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40795'>SPARK-40795</a>] -         Exclude redundant jars from spark-protobuf-assembly jar
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40797'>SPARK-40797</a>] -         Force grouped import onto single line with Scalafmt
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40803'>SPARK-40803</a>] -         LZ4CompressionCodec looks up configuration on each stream creation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40821'>SPARK-40821</a>] -         Introduce window_time function to extract event time from the window column
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40826'>SPARK-40826</a>] -         Add additional checkpoint rename file check
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40834'>SPARK-40834</a>] -         Use SparkListenerSQLExecutionEnd to track final SQL status in UI
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40843'>SPARK-40843</a>] -         Clean up deprecated api usage in SparkThrowableSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40846'>SPARK-40846</a>] -         GA test failed with Java 8u352
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40853'>SPARK-40853</a>] -         Pin mypy-protobuf==3.3.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40863'>SPARK-40863</a>] -         Upgrade dropwizard metrics from 4.2.10 to 4.2.12
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40865'>SPARK-40865</a>] -         Upgrade jodatime to 2.12.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40886'>SPARK-40886</a>] -         Bump Jackson Databind 2.13.4.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40892'>SPARK-40892</a>] -         Loosen the requirement of window_time rule - allow multiple window_time calls
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40895'>SPARK-40895</a>] -         Upgrade Arrow to 10.0.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40897'>SPARK-40897</a>] -         Add missing PySpark APIs to References
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40904'>SPARK-40904</a>] -         Support zsh in K8s `entrypoint.sh`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40905'>SPARK-40905</a>] -         Upgrade rocksdbjni to 7.7.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40913'>SPARK-40913</a>] -         Pin `pytest==7.1.3`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40919'>SPARK-40919</a>] -         Bad case of `AnalysisTest#assertAnalysisErrorClass` when `expectedMessageParameters.size between [2, 4]`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40921'>SPARK-40921</a>] -         Add WHEN NOT MATCHED BY SOURCE clause to MERGE INTO command
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40925'>SPARK-40925</a>] -         Fix late record filtering to support chaining of steteful operators
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40935'>SPARK-40935</a>] -         Upgrade ZSTD-JNI to 1.5.2-5
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40936'>SPARK-40936</a>] -         Refactor `AnalysisTest#assertAnalysisErrorClass` by reusing the `SparkFunSuite#checkError`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40940'>SPARK-40940</a>] -         Fix the unsupported ops checker to allow chaining of stateful operators
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40943'>SPARK-40943</a>] -         Make MSCK optional in MSCK REPAIR TABLE commands
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40950'>SPARK-40950</a>] -         isRemoteAddressMaxedOut performance overhead on scala 2.13
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40976'>SPARK-40976</a>] -         Upgrade sbt to 1.7.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40985'>SPARK-40985</a>] -         Upgrade RoaringBitmap to 0.9.35
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40991'>SPARK-40991</a>] -         Update cloudpickle to v2.2.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40996'>SPARK-40996</a>] -         Upgrade `sbt-checkstyle-plugin` to 4.0.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41017'>SPARK-41017</a>] -         Support column pruning with multiple nondeterministic Filters
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41023'>SPARK-41023</a>] -         Upgrade Jackson to 2.14.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41024'>SPARK-41024</a>] -         Upgrade scala-maven-plugin to 4.7.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41029'>SPARK-41029</a>] -         Optimize the use of `GenericArrayData` constructor for Scala 2.13
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41031'>SPARK-41031</a>] -         Upgrade `org.tukaani:xz` to 1.9
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41039'>SPARK-41039</a>] -         Upgrade `scala-parallel-collections` to 1.0.4 for Scala 2.13
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41045'>SPARK-41045</a>] -         Pre-compute to eliminate ScalaReflection calls after deserializer is created
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41048'>SPARK-41048</a>] -         Improve output partitioning and ordering with AQE cache 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41050'>SPARK-41050</a>] -          Upgrade scalafmt from 3.5.9 to 3.6.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41051'>SPARK-41051</a>] -         Optimize ProcfsMetrics file acquisition
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41071'>SPARK-41071</a>] -         Metaspace OOM when Local run dev/make-distribution.sh 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41074'>SPARK-41074</a>] -         Add option `--upgrade` in dependency installation command
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41087'>SPARK-41087</a>] -         Make `build/mvn` use the same JAVA_OPTS as `dev/make-distribution.sh`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41089'>SPARK-41089</a>] -         Relocate Netty native arm64 libs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41090'>SPARK-41090</a>] -         Enhance Dataset.createTempView testing coverage for db_name.view_name
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41092'>SPARK-41092</a>] -         Do not use identifier to match interval units
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41096'>SPARK-41096</a>] -         Support reading parquet FIXED_LEN_BYTE_ARRAY type
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41097'>SPARK-41097</a>] -         Remove redundant collection  conversion for Scala 2.13
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41106'>SPARK-41106</a>] -         Reduce collection conversion when create AttributeMap
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41112'>SPARK-41112</a>] -         RuntimeFilter should apply ColumnPruning eagerly with in-subquery filter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41113'>SPARK-41113</a>] -         Upgrade sbt to 1.8.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41120'>SPARK-41120</a>] -         Upgrade joda-time from 2.12.0 to 2.12.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41121'>SPARK-41121</a>] -         Upgrade sbt-assembly from 1.2.0 to 2.0.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41123'>SPARK-41123</a>] -         Upgrade mysql-connector-java from 8.0.30 to 8.0.31
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41126'>SPARK-41126</a>] -         `entrypoint.sh` should use its WORKDIR instead of `/tmp` directory
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41134'>SPARK-41134</a>] -         improve error message of internal errors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41153'>SPARK-41153</a>] -         Log migrated shuffle data size and migration time
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41155'>SPARK-41155</a>] -         Add error message to SchemaColumnConvertNotSupportedException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41161'>SPARK-41161</a>] -         Upgrade `scala-parser-combinators` to 2.1.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41167'>SPARK-41167</a>] -         Optimize LikeSimplification rule to improve multi like performance
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41194'>SPARK-41194</a>] -         Add log4j2.properties for testing to the protobuf module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41197'>SPARK-41197</a>] -         Upgrade Kafka to 3.3.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41209'>SPARK-41209</a>] -         Improve PySpark type inference in _merge_type method
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41211'>SPARK-41211</a>] -          Upgrade ZooKeeper to 3.6.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41223'>SPARK-41223</a>] -         Upgrade slf4j to 2.0.4
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41226'>SPARK-41226</a>] -         Refactor Spark types by introducing physical types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41239'>SPARK-41239</a>] -         Upgrade Jackson to 2.14.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41248'>SPARK-41248</a>] -         Add config flag to control before of JSON partial results parsing in SPARK-40646
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41251'>SPARK-41251</a>] -         Upgrade pandas from 1.5.1 to 1.5.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41252'>SPARK-41252</a>] -         Upgrade arrow from 10.0.0 to 10.0.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41260'>SPARK-41260</a>] -         Cast NumPy instances to Python primitive types in GroupState update
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41267'>SPARK-41267</a>] -         Add unpivot / melt to SparkR
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41273'>SPARK-41273</a>] -         Update plugins to latest versions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41275'>SPARK-41275</a>] -         Upgrade pickle to 1.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41276'>SPARK-41276</a>] -         Optimize constructor use of `StructType`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41316'>SPARK-41316</a>] -         Add @tailrec wherever possible
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41338'>SPARK-41338</a>] -         resolve outer references and normal columns in the same analyzer batch
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41355'>SPARK-41355</a>] -         Workaround hive table name validation issue
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41360'>SPARK-41360</a>] -         Avoid BlockManager re-registration if the executor has been lost
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41369'>SPARK-41369</a>] -         Refactor connect directory structure
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41373'>SPARK-41373</a>] -         Rename CAST_WITH_FUN_SUGGESTION to CAST_WITH_FUNC_SUGGESTION
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41387'>SPARK-41387</a>] -         Add assertion on end offset range for Kafka data source with Trigger.AvailableNow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41390'>SPARK-41390</a>] -         Update the script used to generate register function in UDFRegistration 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41393'>SPARK-41393</a>] -         Upgrade slf4j to 2.0.5
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41402'>SPARK-41402</a>] -         Override nodeName of StringDecode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41404'>SPARK-41404</a>] -         Support `ColumnarBatchSuite#testRandomRows` to test more primitive dataType
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41405'>SPARK-41405</a>] -         centralize the column resolution logic
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41408'>SPARK-41408</a>] -         Upgrade scala-maven-plugin to 4.8.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41442'>SPARK-41442</a>] -         Only update SQLMetric value if merging with valid metric
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41447'>SPARK-41447</a>] -         Reduce the number of doMergeApplicationListing invocations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41450'>SPARK-41450</a>] -         PySpark built from master code raise error &quot;java.lang.ClassNotFoundException: org.eclipse.jetty.server.Handler&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41454'>SPARK-41454</a>] -         Support Python 3.11
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41456'>SPARK-41456</a>] -         Improve the performance of try_cast
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41460'>SPARK-41460</a>] -         Introduce IsolatedThreadSafeRpcEndpoint to extend IsolatedRpcEndpoint 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41463'>SPARK-41463</a>] -         Ensure error class (and subclass) names contain only capital letters, numbers and underscores
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41466'>SPARK-41466</a>] -         Change Scala Style configuration to catch AnyFunSuite instead of FunSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41467'>SPARK-41467</a>] -         Upgrade httpclient from 4.5.13 to 4.5.14
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41469'>SPARK-41469</a>] -         Task rerun on decommissioned executor can be avoided if shuffle data has migrated
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41474'>SPARK-41474</a>] -         Exclude proto files from spark-protobuf
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41476'>SPARK-41476</a>] -         Prevent `README.md` from triggering CIs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41482'>SPARK-41482</a>] -         Upgrade dropwizard metrics 4.2.13
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41491'>SPARK-41491</a>] -         Update postgres docker image to 15.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41509'>SPARK-41509</a>] -         Delay execution hash until after aggregation for semi-join runtime filter.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41511'>SPARK-41511</a>] -         LongToUnsafeRowMap support ignoresDuplicatedKey
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41520'>SPARK-41520</a>] -         Split AND_OR TreePattern to separate AND and OR TreePatterns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41523'>SPARK-41523</a>] -         `protoc-jar-maven-plugin` should uniformly use `protoc-jar-maven-plugin.version` as the version 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41524'>SPARK-41524</a>] -         Expose SQL confs and extraOptions separately in o.a.s.sql.execution.streaming.state.RocksDBConf
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41530'>SPARK-41530</a>] -         Rename MedianHeap to PercentileMap and support percentile
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41534'>SPARK-41534</a>] -         Setup initial client module for Spark Connect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41541'>SPARK-41541</a>] -         Fix wrong child call in SQLShuffleWriteMetricsReporter.decRecordsWritten()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41544'>SPARK-41544</a>] -         Upgrade `versions-maven-plugin` to 2.14.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41553'>SPARK-41553</a>] -         Fix the documentation for num_files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41561'>SPARK-41561</a>] -         Upgrade slf4j related dependencies from 2.0.5 to 2.0.6
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41562'>SPARK-41562</a>] -         Upgrade joda-time from 2.12.1 to 2.12.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41567'>SPARK-41567</a>] -         Move configuration of `versions-maven-plugin` to parent pom
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41569'>SPARK-41569</a>] -         Upgrade rocksdbjni to 7.8.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41584'>SPARK-41584</a>] -         Upgrade RoaringBitmap to 0.9.36
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41587'>SPARK-41587</a>] -         Upgrade org.scalatestplus:selenium-4-4 to org.scalatestplus:selenium-4-7
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41588'>SPARK-41588</a>] -         Make &quot;Rule id not found&quot; error message more actionable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41660'>SPARK-41660</a>] -         only propagate metadata columns if they are used
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41669'>SPARK-41669</a>] -         Speed up CollapseProject for wide tables
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41704'>SPARK-41704</a>] -          Upgrade `sbt-assembly` from 2.0.0 to 2.1.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41711'>SPARK-41711</a>] -         Upgrade protobuf-java to 3.21.12
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41714'>SPARK-41714</a>] -         Update maven-checkstyle-plugin from 3.1.2 to 3.2.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41719'>SPARK-41719</a>] -         Spark SSLOptions sub settings should be set only when ssl is enabled
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41720'>SPARK-41720</a>] -         Rename UnresolvedFunc to UnresolvedFunctionName
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41750'>SPARK-41750</a>] -         Upgrade dev.ludovic.netlib to 3.0.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41760'>SPARK-41760</a>] -         Enforce scalafmt for Spark Connect Client module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41778'>SPARK-41778</a>] -         Add an alias &quot;reduce&quot; to ArrayAggregate
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41787'>SPARK-41787</a>] -         Upgrade silencer from 1.7.10 to 1.7.12
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41791'>SPARK-41791</a>] -         Create distinct metadata attributes for metadata that is constant or file and metadata that is generated during the scan
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41798'>SPARK-41798</a>] -         Upgrade hive-storage-api to 2.8.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41800'>SPARK-41800</a>] -         Upgrade commons-compress to 1.22
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41802'>SPARK-41802</a>] -         Upgrade Apache httpcore to 4.4.16
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41805'>SPARK-41805</a>] -         Reuse expressions in WindowSpecDefinition
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41806'>SPARK-41806</a>] -         Use AppendData.byName for SQL INSERT INTO by name for DSV2 and block ambiguous queries with static partitions columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41822'>SPARK-41822</a>] -         Setup Scala/JVM Client Connection
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41860'>SPARK-41860</a>] -         Make AvroScanBuilder and JsonScanBuilder case classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41861'>SPARK-41861</a>] -         Make v2 ScanBuilders&#39; build() return typed scan
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41883'>SPARK-41883</a>] -          Upgrade dropwizard metrics 4.2.15
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41893'>SPARK-41893</a>] -         Publish SBOM artifacts
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41925'>SPARK-41925</a>] -         Enable spark.sql.orc.enableNestedColumnVectorizedReader by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41938'>SPARK-41938</a>] -         Upgrade sbt from 1.8.0 to 1.8.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41941'>SPARK-41941</a>] -         Upgrade scalatest related test dependencies to 3.2.15
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41943'>SPARK-41943</a>] -         Use java api to create files and grant permissions is DiskBlockManager
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41949'>SPARK-41949</a>] -         Make stage scheduling support local-cluster mode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41962'>SPARK-41962</a>] -         Update the import order of scala package in class SpecificParquetRecordReaderBase
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41965'>SPARK-41965</a>] -         Add DataFrameWriterV2 to PySpark API references
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41966'>SPARK-41966</a>] -         Add `CharType` and `TimestampNTZType` to PySpark API references
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41970'>SPARK-41970</a>] -         Introduce SparkPath to address paths and URIs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41986'>SPARK-41986</a>] -         Introduce shuffle on SinglePartition
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41994'>SPARK-41994</a>] -         Harden SQLSTATE usage for error classes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42031'>SPARK-42031</a>] -         Clean up remove methods that do not need override
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42037'>SPARK-42037</a>] -         Rename AMPLAB_ to SPARK_ prefix in build environment variables
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42043'>SPARK-42043</a>] -         Basic Scala Client Result Implementation 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42049'>SPARK-42049</a>] -         Improve AliasAwareOutputExpression
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42055'>SPARK-42055</a>] -         Upgrade scalatest-maven-plugin from 2.1.0 to 2.2.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42056'>SPARK-42056</a>] -         Add missing options for Protobuf functions.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42058'>SPARK-42058</a>] -         Harden SQLSTATE usage for error classes (2)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42065'>SPARK-42065</a>] -         Remove duplicated test_freqItems
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42067'>SPARK-42067</a>] -         Upgrade buf from 1.11.0 to 1.12.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42081'>SPARK-42081</a>] -         improve the plan change validation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42083'>SPARK-42083</a>] -         Make (Executor|StatefulSet)PodsAllocator extendable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42086'>SPARK-42086</a>] -         Sort test cases in SQLQueryTestSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42091'>SPARK-42091</a>] -         Upgrade jetty to 9.4.50.v20221201
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42092'>SPARK-42092</a>] -         Upgrade RoaringBitmap to 0.9.38
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42096'>SPARK-42096</a>] -         Code cleanup for connect module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42106'>SPARK-42106</a>] -         [Pyspark] Hide parameters when re-printing user provided remote URL in REPL
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42108'>SPARK-42108</a>] -         Make Analyzer transform Count(*) into Count(1)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42111'>SPARK-42111</a>] -         Mark Orc*FilterSuite/OrcV*SchemaPruningSuite as ExtendedSQLTest
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42114'>SPARK-42114</a>] -         Add uniform parquet encryption test case
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42116'>SPARK-42116</a>] -         Mark ColumnarBatchSuite as ExtendedSQLTest
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42129'>SPARK-42129</a>] -         Upgrade rocksdbjni to 7.9.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42133'>SPARK-42133</a>] -         Add basic Dataset API methods to Spark Connect Scala Client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42149'>SPARK-42149</a>] -         Remove the env `SPARK_USE_CONC_INCR_GC` used to enable CMS GC for Yarn AM 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42152'>SPARK-42152</a>] -         Use `_` instead of `-` in `shadedPattern` for relocation package name
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42161'>SPARK-42161</a>] -         Upgrade Arrow to 11.0.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42166'>SPARK-42166</a>] -         Make `docker-image-tool.sh` usage message up-to-date
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42167'>SPARK-42167</a>] -         Improve GitHub Action `lint` job to stop on failures earlier
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42172'>SPARK-42172</a>] -         Compatibility check for Scala Client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42180'>SPARK-42180</a>] -         Update `SCALA_VERSION` in `_config.yml`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42202'>SPARK-42202</a>] -         Scala Client E2E test stop the server gracefully
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42220'>SPARK-42220</a>] -         Upgrade buf from 1.12.0 to 1.13.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42230'>SPARK-42230</a>] -         Improve `lint` job by skipping PySpark and SparkR docs if unchanged
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42237'>SPARK-42237</a>] -         change binary to unsupported dataType in csv format
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42277'>SPARK-42277</a>] -         Use ROCKSDB for spark.history.store.hybridStore.diskBackend by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42283'>SPARK-42283</a>] -         Add Simple Scala UDFs to Scala/JVM Client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42287'>SPARK-42287</a>] -         Optimize the packaging strategy of connect client module
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42333'>SPARK-42333</a>] -         Change log level to debug when fetching result set from SparkExecuteStatementOperation
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42334'>SPARK-42334</a>] -         Make sure connect client assembly and sql package is built before running client tests - SBT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42354'>SPARK-42354</a>] -         Upgrade Jackson to 2.14.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42372'>SPARK-42372</a>] -         Improve performance of HiveGenericUDTF by making inputProjection instantiate once
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42390'>SPARK-42390</a>] -         Upgrade buf from 1.13.1 to 1.14.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42394'>SPARK-42394</a>] -         Fix the usage information of bin/spark-sql --help
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42398'>SPARK-42398</a>] -         refine default column value framework
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42422'>SPARK-42422</a>] -         Upgrade `maven-shade-plugin` to 3.4.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42423'>SPARK-42423</a>] -         Add metadata column file block start and length
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42429'>SPARK-42429</a>] -         IntelliJ Build issue: value getArgument is not a member of org.mockito.invocation.InvocationOnMock
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42436'>SPARK-42436</a>] -         Improve multiTransform to generate alternatives dynamically
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42457'>SPARK-42457</a>] -         Scala Client Session Read API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42480'>SPARK-42480</a>] -         Improve the performance of drop partitions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42482'>SPARK-42482</a>] -         Scala client Write API V1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42514'>SPARK-42514</a>] -         Scala Client add partition transforms functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42518'>SPARK-42518</a>] -         Scala client Write API V2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42526'>SPARK-42526</a>] -         Add Classifier.getNumClasses back
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42527'>SPARK-42527</a>] -         Scala Client add Window functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42543'>SPARK-42543</a>] -         Specify protocol for UDF artifact transfer in JVM/Scala client 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42548'>SPARK-42548</a>] -         Add ReferenceAllColumns to skip rewriting attributes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42599'>SPARK-42599</a>] -         Make `CompatibilitySuite` as a tool like `dev/mima`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42653'>SPARK-42653</a>] -         Artifact transfer from Scala/JVM client to Server
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42656'>SPARK-42656</a>] -         Spark Connect Scala Client Shell Script
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42675'>SPARK-42675</a>] -         Should clean up temp view after test
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42684'>SPARK-42684</a>] -         v2 catalog should not allow column default value by default
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42712'>SPARK-42712</a>] -         Improve docstring of mapInPandas and mapInArrow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42722'>SPARK-42722</a>] -         Python Connect def schema() should not cache the schema 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42895'>SPARK-42895</a>] -         ValueError when invoking any session operations on a stopped Spark session
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42904'>SPARK-42904</a>] -         Char/Varchar Support for JDBC Catalog
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42908'>SPARK-42908</a>] -         Raise RuntimeError if SparkContext is not initialized when parsing DDL-formatted type strings
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42917'>SPARK-42917</a>] -         Correct getUpdateColumnNullabilityQuery for DerbyDialect 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42946'>SPARK-42946</a>] -         Sensitive data could still be exposed by variable substitution
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-43009'>SPARK-43009</a>] -         Parameterized sql() with constants
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-43075'>SPARK-43075</a>] -         Change gRPC to grpcio when it is not installed.
</li>
</ul>
    
<h2>        Test
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38755'>SPARK-38755</a>] -         Add file to address missing pandas general functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38786'>SPARK-38786</a>] -         Test Bug in StatisticsSuite &quot;change stats after add/drop partition command&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38893'>SPARK-38893</a>] -         Test SourceProgress in PySpark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38920'>SPARK-38920</a>] -         Add ORC blockSize tests to BloomFilterBenchmark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38923'>SPARK-38923</a>] -         Regenerate benchmark results
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38944'>SPARK-38944</a>] -         Close `NioBufferedFileInputStream` opened by `ExternalAppendOnlyUnsafeRowArraySuite`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38948'>SPARK-38948</a>] -         `DiskRowQueue` leak in `PythonForeachWriterSuite `
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39034'>SPARK-39034</a>] -         Add tests for options from `to_json` and `from_json`.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39035'>SPARK-39035</a>] -         Add tests for options from `to_csv` and `from_csv`.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39117'>SPARK-39117</a>] -         Do not include number of functions in sql-expression-schema.md
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39181'>SPARK-39181</a>] -         SessionCatalog.reset should not drop temp functions twice
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39253'>SPARK-39253</a>] -         Improve PySpark API reference to be more readable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39331'>SPARK-39331</a>] -         Flay test: StreamingListenerTests.test_listener_events
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39369'>SPARK-39369</a>] -         Use JAVA_OPTS for AppVeyer build to increase the memory properly
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39372'>SPARK-39372</a>] -         Support R 4.2.0 in SparkR
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39394'>SPARK-39394</a>] -         Improve PySpark structured streaming page more readable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39463'>SPARK-39463</a>] -         Use UUID for test database location in JavaJdbcRDDSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39477'>SPARK-39477</a>] -         Remove &quot;Number of queries&quot; info from the golden files of SQLQueryTestSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39495'>SPARK-39495</a>] -         Support SPARK_TEST_HIVE_CLIENT_VERSIONS for HiveClientVersions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39584'>SPARK-39584</a>] -         Fix TPCDSQueryBenchmark Measuring Performance of Wrong Query Results
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39604'>SPARK-39604</a>] -         Miss UT for DerbyDialet&#39;s getCatalystType
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39631'>SPARK-39631</a>] -         Update FilterPushdownBenchmark results
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39663'>SPARK-39663</a>] -         Miss UT for MysqlDialect&#39;s listIndexes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39701'>SPARK-39701</a>] -         Move withSecretFile to SparkFunSuite to reuse
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39711'>SPARK-39711</a>] -         Remove redundant trait: BeforeAndAfterAll &amp; BeforeAndAfterEach &amp; Logging
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39826'>SPARK-39826</a>] -         Bump scalatest-maven-plugin to 2.1.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39856'>SPARK-39856</a>] -         Avoid OOM in TPC-DS build with SMJ
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39869'>SPARK-39869</a>] -         Fix flaky hive - slow tests because of out-of-memory
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39874'>SPARK-39874</a>] -         Deflake BroadcastJoinSuite*
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39959'>SPARK-39959</a>] -         Recover SparkR CRAN check in GitHub Actions CI
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40116'>SPARK-40116</a>] -         Remove Arrow in AppVeyor for now
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40133'>SPARK-40133</a>] -         Regenerate excludedTpcdsQueries&#39;s golden files if regenerateGoldenFiles is true
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40172'>SPARK-40172</a>] -         Temporarily disable flaky test cases in ImageFileFormatSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40203'>SPARK-40203</a>] -         Add test cases for Spark Decimal
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40229'>SPARK-40229</a>] -         Re-enable excel I/O test for pandas API on Spark.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40265'>SPARK-40265</a>] -         Fix the inconsistent behavior for Index.intersection.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40271'>SPARK-40271</a>] -         Support list type for pyspark.sql.functions.lit
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40273'>SPARK-40273</a>] -         Fix the documents &quot;Contributing and Maintaining Type Hints&quot;.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40410'>SPARK-40410</a>] -         Migrate trait QueryErrorsSuiteBase into SparkFunSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40461'>SPARK-40461</a>] -         Set upperbound for pyzmq 24.0.0 for Python linter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40495'>SPARK-40495</a>] -         Add additional tests to StreamingSessionWindowSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40669'>SPARK-40669</a>] -         Parameterize InMemoryColumnarBenchmark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40682'>SPARK-40682</a>] -         Set spark.driver.maxResultSize to 3g in SqlBasedBenchmark
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40789'>SPARK-40789</a>] -         Separate tests under pyspark.sql.tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40903'>SPARK-40903</a>] -         Avoid reordering decimal Add for canonicalization
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40968'>SPARK-40968</a>] -         Fix some wrong/misleading comments in DAGSchedulerSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41486'>SPARK-41486</a>] -         Upgrade MySQL docker image to 8.0.31 to support arm64
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41504'>SPARK-41504</a>] -         Update R version to 4.1.2 in Dockerfile comment
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41558'>SPARK-41558</a>] -         Disable Coverage in python.pyspark.tests.test_memory_profiler
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41559'>SPARK-41559</a>] -         Reenable Codecov report in the scheduled job
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41753'>SPARK-41753</a>] -         Add tests for ArrayZip to check the result size and nullability.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41774'>SPARK-41774</a>] -         Remove def test_vectorized_udf_unsupported_types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41782'>SPARK-41782</a>] -         Regenerate benchmark results
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41854'>SPARK-41854</a>] -         Automatic reformat/check python/setup.py 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41863'>SPARK-41863</a>] -         Skip `flake8` tests if the command is not available
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41864'>SPARK-41864</a>] -         Fix mypy linter errors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41996'>SPARK-41996</a>] -         KafkaMicroBatchV2SourceSuite failed for topic partitions unavailable test due to kafka operations taking longer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42087'>SPARK-42087</a>] -         Use `--no-same-owner` when HiveExternalCatalogVersionsSuite untars.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42110'>SPARK-42110</a>] -         Reduce the number of repetition in ParquetDeltaEncodingSuite.`random data test`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42181'>SPARK-42181</a>] -         Skip `torch` tests when torch is not installed
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42183'>SPARK-42183</a>] -         Exclude pyspark.ml.torch.tests in MyPy tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42279'>SPARK-42279</a>] -         Simplify `pyspark.pandas.tests.test_resample`
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42282'>SPARK-42282</a>] -         Split &#39;pyspark.pandas.tests.test_groupby&#39;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42341'>SPARK-42341</a>] -         Fix JoinSelectionHelperSuite and PlanStabilitySuite to use explicit broadcast threshold
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42364'>SPARK-42364</a>] -         Split &#39;pyspark.pandas.tests.test_dataframe&#39;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42365'>SPARK-42365</a>] -         Split &#39;pyspark.pandas.tests.test_ops_on_diff_frames&#39;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42368'>SPARK-42368</a>] -         Ignore SparkRemoteFileTest K8s IT test case in GitHub Action
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42474'>SPARK-42474</a>] -         Add extraJVMOptions JVM GC option K8s test cases
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42507'>SPARK-42507</a>] -         Simplify ORC schema merging conflict error check
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42587'>SPARK-42587</a>] -         Use wrapper versions for SBT and Maven in `connect` module tests
</li>
</ul>
        
<h2>        Task
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-28764'>SPARK-28764</a>] -         Remove unnecessary writePartitionedFile method from ExternalSorter
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-35208'>SPARK-35208</a>] -         Add docs for LATERAL subqueries
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38181'>SPARK-38181</a>] -         Update comments for KafkaDataConsumer
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38289'>SPARK-38289</a>] -         Refactor SQL CLI exit code related code
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38550'>SPARK-38550</a>] -         Use a disk-based store to save more information in live UI to help debug
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38572'>SPARK-38572</a>] -         Setting version to 3.4.0-SNAPSHOT
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38651'>SPARK-38651</a>] -         Writing out empty or nested empty schemas in Datasource should be configurable
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38705'>SPARK-38705</a>] -         Use function identifier in create and drop function command
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38910'>SPARK-38910</a>] -         Clean sparkStaging dir should before unregister()
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39110'>SPARK-39110</a>] -         Add metrics properties to Environment page
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39178'>SPARK-39178</a>] -         When throw SparkFatalException, should show root cause too.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39195'>SPARK-39195</a>] -         Spark OutputCommitCoordinator should abort stage when committed file not consistent with task status
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39224'>SPARK-39224</a>] -         Lower general ProcfsMetricsGetter error log levels except /proc/ lookup error
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39244'>SPARK-39244</a>] -         Use `--no-echo` instead of `--slave` in R 4.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39264'>SPARK-39264</a>] -         Add explicit type checks and casting for awaitOffset fix
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39781'>SPARK-39781</a>] -         Add support for configuring max_open_files through RocksDB state store provider
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39805'>SPARK-39805</a>] -         Deprecate Trigger.Once and Promote Trigger.AvailableNow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39861'>SPARK-39861</a>] -         Deprecate Python 3.7 Support
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39918'>SPARK-39918</a>] -         Replace the wording &quot;un-comparable&quot; with &quot;incomparable&quot;
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40213'>SPARK-40213</a>] -         Incorrect ASCII value for Latin-1 Supplement characters
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40292'>SPARK-40292</a>] -         arrays_zip output unexpected alias column names
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40319'>SPARK-40319</a>] -         Remove duplicated query execution error method for PARSE_DATETIME_BY_NEW_PARSER
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40389'>SPARK-40389</a>] -         Decimals can&#39;t upcast as integral types if the cast can overflow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40467'>SPARK-40467</a>] -         Split FlatMapGroupsWithState down to multiple test suites
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40491'>SPARK-40491</a>] -         Remove too old TODO for JdbcRDD
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40651'>SPARK-40651</a>] -         Drop Hadoop2 binary distribtuion from release process
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40844'>SPARK-40844</a>] -         Flip the default value of Kafka offset fetching config
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41101'>SPARK-41101</a>] -         Add messageClassName support for pypspark-protobuf
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41224'>SPARK-41224</a>] -         Optimize Arrow collect to stream the result from server to client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41247'>SPARK-41247</a>] -         Unify the protobuf versions in Spark connect and protobuf connector
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41249'>SPARK-41249</a>] -         Add acceptance test for self-union on streaming query
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41396'>SPARK-41396</a>] -         Oneof field support and recursive fields
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41415'>SPARK-41415</a>] -         SASL Request Retries
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41499'>SPARK-41499</a>] -         Upgrade protobuf version to 3.21.11
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41538'>SPARK-41538</a>] -         Metadata column should be appended at the end of project list
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41639'>SPARK-41639</a>] -         Remove ScalaReflectionLock 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41690'>SPARK-41690</a>] -         Introduce AgnosticEncoders
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41752'>SPARK-41752</a>] -         UI improvement for nested SQL executions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41853'>SPARK-41853</a>] -         Use Map in place of SortedMap for ErrorClassesJsonReader
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41930'>SPARK-41930</a>] -         Remove `branch-3.1` from publish_snapshot job
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41972'>SPARK-41972</a>] -         Fix flaky test in StreamingQueryStatusListenerSuite
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41993'>SPARK-41993</a>] -         Move RowEncoder to AgnosticEncoders
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42003'>SPARK-42003</a>] -         Reduce duplicate code in ResolveGroupByAll
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42075'>SPARK-42075</a>] -         Deprecate DStream API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42093'>SPARK-42093</a>] -         Move JavaTypeInference to AgnosticEncoders
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42105'>SPARK-42105</a>] -         Document work (Release note &amp; Guide doc) for SPARK-40925
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42284'>SPARK-42284</a>] -         Make sure Connect Server assembly jar is available before we run Scala Client tests
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42377'>SPARK-42377</a>] -         Test Framework for Connect Scala Client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42440'>SPARK-42440</a>] -         Implement First batch of Dataset APIs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42441'>SPARK-42441</a>] -         Scala Client - Implement Column API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42453'>SPARK-42453</a>] -         Implement function max in Scala client
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42460'>SPARK-42460</a>] -         E2E test should clean-up results
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42461'>SPARK-42461</a>] -         Scala Client - Initial Set of Functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42464'>SPARK-42464</a>] -         Fix 2.13 build errors caused by explain output changes and udfs.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42465'>SPARK-42465</a>] -         ProtoToPlanTestSuite should analyze its input plans
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42495'>SPARK-42495</a>] -         Scala Client: Add 2nd batch of functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42512'>SPARK-42512</a>] -         Scala Client: Add 3rd batch of functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42520'>SPARK-42520</a>] -         Spark Connect Scala Client: Window
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42569'>SPARK-42569</a>] -         Throw unsupported exceptions for non-supported API
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42624'>SPARK-42624</a>] -         Reorganize imports in test_functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42876'>SPARK-42876</a>] -         DataType&#39;s physicalDataType should be private[sql]
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42878'>SPARK-42878</a>] -         Named Table should support options
</li>
</ul>
                                                    
<h2>        Dependency upgrade
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39099'>SPARK-39099</a>] -         Add dependencies to Dockerfile for building Spark releases
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39125'>SPARK-39125</a>] -         Upgrade netty and netty-tcnative
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39183'>SPARK-39183</a>] -         Upgrade Apache Xerces Java to 2.12.2
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39540'>SPARK-39540</a>] -         Upgrade mysql-connector-java to 8.0.29
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39725'>SPARK-39725</a>] -         Upgrade jetty-http from 9.4.46.v20220331 to 9.4.48.v20220622
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39927'>SPARK-39927</a>] -         Upgrade Avro to version 1.11.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39992'>SPARK-39992</a>] -         Upgrade slf4j to 1.7.36
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39996'>SPARK-39996</a>] -         Upgrade postgresql to 42.5.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40037'>SPARK-40037</a>] -         Upgrade com.google.crypto.tink:tink from 1.6.1 to 1.7.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40326'>SPARK-40326</a>] -         upgrade com.fasterxml.jackson.dataformat:jackson-dataformat-yaml from 2.13.3 to 2.13.4
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40522'>SPARK-40522</a>] -         Upgrade Apache Kafka from 3.2.1 to 3.2.3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40552'>SPARK-40552</a>] -         Upgrade protobuf-python from 4.21.5 to 4.21.6
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40801'>SPARK-40801</a>] -         Upgrade Apache Commons Text to 1.10
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40884'>SPARK-40884</a>] -         Upgrade fabric8io - kubernetes-client to 6.2.0
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41030'>SPARK-41030</a>] -         Upgrade Apache Ivy to 2.5.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41076'>SPARK-41076</a>] -         Upgrade protobuf-java to 3.21.9
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41240'>SPARK-41240</a>] -         Upgrade Protobuf from 3.19.4 to 3.19.5
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41245'>SPARK-41245</a>] -         Upgrade postgresql from 42.5.0 to 42.5.1
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41566'>SPARK-41566</a>] -         Upgrade netty from 4.1.84.Final to 4.1.86.Final
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41634'>SPARK-41634</a>] -         Upgrade minimatch to 3.1.2 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42218'>SPARK-42218</a>] -         Upgrade netty to version 4.1.87.Final
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42362'>SPARK-42362</a>] -         Upgrade kubernetes-client from 6.4.0 to 6.4.1
</li>
</ul>
        
<h2>        Question
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40221'>SPARK-40221</a>] -         Not able to format using scalafmt
</li>
</ul>
            
<h2>        Umbrella
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39515'>SPARK-39515</a>] -         Improve/recover scheduled jobs in GitHub Actions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40576'>SPARK-40576</a>] -         Support pandas 1.5.x.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41053'>SPARK-41053</a>] -         Better Spark UI scalability and Driver stability for large applications
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41283'>SPARK-41283</a>] -         Feature parity: Functions API in Spark Connect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41550'>SPARK-41550</a>] -         Dynamic Allocation on K8S GA
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41594'>SPARK-41594</a>] -         Support table-valued generator functions in the FROM clause
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41597'>SPARK-41597</a>] -         Improve PySpark errors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41642'>SPARK-41642</a>] -         Deduplicate docstrings in Python Spark Connect
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42339'>SPARK-42339</a>] -         Improve Kryo Serializer Support
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42802'>SPARK-42802</a>] -         Customized K8s Scheduler GA
</li>
</ul>
                                                                
<h2>        Documentation
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38581'>SPARK-38581</a>] -         List of supported pandas APIs for pandas API on Spark docs.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-38961'>SPARK-38961</a>] -         Enhance to automatically generate the pandas API support list
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39001'>SPARK-39001</a>] -         Document which options are unsupported in CSV and JSON functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39577'>SPARK-39577</a>] -         Add SQL reference for built-in functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39677'>SPARK-39677</a>] -         Wrong args item formatting of the regexp functions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39707'>SPARK-39707</a>] -         Add SQL reference for aggregate functions 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39737'>SPARK-39737</a>] -         PERCENTILE_CONT and PERCENTILE_DISC should support aggregate filter 
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39777'>SPARK-39777</a>] -         Remove Hive bucketing incompatibility doc
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39780'>SPARK-39780</a>] -         Add an additional usage example for the map_zip_with function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-39968'>SPARK-39968</a>] -         Update K8s doc to recommend K8s 1.22+
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40028'>SPARK-40028</a>] -         Add binary examples for string expressions
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40043'>SPARK-40043</a>] -         Document DataStreamWriter.toTable and DataStreamReader.table
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40266'>SPARK-40266</a>] -         Corrected  console output in quick-start -  Datatype Integer instead of Long
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40279'>SPARK-40279</a>] -         Document spark.yarn.report.interval
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40922'>SPARK-40922</a>] -         pyspark.pandas.read_csv supports reading multiple files, but that is undocumented
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40983'>SPARK-40983</a>] -         Remove Hadoop requirements for zstd mention in Parquet compression codec
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-40994'>SPARK-40994</a>] -         Add code example for JDBC data source with partitionColumn
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41014'>SPARK-41014</a>] -         Improve documentation and typing of applyInPandas for groupby and cogroup
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41596'>SPARK-41596</a>] -         Document the new feature &quot;Async Progress Tracking&quot; to Structured Streaming guide doc
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-41951'>SPARK-41951</a>] -         Update SQL migration guide and documentations
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42405'>SPARK-42405</a>] -         Better documentation of array_insert function
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42418'>SPARK-42418</a>] -         Updating PySpark documentation to support new users better
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42446'>SPARK-42446</a>] -         Updating PySpark documentation to enhance usability
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42456'>SPARK-42456</a>] -         Consolidating the PySpark version upgrade note pages into a single page to make it easier to read
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42530'>SPARK-42530</a>] -         Remove Hadoop 2 from PySpark installation guide
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42592'>SPARK-42592</a>] -         Document SS guide doc for supporting multiple stateful operators (especially chained aggregations)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42628'>SPARK-42628</a>] -         Add a migration note for bloom filter join
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42713'>SPARK-42713</a>] -         Add &#39;__getattr__&#39; and &#39;__getitem__&#39; of DataFrame and Column to API reference
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42903'>SPARK-42903</a>] -         Avoid documenting None as as a return value in docstring
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-42924'>SPARK-42924</a>] -         Clarify the comment of parameterized SQL args
</li>
</ul>