Sub-task
- [SPARK-25572] - SparkR tests failed on CRAN on Java 10
- [SPARK-26010] - SparkR vignette fails on CRAN on Java 11
- [SPARK-26327] - Metrics in FileSourceScanExec not update correctly while relation.partitionSchema is set
Bug
- [SPARK-21402] - Fix java array of structs deserialization
- [SPARK-24677] - TaskSetManager not updating successfulTaskDurations for old stage attempts
- [SPARK-24687] - When NoClassDefError thrown during task serialization will cause job hang
- [SPARK-24755] - Executor loss can cause task to not be resubmitted
- [SPARK-25081] - Nested spill in ShuffleExternalSorter may access a released memory page
- [SPARK-25425] - Extra options must overwrite sessions options
- [SPARK-25450] - PushProjectThroughUnion rule uses the same exprId for project expressions in each Union child, causing mistakes in constant propagation
- [SPARK-25471] - Fix tests for Python 3.6 with Pandas 0.23+
- [SPARK-25502] - [Spark Job History] Empty Page when page number exceeds the reatinedTask size
- [SPARK-25503] - [Spark Job History] Total task message in stage page is ambiguous
- [SPARK-25509] - SHS V2 cannot enabled in Windows, because POSIX permissions is not support.
- [SPARK-25533] - Inconsistent message for Completed Jobs in the JobUI, when there are failed jobs, compared to spark2.2
- [SPARK-25536] - executorSource.METRIC read wrong record in Executor.scala Line444
- [SPARK-25568] - Continue to update the remaining accumulators when failing to update one accumulator
- [SPARK-25570] - Replace 2.3.1 with 2.3.2 in HiveExternalCatalogVersionsSuite
- [SPARK-25591] - PySpark Accumulators with multiple PythonUDFs
- [SPARK-25674] - If the records are incremented by more than 1 at a time,the number of bytes might rarely ever get updated
- [SPARK-25714] - Null Handling in the Optimizer rule BooleanSimplification
- [SPARK-25726] - Flaky test: SaveIntoDataSourceCommandSuite.`simpleString is redacted`
- [SPARK-25767] - Error reported in Spark logs when using the org.apache.spark:spark-sql_2.11:2.3.2 Java library
- [SPARK-25768] - Constant argument expecting Hive UDAFs doesn't work
- [SPARK-25786] - If the ByteBuffer.hasArray is false , it will throw UnsupportedOperationException for Kryo
- [SPARK-25795] - Fix CSV SparkR SQL Example
- [SPARK-25797] - Views created via 2.1 cannot be read via 2.2+
- [SPARK-25816] - Functions does not resolve Columns correctly
- [SPARK-25822] - Fix a race condition when releasing a Python worker
- [SPARK-25837] - Web UI does not respect spark.ui.retainedJobs in some instances
- [SPARK-25854] - mvn helper script always exits w/1, causing mvn builds to fail
- [SPARK-25934] - Mesos: SPARK_CONF_DIR should not be propogated by spark submit
- [SPARK-26011] - pyspark app with "spark.jars.packages" config does not work
- [SPARK-26019] - pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()
- [SPARK-26078] - WHERE .. IN fails to filter rows when used in combination with UNION
- [SPARK-26084] - AggregateExpression.references fails on unresolved expression trees
- [SPARK-26109] - Duration in the task summary metrics table and the task table are different
- [SPARK-26137] - Linux file separator is hard coded in DependencyUtils used in deploy process
- [SPARK-26198] - Metadata serialize null values throw NPE
- [SPARK-26201] - python broadcast.value on driver fails with disk encryption enabled
- [SPARK-26211] - Fix InSet for binary, and struct and array with null.
- [SPARK-26228] - OOM issue encountered when computing Gramian matrix
- [SPARK-26233] - Incorrect decimal value with java beans and first/last/max... functions
- [SPARK-26272] - Please delete old releases from mirroring system
- [SPARK-26274] - Download page must link to https://www.apache.org/dist/spark for current releases
- [SPARK-26307] - Fix CTAS when INSERT a partitioned table using Hive serde
- [SPARK-26315] - auto cast threshold from Integer to Float in approxSimilarityJoin of BucketedRandomProjectionLSHModel
- [SPARK-26351] - Documented formula of precision at k does not match the actual code
- [SPARK-26352] - Join reordering should not change the order of output attributes
- [SPARK-26366] - Except with transform regression
- [SPARK-26379] - Use dummy TimeZoneId for CurrentTimestamp to avoid UnresolvedException in CurrentBatchTimestamp
- [SPARK-26394] - Annotation error for Utils.timeStringAsMs
- [SPARK-26422] - Unable to disable Hive support in SparkR when Hadoop version is unsupported
- [SPARK-26444] - Stage color doesn't change with it's status
- [SPARK-26496] - Avoid to use Random.nextString in StreamingInnerJoinSuite
- [SPARK-26537] - update the release scripts to point to gitbox
- [SPARK-26538] - Postgres numeric array support
- [SPARK-26545] - Fix typo in EqualNullSafe's truth table comment
- [SPARK-26553] - NameError: global name '_exception_message' is not defined
- [SPARK-26638] - Pyspark vector classes always return error for unary negation
- [SPARK-26665] - BlockTransferService.fetchBlockSync may hang forever
- [SPARK-26680] - StackOverflowError if Stream passed to groupBy
- [SPARK-26682] - Task attempt ID collision causes lost data
- [SPARK-26706] - Fix Cast$mayTruncate for bytes
- [SPARK-26709] - OptimizeMetadataOnlyQuery does not correctly handle the files with zero record
- [SPARK-26718] - Fixed integer overflow in SS kafka rateLimit calculation
- [SPARK-26726] - Synchronize the amount of memory used by the broadcast variable to the UI display
- [SPARK-26732] - Flaky test: SparkContextInfoSuite.getRDDStorageInfo only reports on RDDs that actually persist data
- [SPARK-26751] - HiveSessionImpl might have memory leak since Operation do not close properly
- [SPARK-26757] - GraphX EdgeRDDImpl and VertexRDDImpl `count` method cannot handle empty RDDs
- [SPARK-26806] - EventTimeStats.merge doesn't handle "zero.merge(zero)" correctly
- [SPARK-28626] - Spark leaves unencrypted data on local disk, even with encryption turned on (CVE-2019-10099)
New Feature
- [SPARK-26118] - Make Jetty's requestHeaderSize configurable in Spark
Improvement
- [SPARK-25253] - Refactor pyspark connection & authentication
- [SPARK-25754] - Change CDN for MathJax
- [SPARK-26316] - Because of the perf degradation in TPC-DS, we currently partial revert SPARK-21052:Add hash map metrics to join,
Test
- [SPARK-26120] - Fix a streaming query leak in Structured Streaming R tests
Task
- [SPARK-26607] - Remove Spark 2.2.x testing from HiveExternalCatalogVersionsSuite
Documentation
- [SPARK-25583] - Add newly added History server related configurations in the documentation
- [SPARK-25933] - Fix pstats reference for spark.python.profile.dump in configuration.md
Edit/Copy Release Notes
The text area below allows the project release notes to be edited and copied to another document.