Sub-task
- [SPARK-23491] - continuous symptom
- [SPARK-27441] - Add read/write tests to Hive serde tables
Bug
- [SPARK-21882] - OutputMetrics doesn't count written bytes correctly in the saveAsHadoopDataset function
- [SPARK-23408] - Flaky test: StreamingOuterJoinSuite.left outer early state exclusion on right
- [SPARK-23416] - Flaky test: KafkaSourceStressForDontFailOnDataLossSuite.stress test for failOnDataLoss=false
- [SPARK-24211] - Flaky test: StreamingOuterJoinSuite
- [SPARK-24239] - Flaky test: KafkaContinuousSourceSuite.subscribing topic by name from earliest offsets
- [SPARK-24669] - Managed table was not cleared of path after drop database cascade
- [SPARK-24935] - Problem with Executing Hive UDF's from Spark 2.2 Onwards
- [SPARK-25139] - PythonRunner#WriterThread released block after TaskRunner finally block which invoke BlockManager#releaseAllLocksForTask
- [SPARK-25863] - java.lang.UnsupportedOperationException: empty.max at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.updateAndGetCompilationStats(CodeGenerator.scala:1475)
- [SPARK-26082] - Misnaming of spark.mesos.fetch(er)Cache.enable in MesosClusterScheduler
- [SPARK-26572] - Join on distinct column with monotonically_increasing_id produces wrong output
- [SPARK-26606] - parameters passed in extraJavaOptions are not being picked up
- [SPARK-26734] - StackOverflowError on WAL serialization caused by large receivedBlockQueue
- [SPARK-26758] - Idle Executors are not getting killed after spark.dynamicAllocation.executorIdleTimeout value
- [SPARK-26859] - Fix field writer index bug in non-vectorized ORC deserializer
- [SPARK-26873] - FileFormatWriter creates inconsistent MR job IDs
- [SPARK-26895] - When running spark 2.3 as a proxy user (--proxy-user), SparkSubmit fails to resolve globs owned by target user
- [SPARK-26927] - Race condition may cause dynamic allocation not working
- [SPARK-26950] - Make RandomDataGenerator use Float.NaN or Double.NaN for all NaN values
- [SPARK-26961] - Found Java-level deadlock in Spark Driver
- [SPARK-26998] - spark.ssl.keyStorePassword in plaintext on 'ps -ef' output of executor processes in Standalone mode
- [SPARK-27018] - Checkpointed RDD deleted prematurely when using GBTClassifier
- [SPARK-27065] - avoid more than one active task set managers for a stage
- [SPARK-27080] - Read parquet file with merging metastore schema should compare schema field in uniform case.
- [SPARK-27111] - A continuous query may fail with InterruptedException when kafka consumer temporally 0 partitions temporally
- [SPARK-27112] - Spark Scheduler encounters two independent Deadlocks when trying to kill executors either due to dynamic allocation or blacklisting
- [SPARK-27160] - Incorrect Literal Casting of DecimalType in OrcFilters
- [SPARK-27216] - Upgrade RoaringBitmap to 0.7.45 to fix Kryo unsafe ser/dser issue
- [SPARK-27244] - Redact Passwords While Using Option logConf=true
- [SPARK-27275] - Potential corruption in EncryptedMessage.transferTo
- [SPARK-27301] - DStreamCheckpointData failed to clean up because it's fileSystem cached
- [SPARK-27338] - Deadlock between TaskMemoryManager and UnsafeExternalSorter$SpillableIterator
- [SPARK-27347] - Fix supervised driver retry logic when agent crashes/restarts
- [SPARK-27496] - RPC should send back the fatal errors
- [SPARK-27577] - Wrong thresholds selected by BinaryClassificationMetrics when downsampling
- [SPARK-27621] - Calling transform() method on a LinearRegressionModel throws NoSuchElementException
- [SPARK-27624] - Fix CalenderInterval to show an empty interval correctly
- [SPARK-27626] - Fix `docker-image-tool.sh` to be robust in non-bash shell env
- [SPARK-27735] - Interval string in upper case is not supported in Trigger
- [SPARK-27798] - ConvertToLocalRelation should tolerate expression reusing output object
- [SPARK-27869] - Redact sensitive information in System Properties from UI
- [SPARK-27907] - HiveUDAF should return NULL in case of 0 rows
- [SPARK-28081] - word2vec 'large' count value too low for very large corpora
- [SPARK-28156] - Join plan sometimes does not use cached query
- [SPARK-28157] - Make SHS clear KVStore LogInfo for the blacklisted entries
- [SPARK-28160] - TransportClient.sendRpcSync may hang forever
- [SPARK-28164] - usage description does not match with shell scripts
- [SPARK-28302] - SparkLauncher: The process cannot access the file because it is being used by another process
- [SPARK-28308] - CalendarInterval sub-second part should be padded before parsing
- [SPARK-28404] - Fix negative timeout value in RateStreamContinuousPartitionReader
- [SPARK-28430] - Some stage table rows render wrong number of columns if tasks are missing metrics
- [SPARK-28582] - Pyspark daemon exit failed when receive SIGTERM on py3.7
- [SPARK-28699] - Cache an indeterminate RDD could lead to incorrect result while stage rerun
- [SPARK-28766] - Fix CRAN incoming feasibility warning on invalid URL
- [SPARK-28775] - DateTimeUtilsSuite fails for JDKs using the tzdata2018i or newer timezone database
- [SPARK-28780] - Delete the incorrect setWeightCol method in LinearSVCModel
- [SPARK-28844] - Fix typo in SQLConf FILE_COMRESSION_FACTOR
Improvement
- [SPARK-24898] - Adding spark.checkpoint.compress to the docs
- [SPARK-26604] - Register channel for stream request
- [SPARK-27358] - Update jquery to 1.12.x to pick up security fixes
- [SPARK-27563] - automatically get the latest Spark versions in HiveExternalCatalogVersionsSuite
- [SPARK-27672] - Add since info to string expressions
- [SPARK-27673] - Add since info to random. regex, null expressions
- [SPARK-27771] - Add SQL description for grouping functions (cube, rollup, grouping and grouping_id)
- [SPARK-28545] - Add the hash map size to the directional log of ObjectAggregationIterator
- [SPARK-28891] - do-release-docker.sh in master does not work for branch-2.3
Test
- [SPARK-24352] - Flaky test: StandaloneDynamicAllocationSuite
- [SPARK-28261] - Flaky test: org.apache.spark.network.TransportClientFactorySuite.reuseClientsUpToConfigVariable
- [SPARK-28335] - Flaky test: org.apache.spark.streaming.kafka010.DirectKafkaStreamSuite.offset recovery from kafka
- [SPARK-28357] - Fix Flaky Test - FileAppenderSuite.rolling file appender - size-based rolling compressed
- [SPARK-28361] - Test equality of generated code with id in class name
- [SPARK-28418] - Flaky Test: pyspark.sql.tests.test_dataframe: test_query_execution_listener_on_collect
- [SPARK-28535] - Flaky test: JobCancellationSuite."interruptible iterator of shuffle reader"
Task
- [SPARK-26897] - Update Spark 2.3.x testing from HiveExternalCatalogVersionsSuite
Documentation
- [SPARK-27800] - Example for xor function has a wrong answer
- [SPARK-28777] - Pyspark sql function "format_string" has the wrong parameters in doc string
Edit/Copy Release Notes
The text area below allows the project release notes to be edited and copied to another document.