Bug
- [SPARK-17902] - collect() ignores stringsAsFactors
- [SPARK-18971] - Netty issue may cause the shuffle client hang
- [SPARK-20466] - HadoopRDD#addLocalConfiguration throws NPE
- [SPARK-21278] - Upgrade to Py4J 0.10.6
- [SPARK-21551] - pyspark's collect fails when getaddrinfo is too slow
- [SPARK-21991] - [LAUNCHER] LauncherServer acceptConnections thread sometime dies if machine has very high load
- [SPARK-22083] - When dropping multiple blocks to disk, Spark should release all locks on a failure
- [SPARK-22206] - gapply in R can't work on empty grouping columns
- [SPARK-22273] - Fix key/value schema field names in HashMapGenerators.
- [SPARK-22327] - R CRAN check fails on non-latest branches
- [SPARK-22373] - Intermittent NullPointerException in org.codehaus.janino.IClass.isAssignableFrom
- [SPARK-22377] - Maven nightly snapshot jenkins jobs are broken on multiple workers due to lsof
- [SPARK-22429] - Streaming checkpointing code does not retry after failure due to NullPointerException
- [SPARK-22548] - Incorrect nested AND expression pushed down to JDBC data source
- [SPARK-22862] - Docs on lazy elimination of columns missing from an encoder.
- [SPARK-23053] - taskBinarySerialization and task partitions calculate in DagScheduler.submitMissingTasks should keep the same RDD checkpoint status
- [SPARK-23438] - DStreams could lose blocks with WAL enabled when driver crashes
- [SPARK-23697] - Accumulators of Spark 1.x no longer work with Spark 2.x
- [SPARK-23732] - Broken link to scala source code in Spark Scala api Scaladoc
- [SPARK-24257] - LongToUnsafeRowMap calculate the new size may be wrong
- [SPARK-24589] - OutputCommitCoordinator may allow duplicate commits
Improvement
- [SPARK-18136] - Make PySpark pip install works on windows
- [SPARK-22688] - Upgrade Janino version to 3.0.8
- [SPARK-22897] - Expose stageAttemptId in TaskContext
Edit/Copy Release Notes
The text area below allows the project release notes to be edited and copied to another document.