Release Notes - ASF JIRA

Release Notes - Spark - Version 2.1.3 - HTML format

Configure Release Notes

Bug

[SPARK-17902] - collect() ignores stringsAsFactors
[SPARK-18971] - Netty issue may cause the shuffle client hang
[SPARK-20466] - HadoopRDD#addLocalConfiguration throws NPE
[SPARK-21278] - Upgrade to Py4J 0.10.6
[SPARK-21551] - pyspark's collect fails when getaddrinfo is too slow
[SPARK-21991] - [LAUNCHER] LauncherServer acceptConnections thread sometime dies if machine has very high load
[SPARK-22083] - When dropping multiple blocks to disk, Spark should release all locks on a failure
[SPARK-22206] - gapply in R can't work on empty grouping columns
[SPARK-22273] - Fix key/value schema field names in HashMapGenerators.
[SPARK-22327] - R CRAN check fails on non-latest branches
[SPARK-22373] - Intermittent NullPointerException in org.codehaus.janino.IClass.isAssignableFrom
[SPARK-22377] - Maven nightly snapshot jenkins jobs are broken on multiple workers due to lsof
[SPARK-22429] - Streaming checkpointing code does not retry after failure due to NullPointerException
[SPARK-22548] - Incorrect nested AND expression pushed down to JDBC data source
[SPARK-22862] - Docs on lazy elimination of columns missing from an encoder.
[SPARK-23053] - taskBinarySerialization and task partitions calculate in DagScheduler.submitMissingTasks should keep the same RDD checkpoint status
[SPARK-23438] - DStreams could lose blocks with WAL enabled when driver crashes
[SPARK-23697] - Accumulators of Spark 1.x no longer work with Spark 2.x
[SPARK-23732] - Broken link to scala source code in Spark Scala api Scaladoc
[SPARK-24257] - LongToUnsafeRowMap calculate the new size may be wrong
[SPARK-24589] - OutputCommitCoordinator may allow duplicate commits

Improvement

[SPARK-18136] - Make PySpark pip install works on windows
[SPARK-22688] - Upgrade Janino version to 3.0.8
[SPARK-22897] - Expose stageAttemptId in TaskContext

Edit/Copy Release Notes

The text area below allows the project release notes to be edited and copied to another document.

Release Notes - Spark - Version 2.1.3
                
<h2>        Bug
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-17902'>SPARK-17902</a>] -         collect() ignores stringsAsFactors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-18971'>SPARK-18971</a>] -         Netty issue may cause the shuffle client hang
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-20466'>SPARK-20466</a>] -         HadoopRDD#addLocalConfiguration throws NPE
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-21278'>SPARK-21278</a>] -         Upgrade to Py4J 0.10.6
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-21551'>SPARK-21551</a>] -         pyspark&#39;s collect fails when getaddrinfo is too slow
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-21991'>SPARK-21991</a>] -         [LAUNCHER] LauncherServer acceptConnections thread sometime dies if machine has very high load
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-22083'>SPARK-22083</a>] -         When dropping multiple blocks to disk, Spark should release all locks on a failure
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-22206'>SPARK-22206</a>] -         gapply in R can&#39;t work on empty grouping columns
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-22273'>SPARK-22273</a>] -         Fix key/value schema field names in HashMapGenerators.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-22327'>SPARK-22327</a>] -         R CRAN check fails on non-latest branches
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-22373'>SPARK-22373</a>] -         Intermittent NullPointerException in org.codehaus.janino.IClass.isAssignableFrom
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-22377'>SPARK-22377</a>] -         Maven nightly snapshot jenkins jobs are broken on multiple workers due to lsof
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-22429'>SPARK-22429</a>] -         Streaming checkpointing code does not retry after failure due to NullPointerException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-22548'>SPARK-22548</a>] -         Incorrect nested AND expression pushed down to JDBC data source
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-22862'>SPARK-22862</a>] -         Docs on lazy elimination of columns missing from an encoder.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-23053'>SPARK-23053</a>] -         taskBinarySerialization and task partitions calculate in DagScheduler.submitMissingTasks should keep the same RDD checkpoint status
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-23438'>SPARK-23438</a>] -         DStreams could lose blocks with WAL enabled when driver crashes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-23697'>SPARK-23697</a>] -         Accumulators of Spark 1.x no longer work with Spark 2.x
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-23732'>SPARK-23732</a>] -         Broken link to scala source code in Spark Scala api Scaladoc
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-24257'>SPARK-24257</a>] -         LongToUnsafeRowMap calculate the new size may be wrong
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-24589'>SPARK-24589</a>] -         OutputCommitCoordinator may allow duplicate commits
</li>
</ul>
                
<h2>        Improvement
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-18136'>SPARK-18136</a>] -         Make PySpark pip install works on windows
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-22688'>SPARK-22688</a>] -         Upgrade Janino version to 3.0.8
</li>
<li>[<a href='https://issues.apache.org/jira/browse/SPARK-22897'>SPARK-22897</a>] -         Expose  stageAttemptId in TaskContext
</li>
</ul>