spark
https://github.com/apache/spark
Scala
Apache Spark - A unified analytics engine for large-scale data processing
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
Scala not yet supported75 Subscribers
View all SubscribersAdd a CodeTriage badge to spark
Help out
- Issues
- [SPARK-56966][SPARK-56967][CORE] Auto-create missing event log directories
- [SQL][DML] Resolve Path Based Tables for both Reads and Writes
- [SPARK-55791][PYTHON] Fix pandas-on-Spark equality comparisons under ANSI mode
- [SPARK-56952][CORE] Preserve heartbeat timeout executor loss reason
- Spark Declarative Pipelines: enable primary key check as expectation
- [SPARK-56932][SQL] Rewrite top-level single-column NOT IN subqueries with UNION
- Automatically create the spark history log directory
- [SPARK-56942][SQL] Widen DSv2 row-id resolution to support nested columns
- [SPARK-56940][SQL] Extend OptimizeRand optimizer to support arithmetic expressions
- [WIP] Fixing PySpark benchmark build & install
- Docs
- Scala not yet supported