spark
https://github.com/apache/spark
Scala
Apache Spark - A unified analytics engine for large-scale data processing
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
Scala not yet supported76 Subscribers
View all SubscribersAdd a CodeTriage badge to spark
Help out
- Issues
- [SPARK-30628][SQL] Support DPP and subquery partition pruning for V2 file sources
- Project DSv2 leaf node to Scan's schema
- [DO NOT MERGE][SPARK-55952][SPARK-55953][SQL] Add ResolveChangelogTable analyzer rule for batch CDC post-processing
- PushPredicateThroughNonJoin assertion failure when pushing filter through Project into Union (Hive view)
- [WIP][CONNECT][PYTHON] Add SQLContext wrapper for Spark Connect
- [GRAPHX] Fix sccWorkGraph cache leak in StronglyConnectedComponents
- [GRAPHX] Unpersist sccWorkGraph in StronglyConnectedComponents
- [SPARK-47998][PS] Fix misleading error message in pandas-on-Spark concat
- [SPARK-56633][SQL][TESTS] Add comprehensive Parquet vectorized-reader benchmark coverage
- [SPARK-56642][SQL] Add pipelined JVM-Python UDF data transfer
- Docs
- Scala not yet supported