spark
https://github.com/apache/spark
Scala
Apache Spark - A unified analytics engine for large-scale data processing
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
Scala not yet supported75 Subscribers
View all SubscribersAdd a CodeTriage badge to spark
Help out
- Issues
- [SPARK-56368] Make the number of rows per custom metrics update configurable
- [SPARK-56373][PYSPARK] Add docstring annotations to classify PySpark APIs for Spark Connect compatibility
- KLL_INVALID_INPUT_SKETCH_BUFFER error when empty KLL sketch is read
- NullPointerException in KafkaMicroBatchStream.metrics when latestPartitionOffsets is uninitialized
- [SPARK-56231][SQL] Bucket pruning and bucket join optimization for V2 file read path
- [SPARK-56232][SQL][SS] V2 streaming read for FileTable
- [SPARK-56233][SQL][SS] V2 streaming write for FileTable
- [SPARK-56367][SS][PYTHON][DOCS] Fix latestOffset docstring and tutorial to use correct field name and signature
- [SPARK-56370] Optimize Platform.copyMemory with a fast path for small copies
- [PYTHON] Allow `PathLike` path objects as input to `readwriter`
- Docs
- Scala not yet supported