spark
https://github.com/apache/spark
Scala
Apache Spark - A unified analytics engine for large-scale data processing
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
Scala not yet supported75 Subscribers
View all SubscribersAdd a CodeTriage badge to spark
Help out
- Issues
- Pyspark: `DataFrame` methods behind `is_remote_only()` statically evaluate to `Union` during typechecking
- Add regression test for issue #55392 - Named parameters without legacy config
- [SPARK-57091][SQL] Add BroadcastNearestByJoinExec to avoid cross-product materialization
- [SPARK-57057][SQL] Target a specific branch on SELECT / INSERT via temporal clause and session config
- [SPARK-57064][SQL] Widen bucketing rule pattern matches to use FileSourceScanLike trait
- [SPARK-57055][SQL][DOCS] Document non-binary collation gap in DataFrameStatFunctions.bloomFilter
- [SPARK-57076][SQL] Add session-level default collation support
- [SPARK-57056][SQL] Add `SupportsBranching` DSv2 interface and branching DDL
- [SPARK-57029][SQL][TESTS] Add byte-level visibility golden for ICU collation sort keys
- [SPARK-57052][SS] Add state row format validation to multiGet in RocksDBStateStoreProvider
- Docs
- Scala not yet supported