spark
https://github.com/apache/spark
Scala
Apache Spark - A unified analytics engine for large-scale data processing
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
Scala not yet supported75 Subscribers
View all SubscribersAdd a CodeTriage badge to spark
Help out
- Issues
- [SPARK-57521][ML][CONNECT] Exclude parent from Model.estimatedSize to fix overcounting in ML cache
- [SPARK-57515][SQL] Surface MALFORMED_CSV_RECORD instead of ArrayIndexOutOfBoundsException when CSV header exceeds maxColumns
- [SPARK-57352][SDP] Detect resolved table references in pipeline flow lineage
- [SPARK-56971][SS] Add CommitMetadataV3 and SinkMetadataInfo for sink evolution
- [SPARK-57500][SQL] Escape backslash in MySQL JDBC pushdown for string literals in comparison/IN predicates
- Fix: enable spark archive reader config
- [SPARK-57471][SQL] Populate input/output size metrics for JDBC reads and writes
- [SPARK-57491][CORE] Prevent and detect stale push-based shuffle data from duplicate map task attempts
- [SPARK-57472][SQL] Make FileTable.mergedOptions merge table and relation options case-insensitively
- [SPARK-45154][ML]Fixed an issue where CrossValidator yields non-deterministic trees in Spark 3 due to Python's hash randomization on the default seed. Replaced it with a stable zlib.crc32 seed.
- Docs
- Scala not yet supported