beam
https://github.com/apache/beam
Java
Apache Beam is a unified programming model for Batch and Streaming data processing.
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
Java not yet supported57 Subscribers
View all SubscribersAdd a CodeTriage badge to beam
Help out
- Issues
- Use Executor and JobClient Flink APIs on FlinkRunner
- Failed to run dataflow to consume pubsub
- PubsubIO.readMessagesWithMessageId does not work with streaming dataflow pipeline
- Make Python side input tags always key, value pairs instead of depending in index suffixed tag names
- Autodetect Avro schema from Avro file
- Apache Beam/Dataflow flowed a CalledProcessError with beam.Pipeline("DataflowRunner", options=opts)
- Use FlinkRunner instead of PortableRunner for load tests
- Ensure that the environment is propagated through from ExpansionService to Dataflow
- Reading from pubsub in portable FlinkRunner (ambigious ReadFromPubSub transform)
- Too many shards in GCS
- Docs
- Java not yet supported