Apache Beam is an open source unified programming model to define and execute data processing pipelines, including ETL, batch and stream processing. We consider it has a huge potential to improve traditional development patterns in both transactional and analytical processing of data. Specifically it can be applied to event-driven applications, data pipelines and streaming analytics.

Employing dataflow programming, Beam supports a range of I/O connectors, but we find some gaps in the existing connectors especially in relation to the Python SDK. It fueled us to start the Apache Beam Python I/O Connectors project.