Apache Beam is an open source unified programming model to define and execute data processing pipelines, including ETL, batch and stream processing. We consider it has a huge potential to improve traditional development patterns in both transactional and analytical processing of data. Specifically it can be applied to event-driven applications, data pipelines and streaming analytics.

Employing dataflow programming, Beam supports a range of I/O connectors, but we find some gaps in the existing connectors especially in relation to the Python SDK. It fueled us to start the Apache Beam Python I/O Connectors project.

Amazon Data Firehose is a fully managed service for delivering real-time streaming data to destinations such as Amazon Simple Storage Service (Amazon S3), Amazon Redshift, Amazon OpenSearch Service and Amazon OpenSearch Serverless. The Apache Beam Python I/O connector for Amazon Data Firehose (firehose_pyio) provides a data sink feature that facilitates integration with those services.