We are happy to present the first release of the Apache Beam Python I/O connector for Amazon Data Firehose.
✨NEW
- Add a composite transform (
WriteToFirehose
) that puts records into a Firehose delivery stream in batch, using theput_record_batch
method of the boto3 package.
- Provide options that control individual records.
- jsonify - A flag that indicates whether to convert a record into JSON. Note that a record should be of bytes, bytearray or file-like object, and, if it is not of a supported type (e.g. integer), we can convert it into a Json string by specifying this flag to True.
- multiline - A flag that indicates whether to add a new line character (
\n
) to each record. It is useful to save records into a CSV or Jsonline file.
- Create a dedicated pipeline option (
FirehoseOptions
) that reads AWS related values (e.g.aws_access_key_id
) from pipeline arguments. - Implement metric objects that record the total, succeeded and failed elements counts.
- Add unit and integration testing cases. The moto and localstack-utils packages are used for unit and integration testing respectively. Also, a custom test client is created for testing retry of failed elements, which is not supported by the moto package.
See this post for more examples.