Skip to content

Feature request: Preprocessing for batch processor #5722

@sarflux

Description

@sarflux

Use case

I use Powertools Lambda (Python) for Batch processing, specifically the async_process_partial_response method to process incoming data before invoking other AWS services.

For example, one of my use cases involve batch processing incoming data from SQS, processing it and using the async_record_handler to invoke EventBridge using put_events from aiobotocore.

As the record handler is tightly coupled with the async_process_partial_response, I cannot preprocess and fail partial items before invoking my async_record_handler. This means each item is processed within the async_record_handler, and I have to add my synchronous preprocessing logic within this function.

Solution/User Experience

It would be a massive performance boost if I could preprocess the entire batch, mark items that fail my preprocessing function, and then batch process the rest in the async_record_handler function. The async_partial_response function could collect the cumulative failed records from the preprocessing function as well as the async record handler before end of execution.

I would imagine an inclusion for a synchronous preprocessing like the following to be included as a parameter.

@logger.inject_lambda_context(log_event=True, clear_state=True)
def lambda_handler(event, context: LambdaContext):
    return async_process_partial_response(
        event=event,
        preprocessing_record_handler=preprocess_sync_handler,
        record_handler=async_record_handler,
        processor=processor,
        context=context,
    )

Alternative solutions

I can currently achieve a solution to my use case without using the batch processing functionality of lambda powertools for python.

I am happy to provide snippets and or PRs to develop on this issue.

Acknowledgment

Metadata

Metadata

Labels

batchBatch processing utility

Projects

Status

Closed

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions