AWS Serverless ETL Sample

This project demonstrates a serverless ETL process using AWS services. It automatically processes CSV files uploaded to an S3 bucket by converting them to Parquet format.

Architecture

AWS S3 buckets for raw (sample-raw) and processed (sample-processed) data
AWS Lambda function for data processing
Amazon EventBridge for event-driven processing
AWS SAM for infrastructure as code
Docker container with Python 3.9 and pyarrow

Prerequisites

AWS SAM CLI
Docker
AWS CLI configured with appropriate credentials
GitHub repository with appropriate AWS credentials configured as secrets:
- AWS_ACCESS_KEY_ID
- AWS_SECRET_ACCESS_KEY

Deployment

Manual Deployment

Build the application:

sam build

Deploy the application:

sam deploy --guided

Automated Deployment

The project includes a GitHub Actions workflow that automatically builds and deploys the application when changes are pushed to the main branch. The workflow:

Sets up Python and AWS SAM
Configures AWS credentials
Builds the application using SAM
Deploys to AWS using SAM

To use the automated deployment:

Fork this repository
Configure AWS credentials as GitHub repository secrets:
- AWS_ACCESS_KEY_ID
- AWS_SECRET_ACCESS_KEY
Push changes to the main branch to trigger the deployment

Usage

Upload a CSV file to the sample-raw bucket
The Lambda function will automatically:
- Process the file
- Print the row count to CloudWatch logs
- Save the file as Parquet in the sample-processed bucket

Project Structure

template.yaml: SAM template defining AWS resources
src/app.py: Lambda function code
Dockerfile: Container configuration for Lambda
requirements.txt: Python dependencies

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt
samconfig.toml		samconfig.toml
template.yaml		template.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

AWS Serverless ETL Sample

Architecture

Prerequisites

Deployment

Manual Deployment

Automated Deployment

Usage

Project Structure

About

Uh oh!

Releases

Packages

Languages

Uh oh!

Uh oh!

RedOakStrategic/aws-sample

Folders and files

Latest commit

History

Repository files navigation

AWS Serverless ETL Sample

Architecture

Prerequisites

Deployment

Manual Deployment

Automated Deployment

Usage

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages