Skip to content
#

aws-glue-data-catalog

Here are 15 public repositories matching this topic...

This project repo 📺 offers a robust solution meticulously crafted to efficiently manage, process, and analyze YouTube video data leveraging the power of AWS services. Whether you're diving into structured statistics or exploring the nuances of trending key metrics, this pipeline is engineered to handle it all with finesse.

  • Updated Mar 20, 2024
  • Python

Developed an ETL pipeline for real-time ingestion of stock market data from the stock-market-data-manage.onrender.com API. Engineered the system to store data in Parquet format for optimized query processing and incorporated data quality checks to ensure accuracy prior to visualization.

  • Updated Jun 25, 2024
  • Python

This project creates a scalable data pipeline to analyze YouTube data from Kaggle using AWS services: S3, Glue, Lambda, Athena, and QuickSight. It processes raw JSON and CSV files into cleansed, partitioned datasets, integrates them with ETL workflows, and catalogs data for querying. Final insights are visualized in QuickSight dashboards.

  • Updated Jan 25, 2025
  • Python

Improve this page

Add a description, image, and links to the aws-glue-data-catalog topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the aws-glue-data-catalog topic, visit your repo's landing page and select "manage topics."

Learn more