Skip to content

Python package that pulls in data from Mongo and Google Analytics and present it as a Polars Dataframe

Notifications You must be signed in to change notification settings

Cyber4All/secured_python_pipeline

Repository files navigation

SecurEd Python Data Pipeline

A python module that takes in data from Google Analytics and joins the appropriate MongoDB SecurEd collections as Polars dataframes for easier data analysis.

The dataframe can be converted to different data types and file formats such as csv, Pandas dataframe, Apache Arrow, etc.

The version of PyMongo used throughout this modules leverages the PyMongoArrow extension to automatically output a polars dataframe.

PyMongoArrow is a PyMongo extension containing tools for loading MongoDB query result sets as Apache Arrow tables, Pandas and NumPy arrays

Requirements

A Google Analytics account to supply the following environment variables to be written in a .env file:

  • GOOGLE_SERVICE_ACCOUNT_EMAIL
  • GOOGLE_PRIVATE_KEY

Mongo database URI

  • MONGO_DB_URI

Importing

This package isn't currently in PyPi so install via git

uv

uv add git+https://github.com/Cyber4All/secured_python_pipeline.git

pip

pip install git+https://github.com/Cyber4All/secured_python_pipeline.git

About

Python package that pulls in data from Mongo and Google Analytics and present it as a Polars Dataframe

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages