Skip to content

Simple python script to convert multiple json files into a parquet or and ORC file to be used on Hadoop.

Notifications You must be signed in to change notification settings

galanteh/JSON2Tables

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

What is this project?

This utility helps you to convert json files to a Parquet or an ORC file to be used on a Cloudera cluster, Snowflake, Databricks or others for example.

How to use it?

Parquet

python3 json2tables.py -i ./input -o ./output/cfdi.parquet

ORC

python3 json2tables.py -i ./input -o ./output/cfdi.orc

About

Simple python script to convert multiple json files into a parquet or and ORC file to be used on Hadoop.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages