Skip to content

FredrikBakken/lakefs-datafusion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Prerequisties

This guide assumes that you have already installed the following:

  • Docker
  • Docker Compose
  • Rust (v1.78.0 used for development)

Docker

The LakeFS service is accessible by using Docker containers. Start this service by running the following command in the terminal:

docker compose up --build

Confirm that LakeFS is running by opening a browser on the following website: http://localhost:8000.

Dataset

The dataset used in this example is the NYC Taxi Trip dataset. Download the .parquet file from here: https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2022-02.parquet

Once downloaded, go to the demo repository and upload the dataset to the following path, taxi_data/input/:

Running the Application

All that remains is to build and run the application with:

cargo build
cargo run

Once it has completed running, you will be able to see an output in the terminal window. You will also be able to find a set of files written to the following path: http://localhost:8000/repositories/demo/objects?ref=main&path=taxi_data%2Foutput%2F.

About

:octocat: Read/write files from/to LakeFS using Apache DataFusion

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages