This project is about to stream huge data to PostgreSQL database. Tools used:
Dataset: Datablist provide free csv files for testing.
For each dataset, several CSV sizes are available, from 100 to 2 million records. The first line contains the CSV headers. An index column is set on each file. Rows have an index value which is incremental and starts at 1 for the first data row.
All datasets are free to download and play with. All the data is random and those files must only be used for testing.
docker run -d \
--name 10mdspf \
--restart=unless-stopped \
-e POSTGRES_USER=10mdspf \
-e POSTGRES_PASSWORD=10mdspf \
-e POSTGRES_DB=10mdspf \
-e PGDATA=/var/lib/postgresql/data \
-p 5433:5432 \
-v 10mdspf_data:/var/lib/postgresql/data \
postgres:latestPrerequisites:
Before running the container with a mounted volume, ensure the host directory has the correct permissions. The RustFS container runs as a non-root user (UID 10001), so the host directory owner must be set to match this, or you will encounter "Permission denied" errors.
# Create data and logs directories on the host
mkdir -p data logs
# Change the owner of these directories to UID 10001
sudo chown -R 10001:10001 data logsRunning the Container Use the docker run command to start the RustFS container in the background, mapping ports and volumes:
docker run -d \
--name rustfs \
--restart=unless-stopped \
-p 9000:9000 \
-p 9001:9001 \
-v $(pwd)/data:/data \
-v $(pwd)/logs:/logs \
-e RUSTFS_ACCESS_KEY=rustfsadmin \
-e RUSTFS_SECRET_KEY=rustfsadmin \
rustfs/rustfs:latestcelery -A config.celery_app worker -l infocelery -A config.celery_app flowerflower UI: http://localhost:5555