This code was mostly generated by Gemini AI. Good job on such a simple and tedious task, buddy!
This projects exists to backup the Cover Art Archive's original sized pieces of cover art.
Create a virtual env and install the python3 pre-requisites:
python -mvenv .ve
source .ve/bin/activate
pip install -r requirements.txt
Now copy dot-env-sample
to .env
:
cp dot-env-sample .env
Then edit .env
according to your needs:
PG_CONN_STRING
-- the postgres connection string for access to a MusicBrainz databaseDB_PATH="caa_backup.db"
-- the location where to store the local database file to keep track of progress.BACKUP_DIR="caa-backup"
-- the cache directory where to store the downloaded filesDOWNLOAD_THREADS=12
-- the number of threads to use for simulteanous downloads
The easiest way to use this system is through the manage.py
script, which provides a unified interface:
# Activate your virtual environment first
source .ve/bin/activate
# View all available commands
python manage.py --help
# Check system status and configuration
python manage.py status
# Import data from PostgreSQL (first run)
python manage.py import-data
# Download cover art images (this will take DAYS or WEEKS!)
python manage.py download
# Verify local cache against database
python manage.py verify
# Start standalone monitoring server
python manage.py monitor --port 8080
-
import-data
: Import from PostgreSQL to SQLite--batch-size INTEGER
: Records per batch (default: 1000)--force
: Overwrite existing database--incremental
: Import only new records since last import
-
download
: Download cover art images--threads INTEGER
: Download threads (default: 8)--batch-size INTEGER
: Records per batch (default: 1000)--monitor-port INTEGER
: Monitoring port (default: 8080)
-
verify
: Verify cache against database -
monitor
: Standalone monitoring server--port INTEGER
: Server port (default: 8080)--host TEXT
: Server host (default: localhost)
-
status
: Display system status and statistics
- Import the database:
python manage.py import-data
- Download images:
python manage.py download
To keep the backup up-to-date:
- Update with new records:
python manage.py import-data --incremental
- Verify existing files:
python manage.py verify
- Download new files:
python manage.py download
Alternatively, for a complete refresh:
- Re-import all data:
python manage.py import-data --force
- Verify existing files:
python manage.py verify
- Download new files:
python manage.py download
You can still run the individual scripts directly:
Run caa_importer.py
to download the cover_art_archive.cover_art
table into SQLite.
Then, run caa_downloader.py
to download the cover art images. This is going to take DAYS, if not WEEKS!
To keep the backup up-to-date run the caa_importer.py
script again to re-download all the CAA data,
then re-run caa_verify.py
to mark all the downloaded files as downloaded.
Finally, run caa_downloader.py
again to download any new files that were added since the last run.