This is the numbercruncher we will use for generating fingerprints for whole
audio files that artists have uploaded and to match monitored use of music so it
can be associated with artists. It uses click to run as a command line tool
with the options like preview
, checksum
, or fingerprint
.
This is the current implementation status:
- preview
- creates a smaller audio preview file
- cuts out a one minute excerpt from the file
- retrieves metadata from the file and provides it in the content db table
- checksum:
- looks for files in
previewed_path
(see config.ini) - hashes it with sha256, both, the complete file and each 1MiB block of the file
- moves it into
checksummed_path
- sets the status in the
creation
table - writes the hash to the
content
table
- looks for files in
- fingerprint
- looks for files in
checksummed_path
(see config.ini) and uses echoprint-codegen to get a raw json with an audiofingerprint - fills in metadata that is missing in the json from the
content
andcreation
tables via proteus - sets the status in the
creation
table to fingerprinted - uploads ('ingests') the jsons including the fingerprints to our EchoPrint server
- makes a test query before and after ingesting the print and stores statistical data (score, uniqueness factor)
- moves the file into
fingerprinted_path
- looks for files in
-
./repro.py preview
- previews an audiofile -
./repro.py checksum
- hashes an audiofile -
./repro.py fingerprint
- fingerprints an audiofile -
./repro.py all
- does all the above steps with a priority on preview -
./repro.py loop
- repeatsall
option endlessly with 10 sec pauses in between -
./repro.py all
- sudo apt-get install libpq-dev
- git clone [email protected]:C3S/c3sRepertoireProcessing.git
- cd c3sRepertoireProcessing
- cp config.ini.EXAMPLE config.ini
- pip install virtualenv # setup own python environment
- virtualenv env
- env/bin/python setup.py develop # only once
- build an EchoPrint fingerprinter binary in ../echoprint-codegen/echoprint-codegen
- get ffmpeg (recommended: download static build to /usr/bin)
- setup & run c3s.ado.repertoire and get some 'uploaded' sample data from ado/etc/tmp/upload after uploading some audio files
- chown a+x repro.py
- ./repro.py all