Skip to content

jpollack/download_loc_audio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

A simple script to download LoC (Library of Congress) audio archives and
transcripts and metadata locally for further processing.

Requires LWP::Curl, HTML::TreeBuilder::XPath, JSON.

Which should be fixed.  LWP::Curl is nice but takes a lot of dependencies.

This is a part of a larger LoC Slave Narratives audio cleanup project.

Run:

./download_loc_audio.pl 'https://www.loc.gov/audio/?c=150&fa=subject:slave+narratives&fo=json'

To download everything to the 'data' directory.  Will take 8GB.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages