scraper

TODO: description

Running the scraper as a script

Use the file

Run yarn install in the scraper repository root
Edit run/urls.yml to specify urls to scrape for each provider. Naming a provider starting with a . will cause all its urls to be ignored
Run yarn scrape to start scraping!
Results are gonna be output in ./run/{provider}-{date}-batch{number}.json

Alternative you can run using the launch option in VSCode (And it will attach the debuger!)

In order to run the stack as headless you'll need to set up a .env like the following:

# .env
HEADLESS=true

Run yarn generate {name}
Code the scraper in the generated file at src/providers/{name}/scraper.ts
Register your scraper by adding export * as {name} from './{name}' at src/providers/index.ts
Run your scraper! :)

The Scraper uses the following environment variables:

HEADLESS: Wether to launch chromeium in headless mode or headful (with GUI). false by default

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.vscode		.vscode
run		run
src		src
.editorconfig		.editorconfig
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc		.prettierrc
README.md		README.md
package.json		package.json
tsconfig.json		tsconfig.json
yarn.lock		yarn.lock