Skip to content

Latest commit

 

History

History
58 lines (44 loc) · 1.53 KB

README.md

File metadata and controls

58 lines (44 loc) · 1.53 KB

scrapy-finance

license

scrapy spiders to crawl the financial data pertinent to train word vectors.

List of sources

How to use this

  1. Install scrapy.
pip3 install scrapy
  1. Run the scrapy crawl command.
(py3) hardik@shire:~/scrapy-finance$ scrapy crawl bloomberg

How to modify spiders for your use

Please look at the specific spider files like wikipedia.py. They are relatively easy to follow and modify.

.
├── LICENSE
├── README.md
├── scrapy.cfg
└── text
    ├── __init__.py
    ├── items.py
    ├── middlewares.py
    ├── pipelines.py
    ├── settings.py
    └── spiders
        ├── bloomberg.py
        ├── __init__.py
        ├── investopedia.py
        ├── qplum.py
        └── wikipedia.py

Notes

  • The text data is written in the lower case at the moment in all spiders.
  • This is not checked with python2.

Contributing

Please feel free to submit a pull request to add relevant spiders.

LICENSE

MIT