Django feeds is a simple RSS & Atom feed scraper with a couple of extra features to make your news reading & listening habits simpler. Feedparser is used to gather the entry data from the feeds. NLTK is used in conjunction with functions derived from Jonathon Vogel's summarize.py script. Currently, the text-to-speech functionality only support OS X's built-in say
command.
When available, the Feed's etag
or last_modified
attributes are used to reduce bandwidth consumption when refreshing news feeds. To see benefit from this feature, however, support must be enabled/configured server-side as well.
- Linux text-to-speech support with Festival
- AlchemyAPI integration
- Python 2.7+ (Tested with 2.7.5)
- Django 1.4+ (Tested with 1.4.5)
Install using pip
...
pip install django-feeds
In settings.py, add 'feeds'
to your INSTALLED_APPS
setting.
INSTALLED_APPS = (
...
'feeds',
)
Download the NLTK corpora. In terminal session run:
python -m nltk.downloader all
This may take a while depending on your internet connection.
Optional: Update the logger settings to display feed logs.
In settings.py, add formatters
to the LOGGING
definition:
'formatters': {
'verbose': {
'format' : '%(asctime)s %(levelname)s - PID=%(process)d Thread=%(thread)d - %(pathname)s:%(lineno)-5s in function %(funcName)s | %(message)s',
'datefmt' : '%Y-%m-%d %H:%M:%S'
},
'standard': {
'format' : '%(asctime)s %(levelname)s - %(name)s:%(lineno)-5s | %(message)s',
'datefmt' : '%Y-%m-%d %H:%M:%S'
},
'simple': {
'format' : '%(levelname)s - %(name)s:%(lineno)-5s | %(message)s',
},
},
Also add the following to handlers
:
'console' : {
'level' : 'DEBUG',
'class' : 'logging.StreamHandler',
'formatter' : 'standard'
},
Also add the following to loggers
:
'feeds': {
'handlers' : ['console'],
'level' : 'DEBUG',
'propagate' : True,
},
Optional: In settings.py, set the SUMMARIZE_WORD_COUNT_LIMIT
to limit the length of the content created by the summarizer.
SUMMARIZE_WORD_COUNT_LIMIT = 500
Start up the dev server to enter in your feed data.
python manage.py runserver localhost:8000
In the admin, add a new Feed object for each feed you want scraped.
localhost:8000/admin/feeds
After you've created your Feed objects, you don't need the server running anymore.
The first step is to refresh the news feeds, which consists of scraping the feeds for data and then summarizing the URLs and updating the cached content field.
python manage.py refresh_news_feeds
Once it has begun summarizing entries, you can start having your news read! You don't have to wait for it to completely finish if you open up another console window.
python manage.py speak_entries
Enjoy!