Skip to content

Latest commit

 

History

History
304 lines (226 loc) · 16.9 KB

README.md

File metadata and controls

304 lines (226 loc) · 16.9 KB

🌍 Twitter Sentiment Visualisations

Visualising sentiment trends from real-time social media data
sentiment-sweep.com

Important

As of Feb 2023, X no longer provides free API. This makes it impractical to continue running this app afforably or at scale.

Further to this, now that general purpose AI models are faster, more availible and less expensive than ever before, the static analysis approach (used here) no longer makes sense, given that AI sentiment analysis will yeild much more accurate and insightful results.

For these reasons, the public instance of Sentiment Sweep is no longer availible, and this project will cease to be maintained going forwards.

We had a good run! The Sentiment Sweep website ran for nearly a decade, saw upwards of 1 million sessions, and won 2 awards. It was a great learning experience, and a fun little project. Thank you everybody who used, supported and visited the website! 💖

Contents

About

What

A project to make large quantities of social media data more understandable.

How

The app that streams live social media data, and runs it through a custom sentiment analysis algorithm, to determine trends which are then visualised with a series of dynamic real-time charts.

Why

The aim of the app is to allow trends to be found between sentiment and other factors such as geographical location, time of day, demographics, similar topics, etc.

It has a range of uses, like analysing the effectiveness of a marketing campaign, comparing competing products, viewing local trends, gauging public opinion by location, determining best time of day to advertise to certain audiences, etc.

Where

A live demo is available at: http://sentiment-sweep.com (archived)

When

This project was initially developed in 2015. Some of the technologies used are a little out-dated now, although the app still works great. A few of the external services that were used to provide additional context (like HP Idol on Demand, and IBM Watson, and certain GCP features) have been discontinued, meaning certain features may now be unavailible on the live instance.


Building

Developing

See the Dev Setup docs for local dev setup.

  1. Prerequisites - You will need Node.js, MongoDB and git installed on your system.
  2. Get the files - git clone https://github.com/Lissy93/twitter-sentiment-visualisation.git tsv then cd tsv
  3. Install dependencies - npm i / yarn will download requirements into node_modules, then automatically kick off a bower install for frontend libraries
  4. Set Config - yarn run config will generate the config\src\keys.coffee file, which you will then need to populate with your API keys and save. 5. Apply Settings - Check that your happy with the general app config in config/src/app-config.coffee
  5. Build Project - yarn build will compile the project from the source, outputting files into dist ready to be published
  6. Start MongoDB - mongod will start a MongoDB instance (run in separate terminal instance, see instructions: Starting a MongoDB instance)
  7. Run the project - yarn dev will build, start the dev server, with live-reload and auto-testing
  8. Open Browser - Navigate to the specified port, to view running app, e.g. http://localhost:8080

Deploying

See the [Prod Deployment(/docs/build-environment.md) docs for more info.

Follow the instructions above, then

  1. Execute Tests - yarn test Ensure all tests pass and everything is working as expected
  2. Build for Prod - yarn build Compile all source files to the dist directory
  3. Start Server - yarn start Spin up HTTP server to start API and serve up compiled files

Testing

See the Test Strategy Docs for more info.
TSV is fully unit tested, and follows a BHD pattern. Unit, integration, coverage and depencency tests can be run using yarn test.

Pass/ Fail Criteria
Test Type Pass Condition
Functional Testing All acceptance criteria must be met, checked and documented
Unit Tests 100% of unit tests must pass. It will be immediately clear when a unit test is failing
Integration Tests 100% pass rate after every commit
Coverage Tests 80% or greater
Code Reviews B grade/ Level 4 or higher. Ideally A grade/ Level 5 if possible.
Dependency Checks Mostly up-to-date dependencies except in justified circumstances.
Testing Tool
  • Framework - Mocha
    • Used in order to store, write and run the tests in a structured way
  • Assertion Library - Chai
    • Provides a structure and syntax in order to actually write the test cases
  • Coverage Testing - Istanbul
    • Measures the proportion of your source code that is covered by your unit tests
  • Stubs, Spies and Mocking - Sinon.js
    • Mocking removes the need to call production APIs while running frontend unit tests
  • Continuous Integration Testing - Travis CI
    • Ensures that all the standalone modules function correctly when put together
  • Dependency Checking - David
    • Checks that each dependency is present, correct, secure and functional
  • Automated Code Review's - Code Climate
    • Scans for best practices, and fails in any part of the code could be improved upon
  • Headless Browser Testing - PhantomJS
    • Runs frontend tests without the need for a GUI browser
  • Testing HTTP services - SuperTest
    • Tests API endpoints and ensures routing is working correctly

Automated Workflows

TSV uses the Gulp streaming build tool to automate the prod and dev workflows. For more info, see the Build Environment docs.

The following tasks are useful for getting started:

  • gulp generate-config - Generates correctly structured default configuration files for settings and API keys
  • gulp build - Builds the project fully, including optimization, compilation, minification and validation
  • gulp nodemon - Runs the application on the default port (probably 8080), with live refresh
  • gulp test - Executes all unit and coverage tests, and generates a report containing the results
  • gulp - Default dev task - check the project is configured correctly, build ALL the files, run the server, watch for changes, recompile relevant files and reload browsers on change, and keep all browsers in sync, when a test condition changes it will also re-run tests - a lot going on!

Modules

The project was developed in a modular approach, made up of several distinct components. Each is published as a fully tested, documented and MIT-licensed NPM module for easy re-use.

  • sentiment-analysis - Useses AFINN-111 approach to calculate overall sentiment of a given sentence
  • fetch-tweets - Fetches tweets from Twitter based on topic, location, timeframe or combination
  • stream-tweets - Streams live Twitter data in real-time, based on location, given term, etc
  • remove-words - Removes all non-key words from a given string
  • place-lookup - Finds the latitude and longitude for any fuzzy place name using the Google Places API
  • hp-haven-sentiment-analysis - A Node.js client library for HP Haven OnDemand Sentiment Analysis module
  • haven-entity-extraction - Node.js client for HP Haven OnDemand Entity Extraction
  • tweet-location - Calculates the location from geo-tagged Tweets using the Twitter Geo API
  • find-region-from-location - Given a latitude and longitude calculates which region that point belongs in

Project Info

Project Planning

A set of User Stories with Acceptance Criteria and Complexity Estimates were drawn up outlining what features the finished solution should have. These were expaned upon further with wireframes in the Methodology section.

Technologies

View full tech stack at: stackshare.io/Lissy93/sentiment-sweep

The backend is primarily written in Node.js, with web-sockets facilitating the real-time communication with the frontend, and a data cache stored in MongoDB. Pages are rendered isomorphically, with data visualizations written using D3.js. Social data is fetched from Twitter, compute happens locally, and a few external APIs were used to provide additional context in the form of AI. Views are written in Pug, styles in Less, scripts in CoffeeScript and everything is compiled via a Gulp script.

The project and app are still functional, however 5 years on, this would not be an ideal tech stack. There are now better technologies available that would enable greater performance, less code, easier project management and improved developer experience. If I was to re-write this project in 2022, a better tech stack would likely be Go for the backend, Svelte + Svelte Kit for the frontend and TypeScript for the code, with Pixi.js for the interactive content, styled-components for styling and Rollup for putting it all together.

Status

Build Status View on Snyk Code Climate Size Website

Demo

A live demo of the application has been deployed to: http://sentiment-sweep.com

View Screenshots of each screen in the docs.

Screenshots

Awards

Alicia Sykes - StartHack Winner Alicia Sykes - Oxford Winner

The first stages of the project were developed at StartHack Switzerland 2014, where it won first-place.

It was then further expanded upon, and used as part of my undergraduate thesis, where it won the Oxford BCS best Dissertation Award.




The University Project recieved 96%, so feel free to use it as an example - here's the Final Report in PDF format (warning - it's 300 pages!). And the deck used for the technical presentation, us available at: presentation.sentiment-sweep.com


Documentation


License

twitter-sentiment-visualisation was developed by Alicia Sykes, licensed under MIT © 2014 - 2022.

For information, see TLDR Legal > MIT

The MIT License (MIT)
Copyright (c) Alicia Sykes <[email protected]> 

Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation files
(the "Software"), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge,
publish, distribute, sub-license, and/or sell copies of the Software,
and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:

The above copyright notice and this permission notice shall be
included install copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANT ABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON INFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

© Alicia Sykes 2015 - 2020
Licensed under MIT

Thanks for visiting :)