Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Tools with Twitter Ingestion Server, Twitter GeoTagger and AsterixDB Ingestion Server. #807

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

baiqiushi
Copy link
Collaborator

Data Tools

Data Tools is a new module consisting of 3 components that serve the data preparation of the TwitterMap application.

Twitter Ingestion Server

Twitter Ingestion Server is a daemon service that can ingest real-time tweets from Twitter Filter Stream API into local gzip files in a daily rotation manner.
It is also a light-weight HTTP server with 3 endpoints:

  • /stats - HTTP GET endpoint that returns current ingestion status information in JSON format.
  • /proxy - WebSocket endpoint that pushes real-time tweets to any client in connection.
  • / - HTTP GET endpoint that returns an index.html as an example page demonstrating the usage of the above two endpoints.

Twitter GeoTagger

Twitter GeoTagger is Java program to geoTag Twitter JSON with {stateID, stateName, countyID, countyName, cityID, cityName}.
It has 2 modes,

    1. in API mode, it provides a function tagOneTweet that can be called from other programs;
    1. in process mode, it provides a main function that can be started as a JVM process to geotag tweets in shell console pipeline.

AsterixDB Ingestion Server

TBD.

@sadeemsaleh
Copy link
Contributor

NOT ready to be merged

@codecov-io
Copy link

codecov-io commented Dec 29, 2020

Codecov Report

Merging #807 (6bd3e50) into master (9caf3d0) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #807   +/-   ##
=======================================
  Coverage   63.91%   63.91%           
=======================================
  Files          75       75           
  Lines        4076     4076           
  Branches      355      355           
=======================================
  Hits         2605     2605           
  Misses       1471     1471           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9caf3d0...6bd3e50. Read the comment docs.

@codecov-commenter
Copy link

codecov-commenter commented Dec 24, 2021

Codecov Report

Merging #807 (0538333) into master (9caf3d0) will not change coverage.
The diff coverage is n/a.

❗ Current head 0538333 differs from pull request most recent head 3786af4. Consider uploading reports for the commit 3786af4 to get more accurate results
Impacted file tree graph

@@           Coverage Diff           @@
##           master     #807   +/-   ##
=======================================
  Coverage   63.91%   63.91%           
=======================================
  Files          75       75           
  Lines        4076     4076           
  Branches      355      355           
=======================================
  Hits         2605     2605           
  Misses       1471     1471           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9caf3d0...3786af4. Read the comment docs.

…on output file rotation in TwitterIngestioinServer; (2) add parameter for switching between general Twitter and TwitterMap output format in AsterixDBIngestionDriver; (3) Fix the issue of the unexpected end of file for output gzip files in TwitterIngestionServer;
…does not wait for the WebsocketClient to long live waiting for tweets from the Proxy server; (2) fix the bug in AsterixDBAdapterForTwitterMap that the schema should be initilized in the constructor;
…nsafe issue in AsterixDBAdapterForTWitterMap and AsterixDBAdapterForTwitter.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants