Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visualization to make #41

Open
sanittawan opened this issue Jun 5, 2019 · 1 comment
Open

Visualization to make #41

sanittawan opened this issue Jun 5, 2019 · 1 comment
Labels
High Priority must be done in sequence or must meet the deadline soon Medium presentation

Comments

@sanittawan
Copy link
Collaborator

sanittawan commented Jun 5, 2019

Data description

  1. Relational Diagram of the files (@dhruvalb)
  2. MPI run time experiment

Exploratory Analysis

  1. Top 15 tags (@tonofshell) - this file OR this file (I'm confused. Are they the same?)
  2. Users Activities (@tonofshell) - which users are most active - this file
  3. Questions with most answers per year (@tonofshell) - this file
  4. Users with gold answer badges locations (@tonofshell) - this file
  5. 2-grams of tags that appear together (network of tags) (@tonofshell) - this file

Main Analysis

  1. Time series plots of each language
  • Please ask @liu431 for the output
@sanittawan sanittawan added High Priority must be done in sequence or must meet the deadline soon Medium presentation labels Jun 5, 2019
@sanittawan
Copy link
Collaborator Author

sanittawan commented Jun 6, 2019

@tonofshell I updated the output of Users Activities. You have to download it. Just click the link. It's (user ID, number of activities)

I am going to try to do some descriptive statistics on the data. I'm super impressed with Spark. It took 2 minutes and 40 seconds to do this analysis on the Posts.csv with 3 workers. It's AWESOME.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
High Priority must be done in sequence or must meet the deadline soon Medium presentation
Projects
None yet
Development

No branches or pull requests

1 participant