Skip to content

Cheng111/CS526_Project_Gu_Chen

Repository files navigation

CS526 Project: Efficient MapReduce Implementation on a Graph Algorithm

Poster

  • Chen_Gu_poster.pdf: The final poster
  • Chen_Gu_poster.zip: We used the overleaf to make the poster, and this is the original files

Abstract

  • Chen_Gu_abstract.pdf: The abstract of final project.

Code

  • CompareDifVersion.ipynb: The jupyter notebook that used for compare different version paraclique algorithms

    • Figure 3 was generated by this code
    • The running time can be long!
    • The first part of the code analysis the data distribution and generate the figure 3 in the poster
    • The second part records the running time of the three different impletation with different samples
    • The output of second part are tables/files with the running time
    • Change the samplesize, samplenum, and inputfile in the code can change the samples. In our project, we manually changed these three paramaters to get the results of different samples.
  • time\time_resluts_plot.ipynb: visualization the time tables

    • Figure 4 was generated by this code
    • After we got the time tables of different samples, we moved the files into the time directory and plot with time_resluts_plot.ipynb
  • CompareDifVersion.py: Same code with CompareDifVersion.ipynb

    • Because the computing needs a long time. What we real did is run this python code at the background
    • To run this code, change the PYSPARK_PYTHON to /opt/anaconda3/bin/python in .bashrc

input data

output

  • time: some output of our code.
    • For instance, large800-10.time file includes the time statistical result of a test on large clique set. The test include 10 samples, each sample have 800 large cliques

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published