Skip to content
This repository has been archived by the owner on Dec 3, 2019. It is now read-only.

Modified populate database method to directly fetch upstream data. #53

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

shivanshuraj1333
Copy link
Contributor

@shivanshuraj1333 shivanshuraj1333 commented Mar 31, 2019

Fixes #52
populate_db.py is modified to fetch updated data directly from https://github.com/spdx/license-list-data.

Also, license_text field contains complete updated license text.

Takes very less time to populate database
Screenshot from 2019-04-01 04-35-19
Note1: CSV file is no more needed to populate data base, I'm not removing it in this pull request (since PR #48 is not merged/closed yet)
Note2: I am intentionally commenting out previous method of data population, since in this pr CSV is not removed"

@zvr
Copy link
Collaborator

zvr commented Apr 1, 2019

using PyGitHub requires a GitHub account (I think, and your code confirms it), which should not be a prerequisite for running clio.

the data can be accessed without any account.

@shivanshuraj1333
Copy link
Contributor Author

shivanshuraj1333 commented Apr 1, 2019

Removed user authentication in PyGithub. Now anyone can populate/use Clio locally.
PyGithub will also be used in some future work.

@zvr
Copy link
Collaborator

zvr commented Apr 2, 2019

comments:

  • code shouldn't be reading the directory entries, but the licenses.json that describes them
  • code should also read the exceptions.json in the same manner
  • more importantly, you shouldn't be reading the repository which might have transient commits, but use only one of the official releases https://github.com/spdx/license-list-data/releases

@zvr zvr mentioned this pull request Apr 2, 2019
@shivanshuraj1333
Copy link
Contributor Author

Okay,
Thank you, I will update my script accordingly.

@shivanshuraj1333
Copy link
Contributor Author

shivanshuraj1333 commented Apr 3, 2019

Flow of updated script:
Untitled Diagram

Advantages of above approach:

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Populate data base from upstream data
2 participants