Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

appssubmitter finial version #128

Merged
merged 24 commits into from
Sep 9, 2024
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
9beb1e2
appssubmitter finial version
SeverusYixin Jul 19, 2024
87323f1
File Selection Function Update
SeverusYixin Jul 22, 2024
078174b
In this version, after the PR is merged, it will also close the corre…
SeverusYixin Jul 22, 2024
5b44bda
in this version, it has finish these changes "Remove the submitter na…
SeverusYixin Jul 24, 2024
2e4bae1
All the “Remove submitter's name” has been completed under this versi…
SeverusYixin Jul 26, 2024
603e579
update for add a "tags/types/.." retriever
SeverusYixin Aug 12, 2024
849e37d
update for adding tags which are not in the tags select list
SeverusYixin Aug 19, 2024
a17af85
Delete the specific mapping in the application and renew our data source
SeverusYixin Aug 19, 2024
2596d75
Merge branch 'main' into app_for_adding_entries
haesleinhuepf Aug 26, 2024
2ea918d
update with Docstring
SeverusYixin Aug 27, 2024
738d248
Merge branch 'app_for_adding_entries' of https://github.com/NFDI4BIOI…
SeverusYixin Aug 27, 2024
58df5a7
Update the function with "def get_github_repository(repository):" a…
SeverusYixin Aug 28, 2024
2f49b88
Update the index_2 to make sure it is only about the app submitter.
SeverusYixin Aug 28, 2024
019ec5e
Update the introduction with the latest version
SeverusYixin Aug 28, 2024
94c66c8
Update index_2.md
SeverusYixin Aug 29, 2024
93ce49f
update for the import function and removal of the yml file selector
SeverusYixin Sep 5, 2024
7506ec6
make sure all the read and write is "utf-8"
SeverusYixin Sep 5, 2024
25ea9ce
renamed documentation page to reasonable filename
haesleinhuepf Sep 9, 2024
4639311
make the app-submitter just append a new entry by the end, stop it to…
haesleinhuepf Sep 9, 2024
329c52f
minor documentation update
haesleinhuepf Sep 9, 2024
7d488db
change order of fields in form
haesleinhuepf Sep 9, 2024
0224804
remove unused code
haesleinhuepf Sep 9, 2024
6e093bd
clean up code, add comments, removed unused imports
haesleinhuepf Sep 9, 2024
327a342
renamed app in documentation to be consistent
haesleinhuepf Sep 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ parts:
- caption: Contributing
chapters:
- file: contributing/index
- file: contributing/index_2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to rename this file. "index_2" is not meaningful.

- file: contributing/format

- caption: By tag
Expand Down
Binary file added docs/contributing/appsubmitter.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/contributing/appsubmitter_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/contributing/appsubmitter_3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
100 changes: 100 additions & 0 deletions docs/contributing/index_2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# AppSubmitter Guidance
# How to contribute

This repository contains lists of training materials. It is extensible using GitHub pull requests. You can find a how-to guide at the bottom of this page. The format for entries in the repository is documented on the next page.

## Quick contributing shortcut:

If you're too busy to enter everything in detail yourself, please create a [GitHub issue](https://github.com/NFDI4BIOIMAGE/training/issues) with a link to the materials you want to include in our list. We can take care of all the details.

## What to contribute

Consider adding your favorite training materials and resources. If you know a collection of resources, add it, but do not add all individual entries of the collection. We are working on collecting them automatically ([more information](https://github.com/NFDI4BIOIMAGE/training/issues/2)). However, if there are specific entries in such a collection that you think are particularly valuable, feel free to add them now.

## Inclusion criteria

We will consider merging links to all educative materials related to research data management, especially in the bio-imaging context, and bio-image analysis.

We would like to collect links to resources in various formats/content types, including slides, posters, publications, blog posts, example data, collections (of links to other materials), and more.

## Exclusion criteria

We will only merge links to materials behind a paywall in exceptional cases. We will also not merge links to materials that primarily advertise commercial products. However, if there are openly accessible training resources for commercial software, we welcome links to these resources.

## Maintenance of contributions

We reserve the right to remove and modify entries in this collection at any time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicated text from the page before. Please only document your AppSubmitter in this page.


# Using the AppSubmitter

The AppSubmitter streamlines the process of contributing new training materials to our repository. Follow these steps to use the AppSubmitter:

## Prerequisites

Ensure you have the necessary Python libraries installed. Run the following command to install them:

```
pip install streamlit pygithub pyyaml
```

Make sure you have set your GitHub API key as an environment variable. Here's how:

1. Open Command Prompt by pressing Win + R, typing `cmd`, and pressing Enter.
2. Verify that the environment variable has been set by running `echo %GITHUB_API_KEY%`.
3. If you don't have your API key, go to your GitHub account settings and follow these steps:
- Click your profile photo in the upper-right corner of any page on GitHub, then click Settings.
![Screenshot showing how to create your API Token on your github](set_api_key_1.png)
![Screenshot showing how to create your API Token on your github](set_api_key_2.png)
- In the left sidebar, click Developer settings.
![Screenshot showing how to find the API Token setting on your github](set_api_key_3.png)
- Under Personal access tokens, click Tokens (classic) to generate your API key.
![Screenshot showing how to generate the API Token on your github](set_api_key_4.png)
4. Once you have your API key, set it as an environment variable using the following command in your terminal:
```
setx GITHUB_API_KEY "your_github_api_key"
```

## Running the AppSubmitter

Navigate to the `scripts` folder using the following command:

```
cd ../scripts
```

Run the AppSubmitter with the following command that you will see the AppSubmitter interface:

```
streamlit run appsubmitter.py
```
![Screenshot of Appsubmitter](appsubmitter.png)

With the help of the AppSubmitter interface where you can easily add the author, name, description, URL, select the license, tags, type, and choose the YAML file you want to upload.

![Screenshot of Appsubmitter](appsubmitter_2.png)

After clicking Submit, a pull request will be created.

![Screenshot of Appsubmitter](appsubmitter_3.png)

If you think your contribution is substantial, feel free to send a pull request adding yourself to the list of authors [here](https://github.com/NFDI4BIOIMAGE/training/blob/main/docs/_config.yml#L2).




















Binary file added docs/contributing/set_api_key_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/contributing/set_api_key_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/contributing/set_api_key_3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/contributing/set_api_key_4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion resources/nfdi4bioimage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3582,4 +3582,4 @@ resources:
- collection
- video
url:
- https://www.youtube.com/watch?v=8zd4KTy-oYI&list=PLW-oxncaXRqU4XqduJzwFHvWLF06PvdVm
- https://www.youtube.com/watch?v=8zd4KTy-oYI&list=PLW-oxncaXRqU4XqduJzwFHvWLF06PvdVm
193 changes: 193 additions & 0 deletions scripts/appsubmitter.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
import streamlit as st
import os
from github import Github
import yaml
import time
from pathlib import Path

def load_yaml_data(file_path):
"""
Load YAML data from a file.

Args:
file_path (str): Path to the YAML file.

Returns:
dict: Parsed YAML data.
"""
with open(file_path, 'r') as file:
return yaml.safe_load(file)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you not using this function?

def read_yaml_file(filename):

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done



def get_unique_values_from_yamls(resources_dir):

"""
Get unique tags, types, and licenses from YAML files.

Args:
resources_dir (str): Directory containing YAML files.

Returns:
tuple: Sorted lists of unique tags, types, and licenses.
"""
unique_tags = set()
unique_types = set()
unique_licenses = set()

for yaml_file in Path(resources_dir).glob('*.yml'):
yaml_data = load_yaml_data(yaml_file)
if 'resources' in yaml_data:
for entry in yaml_data['resources']:
# Handle 'tags'
tags = entry.get('tags', [])
if isinstance(tags, list):
unique_tags.update(tags)
else:
unique_tags.add(tags)

# Handle 'type'
type_ = entry.get('type', 'Unknown')
if isinstance(type_, list):
unique_types.update(type_)
else:
unique_types.add(type_)

# Handle 'license'
license_ = entry.get('license', 'Unknown')
if isinstance(license_, list):
unique_licenses.update(license_)
else:
unique_licenses.add(license_)

return sorted(unique_tags), sorted(unique_types), sorted(unique_licenses)


def get_yaml_files(resources_dir):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you not using this function?

def load_dataframe(directory_path):

It gives you a DataFrame with all contents, which you can filter for unique entries in specific columns: df["license"].unique()

"""
List YAML files in a directory.

Args:
resources_dir (str): Directory containing YAML files.

Returns:
list: Sorted list of YAML file names.
"""
return sorted([str(yaml_file.name) for yaml_file in Path(resources_dir).glob('*.yml')])

def authenticate_with_github():
"""
Authenticate with GitHub using the API key from environment variables.

Returns:
tuple: Authenticated GitHub instance and repository object.

Raises:
Exception: If authentication fails.
"""
GITHUB_API_KEY = os.getenv('GITHUB_API_KEY')
if not GITHUB_API_KEY:
st.error("GitHub API Key is not set in the environment variables.")
st.stop()

try:
g = Github(GITHUB_API_KEY)
repo = g.get_repo("NFDI4BIOIMAGE/training")
except Exception as e:
st.error(f"Failed to authenticate with GitHub: {e}")
st.stop()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using this function instead:

def get_github_repository(repository):

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see your changes here until you commit and push them

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll push it after I'm done: "It gives you a DataFrame with all content, which you can filter for unique entries in specific columns: df["license"].unique()"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see your changes here until you commit and push them

I marked it to told myself this is finished XD


return g, repo

def create_pull_request(g, repo, yaml_file, authors, license, name, description, tags, type_, url):
"""
Create a pull request to add a new entry to a YAML file on GitHub.

Args:
g (Github): Authenticated GitHub instance.
repo (Repository): Repository object.
yaml_file (str): YAML file to update.
authors (str): Authors of the new entry.
license (str): License of the new entry.
name (str): Name of the new entry.
description (str): Description of the new entry.
tags (list): Tags for the new entry.
type_ (str): Type of the new entry.
url (str): URL of the new entry.

Raises:
Exception: If the pull request creation fails.
"""
try:
file_path = f"resources/{yaml_file}"
file_contents = repo.get_contents(file_path)
yaml_content = file_contents.decoded_content.decode('utf-8')
yaml_data = yaml.safe_load(yaml_content)
new_entry = {
'authors': authors,
'license': license,
'name': name,
'description': description,
'tags': tags,
'type': type_,
'url': url
}
if 'resources' in yaml_data:
yaml_data['resources'].append(new_entry)
else:
yaml_data['resources'] = [new_entry]
new_yaml_content = yaml.safe_dump(yaml_data, allow_unicode=True, sort_keys=False)
base_branch = repo.get_branch("main")
timestamp = int(time.time())
branch_name = f"update-{yaml_file.split('.')[0]}-{timestamp}".replace(' ', '-')
repo.create_git_ref(ref=f"refs/heads/{branch_name}", sha=base_branch.commit.sha)
repo.update_file(file_path, f"Add new entry", new_yaml_content, file_contents.sha, branch=branch_name)
pr_title = f"Add new training materials request to {yaml_file}"
pr_body = "Added new training materials."
pr = repo.create_pull(title=pr_title, body=pr_body, head=branch_name, base='main')
st.success(f"Pull request created: {pr.html_url}")
except Exception as e:
st.error(f"Failed to update YAML file and create pull request: {e}")

# Path to resources directory
resources_dir = Path('..') / 'resources'

# Extract dynamic values from YAML files
unique_tags, unique_types, unique_licenses = get_unique_values_from_yamls(resources_dir)

# Get list of YAML files dynamically
yaml_files = get_yaml_files(resources_dir)

g, repo = authenticate_with_github()

st.title("GitHub Training Materials Submission")
st.markdown("Welcome to the GitHub Training Materials Submission app! Please fill in the details below and click 'Submit'. Thank you for your contribution!")

with st.form(key='submission_form'):
authors = st.text_input("Authors")
license = st.multiselect("Licenses", unique_licenses)
name = st.text_input("Name")
description = st.text_area("Description")

tags = st.multiselect("Tags", unique_tags)

tags_input = st.text_input("Add tags which are not in the select list (comma separated)", "")

type_ = st.multiselect("Types", unique_types)
url = st.text_input("URL")


yaml_file = st.selectbox("YAML File", ["Select a YAML file"] + yaml_files)

submit_button = st.form_submit_button(label='Submit')

if submit_button:
if tags_input:
entered_tags = [tag.strip() for tag in tags_input.split(',') if tag.strip()]
tags.extend(entered_tags)

tags = sorted(set(tags))

if not license or yaml_file == "Select a YAML file":
st.error("Please make sure all selections are made.")
else:
create_pull_request(g, repo, yaml_file, authors, license, name, description, tags, type_, url)
82 changes: 82 additions & 0 deletions scripts/tags_type_licence_retrieve.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
import yaml
import os

def load_yaml(file_path):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, consider using pre-existing functions instead of duplicating code.

with open(file_path, 'r', encoding='utf-8') as file:
return yaml.safe_load(file)

def extract_tags(yaml_content):
tags = set()
for item in yaml_content.get('resources', []):
if 'tags' in item:
tags.update(item['tags'])
return tags

def extract_licenses(yaml_content):
licenses = set()
for item in yaml_content.get('resources', []):
if 'license' in item:
license_field = item['license']
if isinstance(license_field, list):
licenses.update(license_field)
else:
licenses.add(license_field)
return licenses

def extract_types(yaml_content):
types = set()
for item in yaml_content.get('resources', []):
if 'type' in item:
types.update(item['type'])
return types
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, consider re-using pre-existing functions, e.g. for retrieving a dataframe and then determine unique items in columns.


def collect_tags_licenses_and_types_from_files(directory):
all_tags = set()
all_licenses = set()
all_types = set()
if os.path.exists(directory):
for file_name in os.listdir(directory):
if file_name.endswith('.yaml') or file_name.endswith('.yml'):
file_path = os.path.join(directory, file_name)
yaml_content = load_yaml(file_path)
all_tags.update(extract_tags(yaml_content))
all_licenses.update(extract_licenses(yaml_content))
all_types.update(extract_types(yaml_content))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, reuse code instead of implementing the same thing multiple times.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahhh, you can delete this script, or I delete it. We don't need it anymore, I mean the scripts tags_type_licence_retrieve.py, I have already integrated it into the appsubmitter

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please delete it. It's your branch.

else:
print(f"Directory {directory} does not exist.")
return all_tags, all_licenses, all_types

def write_to_file(items, output_file):
with open(output_file, 'w', encoding='utf-8') as file:
for item in sorted(items):
file.write(item + '\n')

def main():
# Get the absolute paths
resource_directory = os.path.abspath(os.path.join(os.path.dirname(__file__), '..', 'resources'))
tags_output_file = os.path.abspath(os.path.join(os.path.dirname(__file__), 'tags.txt'))
licenses_output_file = os.path.abspath(os.path.join(os.path.dirname(__file__), 'licenses.txt'))
types_output_file = os.path.abspath(os.path.join(os.path.dirname(__file__), 'types.txt'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These three text files, what do we need them for?


unique_tags, unique_licenses, unique_types = collect_tags_licenses_and_types_from_files(resource_directory)

if unique_tags:
write_to_file(unique_tags, tags_output_file)
print(f"Tags have been written to {tags_output_file}")
else:
print("No tags found or directory is empty.")

if unique_licenses:
write_to_file(unique_licenses, licenses_output_file)
print(f"Licenses have been written to {licenses_output_file}")
else:
print("No licenses found or directory is empty.")

if unique_types:
write_to_file(unique_types, types_output_file)
print(f"Types have been written to {types_output_file}")
else:
print("No types found or directory is empty.")

if __name__ == "__main__":
main()
Loading