Sourcery Starbot ⭐ refactored sidphbot/Auto-Research #3

sourcery-ai-bot · 2023-12-10T16:34:41Z

Thanks for starring sourcery-ai/sourcery ✨ 🌟 ✨

Here's your pull request refactoring your most popular Python repo.

If you want Sourcery to refactor all your Python repos and incoming pull requests install our bot.

Review changes via command line

To manually merge these changes, make sure you're on the main branch, then run:

git fetch https://github.com/sourcery-ai-bot/Auto-Research main
git merge --ff-only FETCH_HEAD
git reset HEAD^

sourcery-ai-bot

Due to GitHub API limits, only the first 60 comments can be shown.

sourcery-ai-bot · 2023-12-10T16:34:42Z

app.py

-    st.sidebar.image(Image.open('logo_landscape.png'), use_column_width = 'always')
-    st.title('Auto-Research')
-    st.write('#### A no-code utility to generate a detailed well-cited survey with topic clustered sections' 
-             '(draft paper format) and other interesting artifacts from a single research query or a curated set of papers(arxiv ids).')
-    st.write('##### Data Provider: arXiv Open Archive Initiative OAI')
-    st.write('##### GitHub: https://github.com/sidphbot/Auto-Research')
-    download_placeholder = st.container()
-
-    with st.sidebar.form(key="survey_keywords_form"):
-        session_data = sp.pydantic_input(key="keywords_input_model", model=KeywordsModel)
-        st.write('or')
-        session_data.update(sp.pydantic_input(key="arxiv_ids_input_model", model=ArxivIDsModel))
-        submit = st.form_submit_button(label="Submit")
-    st.sidebar.write('#### execution log:')
-
-    run_kwargs = {'surveyor':get_surveyor_instance(_print_fn=st.sidebar.write, _survey_print_fn=st.write),
-                  'download_placeholder':download_placeholder}
-    if submit:
-        if session_data['research_keywords'] != '':
-            run_kwargs.update({'research_keywords':session_data['research_keywords'], 
-                               'max_search':session_data['max_search'], 
-                               'num_papers':session_data['num_papers']})
-        elif session_data['arxiv_ids'] != '':
-            run_kwargs.update({'arxiv_ids':[id.strip() for id in session_data['arxiv_ids'].split(',')]})
-
-        run_survey(**run_kwargs)
+     st.sidebar.image(Image.open('logo_landscape.png'), use_column_width = 'always')
+     st.title('Auto-Research')
+     st.write('#### A no-code utility to generate a detailed well-cited survey with topic clustered sections' 
+              '(draft paper format) and other interesting artifacts from a single research query or a curated set of papers(arxiv ids).')
+     st.write('##### Data Provider: arXiv Open Archive Initiative OAI')
+     st.write('##### GitHub: https://github.com/sidphbot/Auto-Research')
+     download_placeholder = st.container()
+
+     with st.sidebar.form(key="survey_keywords_form"):
+         session_data = sp.pydantic_input(key="keywords_input_model", model=KeywordsModel)
+         st.write('or')
+         session_data.update(sp.pydantic_input(key="arxiv_ids_input_model", model=ArxivIDsModel))
+         submit = st.form_submit_button(label="Submit")
+     st.sidebar.write('#### execution log:')
+
+     run_kwargs = {'surveyor':get_surveyor_instance(_print_fn=st.sidebar.write, _survey_print_fn=st.write),
+                   'download_placeholder':download_placeholder}
+     if submit:
+          if session_data['research_keywords'] != '':
+               run_kwargs.update({'research_keywords':session_data['research_keywords'], 
+                                  'max_search':session_data['max_search'], 
+                                  'num_papers':session_data['num_papers']})
+          elif session_data['arxiv_ids'] != '':
+               run_kwargs['arxiv_ids'] = [
+                   id.strip() for id in session_data['arxiv_ids'].split(',')
+               ]
+
+          run_survey(**run_kwargs)


Lines 76-101 refactored with the following changes:

Add single value to dictionary directly rather than using update() (simplify-dictionary-update)

sourcery-ai-bot · 2023-12-10T16:34:43Z

arxiv_public_data/authors.py

-            s = '{} {}'.format(match.group(2), match.group(3))
+            s = f'{match.group(2)} {match.group(3)}'


Function _parse_author_affil_split refactored with the following changes:

Replace call to format with f-string (use-fstring-for-formatting)

sourcery-ai-bot · 2023-12-10T16:34:43Z

arxiv_public_data/authors.py

-        else:
-            parts.append(pt)
-            last = pt
+        parts.append(pt)
+        last = pt


Function _remove_double_commas refactored with the following changes:

Remove unnecessary else after guard condition (remove-unnecessary-else)

sourcery-ai-bot · 2023-12-10T16:34:43Z

arxiv_public_data/authors.py

-def _collaboration_at_start(names: List[str]) \
-        -> Tuple[List[str], List[List[str]], int]:
+def _collaboration_at_start(names: List[str]) -> Tuple[List[str], List[List[str]], int]:
    """Perform special handling of collaboration at start."""
    author_list = []

    back_propagate_affiliations_to = 0
-    while len(names) > 0:
+    while names:


Function _collaboration_at_start refactored with the following changes:

Simplify sequence length comparison (simplify-len-comparison)

Replace multiple comparisons of same variable with in operator (merge-comparisons)

sourcery-ai-bot · 2023-12-10T16:34:43Z

arxiv_public_data/authors.py

-def _enum_collaboration_at_end(author_line: str)->Dict:
+def _enum_collaboration_at_end(author_line: str) -> Dict:


Function _enum_collaboration_at_end refactored with the following changes:

Use named expression to simplify assignment and conditional (use-named-expression)

This removes the following comments ( why? ):

# Now expect `1) affil1 ', discard if no match

sourcery-ai-bot · 2023-12-10T16:34:44Z

arxiv_public_data/fulltext.py

-    log.info('Searching "{}"...'.format(globber))
-    log.info('Found: {} pdfs'.format(len(pdffiles)))
+    log.info(f'Searching "{globber}"...')
+    log.info(f'Found: {len(pdffiles)} pdfs')


Function convert_directory refactored with the following changes:

Replace call to format with f-string [×3] (use-fstring-for-formatting)

sourcery-ai-bot · 2023-12-10T16:34:44Z

arxiv_public_data/fulltext.py

-    log.info('Searching "{}"...'.format(globber))
-    log.info('Found: {} pdfs'.format(len(pdffiles)))
+    log.info(f'Searching "{globber}"...')
+    log.info(f'Found: {len(pdffiles)} pdfs')


Function convert_directory_parallel refactored with the following changes:

Replace call to format with f-string [×2] (use-fstring-for-formatting)

sourcery-ai-bot · 2023-12-10T16:34:44Z

arxiv_public_data/fulltext.py

-        log.error('File conversion failed for {}: {}'.format(pdffile, e))
+        log.error(f'File conversion failed for {pdffile}: {e}')


Function convert_safe refactored with the following changes:

Replace call to format with f-string (use-fstring-for-formatting)

sourcery-ai-bot · 2023-12-10T16:34:44Z

arxiv_public_data/fulltext.py

-        raise RuntimeError('No such path: %s' % path)
+        raise RuntimeError(f'No such path: {path}')


Function convert refactored with the following changes:

Replace interpolated string formatting with f-string (replace-interpolation-with-fstring)

sourcery-ai-bot · 2023-12-10T16:34:44Z

arxiv_public_data/internal_citations.py

-        for f in files:
-            if 'txt' in f:
-                out.append(os.path.join(root, f))
-
+        out.extend(os.path.join(root, f) for f in files if 'txt' in f)


Function all_articles refactored with the following changes:

Replace a for append loop with list extend (for-append-to-extend)

sourcery-ai-bot · 2023-12-10T16:34:47Z

arxiv_public_data/internal_citations.py

-            log.info('Completed {} articles'.format(i))
+            log.info(f'Completed {i} articles')
        try:
            refs = extract_references(article)
            cites[path_to_id(article)] = refs
        except:
-            log.error("Error in {}".format(article))
+            log.error(f"Error in {article}")


Function citation_list_inner refactored with the following changes:

Replace call to format with f-string [×2] (use-fstring-for-formatting)

sourcery-ai-bot · 2023-12-10T16:34:47Z

arxiv_public_data/internal_citations.py

-    log.info('Calculating citation network for {} articles'.format(len(articles)))
+    log.info(f'Calculating citation network for {len(articles)} articles')


Function citation_list_parallel refactored with the following changes:

Replace call to format with f-string (use-fstring-for-formatting)

sourcery-ai-bot · 2023-12-10T16:34:47Z

arxiv_public_data/internal_citations.py

-    log.info('Saving to "{}"'.format(filename))
+    log.info(f'Saving to "{filename}"')


Function save_to_default_location refactored with the following changes:

Replace call to format with f-string (use-fstring-for-formatting)

sourcery-ai-bot · 2023-12-10T16:34:47Z

arxiv_public_data/oai_metadata.py

-    if response.status_code == 503:
-        secs = int(response.headers.get('Retry-After', 20)) * 1.5
-        log.info('Requested to wait, waiting {} seconds until retry...'.format(secs))
-
-        time.sleep(secs)
-        return get_list_record_chunk(resumptionToken=resumptionToken)
-    else:
+    if response.status_code != 503:
        raise Exception(
-            'Unknown error in HTTP request {}, status code: {}'.format(
-                response.url, response.status_code
-            )
+            f'Unknown error in HTTP request {response.url}, status code: {response.status_code}'
        )
+    secs = int(response.headers.get('Retry-After', 20)) * 1.5
+    log.info(f'Requested to wait, waiting {secs} seconds until retry...')
+
+    time.sleep(secs)
+    return get_list_record_chunk(resumptionToken=resumptionToken)


Function get_list_record_chunk refactored with the following changes:

Swap if/else branches (swap-if-else-branches)

Remove unnecessary else after guard condition (remove-unnecessary-else)

Replace call to format with f-string [×2] (use-fstring-for-formatting)

sourcery-ai-bot · 2023-12-10T16:34:47Z

arxiv_public_data/oai_metadata.py

-    item = elm.find('arXiv:{}'.format(name), OAI_XML_NAMESPACES)
+    item = elm.find(f'arXiv:{name}', OAI_XML_NAMESPACES)


Function _record_element_text refactored with the following changes:

Replace call to format with f-string (use-fstring-for-formatting)

sourcery-ai-bot · 2023-12-10T16:34:48Z