Review CPU notebooks and run with Python 3.9 #1950

miguelgfierro · 2023-06-27T09:16:37Z

Description

Related Issues

related to #1947

References

Checklist:

I have followed the contribution guidelines and code style for this project.
I have added tests covering my contributions.
I have updated the documentation accordingly.
This PR is being made to staging branch and not to main branch.

review-notebook-app · 2023-06-27T09:16:41Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

miguelgfierro · 2023-06-27T09:55:25Z

error appearing in Azureml, but not in local

=================================== FAILURES ===================================
___________________________ test_sar_deep_dive_runs ____________________________

notebooks = ***'als_deep_dive': '/mnt/azureml/cr/j/4e944a170be74c0e9727bb7a9e80efcd/exe/wd/examples/02_model_collaborative_filtering...rk_movielens': '/mnt/azureml/cr/j/4e944a170be74c0e9727bb7a9e80efcd/exe/wd/examples/06_benchmarks/movielens.ipynb', ...***
output_notebook = 'output.ipynb', kernel_name = 'python3'

    @pytest.mark.notebooks
    def test_sar_deep_dive_runs(notebooks, output_notebook, kernel_name):
        notebook_path = notebooks["sar_deep_dive"]
>       pm.execute_notebook(notebook_path, output_notebook, kernel_name=kernel_name)

tests/unit/examples/test_notebooks_python.py:43: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/azureml-envs/azureml_2248098658e75fe22b1e778dcf414d40/lib/python3.9/site-packages/papermill/execute.py:[128](https://github.com/microsoft/recommenders/actions/runs/5388016531/jobs/9780549827#step:3:135): in execute_notebook
    raise_for_execution_errors(nb, output_path)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

nb = ***'cells': [***'id': 'ed4e3ed1', 'cell_type': 'markdown', 'source': '<span style="color:red; font-family:Helvetica Neue, ...end_time': '2023-06-27T09:42:50.951864', 'duration': 12.94534, 'exception': True***, 'nbformat': 4, 'nbformat_minor': 5***
output_path = 'output.ipynb'

    def raise_for_execution_errors(nb, output_path):
        """Assigned parameters into the appropriate place in the input notebook
    
        Parameters
        ----------
        nb : NotebookNode
           Executable notebook object
        output_path : str
           Path to write executed notebook
        """
        error = None
        for index, cell in enumerate(nb.cells):
            if cell.get("outputs") is None:
                continue
    
            for output in cell.outputs:
                if output.output_type == "error":
                    if output.ename == "SystemExit" and (output.evalue == "" or output.evalue == "0"):
                        continue
                    error = PapermillExecutionError(
                        cell_index=index,
                        exec_count=cell.execution_count,
                        source=cell.source,
                        ename=output.ename,
                        evalue=output.evalue,
                        traceback=output.traceback,
                    )
                    break
    
        if error:
            # Write notebook back out with the Error Message at the top of the Notebook, and a link to
            # the relevant cell (by adding a note just before the failure with an HTML anchor)
            error_msg = ERROR_MESSAGE_TEMPLATE % str(error.exec_count)
            error_msg_cell = nbformat.v4.new_markdown_cell(error_msg)
            error_msg_cell.metadata['tags'] = [ERROR_MARKER_TAG]
            error_anchor_cell = nbformat.v4.new_markdown_cell(ERROR_ANCHOR_MSG)
            error_anchor_cell.metadata['tags'] = [ERROR_MARKER_TAG]
    
            # put the anchor before the cell with the error, before all the indices change due to the
            # heading-prepending
            nb.cells.insert(error.cell_index, error_anchor_cell)
            nb.cells.insert(0, error_msg_cell)
    
            write_ipynb(nb, output_path)
>           raise error
E           papermill.exceptions.PapermillExecutionError: 
E           ---------------------------------------------------------------------------
E           Exception encountered at "In [9]":
E           ---------------------------------------------------------------------------
E           ValueError                                Traceback (most recent call last)
E           Cell In[9], line 1
E           ----> 1 top_k = model.recommend_k_items(test, top_k=TOP_K, remove_seen=True)
E           
E           File /mnt/azureml/cr/j/4e944a[170](https://github.com/microsoft/recommenders/actions/runs/5388016531/jobs/9780549827#step:3:177)be74c0e9727bb7a9e80efcd/exe/wd/recommenders/models/sar/sar_singlenode.py:533, in SARSingleNode.recommend_k_items(self, test, top_k, sort_top_k, remove_seen)
E               520 def recommend_k_items(self, test, top_k=10, sort_top_k=True, remove_seen=False):
E               521     """Recommend top K items for all users which are in the test set
E               522 
E               523     Args:
E              (...)
E               530         pandas.DataFrame: top k recommendation items for each user
E               531     """
E           --> 533     test_scores = self.score(test, remove_seen=remove_seen)
E               535     top_items, top_scores = get_top_k_scored_items(
E               536         scores=test_scores, top_k=top_k, sort_top_k=sort_top_k
E               537     )
E               539     df = pd.DataFrame(
E               540         ***
E               541             self.col_user: np.repeat(
E              (...)
E               546         ***
E               547     )
E           
E           File /mnt/azureml/cr/j/4e944a170be74c0e9727bb7a9e80efcd/exe/wd/recommenders/models/sar/sar_singlenode.py:346, in SARSingleNode.score(self, test, remove_seen)
E               344 # calculate raw scores with a matrix multiplication
E               345 logger.info("Calculating recommendation scores")
E           --> 346 test_scores = self.user_affinity[user_ids, :].dot(self.item_similarity)
E               348 # ensure we're working with a dense ndarray
E               349 if isinstance(test_scores, sparse.spmatrix):
E           
E           File /azureml-envs/azureml_[224](https://github.com/microsoft/recommenders/actions/runs/5388016531/jobs/9780549827#step:3:231)8098658e75fe22b1e778dcf414d40/lib/python3.9/site-packages/scipy/sparse/_base.py:411, in _spbase.dot(self, other)
E               409     return self * other
E               410 else:
E           --> 411     return self @ other
E           
E           File /azureml-envs/azureml_2248098658e75fe22b1e778dcf414d40/lib/python3.9/site-packages/scipy/sparse/_base.py:622, in _spbase.__matmul__(self, other)
E               620 def __matmul__(self, other):
E               621     if isscalarlike(other):
E           --> 622         raise ValueError("Scalar operands are not allowed, "
E               623                          "use '*' instead")
E               624     return self._mul_dispatch(other)
E           
E           ValueError: Scalar operands are not allowed, use '*' instead

See run https://github.com/microsoft/recommenders/actions/runs/5388016531/jobs/9780549827

miguelgfierro · 2023-06-30T05:42:23Z

@anargyri the issue with scipy is fixed, I'm reruning the tests, hopefuly they pass. Please review.

…enders into miguel/review_notebooks

miguelgfierro · 2023-07-04T10:48:14Z

One problem we are seeing is that the old version of SAR with python 3.6 gave these results:

      "System version: 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) \n",
      "[GCC 7.3.0]\n",
      "Pandas version: 0.24.2\n"

"MAP:\t\t 0.095544\n",
      "NDCG:\t\t 0.350232\n",
      "Precision@K:\t 0.305726\n",
      "Recall@K:\t 0.164690\n"

while the new version with Python 3.9 gives these results:

System version: 3.9.16 (main, May 15 2023, 23:46:34) 
[GCC 11.2.0]
Pandas version: 1.5.3
NumPy version: 1.24.4
Scipy version: 1.10.1

      "MAP:\t\t 0.113796\n",
      "NDCG:\t\t 0.384809\n",
      "Precision@K:\t 0.331707\n",
      "Recall@K:\t 0.182571\n"

with Python 3.8:

System version: 3.8.13 (default, Mar 28 2022, 11:38:47) 
[GCC 7.5.0]
Pandas version: 1.4.2
NumPy version: 1.21.6
SciPy version: 1.8.0

MAP:		 0.113796
NDCG:		 0.384809
Precision@K:	 0.331707
Recall@K:	 0.182571

with Python 3.7:

System version: 3.7.16 (default, Jan 17 2023, 22:20:44) 
[GCC 11.2.0]
Pandas version: 1.3.5
NumPy version: 1.21.6
SciPy version: 1.7.3

Top K:		 10
MAP:		 0.113796
NDCG:		 0.384809
Precision@K:	 0.331707
Recall@K:	 0.182571

with Python 3.6:

System version: 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) 
[GCC 7.3.0]
Pandas version: 1.1.5
NumPy version: 1.19.5
SciPy version: 1.5.4

MAP:		 0.113796
NDCG:		 0.384809
Precision@K:	 0.331707
Recall@K:	 0.182571

with Python 3.6 and Pandas<1:

System version: 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) 
[GCC 7.3.0]
Pandas version: 0.24.2
NumPy version: 1.19.5
SciPy version: 1.5.4


Model:
Top K:		 10
MAP:		 0.113796
NDCG:		 0.384809
Precision@K:	 0.331707
Recall@K:	 0.182571

miguelgfierro · 2023-07-04T11:13:30Z

examples/02_model_collaborative_filtering/sar_deep_dive.ipynb

+      "MAP:\t\t 0.113796\n",
+      "NDCG:\t\t 0.384809\n",
+      "Precision@K:\t 0.331707\n",
+      "Recall@K:\t 0.182571\n"


@anargyri I run the notebook with python 3.6-3.9 and I always got the same results. So I believe that the numbers are correct.

An hypothesis of what could have happened is that we haven't updated this notebook since before 2020 https://github.com/microsoft/recommenders/commits/main/examples/02_model_collaborative_filtering/sar_deep_dive.ipynb and in the mean time, there was been some changes, like the TOP_K issue Chuyang found 68b60c0. That could explain the different numbers. Another option is the splitter, again, since it's been a while that we haven't touched the notebook, we might have not detected them.

Yes, I think you are right. Some time in the past there was a change but the notebook outputs were not checked in properly. Btw I think we do not ensure against something like this in the unit tests, do we? I.e. we just do a test that this notebook executes without error but we don't test that the results don't change significantly (I remember that this can be a source of many failing tests, but maybe we should revisit?)

Here the discrepancy is around 20% which should raise a flag imo. I mean, the error tolerance can be high enough, but at least there should be some error checking, whereas now there is none.

You are right, we are not checking this particular notebook with the metrics, we are only making sure it runs in the unit tests.

I have added an issue: #1955

miguelgfierro added 6 commits June 27, 2023 10:49

rerun

a70155d

📝

c72d9ad

rerun

4daa223

📝

9a1e7b0

Optimize code and remove warning

8765ae2

rerun

73c650c

miguelgfierro requested review from gramhagen, anargyri, loomlike, wutaomsft and simonzhaoms as code owners June 27, 2023 09:16

Put back logs in data split notebook

d1a90a6

miguelgfierro mentioned this pull request Jun 29, 2023

[BUG] Scipy 1.11.0 and 1.11.1 are breaking all our tests #1951

Closed

Merge branch 'staging' into miguel/review_notebooks

3350e6f

miguelgfierro and others added 3 commits July 4, 2023 09:43

Merge branch 'staging' into miguel/review_notebooks

6799739

Add scrapbook and show scipy version in SAR deep dive

9465bec

Merge branch 'miguel/review_notebooks' of github.com:microsoft/recomm…

383e250

…enders into miguel/review_notebooks

miguelgfierro commented Jul 4, 2023

View reviewed changes

anargyri approved these changes Jul 4, 2023

View reviewed changes

miguelgfierro mentioned this pull request Jul 4, 2023

[FEATURE] Add functional test with SAR deep dive notebook #1955

Open

miguelgfierro merged commit af53046 into staging Jul 4, 2023
26 checks passed

miguelgfierro deleted the miguel/review_notebooks branch July 4, 2023 14:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review CPU notebooks and run with Python 3.9 #1950

Review CPU notebooks and run with Python 3.9 #1950

miguelgfierro commented Jun 27, 2023

review-notebook-app bot commented Jun 27, 2023

miguelgfierro commented Jun 27, 2023

miguelgfierro commented Jun 30, 2023

miguelgfierro commented Jul 4, 2023 •

edited

Loading

miguelgfierro Jul 4, 2023

anargyri Jul 4, 2023

anargyri Jul 4, 2023

miguelgfierro Jul 4, 2023

Review CPU notebooks and run with Python 3.9 #1950

Review CPU notebooks and run with Python 3.9 #1950

Conversation

miguelgfierro commented Jun 27, 2023

Description

Related Issues

References

Checklist:

review-notebook-app bot commented Jun 27, 2023

miguelgfierro commented Jun 27, 2023

miguelgfierro commented Jun 30, 2023

miguelgfierro commented Jul 4, 2023 • edited Loading

miguelgfierro Jul 4, 2023

Choose a reason for hiding this comment

anargyri Jul 4, 2023

Choose a reason for hiding this comment

anargyri Jul 4, 2023

Choose a reason for hiding this comment

miguelgfierro Jul 4, 2023

Choose a reason for hiding this comment

miguelgfierro commented Jul 4, 2023 •

edited

Loading