Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: AssertionError: Expected 'log_model' to be called once. Called 0 times. #1192

Closed
2 tasks done
dagardner-nv opened this issue Sep 13, 2023 · 1 comment · Fixed by #1195
Closed
2 tasks done
Assignees
Labels
bug Something isn't working

Comments

@dagardner-nv
Copy link
Contributor

Version

23.11

Which installation method(s) does this occur on?

No response

Describe the bug.

As of today this test is failing after performing a conda update.

FAILED tests/examples/digital_fingerprinting/test_dfp_mlflow_model_writer.py::test_on_data[file:///home/user/morpheus/mlruns-None] - AssertionError: Expected 'log_model' to be called once. Called 0 times.

Minimum reproducible example

export CUDA_VER=${CUDA_VER:-11.8} && export PROJ=$(basename $(pwd)) && conda run -n base --live-stream conda-merge docker/conda/environments/cuda${CUDA_VER}_dev.yml docker/conda/environments/cuda${CUDA_VER}_examples.yml docs/conda_docs.yml > .tmp/merged.yml &&  mamba env update -n ${PROJ} --prune -f .tmp/merged.yml

pytest -s -v -x --run_slow --run_kafka --fail_missing tests/examples/digital_fingerprinting/test_dfp_mlflow_model_writer.py::test_on_data

Relevant log output

Click here to see error details

====================================================================================== test session starts =======================================================================================
platform linux -- Python 3.10.12, pytest-7.4.2, pluggy-1.0.0 -- /home/dagardner/work/conda/envs/m2/bin/python3.10
cachedir: .pytest_cache
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/dagardner/work/m2
configfile: pyproject.toml
plugins: Faker-12.3.0, kafka-0.6.0, cov-4.1.0, anyio-4.0.0, benchmark-4.0.0
collected 4 items

tests/examples/digital_fingerprinting/test_dfp_mlflow_model_writer.py::test_on_data[file:///home/user/morpheus/mlruns-None] Error uploading model to ML Flow
Traceback (most recent call last):
File "/home/dagardner/work/m2/morpheus/controllers/mlflow_model_writer_controller.py", line 262, in on_data
model_info = mlflow.pytorch.log_model(
File "/home/dagardner/work/conda/envs/m2/lib/python3.10/site-packages/mlflow/pytorch/init.py", line 294, in log_model
return Model.log(
File "/home/dagardner/work/conda/envs/m2/lib/python3.10/site-packages/mlflow/models/model.py", line 579, in log
flavor.save_model(path=local_path, mlflow_model=mlflow_model, **kwargs)
File "/home/dagardner/work/conda/envs/m2/lib/python3.10/site-packages/mlflow/pytorch/init.py", line 456, in save_model
raise TypeError("Argument 'pytorch_model' should be a torch.nn.Module")
TypeError: Argument 'pytorch_model' should be a torch.nn.Module
FAILED

============================================================================================ FAILURES ============================================================================================
______________________________________________________________________ test_on_data[file:///home/user/morpheus/mlruns-None] ______________________________________________________________________

config = Config(debug=False, log_level=30, log_config_file=None, plugins=None, mode=<PipelineModes.OTHER: 'OTHER'>, feature_len...re_scaler=<AEFeatureScalar.STANDARD: 'standard'>, use_generic_model=False, fallback_username='generic_user'), fil=None)
mock_mlflow = MockedMLFlow(MlflowClient=, ModelSignature=<MagicMock name='ModelS...nt=, start_run=)
mock_requests = MockedRequests(get=, patch=, response=)
dataset_pandas = <_utils.dataset_manager.DatasetManager object at 0x7f9b0a90cf70>, databricks_env = {'DATABRICKS_HOST': 'https://test_host', 'DATABRICKS_TOKEN': 'test_token'}
databricks_permissions = None, tracking_uri = 'file:///home/user/morpheus/mlruns'

@pytest.mark.parametrize("databricks_permissions", [None, {}])
@pytest.mark.parametrize("tracking_uri", ['file:///home/user/morpheus/mlruns', "databricks"])
def test_on_data(
        config: Config,
        mock_mlflow: MockedMLFlow,  # pylint: disable=redefined-outer-name
        mock_requests: MockedRequests,
        dataset_pandas: DatasetManager,
        databricks_env: dict,
        databricks_permissions: dict,
        tracking_uri: str):
    from dfp.messages.multi_dfp_message import DFPMessageMeta
    from dfp.stages.dfp_mlflow_model_writer import DFPMLFlowModelWriterStage
    from dfp.stages.dfp_mlflow_model_writer import conda_env

    should_apply_permissions = (databricks_permissions is not None and tracking_uri == "databricks")

    if not should_apply_permissions:
        # We aren't setting databricks_permissions, so we shouldn't be trying to make any request calls
        mock_requests.get.side_effect = RuntimeError("should not be called")
        mock_requests.patch.side_effect = RuntimeError("should not be called")

    mock_mlflow.get_tracking_uri.return_value = tracking_uri

    config.ae.timestamp_column_name = 'eventTime'

    input_file = os.path.join(TEST_DIRS.validation_data_dir, "dfp-cloudtrail-role-g-validation-data-input.csv")
    df = dataset_pandas[input_file]
    time_col = df['eventTime']
    min_time = time_col.min()
    max_time = time_col.max()

    mock_model = mock.MagicMock()
    mock_model.lr_decay.state_dict.return_value = {'last_epoch': 42}
    mock_model.lr = 0.1
    mock_model.batch_size = 100

    mock_embedding = mock.MagicMock()
    mock_embedding.num_embeddings = 101
    mock_embedding.embedding_dim = 102
    mock_model.categorical_fts = {'test': {'embedding': mock_embedding}}

    mock_model.prepare_df.return_value = df
    mock_model.get_anomaly_score.return_value = pd.Series(float(i) for i in range(len(df)))

    meta = DFPMessageMeta(df, 'Account-123456789')
    msg = MultiAEMessage(meta=meta, model=mock_model)

    stage = DFPMLFlowModelWriterStage(config, databricks_permissions=databricks_permissions, timeout=10)
    assert stage._controller.on_data(msg) is msg  # Should be a pass-thru

    # Test mocks in order that they're called
    mock_mlflow.end_run.assert_called_once()
    mock_mlflow.set_experiment.assert_called_once_with("/dfp-models/dfp-Account-123456789")
    mock_mlflow.start_run.assert_called_once_with(run_name="autoencoder model training run",
                                                  experiment_id="test_experiment_id")

    mock_mlflow.log_params.assert_called_once_with({
        "Algorithm": "Denosing Autoencoder",
        "Epochs": 42,
        "Learning rate": 0.1,
        "Batch size": 100,
        "Start Epoch": min_time,
        "End Epoch": max_time,
        "Log Count": len(df)
    })

    mock_mlflow.log_metrics.assert_called_once_with({
        "embedding-test-num_embeddings": 101, "embedding-test-embedding_dim": 102
    })

    mock_model.prepare_df.assert_called_once()
    mock_model.get_anomaly_score.assert_called_once()

    mock_mlflow.ModelSignature.assert_called_once()
  mock_mlflow.pytorch_log_model.assert_called_once_with(pytorch_model=mock_model,
                                                          artifact_path="dfencoder-test_run_uuid",
                                                          conda_env=conda_env,
                                                          signature=mock_mlflow.ModelSignature)

tests/examples/digital_fingerprinting/test_dfp_mlflow_model_writer.py:304:


self = , args = ()
kwargs = {'artifact_path': 'dfencoder-test_run_uuid', 'conda_env': {'channels': ['defaults', 'conda-forge'], 'dependencies': ['...'pytorch_model': , 'signature': }
msg = "Expected 'log_model' to be called once. Called 0 times."

def assert_called_once_with(self, /, *args, **kwargs):
    """assert that the mock was called exactly once and that that call was
    with the specified arguments."""
    if not self.call_count == 1:
        msg = ("Expected '%s' to be called once. Called %s times.%s"
               % (self._mock_name or 'mock',
                  self.call_count,
                  self._calls_repr()))
      raise AssertionError(msg)

E AssertionError: Expected 'log_model' to be called once. Called 0 times.

../conda/envs/m2/lib/python3.10/unittest/mock.py:940: AssertionError
==================================================================================== short test summary info =====================================================================================
FAILED tests/examples/digital_fingerprinting/test_dfp_mlflow_model_writer.py::test_on_data[file:///home/user/morpheus/mlruns-None] - AssertionError: Expected 'log_model' to be called once. Called 0 times.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
======================================================================================= 1 failed in 2.75s ========================================================================================

Full env printout

Click here to see environment details

[Paste the results of print_env.sh here, it will be hidden by default]

Other/Misc.

No response

Code of Conduct

  • I agree to follow Morpheus' Code of Conduct
  • I have searched the open bugs and have found no duplicates for this bug report
@dagardner-nv dagardner-nv added the bug Something isn't working label Sep 13, 2023
@dagardner-nv dagardner-nv self-assigned this Sep 13, 2023
@dagardner-nv
Copy link
Contributor Author

Issue caused by a new version of mlflow 2.7.

rapids-bot bot pushed a commit that referenced this issue Sep 18, 2023
…1195)

* Adopt camouflage-server 0.15, previously we've been locked on v0.9 due to outstanding bugs introduced in versions 0.10 - 0.14.1 :
  - testinggospels/camouflage#203
  - testinggospels/camouflage#223  
  - testinggospels/camouflage#227
  - testinggospels/camouflage#229
* Includes unrelated fix to running CI locally
* Restrict mlflow to versions prior to 2.7

closes #967
Fixes #1192

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/nv-morpheus/Morpheus/blob/main/docs/source/developer_guide/contributing.md).
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

Authors:
  - David Gardner (https://github.com/dagardner-nv)

Approvers:
  - Christopher Harris (https://github.com/cwharris)

URL: #1195
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant