Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ModuleNotFoundError when deploying Azure ML model to online endpoint #6819

Open
haraldger opened this issue Sep 29, 2023 · 8 comments
Open
Assignees
Labels
Auto-Assign Auto assign by bot customer-reported Issues that are reported by GitHub users external to the Azure organization. extension/ml Machine Learning question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention This issue is responsible by Azure service team.

Comments

@haraldger
Copy link

haraldger commented Sep 29, 2023

Describe the bug

I am trying to deploy my Azure ML model to an online endpoint hosted on Azure, as described in the documentation https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-online-endpoints?view=azureml-api-2&tabs=azure-cli. However, I'm failing to import one of my files that should be included in the code repository where the scoring script is located. The logic for inferencing is quite complicated, more so than in the above linked documentation, so additional files must be included apart from just the scoring script.

Related command

az ml online-deployment create -f azure/deployment.yml --resource-group <omitted> --workspace-name <omitted>

I have deployed the exact same scoring script, and everything else the same, locally and it works perfectly fine. The issue is only when trying to deploy to my Azure endpoint.

Errors

The deployment fails with ResourceNotReady. Upon inspection of the logs in Azure Studio, the error is:

2023-09-29 19:00:24,282 E [65] azmlinfsrv - Traceback (most recent call last):
File "/azureml-envs/azureml_de0eccae1d7809be16f4cee8f4e60c52/lib/python3.10/site-packages/azureml_inference_server_http/server/user_script.py", line 77, in load_script
main_module_spec.loader.exec_module(user_module)
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/var/azureml-app/rebel-core-deployment/scoring_script.py", line 8, in
from utils.post_processing import post_process
ModuleNotFoundError: No module named 'utils.post_processing'; 'utils' is not a package

The directory structure is as follows:

root:

  • __init__.py
  • scoring_script.py
  • utils/
    • __init__.py
    • post_processing.py
  • models/
    • __init__.py
    • StrengthNet.py
    • RateNet.py
  • azure/
    • deployment.yml

Issue script & Debug output

My scoring script is as follows:

import json
import pandas as pd
import torch

from models.RateNet import RateNet
from models.StrengthNet import StrengthNet
from utils.post_processing import post_process

strength_net = None
rate_net = None


def init():
    global config
    global strength_net
    global rate_net
    
    # Initialization
    # Omitted for business sensitivity


def predict(data, metadata):
    # Omitted ...

    return predictions


def run(request):
    request = json.loads(request)
    data = torch.tensor(request["data"])
    metadata = torch.tensor(request["metadata"])

    # Omitted

    predictions = predict(data, metadata)

    return predictions.tolist()

Note that the first two imports work, which are also local files. The program only crashes on the utils package.

Deployment YAML file:

$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json
name: deployment-name-omitted
endpoint_name: endpoint-name-omitted
model: azureml:ModelNameOmitted:2
code_configuration:
  code: ../
  scoring_script: scoring_script.py
environment: azureml:env-name-omitted:14
instance_type: Standard_DS4_v2
instance_count: 1

Expected behavior

The expected behaviour is that the deployment is successful, and that the utils.post_processing file is successfully imported. When deployed locally, with the --local tag, things work fine. Additionally, the deployment successfully imports from files in the models directory, so there should logically not be any difference to the utils directory.

Environment Summary

azure-cli 2

Additional context

No response

@haraldger haraldger added the bug This issue requires a change to an existing behavior in the product in order to be resolved. label Sep 29, 2023
@microsoft-github-policy-service microsoft-github-policy-service bot added question The issue doesn't require a change to the product in order to be resolved. Most issues start as that customer-reported Issues that are reported by GitHub users external to the Azure organization. Auto-Assign Auto assign by bot Service Attention This issue is responsible by Azure service team. Machine Learning extension/ml labels Sep 29, 2023
@yonzhan
Copy link
Collaborator

yonzhan commented Sep 29, 2023

Thank you for opening this issue, we will look into it.

@microsoft-github-policy-service
Copy link
Contributor

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @azureml-github.

@hugoaponte
Copy link
Member

Thanks for the feedback.

Could you confirm that the dependency missing at run time is indeed included in the environment specified in the deployment?

from utils.post_processing import post_process

@hugoaponte hugoaponte self-assigned this Sep 29, 2023
@haraldger
Copy link
Author

haraldger commented Sep 29, 2023

Hi @hugoaponte , thanks for your response.

So maybe this is where I've done something wrong or misunderstood. The utils.post_processing dependency is not part of the environment specification, but rather one of my own files that's located in a child directory to the scoring script. (I've outlined the directory structure in the OP) I assumed that the yml specification for the deployment, which specifies the code location as well as the scoring script location within that code location, makes it so that the deployment knows where to find the code necessary? (I also posted my yml specification above where this code_configuration is shown, and it's aware of the directory structure).

Maybe this is what I've got wrong. However, I don't know how the environment, which is registered in Azure ML Studio, could include this directory and file as a dependency, when this code is only local to this specific deployment? They're not pip packages or anything like that, they're just my own python files

@hugoaponte
Copy link
Member

Got it. Thanks for clarifying.

If the package is not published, then it can't be added to the environment, you are right.

You could perfectly have the code in the folder specified for your code. It should work.

Could you double check the script "post_processing.py"?

If the issue was the folder structure, your scoring script should fail right when trying to import from "models".

@haraldger
Copy link
Author

haraldger commented Oct 2, 2023

@hugoaponte

After double checking, I can't find anything that seems wrong with it. I have double checked its name, which is indeed post_processing, and the function name, confirmed is post_process, and finally double checked that the directory is called utils. So the import line should work. And besides, like I mentioned above, the script works when I deploy locally.

So the import itself works when I deploy locally, but not when deploying to Azure. So it's probably nothing wrong with the referencing or naming. Again, this makes me think directory structure, but like you mentioned importing from the models directory works. So perhaps its something specific to the utils directory, for whatever reason. I have doubled checked that there is an __init__.py file in there, so that shouldn't be the issue. I'm at loss here - I can't find anything wrong on my end, and it works when making a local deployment. I think that perhaps the utils directory is not getting transferred/uploaded to Azure correctly.

@haraldger
Copy link
Author

As an additional to the above, I found in blob storage the specific uploaded code for this deployment, and what's strange is that the code and the utils package does seem to be included. So now I'm left even more confused as to why it can't recognize utils as a package. Pic 1 shows the root folder, pic 2 shows inside the utils folder.

image

image

@shrey-c
Copy link

shrey-c commented Dec 19, 2023

Hi, I came across a similar issue and this is what I was able to figure out.

Given your file structure:
image

Your "source_directory" is called "root", your inference config should look like this
inference_config = InferenceConfig(runtime = "python",
source_directory="root",
entry_script = "score.py",
conda_file = "deploymentEnv.yml")

Now all your file imports need to use root as the parent directly and need to reference from that, so your import would have to look like:

from root.utils.post_processing import post_process

And the import should work fine.

@yonzhan yonzhan removed the bug This issue requires a change to an existing behavior in the product in order to be resolved. label Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Auto-Assign Auto assign by bot customer-reported Issues that are reported by GitHub users external to the Azure organization. extension/ml Machine Learning question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

4 participants