Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot import olmocr modules #54

Open
naseerfaheem opened this issue Feb 27, 2025 · 3 comments
Open

Cannot import olmocr modules #54

naseerfaheem opened this issue Feb 27, 2025 · 3 comments
Labels
bug Something isn't working

Comments

@naseerfaheem
Copy link

🐛 Describe the bug

I have been following all the instructions on an ubuntu server and cannot import the package from jupyter on vscode.


ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 1
----> 1 import olmocr

File /mnt/datacenter/code-server/workspace/olmocr/olmocr.py:10
7 from PIL import Image
8 from transformers import AutoProcessor, Qwen2VLForConditionalGeneration
---> 10 from olmocr.data.renderpdf import render_pdf_to_base64png
11 from olmocr.prompts import build_finetuning_prompt
12 from olmocr.prompts.anchor import get_anchor_text

ModuleNotFoundError: No module named 'olmocr.data'; 'olmocr' is not a package

I tried both Python 3.11 and Python 3.12. I get the same error.

Versions

python --version && pip freeze
Python 3.12.0
aiohappyeyeballs==2.4.6
aiohttp==3.11.13
aiosignal==1.3.2
annotated-types==0.7.0
anthropic==0.48.0
anyio==4.8.0
asttokens @ file:///home/conda/feedstock_root/build_artifacts/asttokens_1733250440834/work
attrs==25.1.0
beaker-py==1.34.1
bleach==6.2.0
boto3==1.37.3
botocore==1.37.3
cached_path==1.6.7
cachetools==5.5.2
certifi==2025.1.31
cffi==1.17.1
charset-normalizer==3.4.1
click==8.1.8
cloudpickle==3.1.1
comm @ file:///home/conda/feedstock_root/build_artifacts/comm_1733502965406/work
compressed-tensors==0.8.0
cryptography==44.0.1
cuda-bindings==12.8.0
cuda-python==12.8.0
datasets==3.3.2
debugpy @ file:///croot/debugpy_1736267418885/work
decorator @ file:///home/conda/feedstock_root/build_artifacts/decorator_1740384970518/work
decord==0.6.0
dill==0.3.8
diskcache==5.6.3
distro==1.9.0
docker==7.1.0
einops==0.8.1
exceptiongroup @ file:///home/conda/feedstock_root/build_artifacts/exceptiongroup_1733208806608/work
executing==2.2.0
fastapi==0.115.9
filelock==3.17.0
flashinfer==0.1.6+cu124torch2.4
frozenlist==1.5.0
fsspec==2024.12.0
ftfy==6.3.1
fuzzysearch==0.7.3
gguf==0.10.0
google-api-core==2.24.1
google-auth==2.38.0
google-cloud-core==2.4.2
google-cloud-storage==2.19.0
google-crc32c==1.6.0
google-resumable-media==2.7.2
googleapis-common-protos==1.68.0
h11==0.14.0
hf_transfer==0.1.9
httpcore==1.0.7
httptools==0.6.4
httpx==0.28.1
huggingface-hub==0.27.1
idna==3.10
importlib_metadata @ file:///home/conda/feedstock_root/build_artifacts/importlib-metadata_1737420181517/work
interegular==0.3.3
ipykernel @ file:///home/conda/feedstock_root/build_artifacts/ipykernel_1719845459717/work
ipython @ file:///home/conda/feedstock_root/build_artifacts/bld/rattler-build_ipython_1738421264/work
jedi @ file:///home/conda/feedstock_root/build_artifacts/jedi_1733300866624/work
Jinja2==3.1.5
jiter==0.8.2
jmespath==1.0.1
jsonschema==4.23.0
jsonschema-specifications==2024.10.1
jupyter_client @ file:///home/conda/feedstock_root/build_artifacts/jupyter_client_1733440914442/work
jupyter_core @ file:///home/conda/feedstock_root/build_artifacts/jupyter_core_1727163409502/work
lark==1.2.2
lingua-language-detector==2.0.2
litellm==1.61.20
llvmlite==0.44.0
lm-format-enforcer==0.10.11
markdown-it-py==3.0.0
markdown2==2.5.3
MarkupSafe==3.0.2
matplotlib-inline @ file:///home/conda/feedstock_root/build_artifacts/matplotlib-inline_1733416936468/work
mdurl==0.1.2
mistral_common==1.5.3
modelscope==1.23.1
mpmath==1.3.0
msgpack==1.1.0
msgspec==0.19.0
multidict==6.1.0
multiprocess==0.70.16
nest_asyncio @ file:///home/conda/feedstock_root/build_artifacts/nest-asyncio_1733325553580/work
networkx==3.4.2
numba==0.61.0
numpy==1.26.4
nvidia-cublas-cu12==12.4.5.8
nvidia-cuda-cupti-cu12==12.4.127
nvidia-cuda-nvrtc-cu12==12.4.127
nvidia-cuda-runtime-cu12==12.4.127
nvidia-cudnn-cu12==9.1.0.70
nvidia-cufft-cu12==11.2.1.3
nvidia-curand-cu12==10.3.5.147
nvidia-cusolver-cu12==11.6.1.9
nvidia-cusparse-cu12==12.3.1.170
nvidia-cusparselt-cu12==0.6.2
nvidia-ml-py==12.570.86
nvidia-nccl-cu12==2.21.5
nvidia-nvjitlink-cu12==12.4.127
nvidia-nvtx-cu12==12.4.127
-e git+https://github.com/allenai/olmocr.git@505e08cbb14f8e2ee039dee061b6dfae86b123d6#egg=olmocr
openai==1.65.1
opencv-python-headless==4.11.0.86
orjson==3.10.15
outlines==0.0.46
packaging @ file:///home/conda/feedstock_root/build_artifacts/packaging_1733203243479/work
pandas==2.2.3
parso @ file:///home/conda/feedstock_root/build_artifacts/parso_1733271261340/work
partial-json-parser==0.2.1.1.post5
pexpect @ file:///home/conda/feedstock_root/build_artifacts/pexpect_1733301927746/work
pickleshare @ file:///home/conda/feedstock_root/build_artifacts/pickleshare_1733327343728/work
pillow==11.1.0
platformdirs @ file:///home/conda/feedstock_root/build_artifacts/platformdirs_1733232627818/work
prometheus-fastapi-instrumentator==7.0.2
prometheus_client==0.21.1
prompt_toolkit @ file:///home/conda/feedstock_root/build_artifacts/prompt-toolkit_1737453357274/work
propcache==0.3.0
proto-plus==1.26.0
protobuf==5.29.3
psutil==7.0.0
ptyprocess @ file:///home/conda/feedstock_root/build_artifacts/ptyprocess_1733302279685/work/dist/ptyprocess-0.7.0-py2.py3-none-any.whl#sha256=92c32ff62b5fd8cf325bec5ab90d7be3d2a8ca8c8a3813ff487a8d2002630d1f
pure_eval @ file:///home/conda/feedstock_root/build_artifacts/pure_eval_1733569405015/work
py-cpuinfo==9.0.0
pyairports==2.1.1
pyarrow==19.0.1
pyasn1==0.6.1
pyasn1_modules==0.4.1
pycountry==24.6.1
pycparser==2.22
pydantic==2.10.6
pydantic_core==2.27.2
Pygments @ file:///home/conda/feedstock_root/build_artifacts/pygments_1736243443484/work
pypdf==5.3.0
pypdfium2==4.30.1
python-dateutil @ file:///home/conda/feedstock_root/build_artifacts/python-dateutil_1733215673016/work
python-dotenv==1.0.1
python-multipart==0.0.20
pytz==2025.1
PyYAML==6.0.2
pyzmq==26.2.1
RapidFuzz==3.12.1
ray==2.43.0
referencing==0.36.2
regex==2024.11.6
requests==2.32.3
rich==13.9.4
rpds-py==0.23.1
rsa==4.9
s3transfer==0.11.3
safetensors==0.5.3
sentencepiece==0.2.0
sequence_align==0.2.0
setproctitle==1.3.5
setuptools==75.8.0
sgl-kernel==0.0.3.post1
sglang==0.4.2
six @ file:///home/conda/feedstock_root/build_artifacts/six_1733380938961/work
smart-open==7.1.0
sniffio==1.3.1
stack_data @ file:///home/conda/feedstock_root/build_artifacts/stack_data_1733569443808/work
starlette==0.45.3
sympy==1.13.1
tiktoken==0.9.0
tokenizers==0.21.0
torch==2.5.1
torchao==0.8.0
torchvision==0.20.1
tornado @ file:///croot/tornado_1733960490606/work
tqdm==4.67.1
traitlets @ file:///home/conda/feedstock_root/build_artifacts/traitlets_1733367359838/work
transformers==4.49.0
triton==3.1.0
typing_extensions @ file:///home/conda/feedstock_root/build_artifacts/typing_extensions_1733188668063/work
tzdata==2025.1
urllib3==2.3.0
uvicorn==0.34.0
uvloop==0.21.0
vllm==0.6.4.post1
watchfiles==1.0.4
wcwidth @ file:///home/conda/feedstock_root/build_artifacts/wcwidth_1733231326287/work
webencodings==0.5.1
websockets==15.0
wheel==0.45.1
wrapt==1.17.2
xformers==0.0.28.post3
xgrammar==0.1.14
xxhash==3.5.0
yarl==1.18.3
zipp @ file:///home/conda/feedstock_root/build_artifacts/zipp_1732827521216/work
zstandard==0.23.0

@naseerfaheem naseerfaheem added the bug Something isn't working label Feb 27, 2025
@aman-17
Copy link
Member

aman-17 commented Feb 28, 2025

Hey @naseerfaheem, It looks like Python isn’t recognizing olmocr as a package. Can you try installing the dependencies from the repo with pip install -e ."[all]" and also check __init__.py in subdirectories?

@naseerfaheem
Copy link
Author

naseerfaheem commented Feb 28, 2025

I tried it and it works in terminal:
Python 3.12.0 | packaged by Anaconda, Inc. | (main, Oct 2 2023, 17:29:18) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

import olmocr
exit()

But when I try it in vscode on the same conda environment, it won't work.

The initi.py in the data directory is empty

@jakep-allenai
Copy link
Collaborator

That's likely because you have a file named "olmocr.py" in your own project directory, and it's causing python to get confused as to which thing to load, the local file, or the package. Try to rename your own test file to "olmocr_test.py", and the imports within it should work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants