Skip to content

Commit 92474df

Browse files
author
Rajan Shukla
committed
Initial commit
0 parents  commit 92474df

File tree

12 files changed

+1072
-0
lines changed

12 files changed

+1072
-0
lines changed

.github/workflows/publish.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+

.gitignore

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# Python
2+
__pycache__/
3+
*.py[cod]
4+
*$py.class
5+
*.so
6+
.Python
7+
build/
8+
develop-eggs/
9+
dist/
10+
downloads/
11+
eggs/
12+
.eggs/
13+
lib/
14+
lib64/
15+
parts/
16+
sdist/
17+
var/
18+
wheels/
19+
*.egg-info/
20+
.installed.cfg
21+
*.egg
22+
23+
# Virtual Environment
24+
.env
25+
.venv
26+
env/
27+
venv/
28+
ENV/
29+
30+
# IDE
31+
.idea/
32+
.vscode/
33+
*.swp
34+
*.swo
35+
36+
# OS
37+
.DS_Store
38+
Thumbs.db

.python-version

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
3.11

LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2024 Rajan Shukla
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
# MCP OCR Server
2+
3+
A production-grade OCR server built using MCP (Model Context Protocol) that provides OCR capabilities through a simple interface.
4+
5+
## Features
6+
7+
- Extract text from images using Tesseract OCR
8+
- Support for multiple input types:
9+
- Local image files
10+
- Image URLs
11+
- Raw image bytes
12+
- Automatic Tesseract installation
13+
- Support for multiple languages
14+
- Production-ready error handling
15+
16+
## Installation
17+
18+
```bash
19+
# Using pip
20+
pip install mcp-ocr
21+
22+
# Using uv
23+
uv pip install mcp-ocr
24+
```
25+
26+
Tesseract will be installed automatically on supported platforms:
27+
- macOS (via Homebrew)
28+
- Linux (via apt, dnf, or pacman)
29+
- Windows (manual installation instructions provided)
30+
31+
## Usage
32+
33+
### As an MCP Server
34+
35+
1. Start the server:
36+
```bash
37+
python -m mcp_ocr
38+
```
39+
40+
2. Configure Claude for Desktop:
41+
Add to `~/Library/Application Support/Claude/claude_desktop_config.json`:
42+
```json
43+
{
44+
"mcpServers": {
45+
"ocr": {
46+
"command": "python",
47+
"args": ["-m", "mcp_ocr"]
48+
}
49+
}
50+
}
51+
```
52+
53+
### Available Tools
54+
55+
#### perform_ocr
56+
Extract text from images:
57+
```python
58+
# From file
59+
perform_ocr("/path/to/image.jpg")
60+
61+
# From URL
62+
perform_ocr("https://example.com/image.jpg")
63+
64+
# From bytes
65+
perform_ocr(image_bytes)
66+
```
67+
68+
#### get_supported_languages
69+
List available OCR languages:
70+
```python
71+
get_supported_languages()
72+
```
73+
74+
## Development
75+
76+
1. Clone the repository:
77+
```bash
78+
git clone https://github.com/yourusername/mcp-ocr.git
79+
cd mcp-ocr
80+
```
81+
82+
2. Set up development environment:
83+
```bash
84+
uv venv
85+
source .venv/bin/activate # On Windows: .venv\Scripts\activate
86+
uv pip install -e .
87+
```
88+
89+
3. Run tests:
90+
```bash
91+
pytest
92+
```
93+
94+
## Contributing
95+
96+
1. Fork the repository
97+
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
98+
3. Commit your changes (`git commit -m 'Add amazing feature'`)
99+
4. Push to the branch (`git push origin feature/amazing-feature`)
100+
5. Open a Pull Request
101+
102+
## License
103+
104+
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
105+
106+
## Acknowledgments
107+
108+
- [Tesseract OCR](https://github.com/tesseract-ocr/tesseract)
109+
- [Model Context Protocol](https://modelcontextprotocol.io)

hello.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
def main():
2+
print("Hello from mcp-ocr!")
3+
4+
5+
if __name__ == "__main__":
6+
main()

pyproject.toml

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
[project]
2+
name = "mcp-ocr"
3+
version = "0.1.0"
4+
description = "MCP server for OCR functionality using Tesseract"
5+
readme = "README.md"
6+
requires-python = ">=3.11"
7+
dependencies = [
8+
"mcp[cli]>=1.2.0",
9+
"pytesseract>=0.3.10",
10+
"opencv-python>=4.8.0",
11+
"numpy>=1.24.0",
12+
"pillow>=10.0.0",
13+
"httpx>=0.24.0"
14+
]
15+
authors = [
16+
{ name = "Rajan Shukla", email = "[email protected]" }
17+
]
18+
license = { text = "MIT" }
19+
classifiers = [
20+
"Development Status :: 4 - Beta",
21+
"Intended Audience :: Developers",
22+
"License :: OSI Approved :: MIT License",
23+
"Programming Language :: Python :: 3",
24+
"Programming Language :: Python :: 3.11",
25+
"Topic :: Software Development :: Libraries :: Python Modules",
26+
"Operating System :: OS Independent",
27+
"Environment :: Console",
28+
]
29+
30+
[project.scripts]
31+
mcp-ocr = "mcp_ocr.__main__:main"
32+
33+
[build-system]
34+
requires = ["hatchling"]
35+
build-backend = "hatchling.build"
36+
37+
[tool.hatch.build.targets.wheel]
38+
packages = ["src/mcp_ocr"]
39+
40+
[tool.hatch.build.hooks.custom]
41+
dependencies = ["hatch-requirements-txt"]
42+
43+
[project.urls]
44+
Homepage = "https://github.com/yourusername/mcp-ocr"
45+
Documentation = "https://github.com/yourusername/mcp-ocr#readme"
46+
Issues = "https://github.com/yourusername/mcp-ocr/issues"
47+
48+
[tool.hatch.metadata]
49+
allow-direct-references = true

src/mcp_ocr/__init__.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
"""MCP OCR: A production-grade OCR server built using MCP (Model Context Protocol)."""
2+
3+
__version__ = "0.1.0"
4+
5+
from .server import mcp, perform_ocr, get_supported_languages
6+
7+
__all__ = ["mcp", "perform_ocr", "get_supported_languages"]

src/mcp_ocr/__main__.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
#!/usr/bin/env python3
2+
"""
3+
Command-line entry point for the MCP OCR Server.
4+
"""
5+
6+
import argparse
7+
from .server import mcp
8+
9+
def main():
10+
"""MCP OCR: Extract text from images using OCR."""
11+
parser = argparse.ArgumentParser(
12+
description="Extract text from images using OCR with support for local files, URLs, and image data."
13+
)
14+
parser.parse_args()
15+
mcp.run()
16+
17+
if __name__ == "__main__":
18+
main()

src/mcp_ocr/install_tesseract.py

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
"""Handle Tesseract installation during package setup."""
2+
3+
import os
4+
import platform
5+
import subprocess
6+
import sys
7+
from typing import Optional
8+
9+
def get_package_manager() -> Optional[str]:
10+
"""Detect the system's package manager."""
11+
system = platform.system().lower()
12+
13+
if system == "darwin":
14+
# Check if Homebrew is installed
15+
if subprocess.run(["which", "brew"], capture_output=True).returncode == 0:
16+
return "brew"
17+
elif system == "linux":
18+
# Check for apt (Debian/Ubuntu)
19+
if os.path.exists("/usr/bin/apt"):
20+
return "apt"
21+
# Check for dnf (Fedora)
22+
elif os.path.exists("/usr/bin/dnf"):
23+
return "dnf"
24+
# Check for pacman (Arch)
25+
elif os.path.exists("/usr/bin/pacman"):
26+
return "pacman"
27+
28+
return None
29+
30+
def install_tesseract():
31+
"""Install Tesseract OCR based on the operating system."""
32+
system = platform.system().lower()
33+
pkg_manager = get_package_manager()
34+
35+
try:
36+
if system == "darwin" and pkg_manager == "brew":
37+
subprocess.run(["brew", "install", "tesseract"], check=True)
38+
39+
elif system == "linux":
40+
if pkg_manager == "apt":
41+
subprocess.run(["sudo", "apt-get", "update"], check=True)
42+
subprocess.run(["sudo", "apt-get", "install", "-y", "tesseract-ocr"], check=True)
43+
elif pkg_manager == "dnf":
44+
subprocess.run(["sudo", "dnf", "install", "-y", "tesseract"], check=True)
45+
elif pkg_manager == "pacman":
46+
subprocess.run(["sudo", "pacman", "-S", "--noconfirm", "tesseract"], check=True)
47+
48+
elif system == "windows":
49+
print("For Windows users:")
50+
print("Please download and install Tesseract from: https://github.com/UB-Mannheim/tesseract/wiki")
51+
print("After installation, ensure the Tesseract installation directory is in your system PATH.")
52+
return
53+
54+
print("Tesseract OCR installed successfully!")
55+
56+
except subprocess.CalledProcessError as e:
57+
print(f"Error installing Tesseract: {str(e)}", file=sys.stderr)
58+
print("Please install Tesseract manually:", file=sys.stderr)
59+
print("- macOS: brew install tesseract", file=sys.stderr)
60+
print("- Ubuntu/Debian: sudo apt-get install tesseract-ocr", file=sys.stderr)
61+
print("- Fedora: sudo dnf install tesseract", file=sys.stderr)
62+
print("- Arch: sudo pacman -S tesseract", file=sys.stderr)
63+
print("- Windows: https://github.com/UB-Mannheim/tesseract/wiki", file=sys.stderr)
64+
sys.exit(1)
65+
66+
if __name__ == "__main__":
67+
install_tesseract()

0 commit comments

Comments
 (0)