This project is an advanced image processing and organization tool that utilizes machine learning models for various image-related tasks. It features a backend powered by Hugging Face's transformers.js
, a frontend for user interaction, and a Bash script to automate image renaming based on generated captions.
- Image Captioning: Automatically generates captions for images.
- Object Detection: Identifies objects present in an image.
- Emotion Detection: Recognizes emotions from facial expressions.
- OCR (Optical Character Recognition): Extracts text from handwritten or printed images.
- Image Upscaling: Enhances image resolution.
- Automated Image Renaming: A Bash script renames images based on their generated captions.
The backend is built using Node.js
and Express
, utilizing the @huggingface/transformers
library for AI-powered image processing.
- Express Server: Handles API requests.
- Multer: Handles file uploads.
- Hugging Face Models: Loads and runs various image processing models.
- Dynamic Route Generation: Each model has its own API endpoint.
Endpoint | Functionality |
---|---|
/generate-caption |
Generates captions for images |
/detect-obj |
Detects objects in an image |
/detect-emo |
Recognizes facial emotions |
/generate-ocr |
Extracts text from images |
/upscale |
Enhances image resolution |
/routes |
Lists all available endpoints |
-
Install dependencies:
npm install
-
Start the server:
node app.js
-
The server runs at
http://localhost:5100
A Bash script automates renaming images based on AI-generated captions.
- Sends an image to the API.
- Retrieves the generated caption.
- Renames the image using the caption.
- Supports parallel processing for faster execution.
./rename_images.sh -p 5 -sr _ -r responses.txt -sf image.jpg
Flag | Description |
---|---|
-p |
Set concurrency level (default: 3) |
-sr |
Space replacement in captions |
-sf |
Process a single file instead of a directory |
-r |
Save API responses to a file |
-d |
Set search depth for images in directories |
The frontend is a React.js
application that provides an intuitive UI for users to upload images and receive processed outputs.
- Upload images for processing.
- Select the desired processing task.
- View results with a clean UI.
- Monitor the processing status.
-
Navigate to the frontend directory:
cd frontend
-
Install dependencies:
npm install
-
Start the development server:
npm start
-
Open
http://localhost:3000
in your browser.
- Node.js & npm
- Bash (for script execution)
-
Clone the repository:
git clone https://github.com/Sahil-958/ims.git
-
Set up and run the backend.
-
Run the frontend.
-
Use the script to rename images if needed.
- Implementing user authentication.
- Extending support for more image transformations.
- Adding a database to store processing history.