Detection,Classification and Traking | Generation |
---|---|
This repository contains simple and fundamental Projects in Computer vision (under development....)
Computer vision is a field of artificial intelligence (AI) that enables computers and systems to derive meaningful information from digital images, videos and other visual inputs — and take actions or make recommendations based on that information. If AI enables computers to think, computer vision enables them to see, observe and understand.
Computer vision works much the same as human vision, except humans have a head start. Human sight has the advantage of lifetimes of context to train how to tell objects apart, how far away they are, whether they are moving and whether there is something wrong in an image.
Computer vision trains machines to perform these functions, but it has to do it in much less time with cameras, data and algorithms rather than retinas, optic nerves and a visual cortex. Because a system trained to inspect products or watch a production asset can analyze thousands of products or processes a minute, noticing imperceptible defects or issues, it can quickly surpass human capabilities.
The Project divided into these topics in these aspects:
- Image
- Face Detection and Crupting
- Facial Emotion Detection
- Facial Demographics estimation
- Classification (Usecase: Food image Classification)
- Regression (Usecase: image Arousal and Valence prediction)
- Obeject Detection
- Semantic Segmentation
- Instance Segmentation
- Panoptic Segmentation
- Image Similarity
- Scene Text Detection and extraction
- Body Pose Detection
- Classification (Usecase:Action Detection)
- Persian Captioning
- Unconditional Generation (Usecase:Butterfly image Generation)
- Conditional Generation
- .....
- Video
- Face Detection and Tracking
- Facial Emotion Detection and Tracking
- Facial Demographics estimation
- Obeject Detection and Tracking
- Semantic Segmentation
- Semantic Clustering
- Scene Text Detection and extraction
- Body Pose Detection
- Classification (Usecase:Action Detection)
- Captioning
- Conditional Generation
- .....
- Vision Large Language Model
- VLLM