Setup Python virtual environment with Python >= 3.7
- Clone this repository
git clone https://github.com/amundra02/ai_pipeline.git
- Activate the virtual environment and install the required packages
pip install -r requirements.txt
Peek inside the requirements file if you have everything already installed. Most of the dependencies are common libraries.
Source Files
Get Cos Client Instance
client = get_cos_client()
Parameter |
Description |
client |
cos client instance |
Get Cos resource Instance
resource = get_cos_resource()
Parameter |
Description |
resource |
cos resource instance |
Get cloudant instance and database to fetch data
cloudant, db = get_cloudant_client()
Parameter |
Description |
cloudant_client |
Cloudant instance - allows access to Cloudant DB |
db |
database name from where documents needs to be queried |
Get Cos Bucket to upload processed data
bucket_name = get_upload_bucket()
Parameter |
Description |
bucket_name |
Cos Bucket name |
Get clouant database name to upload processed metadata
db_name = get_cloudant_processed_db()
Parameter |
Description |
db_name |
Cloudant database name |
Read Image From COS
Convert the downloaded streaming body objects to numpy ndarray
Parameter |
Description |
client |
cos client instance |
bucket |
cos bucket name from where data is fetched |
file |
file name to fetch |
image = read_image(cos, bucket, file)
Returns
Parameter |
Description |
image |
file fetched from cos bucket in a numpy array |
Download data from IBM Cloud Object storage
Download the data from cos bucket as per the request
Parameter |
Description |
limit |
specify the number of documents to limit the results to. Possible values: value ≥ 0 |
metadata, image_data, labels = get_data_ibm_cos(limit)
Parameter |
Description |
metadata |
List of metadata files |
image_data |
List of images (numpy array) |
labels |
List of label for each image |
Download processed data from IBM Cloud Object storage
Parameter |
Description |
limit |
specify the number of documents to limit the results to. Possible values: value ≥ 0 |
metadata, image_data, labels, annotations = get_data_ibm_cos(limit)
Parameter |
Description |
metadata |
List of metadata files |
image_data |
List of images (numpy array) |
labels |
List of label for each image |
annotations |
Annotation details for each image object |
Create and upload metadata document for processed image file to Cloudant database
Parameter |
Description |
metadata |
metadata of image to be uploaded |
annotation_meta |
Annotation details for image object |
response = upload_metadata(metadata, annotation_meta)
Parameter |
Description |
response |
api response of post call |
Write Image to COS
Convert the numpy ndarray image data into Image object and store the data in cos bucket
Parameter |
Description |
client |
cos client instance |
bucket |
cos bucket name where data is uploaded |
file |
file name to upload |
image |
image data to be uploaded |
write_image_cos(cos, bucket, file, image)
Source Files:
Resize image by specifying width, height, and interpolation method
Resize the input image with the given parameters.
Parameter |
Description |
image |
Input image file |
width |
Output image width |
height |
Output image height |
interpolation |
Opencv Interpolation Method |
resized_image = resize(image, width, height, interpolation_method)
Parameter |
Description |
resized_image |
Resized image |
Get Resized Data
Resize the input data as per the specification
Parameter |
Description |
width |
Output image width |
height |
Output image height |
interpolation_method |
Opencv Interpolation Method |
image_resize = ImageResize(width, height, interpolation_method)
metadata, resized_data, labels = image_resize.get_resized_data()
Parameter |
Description |
metadata |
List of metadata files |
resized_data |
List of resized images (numpy array) |
labels |
List of label for each image |
Find Contour in an Image
This method finds all the contours in an input image based on the input method. It takes advantage of opencv methods to remove noise, detect edges, perform adaptive thresholding, and to detect contours.
Parameter |
Description |
image |
Input image |
method |
contour detection method. Possible values - adaptive thresholding(0), edge detection (1); Default - 0 |
contours = find_contours(image, 0)
Parameter |
Description |
contours |
detected contours |
Draw bounding rectangle on an object in an image
Finds the coordinates of the rectangle which contains the object in a given contour and draws the rectangle on an input image.
Parameter |
Description |
contours |
detected contours of an image |
image |
Input image |
method |
contour detection method. Possible values - adaptive thresholding(0), edge detection (1); Default - 0 |
drawn_image, coordinates = draw_bounding_rectangle(contours, image, 0)
Parameter |
Description |
drawn_image |
Image with rectangle on the object |
coordinates |
Coordinates of the drawn rectangle in the form <x, y, w, h> |
Create the annotation deatils and upload the processed data
Generate the metadata for processed image data and upload the new metadata in cloudant database with processed meta files.
Parameter |
Description |
metadata |
metadata file of an image |
image |
Processed image file |
label |
Label of processed image |
coordinates |
Annotation coordinaes of image |
upload_processed_image(metadata, image, label, coordinates)
Get annotated data
Get the annotated processed data
annotation = Annotation()
annotated_data = annotation.get_annotated_data()
Source File: Prepare Training
Split the downloaded data into Train & Validation
- Split the data in training and testing folders using Sklearn train test split with test size of 20%.
- Creates the label file for each image file.
- Creates a file with all the labels.
Parameter |
Description |
metadata |
List of metadata files |
image_data |
List of images (numpy array) |
labels |
List of label for each image |
annotations |
Annotation details for each image object |
split_tarin_test_data(metadata, image_data, labels, annotations)
Create Yolo label file
- Creates the label file for each image file with format <x_center> <y_center> .
- Name of label file is same as the name of image
Parameter |
Description |
annotation |
Annotation coordinates for object in an image |
filename |
Name of the file to be created |
image |
Image object |
label_id |
Label id of object label |
create_yolo_label_file(annotation, filename, image, label_id)
Convert annotation coordinates in Yolo format
Converts the standard annotation coordinates of object in this format: .
Parameter |
Description |
coordinates |
coordinates for object in an image |
width |
image width |
height |
image height |
x, y, w, h = get_yolo_format_annotations(coordinates, width, height)
Parameter |
Description |
x |
x_center relative to width of image |
y |
y_center relative to height of image |
w |
width of object relative to width of image |
h |
height of object relative to height of image |
Create file with all the classes
Create a obj.names file which conatins all the avialable classes in a data sample.
Parameter |
Description |
classes |
set of all the vaialble classes of objects |
create_class_names_file(classes)
Append content of a directory in a file
- List the contents of given data directory in a file. This is used to list all the train and test file names with jpg extension which is an input to Yolo algorithm.
- This will list out the filename with relaive path to the darknet directory.
Parameter |
Description |
content_path |
data directory |
filename |
filename where all the content will be listed |
append_dir_content_in_file(content_path, filename)
Create a file which contains the traing details for Yolo
- append location of Train.txt file which contains path to all the training files
- append location of Test.txt file which contains path to all the validation files
- append location of classes( obj.names ) file which contains all the class names
- append location of backup directory which will be used for training backups
Parameter |
Description |
backup_dir |
Path of backup directory |
append_training_details(backup_dir)
Download Pretrained weights for Yolo Custom training
Download weight file from darknet repository
download_pretrained_weights()