Skip to content

Latest commit

 

History

History
485 lines (374 loc) · 14.1 KB

developer.md

File metadata and controls

485 lines (374 loc) · 14.1 KB

Developer Guide

Setup

Setup Python virtual environment with Python >= 3.7

  1. Clone this repository
       git clone https://github.com/amundra02/ai_pipeline.git
  2. Activate the virtual environment and install the required packages
       pip install -r requirements.txt

Peek inside the requirements file if you have everything already installed. Most of the dependencies are common libraries.

Pipeline Buckets

Data Connections

Source Files

Methods

Get Cos Client Instance
Response
 client = get_cos_client()
Parameter Description
client cos client instance
Get Cos resource Instance
Response
 resource = get_cos_resource()
Parameter Description
resource cos resource instance
Get cloudant instance and database to fetch data
Response
 cloudant, db = get_cloudant_client()
Parameter Description
cloudant_client Cloudant instance - allows access to Cloudant DB
db database name from where documents needs to be queried
Get Cos Bucket to upload processed data
Response
 bucket_name = get_upload_bucket()
Parameter Description
bucket_name Cos Bucket name
Get clouant database name to upload processed metadata
Response
 db_name = get_cloudant_processed_db()
Parameter Description
db_name Cloudant database name
Read Image From COS Convert the downloaded streaming body objects to numpy ndarray
Request
Parameter Description
client cos client instance
bucket cos bucket name from where data is fetched
file file name to fetch
Response
 image = read_image(cos, bucket, file)

Returns

Parameter Description
image file fetched from cos bucket in a numpy array
Download data from IBM Cloud Object storage Download the data from cos bucket as per the request
Request
Parameter Description
limit specify the number of documents to limit the results to. Possible values: value ≥ 0
Response
 metadata, image_data, labels = get_data_ibm_cos(limit)
Parameter Description
metadata List of metadata files
image_data List of images (numpy array)
labels List of label for each image
Download processed data from IBM Cloud Object storage
Request
Parameter Description
limit specify the number of documents to limit the results to. Possible values: value ≥ 0
Response
 metadata, image_data, labels, annotations = get_data_ibm_cos(limit)
Parameter Description
metadata List of metadata files
image_data List of images (numpy array)
labels List of label for each image
annotations Annotation details for each image object
Create and upload metadata document for processed image file to Cloudant database
Request
Parameter Description
metadata metadata of image to be uploaded
annotation_meta Annotation details for image object
Response
 response = upload_metadata(metadata, annotation_meta)
Parameter Description
response api response of post call
Write Image to COS Convert the numpy ndarray image data into Image object and store the data in cos bucket
Request
Parameter Description
client cos client instance
bucket cos bucket name where data is uploaded
file file name to upload
image image data to be uploaded
Response
    write_image_cos(cos, bucket, file, image)

Data Preprocessing

Source Files:

Methods

Resize image by specifying width, height, and interpolation method Resize the input image with the given parameters.
Request
Parameter Description
image Input image file
width Output image width
height Output image height
interpolation Opencv Interpolation Method
Response
 resized_image = resize(image, width, height, interpolation_method)
Parameter Description
resized_image Resized image
Get Resized Data Resize the input data as per the specification
Request
Parameter Description
width Output image width
height Output image height
interpolation_method Opencv Interpolation Method
Response
 image_resize = ImageResize(width, height, interpolation_method)
 metadata, resized_data, labels = image_resize.get_resized_data()
Parameter Description
metadata List of metadata files
resized_data List of resized images (numpy array)
labels List of label for each image
Find Contour in an Image This method finds all the contours in an input image based on the input method. It takes advantage of opencv methods to remove noise, detect edges, perform adaptive thresholding, and to detect contours.
Request
Parameter Description
image Input image
method contour detection method. Possible values - adaptive thresholding(0), edge detection (1); Default - 0
Response
 contours = find_contours(image, 0)
Parameter Description
contours detected contours
Draw bounding rectangle on an object in an image

Finds the coordinates of the rectangle which contains the object in a given contour and draws the rectangle on an input image.

Request
Parameter Description
contours detected contours of an image
image Input image
method contour detection method. Possible values - adaptive thresholding(0), edge detection (1); Default - 0
Response
 drawn_image, coordinates = draw_bounding_rectangle(contours, image, 0)
Parameter Description
drawn_image Image with rectangle on the object
coordinates Coordinates of the drawn rectangle in the form <x, y, w, h>
Create the annotation deatils and upload the processed data Generate the metadata for processed image data and upload the new metadata in cloudant database with processed meta files.
Request
Parameter Description
metadata metadata file of an image
image Processed image file
label Label of processed image
coordinates Annotation coordinaes of image
Response
 upload_processed_image(metadata, image, label, coordinates)
Get annotated data Get the annotated processed data
Response
 annotation = Annotation()
 annotated_data = annotation.get_annotated_data()

Feature Engineering

Algorithm Selection

Training Infrastructure

Source File: Prepare Training

Methods

Split the downloaded data into Train & Validation
  • Split the data in training and testing folders using Sklearn train test split with test size of 20%.
  • Creates the label file for each image file.
  • Creates a file with all the labels.
Request
Parameter Description
metadata List of metadata files
image_data List of images (numpy array)
labels List of label for each image
annotations Annotation details for each image object
Response
  split_tarin_test_data(metadata, image_data, labels, annotations)
Create Yolo label file
  • Creates the label file for each image file with format <x_center> <y_center> .
  • Name of label file is same as the name of image
Request
Parameter Description
annotation Annotation coordinates for object in an image
filename Name of the file to be created
image Image object
label_id Label id of object label
Response
  create_yolo_label_file(annotation, filename, image, label_id)
Convert annotation coordinates in Yolo format Converts the standard annotation coordinates of object in this format: .
Request
Parameter Description
coordinates coordinates for object in an image
width image width
height image height
Response
  x, y, w, h = get_yolo_format_annotations(coordinates, width, height)
Parameter Description
x x_center relative to width of image
y y_center relative to height of image
w width of object relative to width of image
h height of object relative to height of image
Create file with all the classes

Create a obj.names file which conatins all the avialable classes in a data sample.

Request
Parameter Description
classes set of all the vaialble classes of objects
Response
  create_class_names_file(classes)
Append content of a directory in a file
  • List the contents of given data directory in a file. This is used to list all the train and test file names with jpg extension which is an input to Yolo algorithm.
  • This will list out the filename with relaive path to the darknet directory.
Request
Parameter Description
content_path data directory
filename filename where all the content will be listed
Response
  append_dir_content_in_file(content_path, filename)
Create a file which contains the traing details for Yolo
  • append location of Train.txt file which contains path to all the training files
  • append location of Test.txt file which contains path to all the validation files
  • append location of classes( obj.names ) file which contains all the class names
  • append location of backup directory which will be used for training backups
Request
Parameter Description
backup_dir Path of backup directory
Response
  append_training_details(backup_dir)
Download Pretrained weights for Yolo Custom training

Download weight file from darknet repository

Response
  download_pretrained_weights()

Model Deployment

Continuous Improvement