Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
17c5fd7
updated travis for conda to update pip
Nov 5, 2018
d7d4ea3
updated README for build status, updated .travis.yml for codecoverage
Nov 5, 2018
99a0e75
travis debug
Nov 5, 2018
1ea37fc
travis debug
Nov 5, 2018
99f7ebc
travis debug
Nov 5, 2018
e259059
travis debug
Nov 5, 2018
1601d38
travis debug
Nov 5, 2018
4b4bb3e
travis debug
Nov 5, 2018
c7c6593
travis debug
Nov 5, 2018
19a13f2
travis debug
Nov 5, 2018
e9bafa8
travis debug
Nov 5, 2018
e6381f2
updated rio-tiler vesrion
Nov 5, 2018
e924aa7
updated testing for all_close
Nov 5, 2018
4a70153
updated test to use gtiff
Nov 5, 2018
319b07a
updated tests/test_main.py to compare as int
Nov 5, 2018
aa53751
updated .travis.yml for codecov
Nov 5, 2018
ec27791
updated .travis.yml for codecov
Nov 5, 2018
477d956
updtaed tile_generator to delete debug print statement
Nov 5, 2018
804a376
Tasking intereaction (#2)
dlindenbaum Nov 7, 2018
b2f7399
Tasking intereaction (#3)
dlindenbaum Nov 7, 2018
cc53c71
Merge branch 'dev' of github.com:SpaceNetChallenge/ml-export-tool int…
Nov 7, 2018
d9ad7fd
created base model for mlwork
Nov 7, 2018
3d7f682
updated baseline mlmodel class
Nov 7, 2018
9b10fe2
updated tile_generator to remove print statements
Nov 8, 2018
1d3ee36
updated to allow for pass through of indexes
Nov 8, 2018
0542482
updated to use desired_zoom_level as an absolute zoom level as oppose…
Nov 9, 2018
8e23860
updated to use requests library for downloading tiles
Nov 10, 2018
4b9ea80
updated notebooks/Write_Tile_To_COG as demonstration
Nov 10, 2018
95a87b3
updated tile_aggregrator to use pytorch datagenerator class
Nov 10, 2018
6e3500a
updated setup.py for affine, tqdm, pytorch and torchvision
Nov 10, 2018
bc76b97
updated ml_tools/mlbase to have tf_serving class
Nov 11, 2018
a546253
updated tile_aggregator to use torch dataloader for speed enhancement…
Nov 11, 2018
6063789
updated tile_generator for use of dataloader
Nov 11, 2018
889265c
updated noetbook for Tile_To_COG
Nov 11, 2018
b7b3b68
updated tests
Nov 12, 2018
628398a
updated Write_tile_To_COG
Nov 12, 2018
b592392
updated tile_aggregrator to have LZW rio-cog profile
Nov 12, 2018
2f2bbc3
updated setup.py for sat-stac requirement
Nov 28, 2018
92050c5
updated ml_export utils for s3 functionality
Dec 3, 2018
d97ce89
update nginx.conf for creater body size
Dec 3, 2018
c1f1b7f
added scripts processing_tile for demonstration of upload of tiles
Dec 3, 2018
9dd37bb
updated postprocessing for raster to geojson
Dec 3, 2018
af8f2d9
updated ml_tools opencv for inference
Dec 3, 2018
c48634b
updatign readme.md
Dec 3, 2018
6c10b98
added basic stac_tools
Dec 3, 2018
5536331
updated .travis.yml for opencv requirement
Dec 4, 2018
237c81c
Merge pull request #4 from SpaceNetChallenge/ml_interface
nrweir Dec 19, 2018
28743da
code cleanup
nrweir Dec 19, 2018
9484d25
restyling code and adding notes
nrweir Dec 19, 2018
38a19c1
adding docs and comments, code cleanup for remaining modules
nrweir Dec 19, 2018
f0753bb
fixing class name for MLModel
nrweir Dec 20, 2018
3fb54f1
Merge pull request #5 from SpaceNetChallenge/nw_doc
nrweir Dec 20, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .idea/dictionaries/dlindenbaum.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 10 additions & 4 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,16 @@ install:
- conda update -q conda
# Useful for debugging any issues with conda
- conda info -a
- conda create --yes -n ml-export python=$TRAVIS_PYTHON_VERSION pip
- conda create --yes -n ml-export python=$TRAVIS_PYTHON_VERSION pip=18.1
- source activate ml-export
- conda install -c conda-forge rtree
- pip install -q -e .[test]
- conda install -c conda-forge rtree pytest opencv
- conda info -a
- pip install -e .[test]
- conda info -a
- conda list
# command to run tests
script:
- pytest # or py.test for Python versions 3.5 and below
- source activate ml-export & pytest --cov=./ #--log-level=INFO #--cov=./# or py.test for Python versions 3.5 and below
- codecov


11 changes: 11 additions & 0 deletions Docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Using the official tensorflow serving image from docker hub as base image
FROM developmentseed/looking-glass:latest

# Installing NGINX, used to rever proxy the predictions from SageMaker to TF Serving
RUN apt-get update && apt-get install -y --no-install-recommends nginx git

# Copy NGINX configuration to the container
COPY nginx.conf /etc/nginx/nginx.conf

# starts NGINX and TF serving pointing to our model
ENTRYPOINT service nginx start | /usr/bin/tf_serving_entrypoint.sh
26 changes: 26 additions & 0 deletions Docker/nginx.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
events {
# determines how many requests can simultaneously be served
# https://www.digitalocean.com/community/tutorials/how-to-optimize-nginx-configuration
# for more information
worker_connections 2048;
}

http {

client_max_body_size 50M;
server {
# configures the server to listen to the port 8080
listen 8080 deferred;

# redirects requests from SageMaker to TF Serving
location /invocations {
proxy_pass http://localhost:8501/v1/models/looking_glass_export:predict;
}

# Used my SageMaker to confirm if server is alive.
location /ping {
return 200 "OK";
}
}
}

32 changes: 31 additions & 1 deletion README.MD
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# Creating an ML-Export Tool
[![Build Status](https://travis-ci.com/SpaceNetChallenge/ml-export-tool.svg?branch=dev)](https://travis-ci.com/SpaceNetChallenge/ml-export-tool)

## User Story

Expand All @@ -12,16 +13,45 @@ A user would like to perform machine learning against an area. They provide an
3. Output formats for result.


## Export End Points
# Interface End Points

### GET
1. TMS
2. Vector Tiles
3. GeoJson
4. Cloud Optimized GeoTiff

### Push:
1. New ML Prediction
2. New STAC-ITEM



## Storage Layer:
We will use the [Spatial Temporal Access Catalog Spec](https://github.com/radiantearth/stac-spec) for Storage Documentation 9
STAC-ITEMS: Each machine learning output will be stored as an STAC ITEM. Binary masks can be stored as a Cloud Optimized GeoTiff to encourage easy processing

* STAC-ITEMS can be added to Catalog Collections based on TaskID.
STAC-ITEMS produced by the ML Service should have at least 3 assets:

* results_cog
* results_cog_binary
* results_cog_geojson

* STAC-ITEMs for other apis can be documnted
* [AI for Earth](https://github.com/Microsoft/AIforEarth-API-Development/blob/master/Quickstart.md)



### Why STAC:
* STAC is an industry initiative to create a overarching, cloud native searchable catalog of geospatial data.
* Machine Learning outputs in mapping situations are Spatial and Temporal Items.
* This will allow cumalitive machine learning results to queable and should allow flexibility as we add new types of data
* An example of the SpaceNet STAC-Browser can be found at [SpaceNet-STAC](https://spacenet-stac.netlify.com/)

###


## Test items:

Test Location 1:
Expand Down
Empty file added ml_export/api/__init__.py
Empty file.
3 changes: 3 additions & 0 deletions ml_export/ml_tools/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
"""ml-tools"""

__version__ = '0.1'
145 changes: 145 additions & 0 deletions ml_export/ml_tools/mlbase.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
import logging
import numpy as np
import json
import requests
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)


class MLModel():

"""MLModel test case.
The ml model base class should have 4 functions

init:
load_model_dict: This should Load The model into memory based on the provided model dictionary
predict: This should receive a np array of 3 x 1024 x 1024 and return a numpy array of 1x1024x1024
predict_batch: This should receive a list of np arrays of [np(3,1024,1024)] and return a list of [np(1,1024,1024]


model_dictionary = {'model_file': "test.hdf5",
"model_description": "Passthrough Model",
"model_version": "0.1",
"model_speed": 20, # numpy arrays per second
}



"""


def __init__(self, model_json_string, debug=False):
''' Initialize model using a json string ID for the model.'''

self.logger = logging.getLogger(__name__)
self.logger.setLevel(logging.DEBUG)
# Create the Handler for logging data to a file
logger_handler = logging.StreamHandler()
# Create a Formatter for formatting the log messages
logger_formatter = logging.Formatter(
'%(name)s - %(levelname)s - %(message)s')
# Add the Formatter to the Handler
logger_handler.setFormatter(logger_formatter)
# Add the Handler to the Logger
if debug:
logger.setLevel(logging.DEBUG)
else:
logger.setLevel(logging.INFO)
self.logger.addHandler(logger_handler)
## Assign Model Dictionary
self.model_json = model_json_string
## Load Model Into Memory
self.load_model_dict()

def estimate_time(self, tiles_length):
"""Returns Completion estimate in Seconds"""
return self.model_dict['model_speed']* tiles_length

def load_model_dict(self):
self.model_dict = json.loads(self.model_json)

def predict(self, np_array):
# TODO: IMPLEMENT!
return np_array[None, 0, :, :]

def predict_batch(self, list_np_array):
# TODO: IMPLEMENT!
list_np_array_results = []
for np_array in list_np_array:
list_np_array_results.append(np_array[None,0, :, :])
return list_np_array_results


class MLTFServing():
"""MLTFServing model test case.

The ml model base class should have 4 functions

init:
load_model_dict: This should Load The model into memory based on the provided model dictionary
predict: This should receive a np array of 3 x 1024 x 1024 and return a numpy array of 1x1024x1024
predict_batch: This should receive a list of np arrays of [np(3,1024,1024)] and return a list of [np(1,1024,1024]


model_dictionary = {'model_file': "test.hdf5",
"model_description": "Passthrough Model",
"model_version": "0.1",
"model_speed": 20, # numpy arrays per second
}
"""


def __init__(self, api_location, output_num_channels=1, debug=False):
''' Inititialize model '''

self.logger = logging.getLogger(__name__)
self.logger.setLevel(logging.DEBUG)
# Create the Handler for logging data to a file
logger_handler = logging.StreamHandler()
# Create a Formatter for formatting the log messages
logger_formatter = logging.Formatter('%(name)s - %(levelname)s - %(message)s')

# Add the Formatter to the Handler
logger_handler.setFormatter(logger_formatter)

# Add the Handler to the Logger

if debug:
logger.setLevel(logging.DEBUG)
else:
logger.setLevel(logging.INFO)

self.logger.addHandler(logger_handler)

# Assign Model Dictionary
self.predict_api_loc = api_location
self.num_channels = output_num_channels
self.model_speed = 1

# Load Model Into Memory
self.load_model_dict()

def estimate_time(self, tiles_length):
"""Returns Completion estimate in Seconds"""
return self.model_speed * tiles_length

def load_model_dict(self):
# TODO: IMPLEMENT
return 0

def predict(self, np_array):
# TODO: IMPLEMENT
return np_array[None, 0, :, :]

def predict_batch(self, super_res_tile_batch):
inputs = np.moveaxis(super_res_tile_batch, 1, 3).astype(np.float32)/255
payload = {'inputs': inputs.tolist()}
# Send prediction request
r = requests.post(self.predict_api_loc,
json=payload)
content = json.loads(r.content)
all_image_preds = np.asarray(content['outputs']).reshape(len(inputs),
256, 256)
all_image_preds = all_image_preds[:, np.newaxis, :, :]

return all_image_preds
54 changes: 54 additions & 0 deletions ml_export/ml_tools/mlopencv.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
## Note, for mac osx compatability import something from shapely.geometry before importing fiona or geopandas
## https://github.com/Toblerity/Shapely/issues/553 * Import shapely before rasterio or fioana
from shapely import geometry
import mercantile
from ml_export import tile_generator
from torch.utils.data import Dataset


def mlopencv(mlbase):

def __init__(self, model):
super().__init__()


class OpenCVClassDataset(Dataset):

def __init__(self, root_tile_obj, raster_location,
desired_zoom_level, super_res_zoom_level,
cog=True,
tile_size=256,
indexes=None
):

self.root_tile_obj = root_tile_obj
self.desired_zoom_level = desired_zoom_level
self.super_res_zoom_level = super_res_zoom_level
self.raster_location = raster_location
self.cog = cog
self.tile_size = tile_size

if indexes is None:
self.indexes = [1, 2, 3]
else:
self.indexes = indexes

small_tile_object_list, small_tile_position_list = tile_generator.create_super_tile_list(root_tile_obj,
desired_zoom_level=desired_zoom_level)
self.small_tile_object_list = small_tile_object_list
self.small_tile_position_list = small_tile_position_list # this isn't used anywhere?

def __len__(self):
return len(self.small_tile_object_list)

def __getitem__(self, idx):
super_res_tile = tile_generator.create_super_tile_image(
self.small_tile_object_list[idx],
self.raster_location,
desired_zoom_level=self.super_res_zoom_level,
indexes=self.indexes,
tile_size=self.tile_size,
cog=self.cog
)
return super_res_tile, mercantile.xy_bounds(
*self.small_tile_object_list[idx])
45 changes: 45 additions & 0 deletions ml_export/postprocessing/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
import geopandas as gpd
import rasterio
import rasterio.features
import rasterio.warp


def create_geojson(raster_name, geojson_name, threshold=0.5):
"""Binarize and vectorize a raster file.

This function takes an image mask with float values between ``0`` and ``1``
and converts it to a binary mask, which it then polygonizes. Use the
`threshold` argument to set the mimimum pixel intensity value that will
be included in the vectorized polygon outputs.

Arguments
---------
raster_name : str
Path to the raster mask file to vectorize.
geojson_name : str
Desired output path for the geojson file.
threshold : float, optional
Minimum pixel intensity to include in the vectorized polygons. Defaults
to ``0.5``.

"""
geomList = []
with rasterio.open(raster_name) as dataset:
# Read the dataset's valid data mask as a ndarray.
data = dataset.read()
data[data >= threshold] = 1
data[data < threshold] = 0
mask = data == 1
# Extract feature shapes and values from the array.
for geom, val in rasterio.features.shapes(data, mask=mask,
transform=dataset.transform):
# Transform shapes from the dataset's own coordinate
# reference system to CRS84 (EPSG:4326).
geom = rasterio.warp.transform_geom(
dataset.crs, 'EPSG:4326', geom, precision=6)
geomList.append(geom)
gdf = gpd.GeoDataFrame(geometry=geomList)
gdf.crs = {'init': 'epsg:4326'}
gdf.to_file(geojson_name, driver="GeoJSON")

return geojson_name
Empty file.
Empty file.
16 changes: 16 additions & 0 deletions ml_export/stac_tools/stac_collection_base.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@



def create_stac_collection(stac_version="0.6.0",
id='taskm01',
title='Tasking Manager',
keywords = 'Machine-Learning, Remote-Sensing, computervision, ml',
version=0.1,
licesnse= 'CC-BY-SA-4.0',
providers= 'SpaceNet',
extent={},
temporal="null"):



pass
Empty file.
Loading