Skip to content

Latest commit

 

History

History

Bulk Loader Service

What is the high-level flow

  1. The Shared Data Storage contains an image bucket, where administrators can centralize their registration information. Amazon S3 Inventory enumerates all files within the bucket daily (configurable) and persists this information into inventory bucket. This operation triggers a ObjectCreatedNotification from S3 that forwards to the SNS Topic InventoryCreated.

  2. This notification triggers the Bulk Loader Inventory Created Handler to process the report. The function converts the report into an S3BatchOperations_CSV_20180820 compatible format. Finally, it uses the s3:CreateJob method to fan-out importing the list.

  3. Amazon S3 Batch enumerates through the list and passes each item to the RivBulkLoaderBatchHandler. This function determines the current item qualifies for import (e.g., valid format). The historical importation information resides in an Amazon S3 Object Tags. After disqualifying files, the remaining forward into an Amazon SQS Queue (aka Riv Throttled Input Queue).

  4. The RivBulkLoaderThrottledIndexer pulls from the Input Queue and forwards them into the UserPortal Gateway. After confirming the Gateway is successful, the function updates the Import History table. Finally, the message is removed from the queue.

abstract.png

What S3 ObjectTags are supported

The bulk importer identifies any supported image (*.png and *.jpeg files) and examines the associated tags.

Tag Key Expected Format Description
UserId Unicode string (128 char max) Required name of the user to create
Properties s3://bucket/path/properties.json Optional path to Amazon S3 file containing the user's property bag
Indexed True or False Marker denoting the object has been imported
Ignore True or False Marker denoting the object should never be processed

What is the expected format for properties.json

The user's properties.json must deserialize into a Mapping[str,str] data structure. Extending this functionality would require extending the Index-Face's StorageWriter. Additionally, the entire user record in DynamoDB cannot exceed 400KB.

{
    "fname": "Fred",
    "lname": "Flintson",
    "address": "345 Cave Stone Road"
}