file API support (multi page/frame documents)

faithoflifedev · Aug 18, 2024 · 9357e89 · 9357e89
1 parent 4081bc7
commit 9357e89
Show file tree

Hide file tree

Showing 13 changed files with 169 additions and 104 deletions.
diff --git a/packages/google_vision/CHANGELOG.md b/packages/google_vision/CHANGELOG.md
@@ -2,6 +2,19 @@
 
 ## 1.3.0
 
+* support for file API
+* sepeartion of code base into file and image API
+* better support for ImageContext
+* deprecate old methods&#x2F;classes
+
+## 1.2.1+3
+
+* support for file API
+* sepeartion of code base into file and image API
+* deprecate old methods&#x2F;classes
+
+## 1.3.0
+
 * custom headers with API Key authentication Issue #23
 
 ## 1.3.0

diff --git a/packages/google_vision/README.md b/packages/google_vision/README.md
@@ -91,18 +91,18 @@ print('done.');
 
 | <div style="width:420px">**Method Signature** | **Description** |
 | -------------------- | --------------- |
-| Future\<AnnotateImageResponse> **detection**(<br/>&nbsp;&nbsp;JsonImage jsonImage,<br/>&nbsp;&nbsp;AnnotationType annotationType,<br/>&nbsp;&nbsp;{int maxResults = 10,}<br/>) | Lower level method for a single detection type as specified by annotationType |
-| Future<CropHintsAnnotation?> **cropHints**(<br/>&nbsp;&nbsp;JsonImage jsonImage,<br/>&nbsp;&nbsp;{int maxResults = 10,}<br/>) | Crop Hints suggests vertices for a crop region on an image. |
-| Future<FullTextAnnotation?> **documentTextDetection**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{int maxResults = 10,}<br/>) | Extracts text from an image (or file); the response is optimized for dense text and documents. The break information.  A specific use of documentTextDetection is to detect handwriting in an image. |
-| Future<List\<FaceAnnotation>> **faceDetection**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{int maxResults = 10,}<br/>) | Face Detection detects multiple faces within an image along with the associated key facial attributes such as emotional state or wearing headwear. |
-| Future<ImagePropertiesAnnotation?> **imageProperties**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{int maxResults = 10,}<br/>) | The Image Properties feature detects general attributes of the image, such as dominant color. |
-| Future<List\<EntityAnnotation>> **labelDetection**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{int maxResults = 10,}<br/>) | Labels can identify general objects, locations, activities, animal species, products, and more.  Labels are returned in English only. |
-| Future<List\<EntityAnnotation>> **landmarkDetection**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{int maxResults = 10,}<br/>) | Landmark Detection detects popular natural and human-made structures within an image. |
-| Future<List\<EntityAnnotation>> **logoDetection**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{int maxResults = 10,}<br/>) | Logo Detection detects popular product logos within an image. |
-| Future<List\<LocalizedObjectAnnotation>> **objectLocalization**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{int maxResults = 10,}<br/>) | The Vision API can detect and extract multiple objects in an image with Object Localization.  Object localization identifies multiple objects in an image and provides a LocalizedObjectAnnotation for each object in the image. Each LocalizedObjectAnnotation identifies information about the object, the position of the object, and rectangular bounds for the region of the image that contains the object.  Object localization identifies both significant and less-prominent objects in an image. |
-| Future<SafeSearchAnnotation?> **safeSearchDetection**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{int maxResults = 10,}<br/>) | SafeSearch Detection detects explicit content such as adult content or violent content within an image. This feature uses five categories (adult, spoof, medical, violence, and racy) and returns the likelihood that each is present in a given image. See the SafeSearchAnnotation page for details on these fields. |
-| Future<List\<EntityAnnotation>> **textDetection**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{int maxResults = 10,}<br/>) | Detects and extracts text from any image. For example, a photograph might contain a street sign or traffic sign. The JSON includes the entire extracted string, as well as individual words, and their bounding boxes. |
-| Future<WebDetection?> **webDetection**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{int maxResults = 10,}<br/>) | Web Detection detects Web references to an image. |
+| Future\<AnnotateImageResponse> **detection**(<br/>&nbsp;&nbsp;JsonImage jsonImage,<br/>&nbsp;&nbsp;AnnotationType annotationType,<br/>&nbsp;&nbsp;{ImageContext? imageContext,<br/>&nbsp;&nbsp;int maxResults = 10,}<br/>) | Lower level method for a single detection type as specified by annotationType |
+| Future<CropHintsAnnotation?> **cropHints**(<br/>&nbsp;&nbsp;JsonImage jsonImage,<br/>&nbsp;&nbsp;{ImageContext? imageContext,<br/>&nbsp;&nbsp;int maxResults = 10,}<br/>) | Crop Hints suggests vertices for a crop region on an image. |
+| Future<FullTextAnnotation?> **documentTextDetection**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{ImageContext? imageContext,<br/>&nbsp;&nbsp;int maxResults = 10,}<br/>) | Extracts text from an image (or file); the response is optimized for dense text and documents. The break information.  A specific use of documentTextDetection is to detect handwriting in an image. |
+| Future<List\<FaceAnnotation>> **faceDetection**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{ImageContext? imageContext,<br/>&nbsp;&nbsp;int maxResults = 10,}<br/>) | Face Detection detects multiple faces within an image along with the associated key facial attributes such as emotional state or wearing headwear. |
+| Future<ImagePropertiesAnnotation?> **imageProperties**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{ImageContext? imageContext,<br/>&nbsp;&nbsp;int maxResults = 10,}<br/>) | The Image Properties feature detects general attributes of the image, such as dominant color. |
+| Future<List\<EntityAnnotation>> **labelDetection**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{ImageContext? imageContext,<br/>&nbsp;&nbsp;int maxResults = 10,}<br/>) | Labels can identify general objects, locations, activities, animal species, products, and more.  Labels are returned in English only. |
+| Future<List\<EntityAnnotation>> **landmarkDetection**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{ImageContext? imageContext,<br/>&nbsp;&nbsp;int maxResults = 10,}<br/>) | Landmark Detection detects popular natural and human-made structures within an image. |
+| Future<List\<EntityAnnotation>> **logoDetection**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{ImageContext? imageContext,<br/>&nbsp;&nbsp;int maxResults = 10,}<br/>) | Logo Detection detects popular product logos within an image. |
+| Future<List\<LocalizedObjectAnnotation>> **objectLocalization**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{ImageContext? imageContext,<br/>&nbsp;&nbsp;int maxResults = 10,}<br/>) | The Vision API can detect and extract multiple objects in an image with Object Localization.  Object localization identifies multiple objects in an image and provides a LocalizedObjectAnnotation for each object in the image. Each LocalizedObjectAnnotation identifies information about the object, the position of the object, and rectangular bounds for the region of the image that contains the object.  Object localization identifies both significant and less-prominent objects in an image. |
+| Future<SafeSearchAnnotation?> **safeSearchDetection**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{ImageContext? imageContext,<br/>&nbsp;&nbsp;int maxResults = 10,}<br/>) | SafeSearch Detection detects explicit content such as adult content or violent content within an image. This feature uses five categories (adult, spoof, medical, violence, and racy) and returns the likelihood that each is present in a given image. See the SafeSearchAnnotation page for details on these fields. |
+| Future<List\<EntityAnnotation>> **textDetection**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{ImageContext? imageContext,<br/>&nbsp;&nbsp;int maxResults = 10,}<br/>) | Detects and extracts text from any image. For example, a photograph might contain a street sign or traffic sign. The JSON includes the entire extracted string, as well as individual words, and their bounding boxes. |
+| Future<WebDetection?> **webDetection**(<br/>&nbsp;&nbsp;JsonImage jsonImage, <br/>&nbsp;&nbsp;{ImageContext? imageContext,<br/>&nbsp;&nbsp;int maxResults = 10,}<br/>) | Web Detection detects Web references to an image. |
 
 ## Usage with Flutter
 

diff --git a/packages/google_vision/lib/meta.dart b/packages/google_vision/lib/meta.dart
@@ -6,4 +6,4 @@ library meta;
 import 'dart:convert' show json;
 
 final pubSpec = json.decode(
-    '{"name":"google_vision","version":"1.3.0","homepage":"https://github.com/faithoflifedev/google_vision/tree/main/packages/google_vision","environment":{"sdk":">=3.2.0 <4.0.0"},"description":"Allows you to add Google Visions image labeling, face, logo, and landmark detection, OCR, and detection of explicit content, into cross platform applications.","dependencies":{"args":"^2.5.0","collection":"^1.18.0","crypto_keys_plus":"^0.4.0","dio":"^5.5.0+1","http":"^1.2.2","image":"^4.1.7","jose_plus":"^0.4.6","json_annotation":"^4.9.0","loggy":"^2.0.3","mime":"^1.0.5","retrofit":"^4.1.0","universal_io":"^2.2.2"},"dev_dependencies":{"build_runner":"^2.4.9","grinder":"^0.9.5","json_serializable":"^6.8.0","lints":"^4.0.0","publish_tools":"^1.0.0+4","retrofit_generator":"^8.1.2"},"executables":{"vision":""},"repository":"https://github.com/faithoflifedev/google_vision"}');
+    '{"name":"google_vision","version":"1.3.0","homepage":"https://github.com/faithoflifedev/google_vision/tree/main/packages/google_vision","environment":{"sdk":">=3.2.0 <4.0.0"},"description":"Allows you to add Google Visions image labeling, face, logo, and landmark detection, OCR, and detection of explicit content, into cross platform applications.","dependencies":{"args":"^2.5.0","collection":"^1.18.0","crypto_keys_plus":"^0.4.0","dio":"^5.6.0","http":"^1.2.2","image":"^4.1.7","jose_plus":"^0.4.6","json_annotation":"^4.9.0","loggy":"^2.0.3","mime":"^1.0.5","retrofit":"^4.2.0","universal_io":"^2.2.2"},"dev_dependencies":{"build_runner":"^2.4.11","grinder":"^0.9.5","json_serializable":"^6.8.0","lints":"^4.0.0","publish_tools":"^1.0.0+4","retrofit_generator":"^8.2.0"},"executables":{"vision":""},"repository":"https://github.com/faithoflifedev/google_vision"}');
diff --git a/packages/google_vision/lib/src/cmd/vision_crop_hint_command.dart b/packages/google_vision/lib/src/cmd/vision_crop_hint_command.dart
@@ -39,27 +39,27 @@ class VisionCropHintCommand extends VisionHelper {
             .map((aspectRatio) => double.parse(aspectRatio))
             .toList();
 
-    final requests = AnnotateImageRequests(requests: [
-      AnnotateImageRequest(
-        jsonImage: JsonImage(byteBuffer: imageFile.readAsBytesSync().buffer),
-        features: [Feature(type: AnnotationType.cropHints)],
-        imageContext: aspectRatios != null
-            ? ImageContext(
-                cropHintsParams: CropHintsParams(aspectRatios: aspectRatios),
-              )
-            : null,
-      )
-    ]);
+    final imageContext = aspectRatios != null
+        ? ImageContext(
+            cropHintsParams: CropHintsParams(aspectRatios: aspectRatios),
+          )
+        : null;
 
     if (pages != null) {
-      final annotatedResponses = await annotateFile(imageFile, pages: pages!);
+      final annotatedResponses = await annotateFile(
+        imageFile,
+        imageContext: imageContext,
+        pages: pages!,
+      );
 
       print(annotatedResponses.responses);
     } else {
-      final annotatedResponses =
-          await googleVision.annotate(requests: requests);
+      final annotatedResponses = await googleVision.image.cropHints(
+        JsonImage.fromFile(imageFile),
+        imageContext: imageContext,
+      );
 
-      print(annotatedResponses.responses);
+      print(annotatedResponses?.cropHints);
     }
   }
 }
diff --git a/packages/google_vision/lib/src/cmd/vision_helper_command.dart b/packages/google_vision/lib/src/cmd/vision_helper_command.dart
@@ -58,12 +58,14 @@ abstract class VisionHelper extends Command {
   Future<BatchAnnotateFilesResponse> annotateFile(
     File file, {
     String? features,
+    ImageContext? imageContext,
     required List<int> pages,
   }) async =>
       googleVision.file.annotate(requests: [
         AnnotateFileRequest(
           inputConfig: InputConfig.fromFile(file),
           features: getFeatures(features),
+          imageContext: imageContext,
           pages: pages,
         )
       ]);

diff --git a/packages/google_vision/lib/src/cmd/vision_safe_search_command.dart b/packages/google_vision/lib/src/cmd/vision_safe_search_command.dart
@@ -26,17 +26,11 @@ class VisionSafeSearchCommand extends VisionHelper {
         globalResults!['credential-file'],
         'https://www.googleapis.com/auth/cloud-vision');
 
-    final imageFile = File(argResults!['image-file']).readAsBytesSync();
+    final imageFile = File(argResults!['image-file']);
 
-    final requests = AnnotateImageRequests(requests: [
-      AnnotateImageRequest(
-        jsonImage: JsonImage(byteBuffer: imageFile.buffer),
-        features: [Feature(type: AnnotationType.safeSearchDetection)],
-      )
-    ]);
+    final safeSearchDetection = await googleVision.image
+        .safeSearchDetection(JsonImage.fromFile(imageFile));
 
-    final annotatedResponses = await googleVision.annotate(requests: requests);
-
-    print(annotatedResponses.responses);
+    print(safeSearchDetection);
   }
 }
diff --git a/packages/google_vision/lib/src/google_vision_file.dart b/packages/google_vision/lib/src/google_vision_file.dart
@@ -10,8 +10,10 @@ class GoogleVisionFile {
   );
 
   /// Run detection and annotation for a batch of requests.
-  Future<BatchAnnotateFilesResponse> annotate(
-      {required List<AnnotateFileRequest> requests, String? parent}) {
+  Future<BatchAnnotateFilesResponse> annotate({
+    required List<AnnotateFileRequest> requests,
+    String? parent,
+  }) {
     googleVision.setAuthHeader();
 
     final jsonRequest = <String, dynamic>{

diff --git a/packages/google_vision/lib/src/google_vision_image.dart b/packages/google_vision/lib/src/google_vision_image.dart
@@ -12,9 +12,18 @@ class GoogleVisionImage {
   /// Run detection and annotation for a batch of requests.
   Future<BatchAnnotateImagesResponse> annotate({
     required List<AnnotateImageRequest> requests,
+    String? parent,
   }) {
     googleVision.setAuthHeader();
 
+    final jsonRequest = <String, dynamic>{
+      'requests': requests,
+    };
+
+    if (parent != null) {
+      jsonRequest['parent'] = parent;
+    }
+
     return client.annotate(
       GoogleVision.contentType,
       {'requests': requests},
@@ -25,16 +34,21 @@ class GoogleVisionImage {
   Future<AnnotateImageResponse> detection(
     JsonImage jsonImage,
     AnnotationType annotationType, {
+    ImageContext? imageContext,
     int maxResults = 10,
   }) async {
     final annotatedResponses = await annotate(
       requests: [
-        AnnotateImageRequest(jsonImage: jsonImage, features: [
-          Feature(
-            maxResults: maxResults,
-            type: annotationType,
-          ),
-        ])
+        AnnotateImageRequest(
+          jsonImage: jsonImage,
+          features: [
+            Feature(
+              maxResults: maxResults,
+              type: annotationType,
+            ),
+          ],
+          imageContext: imageContext,
+        )
       ],
     );
 
@@ -44,11 +58,13 @@ class GoogleVisionImage {
   /// Crop Hints suggests vertices for a crop region on an image.
   Future<CropHintsAnnotation?> cropHints(
     JsonImage jsonImage, {
+    ImageContext? imageContext,
     int maxResults = 10,
   }) async {
     final annotatedResponses = await detection(
       jsonImage,
       AnnotationType.cropHints,
+      imageContext: imageContext,
       maxResults: maxResults,
     );
 
@@ -61,6 +77,7 @@ class GoogleVisionImage {
   /// handwriting in an image.
   Future<FullTextAnnotation?> documentTextDetection(
     JsonImage jsonImage, {
+    ImageContext? imageContext,
     int maxResults = 10,
   }) async {
     final annotatedResponses = await detection(
@@ -77,6 +94,7 @@ class GoogleVisionImage {
   /// head-wear.
   Future<List<FaceAnnotation>> faceDetection(
     JsonImage jsonImage, {
+    ImageContext? imageContext,
     int maxResults = 10,
   }) async {
     final annotatedResponses = await detection(
@@ -92,6 +110,7 @@ class GoogleVisionImage {
   /// as dominant color.
   Future<ImagePropertiesAnnotation?> imageProperties(
     JsonImage jsonImage, {
+    ImageContext? imageContext,
     int maxResults = 10,
   }) async {
     final annotatedResponses = await detection(
@@ -107,6 +126,7 @@ class GoogleVisionImage {
   /// species, products, and more.  Labels are returned in English only.
   Future<List<EntityAnnotation>> labelDetection(
     JsonImage jsonImage, {
+    ImageContext? imageContext,
     int maxResults = 10,
   }) async {
     final annotatedResponses = await detection(
@@ -122,6 +142,7 @@ class GoogleVisionImage {
   /// within an image.
   Future<List<EntityAnnotation>> landmarkDetection(
     JsonImage jsonImage, {
+    ImageContext? imageContext,
     int maxResults = 10,
   }) async {
     final annotatedResponses = await detection(
@@ -136,6 +157,7 @@ class GoogleVisionImage {
   /// Logo Detection detects popular product logos within an image.
   Future<List<EntityAnnotation>> logoDetection(
     JsonImage jsonImage, {
+    ImageContext? imageContext,
     int maxResults = 10,
   }) async {
     final annotatedResponses = await detection(
@@ -159,6 +181,7 @@ class GoogleVisionImage {
   /// Object localization identifies both significant and less-prominent objects in an image.
   Future<List<LocalizedObjectAnnotation>> objectLocalization(
     JsonImage jsonImage, {
+    ImageContext? imageContext,
     int maxResults = 10,
   }) async {
     final annotatedResponses = await detection(
@@ -173,6 +196,7 @@ class GoogleVisionImage {
   /// Run Product Search.
   Future<ProductSearchResults?> productSearch(
     JsonImage jsonImage, {
+    ImageContext? imageContext,
     int maxResults = 10,
   }) async {
     final annotatedResponses = await detection(
@@ -191,6 +215,7 @@ class GoogleVisionImage {
   /// on these fields.
   Future<SafeSearchAnnotation?> safeSearchDetection(
     JsonImage jsonImage, {
+    ImageContext? imageContext,
     int maxResults = 10,
   }) async {
     final annotatedResponses = await detection(
@@ -207,6 +232,7 @@ class GoogleVisionImage {
   /// extracted string, as well as individual words, and their bounding boxes.
   Future<List<EntityAnnotation>> textDetection(
     JsonImage jsonImage, {
+    ImageContext? imageContext,
     int maxResults = 10,
   }) async {
     final annotatedResponses = await detection(
@@ -221,6 +247,7 @@ class GoogleVisionImage {
   /// Web Detection detects Web references to an image.
   Future<WebDetection?> webDetection(
     JsonImage jsonImage, {
+    ImageContext? imageContext,
     int maxResults = 10,
   }) async {
     final annotatedResponses = await detection(

diff --git a/packages/google_vision/lib/src/provider/files.g.dart b/packages/google_vision/lib/src/provider/files.g.dart