Prepare to release v1.0.0 (#64)

* Add v1.0.0 release notes * Add public API contract * Rev version number to v1.0.0 * Move logged_subprocess() from build to plugin_helers --------- Signed-off-by: Jeremy Fowers <[email protected]>
onnx · Dec 6, 2023 · 6aae9b5 · 6aae9b5
1 parent d44355d
commit 6aae9b5
Show file tree

Hide file tree

Showing 8 changed files with 210 additions and 72 deletions.
diff --git a/README.md b/README.md
@@ -8,7 +8,7 @@
 
 We are on a mission to understand and use as many models as possible while leveraging the right toolchain and AI hardware for the job in every scenario. 
 
-Evaluating a deep learning model with a familiar toolchain and hardware accelerator is pretty straightforward. Scaling these evaluations to get apples-to-applies insights across a landscape of millions of permutations of models, toolchains, and hardware targets is not straightforward. Not without help, anyways.
+Evaluating a deep learning model with a familiar toolchain and hardware accelerator is pretty straightforward. Scaling these evaluations to get apples-to-apples insights across a landscape of millions of permutations of models, toolchains, and hardware targets is not straightforward. Not without help, anyways.
 
 TurnkeyML is a *tools framework* that integrates models, toolchains, and hardware backends to make evaluation and actuation of this landscape as simple as turning a key.
 

diff --git a/docs/contribute.md b/docs/contribute.md
@@ -14,6 +14,7 @@ The guidelines document is organized as the following sections:
 - [Pull Requests](#pull-requests)
 - [Testing](#testing)
 - [Versioning](#versioning)
+- [Public APIs](#public-apis)
 
 
 ## Contributing a model
@@ -225,3 +226,46 @@ We don't have any fancy testing framework set up yet. If you want to run tests l
 ## Versioning
 
 We use semantic versioning, as described in [versioning.md](https://github.com/onnx/turnkeyml/blob/main/docs/versioning.md).
+
+## Public APIs
+
+The following public APIs are available for developers. The maintainers aspire to change these as infrequently as possible, and doing so will require an update to the package's major version number.
+
+- From the top-level `__init__.py`:
+    - `turnkeycli`: the `main()` function of the `turnkey` CLI
+    - `benchmark_files()`: the top-level API called by the CLI's `benchmark` command
+    - `build_model()`: API for building a model with a Sequence
+    - `load_state()`: API for loading the state of a previous build
+    - `turnkeyml.version`: The package version number
+- From the `run` module:
+    - The `BaseRT` class: abstract base class used in all runtime plugins
+- From the `common.filesystem` module:
+    - `get_available_builds()`: list the builds in a turnkey cache
+    - `make_cache_dir()`: create a turnkey cache
+    - `MODELS_DIR`: the location of turnkey's model corpus on disk
+    - `Stats`: handle for saving and reading evaluation statistics
+    - `Keys`: reserves keys in the evaluation statistics
+- From the `common.printing` module:
+    - `log_info()`: print an info statement in the style of the turnkey APIs/CLIs 
+    - `log_warning()`: print a warning statement in the style of the turnkey APIs/CLIs
+    - `log_error()`: print an error statement in the style of the turnkey APIs/CLIs
+ - From the `build.export` module:
+    - `onnx_dir()`: location on disk of a build's ONNX files
+    - `ExportPlaceholder(Stage)`: build Stage for exporting models to ONNX
+    - `OptimizeOnnxModel(Stage)`: build Stage for using ONNX Runtime to optimize an ONNX model
+    - `ConvertOnnxToFp16(Stage)`: build Stage for using ONNX ML Tools to downcast an ONNX model to fp16
+ - From the `build.stage` module:
+    - The `Sequence` class: ordered collection of build Stages that define a build flow
+    - The `Stage` class: abstract base class that is used to define a model-to-model transformation 
+ - From the `common.build` module:
+    - The `State` class: data structure that holds the inputs, outputs, and intermediate values for a Sequence 
+ - From the `common.exceptions` module:
+    - `StageError`: exception raised when something goes wrong during a Stage
+    - `ModelRuntimeError`: exception raised when something goes wrong running a model in hardware
+ - From `run.plugin_helpers` everything
+    - `get_python_path()`: returns the Python executable
+    - `run_subprocess()`: execute a command in a subprocess
+    - `logged_subprocess()`: execute a command in a subprocess while capturing all terminal outputs to a file
+    - `CondaError`: exception raised when something goes wrong in a Conda environment created by TurnkeyML 
+    - `SubprocessError`: exception raised when something goes wrong in a subprocess created by TurnkeyML
+    - `HardwareError`: exception raised when something goes wrong in hardware managed by TurnkeyML
diff --git a/docs/release_notes.md b/docs/release_notes.md
@@ -0,0 +1,82 @@
+# Release Notes
+
+This document tracks the major changes in each package release of TurnkeyML.
+
+We are tracking two types of major changes:
+ - New features that enhance the user and developer experience
+ - Breaking changes to the CLI or public APIs
+
+If you are creating the release notes for a new version, please see the [template](#template-version-majorminorpatch). Release notes should capture all of the significant changes since the last numbered package release.
+
+# Version 1.0.0
+
+This version focuses on cleaning up technical debts and most of the changes are not visible to users. It removes cumbersome requirements for developers, removes unused features to streamline the codebase, and also clarifying some API naming schemes.
+
+Users, however, will enjoy improved fidelity in their reporting telemetry thanks to the streamlined code.
+
+## Users
+
+### User Improvements
+
+Improvements to the information in `turnkey_stats.yaml` and report CSVs:
+
+ - Now reports all model labels. Including, but not limited to, the model's OSS license.
+ - `build_status` and `benchmark_status` now accurately report the status of their respective toolchain phases.
+     - Previously, `benchmark_status` was a superset of the status of both build and benchmark.
+
+## User Breaking Changes
+
+None.
+
+## Developers
+
+### Developer Improvements
+
+ - Build success has been conceptually reworked for Stages/Sequences such that the `SetSuccess` Stage is no longer required at the end of every Sequence.
+   - Previously, `build_model()` would only return a `State` object if the `state.build_status == successful_build`, which in turn had to be manually set in a Stage.
+   - Now, if a Sequence finishes then the underlying toolflow will automatically set `state.build_status = successful_build` on your behalf.
+
+### Developer Breaking Changes
+
+ - The `benchmark_model()` API has been removed as there were no known users / use cases. Anyone who wants to run standalone benchmarking can still instantiate any `BaseRT` child class and call `BaseRT.benchmark()`.
+ - The APIs for saving and loading labels `.txt` files in the cache have been removed since no code was using those APIs. Labels are now saved into `turnkey_stats.yaml` instead.
+ - The `quantization_samples` argument to the `build_model()` API has been removed.
+ - The naming scheme of the members of `Stats` has been adjusted for consistency. It used to refer to both builds and benchmarks as "builds", whereas now it uses "evaluations" as a superset of the two.
+   - `Stats.add_build_stat()` is now `Stats.save_model_eval_stat()`.
+   - `Stats.add_build_sub_stat()` is now `Stats.save_model_eval_sub_stat()`.
+   - `Stats.stat_id` is now `Stats.evaluation_id`.
+   - The `builds` section of the stats/reports is now `evaluations`.
+   - `Stats.save_stat()` is now `Stats.save_model_stat()`.
+   - `Stats.build_stats` is now `Stats.evaluation_stats`.
+ - The `SetSuccess` build stage has been removed because build success has been reworked (see improvements).
+ - The `logged_subprocess()` API has been moved from the `common.build` module to the `run.plugin_helpers` module.
+
+# Version 0.3.0
+
+This version was used to initialize the repository. 
+
+# Template: Version Major.Minor.Patch
+
+Headline statement.
+
+
+
+## Users
+
+### User Improvements
+
+List of enhancements specific to users of the tools.
+
+### User Breaking Changes
+
+List of breaking changes specific to users of the tools.
+
+## Developers
+
+### Developer Improvements
+
+List of enhancements specific to developers who build on the tools.
+
+### Developer Breaking Changes
+
+List of breaking changes specific to developers who build on the tools.
diff --git a/src/turnkeyml/common/build.py b/src/turnkeyml/common/build.py
@@ -18,7 +18,6 @@
 import sklearn.base
 import turnkeyml.common.exceptions as exp
 import turnkeyml.common.tf_helpers as tf_helpers
-import turnkeyml.run.plugin_helpers as plugin_helpers
 from turnkeyml.version import __version__ as turnkey_version
 
 
@@ -445,65 +444,6 @@ def flush(self):
         pass
 
 
-def logged_subprocess(
-    cmd: List[str],
-    cwd: str = os.getcwd(),
-    env: Optional[Dict] = None,
-    log_file_path: Optional[str] = None,
-    log_to_std_streams: bool = True,
-    log_to_file: bool = True,
-) -> None:
-    """
-    This function calls a subprocess and sends the logs to either a file, stdout/stderr, or both.
-
-    cmd             Command that will run o a sbprocess
-    cwd             Working directory from where the subprocess should run
-    env             Evironment to be used by the subprocess (useful for passing env vars)
-    log_file_path   Where logs will be stored
-    log_to_file     Whether or not to store the subprocess's stdout/stderr into a file
-    log_to_std      Whether or not to print subprocess's stdout/stderr to the screen
-    """
-    if env is None:
-        env = os.environ.copy()
-    if log_to_file and log_file_path is None:
-        raise ValueError("log_file_path must be set when log_to_file is True")
-
-    log_stdout = ""
-    log_stderr = ""
-    try:
-        proc = subprocess.run(
-            cmd,
-            check=True,
-            env=env,
-            capture_output=True,
-            cwd=cwd,
-        )
-    except Exception as e:  # pylint: disable=broad-except
-        log_stdout = e.stdout.decode("utf-8")  # pylint: disable=no-member
-        log_stderr = e.stderr.decode("utf-8")  # pylint: disable=no-member
-        raise plugin_helpers.CondaError(
-            f"Exception {e} encountered, \n\nstdout was: "
-            f"\n{log_stdout}\n\n and stderr was: \n{log_stderr}"
-        )
-    else:
-        log_stdout = proc.stdout.decode("utf-8")
-        log_stderr = proc.stderr.decode("utf-8")
-    finally:
-        if log_to_std_streams:
-            # Print log to stdout
-            # This might be useful when this subprocess is being logged externally
-            print(log_stdout, file=sys.stdout)
-            print(log_stderr, file=sys.stdout)
-        if log_to_file:
-            log = f"{log_stdout}\n{log_stderr}"
-            with open(
-                log_file_path,
-                "w",
-                encoding="utf-8",
-            ) as f:
-                f.write(log)
-
-
 def get_system_info():
     os_type = platform.system()
     info_dict = {}

diff --git a/src/turnkeyml/run/onnxrt/execute.py b/src/turnkeyml/run/onnxrt/execute.py
@@ -8,7 +8,6 @@
 import json
 from statistics import mean
 import platform
-import turnkeyml.common.build as build
 import turnkeyml.run.plugin_helpers as plugin_helpers
 
 ORT_VERSION = "1.15.1"
@@ -84,7 +83,7 @@ def execute_benchmark(
     ]
 
     # Execute command and log stdout/stderr
-    build.logged_subprocess(
+    plugin_helpers.logged_subprocess(
         cmd=cmd,
         cwd=os.path.dirname(output_dir),
         log_to_std_streams=False,

diff --git a/src/turnkeyml/run/plugin_helpers.py b/src/turnkeyml/run/plugin_helpers.py
@@ -1,6 +1,8 @@
 import subprocess
 import logging
 import os
+import sys
+from typing import List, Optional, Dict
 
 TIMEOUT = 900
 
@@ -72,3 +74,62 @@ def get_python_path(conda_env_name):
             f"An error occurred while getting Python path for {conda_env_name} environment"
             f"{e.stderr.decode()}"
         )
+
+
+def logged_subprocess(
+    cmd: List[str],
+    cwd: str = os.getcwd(),
+    env: Optional[Dict] = None,
+    log_file_path: Optional[str] = None,
+    log_to_std_streams: bool = True,
+    log_to_file: bool = True,
+) -> None:
+    """
+    This function calls a subprocess and sends the logs to either a file, stdout/stderr, or both.
+
+    cmd             Command that will run o a sbprocess
+    cwd             Working directory from where the subprocess should run
+    env             Evironment to be used by the subprocess (useful for passing env vars)
+    log_file_path   Where logs will be stored
+    log_to_file     Whether or not to store the subprocess's stdout/stderr into a file
+    log_to_std      Whether or not to print subprocess's stdout/stderr to the screen
+    """
+    if env is None:
+        env = os.environ.copy()
+    if log_to_file and log_file_path is None:
+        raise ValueError("log_file_path must be set when log_to_file is True")
+
+    log_stdout = ""
+    log_stderr = ""
+    try:
+        proc = subprocess.run(
+            cmd,
+            check=True,
+            env=env,
+            capture_output=True,
+            cwd=cwd,
+        )
+    except Exception as e:  # pylint: disable=broad-except
+        log_stdout = e.stdout.decode("utf-8")  # pylint: disable=no-member
+        log_stderr = e.stderr.decode("utf-8")  # pylint: disable=no-member
+        raise CondaError(
+            f"Exception {e} encountered, \n\nstdout was: "
+            f"\n{log_stdout}\n\n and stderr was: \n{log_stderr}"
+        )
+    else:
+        log_stdout = proc.stdout.decode("utf-8")
+        log_stderr = proc.stderr.decode("utf-8")
+    finally:
+        if log_to_std_streams:
+            # Print log to stdout
+            # This might be useful when this subprocess is being logged externally
+            print(log_stdout, file=sys.stdout)
+            print(log_stderr, file=sys.stdout)
+        if log_to_file:
+            log = f"{log_stdout}\n{log_stderr}"
+            with open(
+                log_file_path,
+                "w",
+                encoding="utf-8",
+            ) as f:
+                f.write(log)
diff --git a/src/turnkeyml/version.py b/src/turnkeyml/version.py
@@ -1 +1 @@
-__version__ = "0.4.0"
+__version__ = "1.0.0"
diff --git a/test/unit.py b/test/unit.py
@@ -101,8 +101,8 @@ def test_003_subprocess_logger(self):
         traceback_error_cmd = f"raise ValueError('{traceback_error_msg}')"
 
         # Perform basic test (no exceptions inside logger)
-        cmd = ["python","-c",f"import sys\n{inside_stdout_cmd}\n{inside_sterr_cmd}"]
-        build.logged_subprocess(cmd=cmd, log_file_path=logfile_path)
+        cmd = ["python", "-c", f"import sys\n{inside_stdout_cmd}\n{inside_sterr_cmd}"]
+        plugin_helpers.logged_subprocess(cmd=cmd, log_file_path=logfile_path)
 
         # Make sure we captured everything we intended to capture
         with open(logfile_path, "r", encoding="utf-8") as file:
@@ -111,9 +111,13 @@ def test_003_subprocess_logger(self):
         assert inside_sterr_msg in log_contents
 
         # Perform test with exceptions inside the logger
-        cmd = ["python","-c",f"import sys\n{inside_stdout_cmd}\n{inside_sterr_cmd}\n{traceback_error_cmd}"]
+        cmd = [
+            "python",
+            "-c",
+            f"import sys\n{inside_stdout_cmd}\n{inside_sterr_cmd}\n{traceback_error_cmd}",
+        ]
         with self.assertRaises(plugin_helpers.CondaError):
-            build.logged_subprocess(cmd=cmd, log_file_path=logfile_path)
+            plugin_helpers.logged_subprocess(cmd=cmd, log_file_path=logfile_path)
 
         # Make sure we captured everything we intended to capture
         with open(logfile_path, "r", encoding="utf-8") as file:
@@ -126,16 +130,24 @@ def test_003_subprocess_logger(self):
         subprocess_env = os.environ.copy()
         expected_env_var_value = "Expected Value"
         subprocess_env["TEST_ENV_VAR"] = expected_env_var_value
-        cmd = ["python","-c",f'import os\nprint(os.environ["TEST_ENV_VAR"])']
-        build.logged_subprocess(cmd=cmd, log_file_path=logfile_path, env=subprocess_env)
+        cmd = ["python", "-c", f'import os\nprint(os.environ["TEST_ENV_VAR"])']
+        plugin_helpers.logged_subprocess(
+            cmd=cmd, log_file_path=logfile_path, env=subprocess_env
+        )
         with open(logfile_path, "r", encoding="utf-8") as file:
             log_contents = file.read()
         assert expected_env_var_value in log_contents
 
         # Test log_to_std_streams
-        cmd = ["python","-c",f'print("{outside_stdout_msg}")\nprint("{outside_stderr_msg}")']
+        cmd = [
+            "python",
+            "-c",
+            f'print("{outside_stdout_msg}")\nprint("{outside_stderr_msg}")',
+        ]
         with build.Logger("", logfile_path):
-            build.logged_subprocess(cmd=cmd, log_to_std_streams=True, log_to_file=False)
+            plugin_helpers.logged_subprocess(
+                cmd=cmd, log_to_std_streams=True, log_to_file=False
+            )
         with open(logfile_path, "r", encoding="utf-8") as file:
             log_contents = file.read()
         assert outside_stdout_msg in log_contents