Releases: snowflakedb/snowflake-ml-python
Releases · snowflakedb/snowflake-ml-python
1.1.2
1.1.2
Bug Fixes
- Generic: Fix the issue that stack trace is hidden by telemetry unexpectedly.
- Model Development: Execute model signature inference without materializing full dataframe in memory.
- Model Registry: Fix occasional 'snowflake-ml-python library does not exist' error when deploying to SPCS.
Behavior Changes
- Model Registry: When calling
predict
with Snowpark DataFrame, both inferred or normalized column names are accepted. - Model Registry: When logging a Snowpark ML Modeling Model, sample input data or manually provided signature will be
ignored since they are not necessary.
New Features
- Model Development: SQL implementation of binary
precision_score
metric.
1.1.1
1.1.1
Bug Fixes
- Model Registry: The
predict
target method on registered models is now compatible with unsupervised estimators. - Model Development: Fix confusion_matrix incorrect results when the row number cannot be divided by the batch size.
Behavior Changes
New Features
- Introduced passthrough_col param in Modeling API. This new param is helpful in scenarios
requiring automatic input_cols inference, but need to avoid using specific
columns, like index columns, during training or inference.
1.1.0
1.1.0
Bug Fixes
- Model Registry: Fix panda dataframe input not handling first row properly.
- Model Development: OrdinalEncoder and LabelEncoder output_columns do not need to be valid snowflake identifiers. They
would previously be excluded if the normalized name did not match the name specified in output_columns.
Behavior Changes
New Features
- Model Registry: Add support for invoking public endpoint on SPCS service, by providing a "enable_ingress" SPCS
deployment option. - Model Development: Add support for distributed HPO - GridSearchCV and RandomizedSearchCV execution will be
distributed on multi-node warehouses.
1.0.12
1.0.12
Bug Fixes
- Model Registry: Fix regression issue that container logging is not shown during model deployment to SPCS.
- Model Development: Enhance the column capacity of OrdinalEncoder.
- Model Registry: Fix unbound `batch_size`` error when deploying a model other than Hugging Face Pipeline
and LLM with GPU on SPCS.
Behavior Changes
- Model Registry: Raise early error when deploying to SPCS with db/schema that starts with underscore.
- Model Registry:
conda-forge
channel is now automatically added to channel lists when deploying to SPCS. - Model Registry:
relax_version
will not strip all version specifier, instead it will relax==x.y.z
specifier to
>=x.y,<(x+1)
. - Model Registry: Python with different patchlevel but the same major and minor will not result a warning when loading
the model via Model Registry and would be considered to use when deploying to SPCS. - Model Registry: When logging a
snowflake.ml.model.models.huggingface_pipeline.HuggingFacePipelineModel
object,
versions of local installed libraries won't be picked as dependencies of models, instead it will pick up some pre-
defined dependencies to improve user experience.
New Features
- Model Registry: Enable best-effort SPCS job/service log streaming when logging level is set to INFO.
1.0.11
1.0.11
New Features
- Model Registry: Add log_artifact() public method.
- Model Development: Add support for
kneighbors
.
Behavior Changes
- Model Registry: Change log_model() argument from TrainingDataset to List of Artifact.
- Model Registry: Change get_training_dataset() to get_artifact().
Bug Fixes
- Model Development: Fix support for XGBoost and LightGBM models using SKLearn Grid Search and Randomized Search model selectors.
- Model Development: DecimalType is now supported as a DataType.
- Model Development: Fix metrics compatibility with Snowpark Dataframes that use Snowflake identifiers
1.0.10
1.0.10
Behavior Changes
- Model Development: precision_score, recall_score, f1_score, fbeta_score, precision_recall_fscore_support,
mean_absolute_error, mean_squared_error, and mean_absolute_percentage_error metric calculations are now distributed. - Model Registry:
deploy
will now returnDeployment
for deployment information.
New Features
- Model Registry: When the model signature is auto-inferred, it will be printed to the log for reference.
- Model Registry: For SPCS deployment,
Deployment
details will containsimage_name
,service_spec
andservice_function_sql
.
Bug Fixes
- Model Development: Fix an issue that leading to UTF-8 decoding errors when using modeling modules on Windows.
- Model Development: Fix an issue that alias definitions cause
SnowparkSQLUnexpectedAliasException
in inference. - Model Registry: Fix an issue that signature inference could be incorrect when using Snowpark DataFrame as sample input.
- Model Registry: Fix too strict data type validation when predicting. Now, for example, if you have a INT8
type feature in the signature, if providing a INT64 dataframe but all values are within the range, it would not fail.
[1.0.9]
Behavior Changes
- Model Development: log_loss metric calculation is now distributed.
Bug Fixes
- Model Registry: Fix an issue that building images fails with specific docker setup.
- Model Registry: Fix an issue that unable to embed local ML library when the library is imported by
zipimport
. - Model Registry: Fix out-of-date doc about
platform
argument in thedeploy
function. - Model Registry: Fix an issue that unable to deploy a GPU-trained PyTorch model to a platform where GPU is not available.
[1.0.8]
1.0.8
Bug Fixes
- Model Development: Ordinal encoder can be used with mixed input column types.
- Model Registry: Fix an issue that incorrect docker executable is used when building images.
- Model Registry: Fix an issue that specifying
token
argument when using
snowflake.ml.model.models.huggingface_pipeline.HuggingFacePipelineModel
withtransformers < 4.32.0
is not effective. - Model Registry: Fix an issue that incorrect system function call is used when deploying to SPCS.
- Model Registry: Fix an issue when using a
transformers.pipeline
that does not have atokenizer
. - Model Registry: Fix incorrectly-inferred image repository name during model deployment to SPCS.
- Model Registry: Fix GPU resource retention issue caused by failed or stuck previous deployments in SPCS.
[1.0.7]
Bug Fixes
- Model Development & Model Registry: Fix an error related to pandas.io.json.json_normalize.
[1.0.6]
New Features
- Model Registry: add
create_if_not_exists
parameter in constructor. - Model Registry: Added get_or_create_model_registry API.
- Model Registry: Added support for using GPU inference when deploying XGBoost (
xgboost.XGBModel
andxgboost.Booster
), PyTorch (torch.nn.Module
andtorch.jit.ScriptModule
) and TensorFlow (tensorflow.Module
andtensorflow.keras.Model
) models to Snowpark Container Services. - Model Registry: When inferring model signature,
Sequence
of built-in types,Sequence
ofnumpy.ndarray
,Sequence
oftorch.Tensor
,Sequence
oftensorflow.Tensor
andSequence
oftensorflow.Tensor
can be used instead of onlyList
of them. - Model Registry: Added
get_training_dataset
API. - Model Development: Size of metrics result can exceed previous 8MB limit.
- Model Registry: Added support save/load/deploy HuggingFace pipeline object (
transformers.Pipeline
) and our wrapper (snowflake.ml.model.models.huggingface_pipeline.HuggingFacePipelineModel
) to it. Using the wrapper to specify configurations and the model for the pipeline will be loaded dynamically when deploying. Currently, following tasks are supported to log without manually specifying model signatures:- "conversational"
- "fill-mask"
- "question-answering"
- "summarization"
- "table-question-answering"
- "text2text-generation"
- "text-classification" (alias "sentiment-analysis" available)
- "text-generation"
- "token-classification" (alias "ner" available)
- "translation"
- "translation_xx_to_yy"
- "zero-shot-classification"
Bug Fixes
- Model Development: Fixed a bug when using simple imputer with numpy >= 1.25.
- Model Development: Fixed a bug when inferring the type of label columns.
Behavior Changes
- Model Registry:
log_model()
now return aModelReference
object instead of a model ID. - Model Registry: When deploying a model with 1
target method
only, thetarget_method
argument can be omitted. - Model Registry: When using the snowflake-ml-python with version newer than what is available in Snowflake Anaconda Channel,
embed_local_ml_library
option will be set asTrue
automatically if not. - Model Registry: When deploying a model to Snowpark Container Services and using GPU, the default value of num_workers will be 1.
- Model Registry:
keep_order
andoutput_with_input_features
in the deploy options have been removed. Now the behavior is controlled by the type of the input when callingmodel.predict()
. If the input is apandas.DataFrame
, the behavior will be the same askeep_order=True
andoutput_with_input_features=False
before. If the input is asnowpark.DataFrame
, the behavior will be the same askeep_order=False
andoutput_with_input_features=True
before. - Model Registry: When logging and deploying PyTorch (
torch.nn.Module
andtorch.jit.ScriptModule
) and TensorFlow (tensorflow.Module
andtensorflow.keras.Model
) models, we no longer accept models whose input is a list of tensor and output is a list of tensors. Instead, now we accept models whose input is 1 or more tensors as positional arguments, and output is a tensor or a tuple of tensors. The input and output dataframe when predicting keep the same as before, that is every column is an array feature and contains a tensor.