Skip to content

sfc-gh-jdemlow/SPCSDataScience

Repository files navigation

Welcome Snowpark Container Services (SPCS) For Data Science

SPCS Data Science Series

This repository is part of a series that will continue to evolve, finding ways to leverage SPCS to make our lives easier as data scientists and machine learning engineers. We’ll be enhancing Snowflake’s native capabilities alongside SPCS to create templates adaptable to any direction. After setting up your work environment in Snowflake, the next steps involve building the desired data science workflow. As the series develops, we’ll explore and refine these workflows.

Part 1: Data Science Workload in Snowflake with SPCS

  • YouTube Walkthrough: Video
  • Medium Article: Link
  • GitHub Page: Link
  • Github Repository: Link

https://youtu.be/FaLgQCbQWjA?si=UCvwGO5U7wCo9_uU

Future Work

  1. Data Ingestion
    Data Ingestion Snowpark - Great Article. Ideally, this repo eventually becomes something similar, providing comprehensive data ingestion and MLOps capabilities.

  2. Feature Store & Data Preprocessing
    Snowflake is working on a feature store that’s currently in preview. Once available, we’ll demonstrate its potential in project workflows.

    • We’ll discuss both offline and batch feature stores.
    • Monitoring data drift is crucial to ensure that our data hasn’t drifted significantly, as this may require retraining models.
  3. Model Training and Model Registering
    Our goal is to highlight the power of SPCS, Snowflake ML, and Snowflake Cortex. We’ll demonstrate how these tools can integrate effectively.

  4. Model Inference
    We will explore various options for model inference in Snowflake.

Resources

About

This repository has been created to leverage SPCS

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published