Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
204 changes: 204 additions & 0 deletions demo/notebooks/0_quickstart-readiness-check.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "6ac5f2fd-0daf-43b9-83d5-30dfd2162f0e",
"metadata": {},
"source": [
"# Fraud Detection with LakeFS - Quick Start\n",
"\n",
"Welcome to your Jupyter notebook environment! This workspace is pre-configured for fraud detection experiments with LakeFS data versioning.\n",
"\n",
"## 🚀 Getting Started\n",
"\n",
"### 1. Check Your Environment\n",
"\n",
"Run this in a notebook cell to verify connections:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "de2f8065-6c4e-4ce0-9ccf-deb493ea5fa7",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"LakeFS Endpoint: http://fraud-detection-lakefs\n",
"LakeFS Access Key: something\n",
"S3 Endpoint: None\n",
"S3 Access Key: None\n"
]
}
],
"source": [
"import os\n",
"\n",
"# LakeFS connection\n",
"print(f\"LakeFS Endpoint: {os.getenv('LAKECTL_SERVER_ENDPOINT_URL')}\")\n",
"print(f\"LakeFS Access Key: {os.getenv('LAKECTL_CREDENTIALS_ACCESS_KEY_ID')}\")\n",
"\n",
"# MinIO connection\n",
"print(f\"S3 Endpoint: {os.getenv('AWS_S3_ENDPOINT')}\")\n",
"print(f\"S3 Access Key: {os.getenv('AWS_ACCESS_KEY_ID')}\")"
]
},
{
"cell_type": "markdown",
"id": "923694c6-d3b3-4a61-ba6e-b3161888f9ac",
"metadata": {},
"source": [
"### 2. Install Required Packages"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a7e3e96a-d795-4d48-82a3-dcfcc77d3520",
"metadata": {},
"outputs": [],
"source": [
"# Install LakeFS client\n",
"!pip install lakefs-client boto3 pandas scikit-learn\n",
"\n",
"# For ML workloads\n",
"!pip install tensorflow torch transformers"
]
},
{
"cell_type": "markdown",
"id": "c78c9930-6db4-4113-9bbf-81eeb384e72e",
"metadata": {},
"source": [
"### 3. Connect to LakeFS"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aa962a99-7271-4c60-9872-7a10035548ff",
"metadata": {},
"outputs": [],
"source": [
"import lakefs_client\n",
"from lakefs_client.client import LakeFSClient\n",
"\n",
"# Configuration is already set via environment variables\n",
"configuration = lakefs_client.Configuration(\n",
" host=os.getenv('LAKECTL_SERVER_ENDPOINT_URL'),\n",
" username=os.getenv('LAKECTL_CREDENTIALS_ACCESS_KEY_ID'),\n",
" password=os.getenv('LAKECTL_CREDENTIALS_SECRET_ACCESS_KEY')\n",
")\n",
"\n",
"client = LakeFSClient(configuration)\n",
"\n",
"# List repositories\n",
"repos = client.repositories.list_repositories()\n",
"print(f\"Available repositories: {[repo.id for repo in repos['results']]}\")"
]
},
{
"cell_type": "markdown",
"id": "6e59ee38-768b-4746-95d8-c856f9a36ce9",
"metadata": {},
"source": [
"## 📚 Available Notebooks\n",
"\n",
"Navigate to `demo/notebooks/` to find:\n",
"\n",
"1. **0_quickstart-readiness-check.ipynb** - Validation of environment before quickstart\n",
"1. **1_experiment_train_lakefs.ipynb** - Train models with LakeFS versioning\n",
"2. **2_save_model_lakefs.ipynb** - Save and version ML models\n",
"3. **3_rest_requests_multi_model_lakefs.ipynb** - Multi-model REST inference\n",
"4. **4_grpc_requests_multi_model_lakefs.ipynb** - High-performance gRPC inference\n",
"5. **5_rest_requests_single_model_lakefs.ipynb** - Single model deployment\n",
"6. **8_distributed_training_lakefs.ipynb** - Distributed training patterns\n",
"\n",
"## 🔧 Pre-configured Services\n",
"\n",
"| Service | Endpoint | Purpose |\n",
"|---------|----------|---------|\n",
"| LakeFS | http://lakefs:8000 | Data versioning & management |\n",
"| MinIO | http://minio:9000 | S3-compatible object storage |\n",
"\n",
"## 💡 Tips\n",
"\n",
"- **Save frequently** - Your work is persisted on the PVC\n",
"- **Use LakeFS branches** - Create branches for experiments\n",
"- **Version your data** - Commit data changes to LakeFS\n",
"- **Monitor resources** - Check memory/CPU in the status bar\n",
"\n",
"## 📖 Learn More\n",
"\n",
"- [LakeFS Documentation](https://docs.lakefs.io/)\n",
"- [Chart README](https://github.com/rh-aiservices-bu/Fraud-Detection-data-versioning-with-lakeFS/blob/main/deploy/helm/fraud-detection/README.md)\n",
"- [OpenShift AI Documentation](https://access.redhat.com/documentation/en-us/red_hat_openshift_ai/)\n",
"\n",
"## 🆘 Need Help?\n",
"\n",
"Check the environment variables:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4338431b-64ac-40e5-a1cb-d0c17777683b",
"metadata": {},
"outputs": [],
"source": [
"env | grep -E '(LAKEFS|AWS)' | sort"
]
},
{
"cell_type": "markdown",
"id": "24ccc387-53fa-4aab-812f-397f724b35d1",
"metadata": {},
"source": [
"Test connectivity:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bcb6543a-3ec7-4c87-a97a-c32df6e6971c",
"metadata": {},
"outputs": [],
"source": [
"import requests\n",
"response = requests.get(os.getenv('LAKECTL_SERVER_ENDPOINT_URL') + '/api/v1/healthcheck')\n",
"print(f\"LakeFS Status: {response.status_code}\")"
]
},
{
"cell_type": "markdown",
"id": "7d7c4e57-e968-4797-b17a-65382fc8e5b9",
"metadata": {},
"source": [
"**Happy Experimenting! 🎉**"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.12",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading