-
Notifications
You must be signed in to change notification settings - Fork 52
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Added notebooks for load data (#103)
* Create load-CSV-data-S3 * Added notebooks for Load data sections of UI * Modified with suggested changes * Modified with suggested changes * Remove extra header --------- Co-authored-by: chetan thote <[email protected]> Co-authored-by: Kevin D Smith <[email protected]>
- Loading branch information
1 parent
e2becae
commit 2540ebf
Showing
5 changed files
with
791 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
name="Chetan Thote" | ||
title="Product Team" | ||
image="singlestore" | ||
external=false |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
[meta] | ||
authors=["chetan-thote"] | ||
title="Sales Data Analysis Dataset From Amazon S3" | ||
description="""\ | ||
The Sales Data Analysis use case demonstrates how to utilize Singlestore's powerful querying capabilities to analyze sales data stored in a CSV file.""" | ||
difficulty="beginner" | ||
tags=["starter", "loaddata", "s3"] | ||
lesson_areas=["Ingest"] | ||
icon="database" | ||
destinations=["spaces"] | ||
minimum_tier="free-shared" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,360 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "97f96c34-81a9-495a-a55d-c565695e87f0", | ||
"metadata": {}, | ||
"source": [ | ||
"<div id=\"singlestore-header\" style=\"display: flex; background-color: rgba(235, 249, 245, 0.25); padding: 5px;\">\n", | ||
" <div id=\"icon-image\" style=\"width: 90px; height: 90px;\">\n", | ||
" <img width=\"100%\" height=\"100%\" src=\"https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/header-icons/database.png\" />\n", | ||
" </div>\n", | ||
" <div id=\"text\" style=\"padding: 5px; margin-left: 10px;\">\n", | ||
" <div id=\"badge\" style=\"display: inline-block; background-color: rgba(0, 0, 0, 0.15); border-radius: 4px; padding: 4px 8px; align-items: center; margin-top: 6px; margin-bottom: -2px; font-size: 80%\">SingleStore Notebooks</div>\n", | ||
" <h1 style=\"font-weight: 500; margin: 8px 0 0 4px;\">Sales Data Analysis Dataset From Amazon S3</h1>\n", | ||
" </div>\n", | ||
"</div>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "612bd378-f145-42f1-b8ce-32557a4c00cd", | ||
"metadata": {}, | ||
"source": [ | ||
"<div class=\"alert alert-block alert-warning\">\n", | ||
" <b class=\"fa fa-solid fa-exclamation-circle\"></b>\n", | ||
" <div>\n", | ||
" <p><b>Note</b></p>\n", | ||
" <p>This notebook can be run on a Free Starter Workspace. To create a Free Starter Workspace navigate to <tt>Start</tt> using the left nav. You can also use your existing Standard or Premium workspace with this Notebook.</p>\n", | ||
" </div>\n", | ||
"</div>" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "481ce5ae-2ee0-4b63-b3f3-a4b53a5bc381", | ||
"metadata": {}, | ||
"source": [ | ||
"The Sales Data Analysis use case demonstrates how to utilize Singlestore's powerful querying capabilities to analyze sales data stored in a CSV file. This demo showcases typical operations that businesses perform to gain insights from their sales data, such as calculating total sales, identifying top-selling products, and analyzing sales trends over time. By working through this example, new users will learn how to load CSV data into Singlestore, execute aggregate functions, and perform time-series analysis, which are essential skills for leveraging the full potential of Singlestore in a business intelligence context." | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "72fe6854-5b6e-4b79-a2d0-79bda0e18429", | ||
"metadata": {}, | ||
"source": [ | ||
"<h3>Demo Flow</h3>" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "5ed26ab8-1217-4fbd-be0c-4e7728314671", | ||
"metadata": {}, | ||
"source": [ | ||
"<img src=https://singlestoreloaddata.s3.ap-south-1.amazonaws.com/images/LoadDataCSV.png width=\"100%\" hight=\"50%\"/>" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "46fb95a8-1402-4b97-b04a-560741f96181", | ||
"metadata": {}, | ||
"source": [ | ||
"## How to use this notebook" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "a701cd90-dd42-4a06-b7a1-e0a2132af558", | ||
"metadata": {}, | ||
"source": [ | ||
"<img src=https://singlestoreloaddata.s3.ap-south-1.amazonaws.com/images/notebookuse.gif width=\"75%\" hight=\"50%\"/>" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "2d22fd53-2c18-40e5-bb38-6d8ebc06f1b8", | ||
"metadata": {}, | ||
"source": [ | ||
"## Create a database\n", | ||
"\n", | ||
"We need to create a database to work with in the following examples." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"id": "1624ccea-0c15-4048-ab2a-fe2178e5912a", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"shared_tier_check = %sql show variables like 'is_shared_tier'\n", | ||
"if not shared_tier_check or shared_tier_check[0][1] == 'OFF':\n", | ||
" %sql DROP DATABASE IF EXISTS SalesAnalysis;\n", | ||
" %sql CREATE DATABASE SalesAnalysis;" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "901e6ec1-2530-497a-857e-7973bb9714f1", | ||
"metadata": {}, | ||
"source": [ | ||
"<h3>Create Table</h3>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 2, | ||
"id": "7ac4285d-0d2d-44ec-8b1e-eef7b4f9358c", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"%%sql\n", | ||
"CREATE TABLE `SalesData` (\n", | ||
" `Date` text CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,\n", | ||
" `Store_ID` bigint(20) DEFAULT NULL,\n", | ||
" `ProductID` text CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,\n", | ||
" `Product_Name` text CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,\n", | ||
" `Product_Category` text CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,\n", | ||
" `Quantity_Sold` bigint(20) DEFAULT NULL,\n", | ||
" `Price` float DEFAULT NULL,\n", | ||
" `Total_Sales` float DEFAULT NULL\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "1de959eb-4f17-45d4-af74-42f45684d67b", | ||
"metadata": {}, | ||
"source": [ | ||
"<h3>Load Data Using Pipelines</h3>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 3, | ||
"id": "84f592b8-a12e-41d8-bff0-fe96175992b9", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"%%sql\n", | ||
"CREATE PIPELINE SalesData_Pipeline AS\n", | ||
"LOAD DATA S3 's3://singlestoreloaddata/SalesData/sales_data.csv'\n", | ||
"CONFIG '{ \\\"region\\\": \\\"ap-south-1\\\" }'\n", | ||
"/*\n", | ||
"CREDENTIALS '{\"aws_access_key_id\": \"<access key id>\",\n", | ||
" \"aws_secret_access_key\": \"<access_secret_key>\"}'\n", | ||
" */\n", | ||
"INTO TABLE SalesData\n", | ||
"FIELDS TERMINATED BY ','\n", | ||
"LINES TERMINATED BY '\\r\\n'\n", | ||
"IGNORE 1 lines;\n", | ||
"\n", | ||
"\n", | ||
"START PIPELINE SalesData_Pipeline;" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 4, | ||
"id": "352e340a-a613-4ec5-94a5-c4e1f3565757", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"%%sql\n", | ||
"SELECT * FROM SalesData LIMIT 10" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "4508d431-7683-4ac9-a4e8-d939c47dd1fc", | ||
"metadata": {}, | ||
"source": [ | ||
"<h3>Sample Queries</h3>\n", | ||
"\n", | ||
"We will try to execute some Analytical Queries" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "55ac6134-976c-4f27-bc2b-140835b64f13", | ||
"metadata": {}, | ||
"source": [ | ||
"<b>Top-Selling Products" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 5, | ||
"id": "d666c04b-ccb0-47cc-a1e7-efaa7a590d27", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"%%sql\n", | ||
"SELECT product_name, SUM(quantity_sold) AS total_quantity_sold FROM SalesData\n", | ||
" GROUP BY product_name ORDER BY total_quantity_sold DESC LIMIT 5;" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "87c36700-0db8-405f-97c0-e13a6a2ae0cb", | ||
"metadata": {}, | ||
"source": [ | ||
"<b>Sales Trends Over Time" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 6, | ||
"id": "b46d72c7-07a3-4e23-8fe4-c238b5517ef6", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"%%sql\n", | ||
"SELECT date, SUM(total_sales) AS total_sales FROM SalesData\n", | ||
"GROUP BY date ORDER BY total_sales desc limit 5;" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "e6c232a1-acce-4d25-aebd-1a89aafba47d", | ||
"metadata": {}, | ||
"source": [ | ||
"<b>Total Sales by Store" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 7, | ||
"id": "af571f6c-0145-4466-9ed7-000d37e4738f", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"%%sql\n", | ||
"SELECT Store_ID, SUM(total_sales) AS total_sales FROM SalesData\n", | ||
"GROUP BY Store_ID ORDER BY total_sales DESC limit 5;" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "9bf1d7f3-c636-4ac0-b2be-e48eaca747ef", | ||
"metadata": {}, | ||
"source": [ | ||
"<b>Sales Contribution by Product (Percentage)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 8, | ||
"id": "5613b3e8-72d2-48dc-a7ae-47911df24cd2", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"%%sql\n", | ||
"SELECT product_name, SUM(total_sales) * 100.0 / (SELECT SUM(total_sales) FROM SalesData) AS sales_percentage FROM SalesData\n", | ||
" GROUP BY product_name ORDER BY sales_percentage DESC limit 5;" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "afed201d-d9f2-49cc-8a14-df35103abd4e", | ||
"metadata": {}, | ||
"source": [ | ||
"<b>Top Days with Highest Sale</b>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 9, | ||
"id": "7fd8d785-7861-4570-88b3-0185c2c9c298", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"%%sql\n", | ||
"SELECT date, SUM(total_sales) AS total_sales FROM SalesData\n", | ||
" GROUP BY date ORDER BY total_sales DESC LIMIT 5;" | ||
] | ||
}, | ||
{ | ||
"attachments": {}, | ||
"cell_type": "markdown", | ||
"id": "6738b6e4-5e8b-45db-b3dc-ebcb73bcf629", | ||
"metadata": {}, | ||
"source": [ | ||
"## Conclusion\n", | ||
"\n", | ||
"<div class=\"alert alert-block alert-warning\">\n", | ||
" <b class=\"fa fa-solid fa-exclamation-circle\"></b>\n", | ||
" <div>\n", | ||
" <p><b>Action Required</b></p>\n", | ||
" <p> If you created a new database in your Standard or Premium Workspace, you can drop the database by running the cell below. Note: this will not drop your database for Free Starter Workspaces. To drop a Free Starter Workspace, terminate the Workspace using the UI. </p>\n", | ||
" </div>\n", | ||
"</div>\n", | ||
"\n", | ||
"We have shown how to insert data from a Amazon S3 using `Pipelines` to SingleStoreDB. These techniques should enable you to\n", | ||
"integrate your Amazon S3 with SingleStoreDB." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 10, | ||
"id": "d5053a52-5579-4fea-9594-5250f6fcc289", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"shared_tier_check = %sql show variables like 'is_shared_tier'\n", | ||
"if not shared_tier_check or shared_tier_check[0][1] == 'OFF':\n", | ||
" %sql DROP DATABASE IF EXISTS SalesAnalysis;" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "2dcc585a-43c2-4598-93bf-888143dd5e29", | ||
"metadata": {}, | ||
"source": [ | ||
"<div id=\"singlestore-footer\" style=\"background-color: rgba(194, 193, 199, 0.25); height:2px; margin-bottom:10px\"></div>\n", | ||
"<div><img src=\"https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/singlestore-logo-grey.png\" style=\"padding: 0px; margin: 0px; height: 24px\"/></div>" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"jupyterlab": { | ||
"notebooks": { | ||
"version_major": 6, | ||
"version_minor": 4 | ||
} | ||
}, | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.11.6" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
[meta] | ||
authors=["chetan-thote"] | ||
title="Real-Time Event Monitoring Dataset From Kafka" | ||
description="""\ | ||
The Real-Time Event Monitoring use case illustrates how to leverage Singlestore's capabilities to process and analyze streaming data from a Kafka data source. | ||
""" | ||
difficulty="beginner" | ||
tags=["starter", "loaddata", "kafka"] | ||
lesson_areas=["Ingest"] | ||
icon="database" | ||
destinations=["spaces"] | ||
minimum_tier="free-shared" |
Oops, something went wrong.