diff --git a/authors/chetan-thote.toml b/authors/chetan-thote.toml
new file mode 100644
index 00000000..e2519fd7
--- /dev/null
+++ b/authors/chetan-thote.toml
@@ -0,0 +1,4 @@
+name="Chetan Thote"
+title="Product Team"
+image="singlestore"
+external=false
diff --git a/notebooks/load-csv-data-s3/meta.toml b/notebooks/load-csv-data-s3/meta.toml
new file mode 100644
index 00000000..1be7078c
--- /dev/null
+++ b/notebooks/load-csv-data-s3/meta.toml
@@ -0,0 +1,11 @@
+[meta]
+authors=["chetan-thote"]
+title="Sales Data Analysis Dataset From Amazon S3"
+description="""\
+ The Sales Data Analysis use case demonstrates how to utilize Singlestore's powerful querying capabilities to analyze sales data stored in a CSV file."""
+difficulty="beginner"
+tags=["starter", "loaddata", "s3"]
+lesson_areas=["Ingest"]
+icon="database"
+destinations=["spaces"]
+minimum_tier="free-shared"
diff --git a/notebooks/load-csv-data-s3/notebook.ipynb b/notebooks/load-csv-data-s3/notebook.ipynb
new file mode 100644
index 00000000..f570ff20
--- /dev/null
+++ b/notebooks/load-csv-data-s3/notebook.ipynb
@@ -0,0 +1,360 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "97f96c34-81a9-495a-a55d-c565695e87f0",
+ "metadata": {},
+ "source": [
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "612bd378-f145-42f1-b8ce-32557a4c00cd",
+ "metadata": {},
+ "source": [
+ "\n",
+ "
\n",
+ "
\n",
+ "
Note
\n",
+ "
This notebook can be run on a Free Starter Workspace. To create a Free Starter Workspace navigate to Start using the left nav. You can also use your existing Standard or Premium workspace with this Notebook.
\n",
+ "
\n",
+ "
"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "481ce5ae-2ee0-4b63-b3f3-a4b53a5bc381",
+ "metadata": {},
+ "source": [
+ "The Sales Data Analysis use case demonstrates how to utilize Singlestore's powerful querying capabilities to analyze sales data stored in a CSV file. This demo showcases typical operations that businesses perform to gain insights from their sales data, such as calculating total sales, identifying top-selling products, and analyzing sales trends over time. By working through this example, new users will learn how to load CSV data into Singlestore, execute aggregate functions, and perform time-series analysis, which are essential skills for leveraging the full potential of Singlestore in a business intelligence context."
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "72fe6854-5b6e-4b79-a2d0-79bda0e18429",
+ "metadata": {},
+ "source": [
+ "Demo Flow
"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "5ed26ab8-1217-4fbd-be0c-4e7728314671",
+ "metadata": {},
+ "source": [
+ ""
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "46fb95a8-1402-4b97-b04a-560741f96181",
+ "metadata": {},
+ "source": [
+ "## How to use this notebook"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "a701cd90-dd42-4a06-b7a1-e0a2132af558",
+ "metadata": {},
+ "source": [
+ ""
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "2d22fd53-2c18-40e5-bb38-6d8ebc06f1b8",
+ "metadata": {},
+ "source": [
+ "## Create a database\n",
+ "\n",
+ "We need to create a database to work with in the following examples."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "1624ccea-0c15-4048-ab2a-fe2178e5912a",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "shared_tier_check = %sql show variables like 'is_shared_tier'\n",
+ "if not shared_tier_check or shared_tier_check[0][1] == 'OFF':\n",
+ " %sql DROP DATABASE IF EXISTS SalesAnalysis;\n",
+ " %sql CREATE DATABASE SalesAnalysis;"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "901e6ec1-2530-497a-857e-7973bb9714f1",
+ "metadata": {},
+ "source": [
+ "Create Table
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "id": "7ac4285d-0d2d-44ec-8b1e-eef7b4f9358c",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "CREATE TABLE `SalesData` (\n",
+ " `Date` text CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,\n",
+ " `Store_ID` bigint(20) DEFAULT NULL,\n",
+ " `ProductID` text CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,\n",
+ " `Product_Name` text CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,\n",
+ " `Product_Category` text CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci,\n",
+ " `Quantity_Sold` bigint(20) DEFAULT NULL,\n",
+ " `Price` float DEFAULT NULL,\n",
+ " `Total_Sales` float DEFAULT NULL\n",
+ ")"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "1de959eb-4f17-45d4-af74-42f45684d67b",
+ "metadata": {},
+ "source": [
+ "Load Data Using Pipelines
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "84f592b8-a12e-41d8-bff0-fe96175992b9",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "CREATE PIPELINE SalesData_Pipeline AS\n",
+ "LOAD DATA S3 's3://singlestoreloaddata/SalesData/sales_data.csv'\n",
+ "CONFIG '{ \\\"region\\\": \\\"ap-south-1\\\" }'\n",
+ "/*\n",
+ "CREDENTIALS '{\"aws_access_key_id\": \"\",\n",
+ " \"aws_secret_access_key\": \"\"}'\n",
+ " */\n",
+ "INTO TABLE SalesData\n",
+ "FIELDS TERMINATED BY ','\n",
+ "LINES TERMINATED BY '\\r\\n'\n",
+ "IGNORE 1 lines;\n",
+ "\n",
+ "\n",
+ "START PIPELINE SalesData_Pipeline;"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "id": "352e340a-a613-4ec5-94a5-c4e1f3565757",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "SELECT * FROM SalesData LIMIT 10"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "4508d431-7683-4ac9-a4e8-d939c47dd1fc",
+ "metadata": {},
+ "source": [
+ "Sample Queries
\n",
+ "\n",
+ "We will try to execute some Analytical Queries"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "55ac6134-976c-4f27-bc2b-140835b64f13",
+ "metadata": {},
+ "source": [
+ "Top-Selling Products"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "id": "d666c04b-ccb0-47cc-a1e7-efaa7a590d27",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "SELECT product_name, SUM(quantity_sold) AS total_quantity_sold FROM SalesData\n",
+ " GROUP BY product_name ORDER BY total_quantity_sold DESC LIMIT 5;"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "87c36700-0db8-405f-97c0-e13a6a2ae0cb",
+ "metadata": {},
+ "source": [
+ "Sales Trends Over Time"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "id": "b46d72c7-07a3-4e23-8fe4-c238b5517ef6",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "SELECT date, SUM(total_sales) AS total_sales FROM SalesData\n",
+ "GROUP BY date ORDER BY total_sales desc limit 5;"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "e6c232a1-acce-4d25-aebd-1a89aafba47d",
+ "metadata": {},
+ "source": [
+ "Total Sales by Store"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "id": "af571f6c-0145-4466-9ed7-000d37e4738f",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "SELECT Store_ID, SUM(total_sales) AS total_sales FROM SalesData\n",
+ "GROUP BY Store_ID ORDER BY total_sales DESC limit 5;"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "9bf1d7f3-c636-4ac0-b2be-e48eaca747ef",
+ "metadata": {},
+ "source": [
+ "Sales Contribution by Product (Percentage)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "id": "5613b3e8-72d2-48dc-a7ae-47911df24cd2",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "SELECT product_name, SUM(total_sales) * 100.0 / (SELECT SUM(total_sales) FROM SalesData) AS sales_percentage FROM SalesData\n",
+ " GROUP BY product_name ORDER BY sales_percentage DESC limit 5;"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "afed201d-d9f2-49cc-8a14-df35103abd4e",
+ "metadata": {},
+ "source": [
+ "Top Days with Highest Sale"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "id": "7fd8d785-7861-4570-88b3-0185c2c9c298",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "SELECT date, SUM(total_sales) AS total_sales FROM SalesData\n",
+ " GROUP BY date ORDER BY total_sales DESC LIMIT 5;"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "6738b6e4-5e8b-45db-b3dc-ebcb73bcf629",
+ "metadata": {},
+ "source": [
+ "## Conclusion\n",
+ "\n",
+ "\n",
+ "
\n",
+ "
\n",
+ "
Action Required
\n",
+ "
If you created a new database in your Standard or Premium Workspace, you can drop the database by running the cell below. Note: this will not drop your database for Free Starter Workspaces. To drop a Free Starter Workspace, terminate the Workspace using the UI.
\n",
+ "
\n",
+ "
\n",
+ "\n",
+ "We have shown how to insert data from a Amazon S3 using `Pipelines` to SingleStoreDB. These techniques should enable you to\n",
+ "integrate your Amazon S3 with SingleStoreDB."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "id": "d5053a52-5579-4fea-9594-5250f6fcc289",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "shared_tier_check = %sql show variables like 'is_shared_tier'\n",
+ "if not shared_tier_check or shared_tier_check[0][1] == 'OFF':\n",
+ " %sql DROP DATABASE IF EXISTS SalesAnalysis;"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2dcc585a-43c2-4598-93bf-888143dd5e29",
+ "metadata": {},
+ "source": [
+ "\n",
+ ""
+ ]
+ }
+ ],
+ "metadata": {
+ "jupyterlab": {
+ "notebooks": {
+ "version_major": 6,
+ "version_minor": 4
+ }
+ },
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.11.6"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/notebooks/load-data-kakfa/meta.toml b/notebooks/load-data-kakfa/meta.toml
new file mode 100644
index 00000000..7d895bee
--- /dev/null
+++ b/notebooks/load-data-kakfa/meta.toml
@@ -0,0 +1,12 @@
+[meta]
+authors=["chetan-thote"]
+title="Real-Time Event Monitoring Dataset From Kafka"
+description="""\
+ The Real-Time Event Monitoring use case illustrates how to leverage Singlestore's capabilities to process and analyze streaming data from a Kafka data source.
+ """
+difficulty="beginner"
+tags=["starter", "loaddata", "kafka"]
+lesson_areas=["Ingest"]
+icon="database"
+destinations=["spaces"]
+minimum_tier="free-shared"
diff --git a/notebooks/load-data-kakfa/notebook.ipynb b/notebooks/load-data-kakfa/notebook.ipynb
new file mode 100644
index 00000000..ac28ae67
--- /dev/null
+++ b/notebooks/load-data-kakfa/notebook.ipynb
@@ -0,0 +1,404 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "14762a67-4baa-493e-a182-89de7fcbbaf2",
+ "metadata": {},
+ "source": [
+ ""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "25c2b147-47cb-4755-8b8f-95c93cc9e35d",
+ "metadata": {},
+ "source": [
+ "\n",
+ "
\n",
+ "
\n",
+ "
Note
\n",
+ "
This notebook can be run on a Free Starter Workspace. To create a Free Starter Workspace navigate to Start using the left nav. You can also use your existing Standard or Premium workspace with this Notebook.
\n",
+ "
\n",
+ "
"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "ee90231c-d301-4d3b-a72e-99cf5338f0f5",
+ "metadata": {},
+ "source": [
+ "Introduction
"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "f6f20e3f-c17a-4a11-b394-3b02b8fb5d31",
+ "metadata": {},
+ "source": [
+ "The Real-Time Event Monitoring use case illustrates how to leverage Singlestore's capabilities to process and analyze streaming data from a Kafka data source. This demo showcases the ability to ingest real-time events, such as application logs or user activities, and perform immediate analysis to gain actionable insights. By working through this example, new users will learn how to set up a Kafka data pipeline, ingest streaming data into Singlestore, and execute real-time queries to monitor event types, user activity patterns, and detect anomalies. This use case highlights the power of Singlestore in providing timely and relevant information for decision-making in dynamic environments."
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "2d209d08-ee22-4cdd-81be-51d1f742cb91",
+ "metadata": {},
+ "source": [
+ ""
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "a7bdf2ca-0ca0-4a67-b860-0df79df38878",
+ "metadata": {},
+ "source": [
+ "## How to use this notebook"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "63d529ea-4f84-4ffe-9c93-691e787b5613",
+ "metadata": {},
+ "source": [
+ ""
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "5f963a4f-0eb0-4282-bc2f-f8bf48eef971",
+ "metadata": {},
+ "source": [
+ "## Create a database\n",
+ "\n",
+ "We need to create a database to work with in the following examples."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "8ccfe96a-05e7-4547-9df9-97e4ed6b3998",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "shared_tier_check = %sql show variables like 'is_shared_tier'\n",
+ "if not shared_tier_check or shared_tier_check[0][1] == 'OFF':\n",
+ " %sql DROP DATABASE IF EXISTS EventAnalysis;\n",
+ " %sql CREATE DATABASE EventAnalysis;"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "a06e69b8-1e19-4ab6-b724-4bd32f235994",
+ "metadata": {},
+ "source": [
+ "\n",
+ "
\n",
+ "
\n",
+ "
Action Required
\n",
+ "
If you have a Free Starter Workspace deployed already, select the database from drop-down menu at the top of this notebook. It updates the connection_url to connect to that database.
\n",
+ "
\n",
+ "
"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "8b5ffbab-62f7-4052-a415-c511b5deb7bf",
+ "metadata": {},
+ "source": [
+ "Create Table
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "id": "f089b404-5907-4236-a05f-ad0e5bf8157a",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "CREATE TABLE `eventsdata` (\n",
+ " `user_id` varchar(120) DEFAULT NULL,\n",
+ " `event_name` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,\n",
+ " `advertiser` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,\n",
+ " `campaign` varchar(110) DEFAULT NULL,\n",
+ " `gender` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,\n",
+ " `income` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,\n",
+ " `page_url` varchar(512) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,\n",
+ " `region` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,\n",
+ " `country` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL\n",
+ ")"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "057f3cbf-7a49-4954-bd04-f8f42839dfc7",
+ "metadata": {},
+ "source": [
+ "Load Data using Pipeline
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "7a7163c9-0ca5-40a9-b503-811376e1af2b",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "CREATE PIPELINE `eventsdata`\n",
+ "AS LOAD DATA KAFKA 'public-kafka.memcompute.com:9092/ad_events'\n",
+ "BATCH_INTERVAL 2500\n",
+ "ENABLE OUT_OF_ORDER OPTIMIZATION\n",
+ "DISABLE OFFSETS METADATA GC\n",
+ "INTO TABLE `eventsdata`\n",
+ "FIELDS TERMINATED BY '\\t' ENCLOSED BY '' ESCAPED BY '\\\\'\n",
+ "LINES TERMINATED BY '\\n' STARTING BY ''\n",
+ "(\n",
+ " `events`.`user_id`,\n",
+ " `events`.`event_name`,\n",
+ " `events`.`advertiser`,\n",
+ " `events`.`campaign`,\n",
+ " `events`.`gender`,\n",
+ " `events`.`income`,\n",
+ " `events`.`page_url`,\n",
+ " `events`.`region`,\n",
+ " `events`.`country`\n",
+ ")\n",
+ "\n",
+ "START PIPELINE `eventsdata`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "id": "0b75627d-684c-4900-bb3c-1ec539ac3671",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "SELECT COUNT(*) FROM `eventsdata`"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "15366453-7483-4e4f-a67f-439b66dfb4f4",
+ "metadata": {},
+ "source": [
+ "Sample Queries
"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "94c011f2-2662-4c12-b70b-e6601ed7bdca",
+ "metadata": {},
+ "source": [
+ "Events by Region"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "id": "3195c978-7356-45ba-8864-832f75ec90c7",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "SELECT events.country\n",
+ "AS `events.country`,\n",
+ "COUNT(events.country) AS 'events.countofevents'\n",
+ "FROM eventsdata AS events\n",
+ "GROUP BY 1 ORDER BY 2 DESC LIMIT 5;"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "0a2d68aa-1ea4-49a0-9cbe-04030e754342",
+ "metadata": {},
+ "source": [
+ "Events by Top 5 Advertisers"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "id": "890ce930-ebbe-4415-861a-60820fbf631d",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "SELECT\n",
+ " events.advertiser AS `events.advertiser`,\n",
+ " COUNT(*) AS `events.count`\n",
+ "FROM eventsdata AS events\n",
+ "WHERE\n",
+ " (events.advertiser LIKE '%Subway%' OR events.advertiser LIKE '%McDonalds%' OR events.advertiser LIKE '%Starbucks%' OR events.advertiser LIKE '%Dollar General%' OR events.advertiser LIKE '%YUM! Brands%')\n",
+ "GROUP BY 1\n",
+ "ORDER BY 2 DESC;"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "094a0e46-fbd9-440b-843d-ba5736e48a51",
+ "metadata": {},
+ "source": [
+ "Ad visitors by gender and income"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "id": "270a21bd-7166-4f01-9ee0-8f77cc263a30",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "SELECT * FROM (\n",
+ "SELECT *, DENSE_RANK() OVER (ORDER BY z___min_rank) as z___pivot_row_rank, RANK() OVER (PARTITION BY z__pivot_col_rank ORDER BY z___min_rank) as z__pivot_col_ordering, CASE WHEN z___min_rank = z___rank THEN 1 ELSE 0 END AS z__is_highest_ranked_cell FROM (\n",
+ "SELECT *, MIN(z___rank) OVER (PARTITION BY `events.income`) as z___min_rank FROM (\n",
+ "SELECT *, RANK() OVER (ORDER BY CASE WHEN z__pivot_col_rank=1 THEN (CASE WHEN `events.count` IS NOT NULL THEN 0 ELSE 1 END) ELSE 2 END, CASE WHEN z__pivot_col_rank=1 THEN `events.count` ELSE NULL END DESC, `events.count` DESC, z__pivot_col_rank, `events.income`) AS z___rank FROM (\n",
+ "SELECT *, DENSE_RANK() OVER (ORDER BY CASE WHEN `events.gender` IS NULL THEN 1 ELSE 0 END, `events.gender`) AS z__pivot_col_rank FROM (\n",
+ "SELECT\n",
+ " events.gender AS `events.gender`,\n",
+ " events.income AS `events.income`,\n",
+ " COUNT(*) AS `events.count`\n",
+ "FROM eventsdata AS events\n",
+ "WHERE\n",
+ " (events.income <> 'unknown' OR events.income IS NULL)\n",
+ "GROUP BY 1,2) ww\n",
+ ") bb WHERE z__pivot_col_rank <= 16384\n",
+ ") aa\n",
+ ") xx\n",
+ ") zz\n",
+ "WHERE (z__pivot_col_rank <= 50 OR z__is_highest_ranked_cell = 1) AND (z___pivot_row_rank <= 500 OR z__pivot_col_ordering = 1) ORDER BY z___pivot_row_rank;"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "8716cb1f-b1f4-4ec8-9f74-df48cc7b4154",
+ "metadata": {},
+ "source": [
+ "Pipeline will keep pushing data from the kafka topic. Once your data is loaded you can stop the pipeline using below command"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "id": "35573b60-4d2c-4861-9fad-c53312993dd3",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "STOP PIPELINE eventsdata"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "30a9b5de-79d0-481c-99cb-7321cbad95d9",
+ "metadata": {},
+ "source": [
+ "## Conclusion\n",
+ "\n",
+ "\n",
+ "
\n",
+ "
\n",
+ "
Action Required
\n",
+ "
If you created a new database in your Standard or Premium Workspace, you can drop the database by running the cell below. Note: this will not drop your database for Free Starter Workspaces. To drop a Free Starter Workspace, terminate the Workspace using the UI.
\n",
+ "
\n",
+ "
\n",
+ "\n",
+ "We have shown how to connect to Kafka using `Pipelines` and insert data into SinglestoreDB. These techniques should enable you to\n",
+ "integrate your Kafka topics with SingleStoreDB."
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "ac2472f8-bca5-419a-82e4-0e39ea328522",
+ "metadata": {},
+ "source": [
+ "Drop the pipeline using below command"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "id": "7486de45-9c10-43c4-9f0d-2b9d68671b22",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%sql\n",
+ "DROP PIPELINE eventsdata"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "id": "204475a5-9f22-4ec7-8a61-86e802c52055",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "shared_tier_check = %sql show variables like 'is_shared_tier'\n",
+ "if not shared_tier_check or shared_tier_check[0][1] == 'OFF':\n",
+ " %sql DROP DATABASE IF EXISTS EventAnalysis;"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "330a667f-19e3-4af8-97d7-1d9d28cfe002",
+ "metadata": {},
+ "source": [
+ "\n",
+ ""
+ ]
+ }
+ ],
+ "metadata": {
+ "jupyterlab": {
+ "notebooks": {
+ "version_major": 6,
+ "version_minor": 4
+ }
+ },
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.11.6"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}