Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLite like API mode of chDB #283

Merged
merged 36 commits into from
Nov 23, 2024
Merged

SQLite like API mode of chDB #283

merged 36 commits into from
Nov 23, 2024

Conversation

auxten
Copy link
Member

@auxten auxten commented Nov 2, 2024

Fix #197

@auxten auxten marked this pull request as draft November 2, 2024 06:46
@auxten auxten changed the title Stateful mode v2 of chDB SQLite like API mode v2 of chDB Nov 8, 2024
@auxten auxten changed the title SQLite like API mode v2 of chDB SQLite like API mode of chDB Nov 8, 2024
@ruslandoga
Copy link

ruslandoga commented Nov 11, 2024

👋

(Since the PR was renamed to SQLite-like) I wonder if there are plans to allow reading the result blocks similar to sqlite3_column_* or DuckDB's duckdb_fetch_chunk and duckdb_data_chunk_get_vector? It would simplify the bindings as they wouldn't need to implement custom parsers.

I'm really looking forward to the new API!

@auxten auxten marked this pull request as ready for review November 22, 2024 05:52
@auxten
Copy link
Member Author

auxten commented Nov 23, 2024

Current progress

image

Todo

  • Re-impl the chdb.Session mode

About the chdb.connect

chdb.connect()

    Create a connection to chDB backgroud server.
    Only one open connection is allowed per process. Use `close` to close the connection.
    If called with the same connection string, the same connection object will be returned.
    You can use the connection object to create cursor object. `cursor` method will return a cursor object.

    Args:
        connection_string (str, optional): Connection string. Defaults to ":memory:".
        Aslo support file path like:
          - ":memory:" (for in-memory database)
          - "test.db" (for relative path)
          - "file:test.db" (same as above)
          - "/path/to/test.db" (for absolute path)
          - "file:/path/to/test.db" (same as above)
          - "file:test.db?param1=value1&param2=value2" (for relative path with query params)
          - "///path/to/test.db?param1=value1&param2=value2" (for absolute path)

        Connection string args handling:
          Connection string can contain query params like "file:test.db?param1=value1&param2=value2"
          "param1=value1" will be passed to ClickHouse engine as start up args.

          For more details, see `clickhouse local --help --verbose`
          Some special args handling:
            - "mode=ro" would be "--readonly=1" for clickhouse (read-only mode)

    Returns:
        Connection: Connection object

Examples

example of memory engine of SQLite like API:

# If you want best perf, the chdb.connect() API set will satisfy you
from chdb import connect

conn = connect(":memory:")
cursor = conn.cursor()
# Create a table
cursor.execute(
    """
    CREATE TABLE users (
        id Int32,
        name String,
        scores Array(UInt8)
    ) ENGINE = Memory
    """
)

# Insert test data
cursor.execute(
    """
    INSERT INTO users VALUES
    (1, 'Alice', [95, 87, 92]),
    (2, 'Bob', [88, 85, 90]),
    (3, 'Charlie', [91, 89, 94])
    """
)

# Test fetchone
cursor.execute("SELECT * FROM users WHERE id = 1")
row = cursor.fetchone()

More API see chdb.dbapi like:

import tempfile
import unittest
from chdb import dbapi

with tempfile.TemporaryDirectory() as tmpdirname:
    conn = dbapi.connect(tmpdirname)
    print(conn)
    cur = conn.cursor()
    # cur.execute("CREATE DATABASE IF NOT EXISTS test_db ENGINE = Atomic")
    # cur.execute("USE test_db")
    cur.execute(
        """
    CREATE TABLE rate (
        day Date,
        value Int64
    ) ENGINE = ReplacingMergeTree ORDER BY day"""
    )

    # Insert single value
    cur.execute("INSERT INTO rate VALUES (%s, %s)", ("2021-01-01", 24))
    # Insert multiple values
    cur.executemany(
        "INSERT INTO rate VALUES (%s, %s)",
        [("2021-01-02", 128), ("2021-01-03", 256)],
    )
    # Test executemany outside optimized INSERT/REPLACE path
    cur.executemany(
        "ALTER TABLE rate UPDATE value = %s WHERE day = %s",
        [(72, "2021-01-02"), (96, "2021-01-03")],
    )

    # Test fetchone
    cur.execute("SELECT value FROM rate ORDER BY day DESC LIMIT 2")
    row1 = cur.fetchone()
    self.assertEqual(row1, (96,))
    row2 = cur.fetchone()
    self.assertEqual(row2, (72,))
    row3 = cur.fetchone()
    self.assertIsNone(row3)

    # Test fetchmany
    cur.execute("SELECT value FROM rate ORDER BY day DESC")
    result_set1 = cur.fetchmany(2)
    self.assertEqual(result_set1, ((96,), (72,)))
    result_set2 = cur.fetchmany(1)
    self.assertEqual(result_set2, ((24,),))

    # Test fetchall
    cur.execute("SELECT value FROM rate ORDER BY day DESC")
    rows = cur.fetchall()
    self.assertEqual(rows, ((96,), (72,), (24,)))

    # Clean up
    cur.close()
    conn.close()

For more please refer to:

@auxten auxten merged commit 18d4aa4 into main Nov 23, 2024
12 checks passed
@auxten auxten mentioned this pull request Nov 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reimplement the session mode
2 participants