SQLite like API mode of chDB #283

auxten · 2024-11-02T06:46:10Z

ruslandoga · 2024-11-11T16:49:12Z

👋

(Since the PR was renamed to SQLite-like) I wonder if there are plans to allow reading the result blocks similar to sqlite3_column_* or DuckDB's duckdb_fetch_chunk and duckdb_data_chunk_get_vector? It would simplify the bindings as they wouldn't need to implement custom parsers.

I'm really looking forward to the new API!

auxten · 2024-11-23T10:19:19Z

Current progress

SQLite-like API with commonly used functions and supported connection strings
Stateful Query with long live clickhouse engine instance bind with connection
Python DB-API reimplemented with new API. Both persist and memory engine supported See: tests/test_dbapi.py, tests/test_dbapi_persistence.py
ClickHouse memory engine support
- Which clickhouse engines dose chDB currently support and will support in the future #262
- Create table view/ import from pandas in a session #258 (comment)
Some performance improvement (~43%)

Todo

Re-impl the chdb.Session mode

About the chdb.connect

chdb.connect()

    Create a connection to chDB backgroud server.
    Only one open connection is allowed per process. Use `close` to close the connection.
    If called with the same connection string, the same connection object will be returned.
    You can use the connection object to create cursor object. `cursor` method will return a cursor object.

    Args:
        connection_string (str, optional): Connection string. Defaults to ":memory:".
        Aslo support file path like:
          - ":memory:" (for in-memory database)
          - "test.db" (for relative path)
          - "file:test.db" (same as above)
          - "/path/to/test.db" (for absolute path)
          - "file:/path/to/test.db" (same as above)
          - "file:test.db?param1=value1&param2=value2" (for relative path with query params)
          - "///path/to/test.db?param1=value1&param2=value2" (for absolute path)

        Connection string args handling:
          Connection string can contain query params like "file:test.db?param1=value1&param2=value2"
          "param1=value1" will be passed to ClickHouse engine as start up args.

          For more details, see `clickhouse local --help --verbose`
          Some special args handling:
            - "mode=ro" would be "--readonly=1" for clickhouse (read-only mode)

    Returns:
        Connection: Connection object

Examples

example of memory engine of SQLite like API:

# If you want best perf, the chdb.connect() API set will satisfy you
from chdb import connect

conn = connect(":memory:")
cursor = conn.cursor()
# Create a table
cursor.execute(
    """
    CREATE TABLE users (
        id Int32,
        name String,
        scores Array(UInt8)
    ) ENGINE = Memory
    """
)

# Insert test data
cursor.execute(
    """
    INSERT INTO users VALUES
    (1, 'Alice', [95, 87, 92]),
    (2, 'Bob', [88, 85, 90]),
    (3, 'Charlie', [91, 89, 94])
    """
)

# Test fetchone
cursor.execute("SELECT * FROM users WHERE id = 1")
row = cursor.fetchone()

More API see chdb.dbapi like:

import tempfile
import unittest
from chdb import dbapi

with tempfile.TemporaryDirectory() as tmpdirname:
    conn = dbapi.connect(tmpdirname)
    print(conn)
    cur = conn.cursor()
    # cur.execute("CREATE DATABASE IF NOT EXISTS test_db ENGINE = Atomic")
    # cur.execute("USE test_db")
    cur.execute(
        """
    CREATE TABLE rate (
        day Date,
        value Int64
    ) ENGINE = ReplacingMergeTree ORDER BY day"""
    )

    # Insert single value
    cur.execute("INSERT INTO rate VALUES (%s, %s)", ("2021-01-01", 24))
    # Insert multiple values
    cur.executemany(
        "INSERT INTO rate VALUES (%s, %s)",
        [("2021-01-02", 128), ("2021-01-03", 256)],
    )
    # Test executemany outside optimized INSERT/REPLACE path
    cur.executemany(
        "ALTER TABLE rate UPDATE value = %s WHERE day = %s",
        [(72, "2021-01-02"), (96, "2021-01-03")],
    )

    # Test fetchone
    cur.execute("SELECT value FROM rate ORDER BY day DESC LIMIT 2")
    row1 = cur.fetchone()
    self.assertEqual(row1, (96,))
    row2 = cur.fetchone()
    self.assertEqual(row2, (72,))
    row3 = cur.fetchone()
    self.assertIsNone(row3)

    # Test fetchmany
    cur.execute("SELECT value FROM rate ORDER BY day DESC")
    result_set1 = cur.fetchmany(2)
    self.assertEqual(result_set1, ((96,), (72,)))
    result_set2 = cur.fetchmany(1)
    self.assertEqual(result_set2, ((24,),))

    # Test fetchall
    cur.execute("SELECT value FROM rate ORDER BY day DESC")
    rows = cur.fetchall()
    self.assertEqual(rows, ((96,), (72,), (24,)))

    # Clean up
    cur.close()
    conn.close()

For more please refer to:

Fix result buf copy in query_stable_v2

84d099e

auxten marked this pull request as draft November 2, 2024 06:46

auxten added 11 commits November 2, 2024 08:13

Impl chdb_conn connect_chdb close_conn query_conn

50f33a6

Update .clang-tidy

d073583

Basically works

61b875c

Fix SCOPE_EXIT

6ca4dd6

Fix output format

7f21a4b

Fix chdb.h decl

d82a1a7

Minimal changes on ClientBase

744fa13

Fix get_error_msg name to getErrorMsg

aa94638

No exception on save default_database for now

6827908

Add parquet(arrow) dep for local

7de1e76

Use ArrowStream in cursor mode

5d578b8

auxten changed the title ~~Stateful mode v2 of chDB~~ SQLite like API mode v2 of chDB Nov 8, 2024

auxten changed the title ~~SQLite like API mode v2 of chDB~~ SQLite like API mode of chDB Nov 8, 2024

auxten added 2 commits November 11, 2024 10:25

Handle result vec in CH loop

6a1c0d8

Fix cursor_wrapper close

b259bec

auxten added 12 commits November 12, 2024 10:53

Fix exception handling

a4b8d9c

If conn_str empty, use :memory:

b8f44e3

Add pyarrow and pandas as dep

a1a8fdd

Fix close_conn

af9761e

Add sqlitelike API for python

cef9c1b

Add test_conn_cursor

47f78c9

Add trace utils

2365709

Add .cursorignore

1399019

Fix some pylint issue

ca08ea0

Fix lint

40e7143

Add keep_buf switch for local_result_v2

ab080f7

Use sqlitelike API in DBAPI

233c3b1

auxten added 4 commits November 21, 2024 13:08

Fix error in example

401255b

Fix unittest for DBAPI and SQLite like API

238e019

Fix lint

a82d028

Fix Python 3.8

0f1ae8c

auxten marked this pull request as ready for review November 22, 2024 05:52

auxten added 6 commits November 22, 2024 13:04

Handle parameters without values

ec37ebe

Add getQueryOutputSpan

4a61bbe

Add doc string for chdb.connect

7b293dd

Fix double free

da541d1

Fix null check

b5945e1

Test connect properties

7c3cdde

auxten merged commit 18d4aa4 into main Nov 23, 2024
12 checks passed

auxten mentioned this pull request Nov 23, 2024

Long living client #108

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SQLite like API mode of chDB #283

SQLite like API mode of chDB #283

auxten commented Nov 2, 2024

ruslandoga commented Nov 11, 2024 •

edited

Loading

auxten commented Nov 23, 2024 •

edited

Loading

SQLite like API mode of chDB #283

SQLite like API mode of chDB #283

Conversation

auxten commented Nov 2, 2024

ruslandoga commented Nov 11, 2024 • edited Loading

auxten commented Nov 23, 2024 • edited Loading

Current progress

Todo

About the chdb.connect

Examples

ruslandoga commented Nov 11, 2024 •

edited

Loading

auxten commented Nov 23, 2024 •

edited

Loading