-
-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create table view/ import from pandas in a session #258
Comments
create temporary table seems works, but cannot select it. >>> from chdb.session import Session
>>>
>>> db = Session()
>>> db.query("create database db")
>>> db.query("use db")
>>> x="create temporary table data (id UInt32, x UInt32) engine MergeTree order by id sample by id as select number+1 as id, randUniform(1, 100) as x from numbers(10000);"
>>> db.query(x)
>>> y='select avg(x) as "avg", round(quantile(0.95)(x), 2) as p95 from data sample 0.1;'>>> db.query(y)
Code: 60. DB::Exception: Table db.data does not exist. (UNKNOWN_TABLE)
>>> x="create table data (id UInt32, x UInt32) engine MergeTree order by id sample by id as select number+1 as id, randUniform(1, 100) as x from numbers(10000);"
>>> db.query(x)
>>> db.query(y) 50.2891,95 |
try Query on Pandas DataFrame of https://clickhouse.com/docs/en/chdb/install/python import chdb
import pandas as pd
df = pd.DataFrame(
{
"a": [1, 2, 3, 4, 5, 6],
"b": ["tom", "jerry", "auxten", "tom", "jerry", "auxten"],
}
)
chdb.query("SELECT b, sum(a) FROM Python(df) GROUP BY b ORDER BY b").show() |
chdb.dataframe is going to be deprecated soon |
still can't select from table >>>
>>> chdb.query("create table a engine=Memory as SELECT b, sum(a) FROM Python(df) GROUP BY b ORDER BY b").show()
>>> chdb.query("select * from a").show() Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.10/dist-packages/chdb/__init__.py", line 78, in query
raise ChdbError(res.error_message())
chdb.ChdbError: Code: 60. DB::Exception: Unknown table expression identifier 'a' in scope SELECT * FROM a. (UNKNOWN_TABLE) |
when use session, the result is empty. >>> from chdb.session import Session
>>> db = Session()
>>> db.query("create database db")
>>> db.query("use db")
>>> import chdb
>>> import pandas as pd
>>> df = pd.DataFrame(
... {
... "a": [1, 2, 3, 4, 5, 6],
... "b": ["tom", "jerry", "auxten", "tom", "jerry", "auxten"],
... }
... )
>>> db.query("create table a engine=Memory as SELECT b, sum(a) FROM Python(df) GROUP BY b ORDER BY b").show()
>>> db.query("select * from a").show() |
engine=Log works. engine=Memory did not insert rows >>> db.query("select count() from a")
0
>>> db.query("create table a1 engine=Log as SELECT b, sum(a) FROM Python(df) GROUP BY b ORDER BY b").show()
>>> db.query("select count() from a1") 3
>>> db.query("select * from a1") "auxten",9
"jerry",7
"tom",5 |
Hello! : ) I am working on some feature engineering tasks and I find your great job. But I encountered some API limitations. Such APIs are supported by clickhouse and duckdb, but not supported by chDB. So I would like to discuss this issue.
Use case
When I using clickhouse, my use case is like:
Describe the solution you'd like
In chDB, I tried the API for a whole day but I can not find any APIs like clickhouse's
insert_df
nor any APIs like duckdb'sduckdb.sql("INSERT INTO my_table BY NAME SELECT * FROM my_df")
. I woule like such APIs so that I can import my data into servers, and reuse my analysis query. I do know if it is easy to wrap up on clickhouse c++ library or it is to be implemented in c++. (By the way, the clickhouse insert df is not efficient, that's the reason I try chdb)Describe alternatives you've considered
I looked the official example do the sql look like this, but the problem is my query may consists of create temp table statement, which cannot be done by
chdb.dataframe
(It can only query but not create). Instead, I would like to insert the dataframe into db (logically in a session), and in this session I can do many SQLs.Additional context
Look forward to you reply : )
The text was updated successfully, but these errors were encountered: