psycopg
is a popular PostgreSQL adapter for Python. Here is how you can connect to MyDuck Server using psycopg
:
import psycopg
with psycopg.connect("dbname=postgres user=postgres host=127.0.0.1 port=5432", autocommit=True) as conn:
with conn.cursor() as cur:
...
The COPY
command in PostgreSQL is a powerful tool for bulk data transfer. Here is how you can use it with the psycopg
library to interact directly with MyDuck Server:
with cur.copy("COPY test.tb1 (id, num, data) FROM STDIN") as copy:
copy.write(b"1\t100\taaa\n")
with cur.copy("COPY test.tb1 (id, num, data) FROM STDIN") as copy:
copy.write_row((1, 100, "aaa"))
with cur.copy("COPY test.tb1 TO STDOUT") as copy:
for block in copy:
print(block)
with cur.copy("COPY test.tb1 TO STDOUT") as copy:
for row in copy.rows():
print(row)
2. Importing and Exporting Data in Arrow Format
The pyarrow
package allows efficient data interchange between DataFrame libraries and MyDuck Server. Here is how to import and export data in Arrow format:
import pandas as pd
import pyarrow as pa
data = {
'id': [1, 2, 3],
'num': [100, 200, 300],
'data': ['aaa', 'bbb', 'ccc']
}
df = pd.DataFrame(data)
table = pa.Table.from_pandas(df)
import io
output_stream = io.BytesIO()
with pa.ipc.RecordBatchStreamWriter(output_stream, table.schema) as writer:
writer.write_table(table)
with cur.copy("COPY test.tb1 FROM STDIN (FORMAT arrow)") as copy:
copy.write(output_stream.getvalue())
arrow_data = io.BytesIO()
with cur.copy("COPY test.tb1 TO STDOUT (FORMAT arrow)") as copy:
for block in copy:
arrow_data.write(block)
with pa.ipc.open_stream(arrow_data.getvalue()) as reader:
arrow_df = reader.read_all()
print(arrow_df)
with pa.ipc.open_stream(arrow_data.getvalue()) as reader:
pandas_df = reader.read_pandas()
print(pandas_df)
Polars is a fast DataFrame library that can work with Arrow data. Here is how to use Polars to read Arrow or pandas dataframes:
import polars as pl
polars_df = pl.from_arrow(arrow_df)
polars_df = pl.from_pandas(pandas_df)
You can also retrieve query results from MyDuck Server as DataFrames using Arrow format. Here is an example:
# Copy query result to a Polars DataFrame
arrow_data = io.BytesIO()
with cur.copy("COPY (SELECT id, num * num AS num FROM test.tb1) TO STDOUT (FORMAT arrow)") as copy:
for block in copy:
arrow_data.write(block)
with pa.ipc.open_stream(arrow_data.getvalue()) as reader:
arrow_table = reader.read_all()
polars_df = pl.from_arrow(arrow_table)
print(polars_df)