Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Row counts don't match row counts from using a JDBC connection or when using an apache adbc connection #2116

Open
matquant14 opened this issue Dec 10, 2024 · 3 comments
Assignees
Labels
status-triage_done Initial triage done, will be further handled by the driver team

Comments

@matquant14
Copy link

matquant14 commented Dec 10, 2024

Python version

3.12.7

Operating system and processor architecture

Windows-10-10.0.19045-SP0

Installed packages

adbc-driver-manager==1.3.0
adbc-driver-snowflake==1.3.0
aiobotocore==2.15.2
aiohappyeyeballs==2.4.4
aiohttp==3.11.9
aioitertools==0.12.0
aiosignal==1.3.1
altair==5.5.0
altair_tiles==0.3.0
annotated-types==0.7.0
anyio==4.7.0
anywidget==0.9.13
appdirs==1.4.4
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arro3-core==0.4.2
arrow==1.3.0
arrow-odbc==8.0.2
asn1crypto==1.5.1
asttokens==3.0.0
async-lru==2.0.4
attrs==24.2.0
babel==2.16.0
beautifulsoup4==4.12.3
black==24.10.0
bleach==6.2.0
bokeh==3.6.2
BotCore==1.1.4
boto3==1.35.75
botocore==1.35.76
Bottleneck==1.4.2
certifi==2024.8.30
cffi==1.17.1
charset-normalizer==3.4.0
click==8.1.7
cloudpickle==3.1.0
colorama==0.4.6
colorcet==3.1.0
comm==0.2.2
commonmark==0.9.1
connectorx==0.4.0
contourpy==1.3.1
cryptography==44.0.0
cycler==0.12.1
DatastreamNavigator==1.0.7
DatastreamPy==2.0.30
debugpy==1.8.9
decorator==5.1.1
defusedxml==0.7.1
et_xmlfile==2.0.0
executing==2.1.0
fastexcel==0.12.0
fastjsonschema==2.21.1
filelock==3.16.1
fonttools==4.55.2
fqdn==1.5.1
frozenlist==1.5.0
fsspec==2024.10.0
gevent==24.11.1
great-tables==0.14.0
greenlet==3.1.1
h11==0.14.0
helium==5.1.0
holoviews==1.20.0
html5lib==1.1
htmltools==0.6.0
httpcore==1.0.7
httpx==0.28.0
hvplot==0.11.1
idna==3.10
importlib_metadata==8.5.0
importlib_resources==6.4.5
ipykernel==6.29.5
ipython==8.30.0
ipywidgets==8.1.5
isoduration==20.11.0
jedi==0.19.2
Jinja2==3.1.4
jmespath==1.0.1
json5==0.10.0
jsonpointer==3.0.0
jsonschema==4.23.0
jsonschema-specifications==2024.10.1
jupyter==1.1.1
jupyter-console==6.6.3
jupyter-events==0.10.0
jupyter-lsp==2.2.5
jupyter_client==8.6.3
jupyter_core==5.7.2
jupyter_server==2.14.2
jupyter_server_terminals==0.5.3
jupyterlab==4.3.2
jupyterlab_pygments==0.3.0
jupyterlab_server==2.27.3
jupyterlab_widgets==3.0.13
kiwisolver==1.4.7
linkify-it-py==2.0.3
llvmlite==0.43.0
lseg-data==2.0.1
lxml==5.3.0
Markdown==3.7
markdown-it-py==3.0.0
MarkupSafe==3.0.2
matplotlib==3.9.3
matplotlib-inline==0.1.7
mdit-py-plugins==0.4.2
mdurl==0.1.2
mercantile==1.2.1
mistune==3.0.2
multidict==6.1.0
mypy-extensions==1.0.0
narwhals==1.15.2
nbclient==0.10.1
nbconvert==7.16.4
nbformat==5.10.4
nest-asyncio==1.6.0
notebook==7.3.1
notebook_shim==0.2.4
numba==0.60.0
numexpr==2.10.2
numpy==2.1.3
odfpy==1.4.1
opencv-python==4.10.0.84
openpyxl==3.1.5
outcome==1.3.0.post0
overrides==7.7.0
packaging==24.2
pandas==2.2.3
pandocfilters==1.5.1
panel==1.5.4
param==2.1.1
parso==0.8.4
pathspec==0.12.1
patsy==1.0.1
pendulum==3.0.0
pillow==11.0.0
pip-system-certs==4.0
platformdirs==4.3.6
plotly==5.24.1
polars==1.16.0
polars-business==0.3.21
polars-ols==0.3.5
polars-xdt==0.16.0
polars_ds==0.6.2
prometheus_client==0.21.1
prompt_toolkit==3.0.48
propcache==0.2.1
protobuf==5.28.3
psutil==6.1.0
psygnal==0.11.1
pure_eval==0.2.3
pyarrow==18.1.0
pycparser==2.22
pydantic==2.10.3
pydantic_core==2.27.1
pyee==11.1.0
Pygments==2.18.0
pyhumps==3.8.0
PyJWT==2.10.1
pyodbc==5.2.0
pyOpenSSL==24.3.0
pyparsing==3.2.0
pypdf==5.1.0
PySocks==1.7.1
pyTelegramBotAPI==4.23.0
python-calamine==0.3.1
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-json-logger==2.0.7
pytz==2024.2
pyviz_comms==3.0.3
pywin32==308
pywinpty==2.0.14
pyxdg==0.28
pyxlsb==1.0.10
PyYAML==6.0.2
pyzmq==26.2.0
qtconsole==5.6.1
QtPy==2.4.2
referencing==0.35.1
requests==2.32.3
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rpds==5.1.0
rpds-py==0.22.3
ruff==0.8.1
s3fs==2024.10.0
s3transfer==0.10.4
scipy==1.14.1
seaborn==0.13.2
selenium==4.27.1
Send2Trash==1.8.3
setuptools==75.6.0
simplejson==3.19.3
six==1.17.0
sniffio==1.3.1
snowflake-connector-python==3.12.4
sortedcontainers==2.4.0
soupsieve==2.6
SQLAlchemy==2.0.36
sqlglot==25.33.0
sqlglotrs==0.3.0
stack-data==0.6.3
statsmodels==0.14.4
tabulate==0.9.0
tenacity==8.5.0
terminado==0.18.1
time-machine==2.16.0
tinycss2==1.4.0
tomlkit==0.13.2
tornado==6.4.2
traitlets==5.14.3
trio==0.27.0
trio-websocket==0.11.1
types-python-dateutil==2.9.0.20241003
typing_extensions==4.12.2
tzdata==2024.2
uc-micro-py==1.0.3
uri-template==1.3.0
urllib3==2.2.3
uv==0.5.7
vega-datasets==0.9.0
vegafusion==2.0.1
vegafusion-python-embed==1.6.9
vl-convert-python==1.6.1
watchdog==2.3.1
wcwidth==0.2.13
webcolors==24.11.1
webdriver-manager==4.0.2
webencodings==0.5.1
websocket-client==1.8.0
wheel==0.45.1
widgetsnbextension==4.0.13
wincertstore==0.2.1
wrapt==1.17.0
wsproto==1.2.0
wxPython==4.2.2
xarray==2024.11.0
xlrd==2.0.1
xlsx2csv==0.8.4
XlsxWriter==3.2.0
xyzservices==2024.9.0
yarl==1.18.3
zipp==3.21.0
zope.event==5.0
zope.interface==7.2

What did you do?

I am running a few variations of an adhoc SQL query to compare results.  When I connect using DataGrip, via JDBC, my SQL query returns 5652 rows.  When I connect with python, using the snowflake connector, I return 5625 rows.  When I connect with python, using the apache adbc connection, I get results that match the JDBC output.

What did you expect to see?

Row counts match across connection types.

Can you set logging to DEBUG and collect the logs?

import logging
import os

for logger_name in ('snowflake.connector',):
    logger = logging.getLogger(logger_name)
    logger.setLevel(logging.DEBUG)
    ch = logging.StreamHandler()
    ch.setLevel(logging.DEBUG)
    ch.setFormatter(logging.Formatter('%(asctime)s - %(threadName)s %(filename)s:%(lineno)d - %(funcName)s() - %(levelname)s - %(message)s'))
    logger.addHandler(ch)
@sfc-gh-sghosh
Copy link

sfc-gh-sghosh commented Dec 12, 2024

Hello @matquant14 ,

Thanks for the update.
Can you let us know the table name in Snowflake with account details, DB and schema.
Ideally all snowflake connectors should return same number of rows.

Could you please check

  1. In snowsight UI perform select count(*) from the table.
  2. Now do the same thing via Python application using Snowflake python connector
  3. Now do the same thing via the JDBC application using Snowflake jdbc connector 3.21.1.
    https://repo1.maven.org/maven2/net/snowflake/snowflake-jdbc/3.21.0/snowflake-jdbc-3.21.0.jar

Please let us know.
Regards,
Sujan

@sfc-gh-sghosh sfc-gh-sghosh added status-triage Issue is under initial triage and removed bug needs triage labels Dec 12, 2024
@matquant14
Copy link
Author

Hello @matquant14 ,

Thanks for the update. Can you let us know the table name in Snowflake with account details, DB and schema. Ideally all snowflake connectors should return same number of rows.

Could you please check

  1. In snowsight UI perform select count(*) from the table.
  2. Now do the same thing via Python application using Snowflake python connector
  3. Now do the same thing via the JDBC application using Snowflake jdbc connector 3.21.1.
    https://repo1.maven.org/maven2/net/snowflake/snowflake-jdbc/3.21.0/snowflake-jdbc-3.21.0.jar

Please let us know. Regards, Sujan

Hi @sfc-gh-sghosh,

I'm running a complex SQL query w/ multiple joins. It's not for a single table. I'm using our enterprise account, so would rather not share those details publicly. Is there a secure way to share those details?

I did re-run everything, using the snowsight UI, DataGrip IDE with3.20.0 JDBC, python snowflake connector, and the python adbc connection, and I have inconsistent results across the board which has me even more concerned now

┌───────────────────────────────────┬──────────┐
│ Connection Type                   │   Counts │
├───────────────────────────────────┼──────────┤
│ Snowsight UI                      │     5629 │
│ DataGrip JDBC 3.20.0              │     5625 │
│ Snowflake Python Connector 3.12.4 │     5625 │
│ ADBC Snowflake 1.13.0             │     5634 │
└───────────────────────────────────┴──────────┘

@sfc-gh-sghosh
Copy link

Hello @matquant14 ,

Please open an official support ticket with account details and query IDs; we need those for investigation.
https://www.snowflake.com/en/support/

Regards,
Sujan

@sfc-gh-sghosh sfc-gh-sghosh added status-triage_done Initial triage done, will be further handled by the driver team and removed status-triage Issue is under initial triage labels Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status-triage_done Initial triage done, will be further handled by the driver team
Projects
None yet
Development

No branches or pull requests

2 participants