Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

result set stream processing performance is horrible after switching to better-eval #528

Closed
odedpeer opened this issue Jun 12, 2023 · 12 comments
Assignees
Labels
bug Something isn't working

Comments

@odedpeer
Copy link

  1. What version of NodeJS driver are you using?

1.6.22

  1. What operating system and processor architecture are you using?

AWS Lambda

  1. What version of NodeJS are you using?

AWS Lambda Nodejs 14 runtime

  1. What are the component versions in the environment (npm list)?

5.Server version:

7.19.2

  1. What did you do?

I am running a query that returns 40,000 records, 12 of the column types are OBJECT.
The query_history view shows the query itself takes 5 seconds, but streaming and processing the result records takes more than 15 minutes

  1. What did you expect to see?

I expected the streaming and processing query to complete in 25 seconds, as it did for driver version 1.6.20
instead, streaming and processing the result records takes more than 15 minutes

  1. Can you set logging to DEBUG and collect the logs?

Yes.

  1. What is your Snowflake account identifier, if any? (Optional)

I pinpointed the reason to #465
the performance of better-eval ishorrible and makes processing the streamed records unusable.

it's easy to reproduce, create a table with 20 columns, where most of the columns are OBJECT or VARIANT, and 40,000 records, and query the table using the driver.

@odedpeer odedpeer added the bug Something isn't working label Jun 12, 2023
@odedpeer
Copy link
Author

you should also fix your automation.
I got a "failure" email after opening the issue https://github.com/snowflakedb/snowflake-connector-nodejs/actions/runs/5241922339/jobs/9464649534

@sfc-gh-dszmolka sfc-gh-dszmolka self-assigned this Jun 12, 2023
@sfc-gh-dszmolka sfc-gh-dszmolka added the status-triage Issue is under initial triage label Jun 12, 2023
@sfc-gh-dszmolka
Copy link
Collaborator

sfc-gh-dszmolka commented Jun 12, 2023

hi and thank you for raising this issue. we're aware and actively working towards the resolution. seems indeed related to better-eval, as a possible workaround you can revert changes done to column.js in the aforementioned PR (== go back to eval) , or just downgrade to 1.6.20.

since eval was used for many years and has been eliminated being a possible security gap, the long-term fix in the library for this issue might not come as easy as reverting better-eval to eval . but as mentioned, we're working on it and i'll keep this ticket updated.

edit: thank you for the notification about the automation too!

@sfc-gh-dszmolka sfc-gh-dszmolka added status-in_progress Issue is worked on by the driver team and removed status-triage Issue is under initial triage labels Jun 12, 2023
@odedpeer
Copy link
Author

@sfc-gh-dszmolka downgrading to 1.6.20 exposes us to a security risk as described in GHSA-h53w-7qw7-vh5c Snowflake NodeJS Driver vulnerable to Command Injection, which was patched in version 1.6.21

@sfc-gh-dszmolka
Copy link
Collaborator

unfortunately, PR465 also comes with 1.6.21 so at this moment apart from the manual modifications of the library, no other workaround or fix is available. will keep this issue updated.

@sfc-gh-dszmolka
Copy link
Collaborator

#536 under review for the fix, will update the Issue once the PR is

  • merged - then it can be installed and tested with npm i https://github.com/snowflakedb/snowflake-connector-nodejs.git (or similar with favoured package manager)
  • released - installable with regular methods of installing snowflake-sdk

@sfc-gh-dszmolka sfc-gh-dszmolka added status-pr_pending_merge A PR is made and is under review and removed status-in_progress Issue is worked on by the driver team labels Jun 15, 2023
@jucabot
Copy link

jucabot commented Jun 21, 2023

Same issue with Node 16 on Azure

@sfc-gh-dszmolka
Copy link
Collaborator

sfc-gh-dszmolka commented Jun 28, 2023

PR #536 merged and will be part of the next release (expecting towards second half of July).
It also needs a bit of clarification on usage, which will also go into the official documentation but it will take time, so as a heads-up here's some example on the usage. it now uses the following as a default:

  • vm for JSON parsing (pre-1.6.21: it was eval, 1.6.21-1.6.23: better-eval)
  • fast-xml-parser for XML parsing (unchanged since 1.6.21 which fixed the XML parsing)

so if you don't configure it, the above will be used as a default. however, the driver will now have the option of configuring either JSON or XML parser, or even both!

some examples:

  1. Configuring the JSON parser with eval (if you wish to use the pre-1.6.21 method)
const snowflake = require('snowflake-sdk');
snowflake.configure({
  jsonColumnVariantParser: rawColumnValue => eval("(" + rawColumnValue + ")")
})
  1. Configuring the XML parser with the fast-xml-parser module (this is also the default)
const snowflake = require('snowflake-sdk');
snowflake.configure({
  xmlColumnVariantParser: rawColumnValue => new (require("fast-xml-parser")).XMLParser().parse(rawColumnValue)
})
  1. Configuring the XML parser with the fast-xml-parser module and initialize it with some non-default configuration options (see the module's XMLparseOptions for details)
const snowflake = require('snowflake-sdk');
snowflake.configure({
  xmlColumnVariantParser: rawColumnValue => new (require("fast-xml-parser"))
    .XMLParser({ignoreAttributes: false, attributeNamePrefix: '@', attributesGroupName: null})
    .parse(rawColumnValue)
})
  1. Configuring the JSON parser with JSON.parse() and XML with fast-xml-parser (the latter is also the default)
const snowflake = require('snowflake-sdk');
snowflake.configure({
  jsonColumnVariantParser: rawColumnValue => JSON.parse(rawColumnValue),
  xmlColumnVariantParser: rawColumnValue => new (require("fast-xml-parser")).XMLParser().parse(rawColumnValue)
})

will update the issue again once the new version is released.

@sfc-gh-dszmolka sfc-gh-dszmolka added status-fixed_awaiting_release The issue has been fixed, its PR merged, and now awaiting the next release cycle of the connector. and removed status-pr_pending_merge A PR is made and is under review labels Jun 28, 2023
@sfc-gh-dszmolka
Copy link
Collaborator

PR is released as part of the latest Snowflake Node.JS driver version 1.7.0. Closing this issue.

@sfc-gh-dszmolka sfc-gh-dszmolka removed the status-fixed_awaiting_release The issue has been fixed, its PR merged, and now awaiting the next release cycle of the connector. label Jul 27, 2023
@haggholm
Copy link

The PR has a brief remark on performance:

Average execution time of testVariantLarge over 10 runs:

better-eval: 22.6s
vm: 4.4s

Is it known how this compares to the older versions using eval()?

@sfc-gh-dszmolka
Copy link
Collaborator

on the particular test, eval was around 10% faster than vm when running the same test. but this might even change with the dataset.
now the parser is configurable so if anyone wishes, they can configure eval again

@sfc-gh-dszmolka
Copy link
Collaborator

Releng team hit a bump in the release process which will be continued on Monday as expected. Reopening the issue and will only close when artifact confirmed to published to npm. Apologies for the inconvenience.

@sfc-gh-dszmolka sfc-gh-dszmolka added the status-fixed_awaiting_release The issue has been fixed, its PR merged, and now awaiting the next release cycle of the connector. label Jul 28, 2023
@sfc-gh-dszmolka
Copy link
Collaborator

PR is released as part of the latest Snowflake Node.JS driver version 1.7.0, for real this time :) Visible on npm too. Closing this issue and thank you for bearing with us !

@sfc-gh-dszmolka sfc-gh-dszmolka removed the status-fixed_awaiting_release The issue has been fixed, its PR merged, and now awaiting the next release cycle of the connector. label Jul 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants