-
Notifications
You must be signed in to change notification settings - Fork 3
Snowflake Dialect and Testing #363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
knassre-bodo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left behind some initial feedback
knassre-bodo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few more revisions
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
| When submitting a PR, you can control which CI tests run by adding special flags | ||
| to your **latest commit message**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's specify that all of these flags are case insensitive
README.md
Outdated
| When submitting a PR, you can control which CI tests run by adding special flags | ||
| to your **latest commit message**. | ||
|
|
||
| - To run **PyDough CI tests**, add: `[run CI]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specify that this one will not run tests from any sql database except sqlite
| @@ -0,0 +1,307 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also show another example where we create the connector object, THEN pass it in to Snowflake with connection=..., then run another test on that (& after that test show us doing something with the connector, like fetching the query text of the last query run through that connector).
Reply via ReviewNB
| @@ -0,0 +1,307 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: let's fix the formatting on the first line so lines is on the next line and is matched up with .PARTITION
Reply via ReviewNB
knassre-bodo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just some last nitpicks and cleanup revisions
| "validate_qualify_columns": False, | ||
| } | ||
| # Exclude Snowflake dialect to avoid some issues | ||
| # related to qualify and column decorrelation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it column de-correlation or just column name qualification? I ask because I'm pretty sure qualification happens somewhere else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. I mixed things up.
updated it
| @pytest.fixture(scope="session") | ||
| def get_sf_sample_graph( | ||
| sf_sample_graph_path: str, | ||
| valid_sample_graph_names: set[str], | ||
| ) -> graph_fetcher: | ||
| """ | ||
| A function that takes in the name of a graph from the supported sample | ||
| Snowflake graph names and returns the metadata for that PyDough graph. | ||
| """ | ||
|
|
||
| @cache | ||
| def impl(name: str) -> GraphMetadata: | ||
| if name not in valid_sample_graph_names: | ||
| raise Exception(f"Unrecognized graph name '{name}'") | ||
| return pydough.parse_json_metadata_from_file( | ||
| file_path=sf_sample_graph_path, graph_name=name | ||
| ) | ||
|
|
||
| return impl | ||
|
|
||
|
|
||
| @pytest.fixture(scope="session") | ||
| def get_sf_defog_graphs() -> graph_fetcher: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge these: we can place the sample graphs & defog graphs for Snowflake in the same JSON file, and merge the fixtures since they are functions that take in the name of a graph and return that graph within that file. We can ignore the valid_sample_graph_names stuff since that is a mostly-redundant layer of error checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll address that in a followup PR. For now, I will keep them separate to match what we have for SQLite.
pydough/sqlglot/sqlglot_helpers.py
Outdated
| return expr.this if isinstance(expr, SQLGlotAlias) else expr | ||
|
|
||
|
|
||
| def is_boolean_expression(expr: SQLGlotExpression) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this being used anywhere? I can't find it. If not, let's delete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, old code
| def convert_get_part( | ||
| self, args: SQLGlotExpression, types: list[PyDoughType] | ||
| ) -> SQLGlotExpression: | ||
| return sqlglot_expressions.Anonymous(this="SPLIT_PART", expressions=args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can get rid of this if we are using PYDOP_TO_SNOWFLAKE_FUNC
| def convert_sum( | ||
| self, arg: SQLGlotExpression, types: list[PyDoughType] | ||
| ) -> SQLGlotExpression: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have an idea: let's move this method to the base class, then have this be the override, that way we don't need to case on SUM in the subclasses' implementation of convert_call_to_sqlglot. We can also edit the mysql file accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated base and SF files.
MySQL doesn't have SUM variant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦 I was thinking of PostgreSQL
| arg (SQLGlotExpression): The argument to the SUM function. | ||
| types (list[PyDoughType]): The types of the arguments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's include backticks for better tooltips, and exclude the types (since they are already in the type annotation).
| arg (SQLGlotExpression): The argument to the SUM function. | |
| types (list[PyDoughType]): The types of the arguments. | |
| `arg` The argument to the SUM function. | |
| `types`: The types of the arguments. |
| constant_propagation: bool = False, | ||
| dialect: DialectType = None, | ||
| max_depth: int | None = None, | ||
| ): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add type hint for the return and throughout the function if it's necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy/paste from SqlGlot so won't update that
| Args: | ||
| expression: expression to simplify | ||
| constant_propagation: whether the constant propagation rule should be used | ||
| max_depth: Chains of Connectors (AND, OR, etc) exceeding `max_depth` will be skipped |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the dialect parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
| constant_propagation: whether the constant propagation rule should be used | ||
| max_depth: Chains of Connectors (AND, OR, etc) exceeding `max_depth` will be skipped | ||
| Returns: | ||
| sqlglot.Expression: simplified expression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| sqlglot.Expression: simplified expression | |
| simplified expression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
pydough/sqlglot/sqlglot_helpers.py
Outdated
| if isinstance(expr, boolean_types): | ||
| return True | ||
|
|
||
| return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can simplify this just to:
| if isinstance(expr, boolean_types): | |
| return True | |
| return False | |
| return isinstance(expr, boolean_types) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tip. However, whole function not needed so deleted that code.
| unit: DateTimeUnit, | ||
| ) -> SQLGlotExpression: | ||
| # Update argument type to fit datetime | ||
| dt_expr = self.handle_datetime_base_arg(args[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type hint if required
resolves #296
All dialects testing passed, see here