You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The first commit switches the existing client to use the /v2/query endpoint. The second commit adds support for NDJSON streaming.
I went over a couple solutions to make the switch to the new endpoint. Here are some requirements to keep in mind:
Behavior should not silently change. For instance, we can't silently have a min_timestamp set when existing usage did not set any.
Having existing usage of the client deprecated is fine, but disruption should ideally happen only once.
The first solution would be to introduce new methods, leaving the existing ones unchanged (still targeting /v1/query):
The existing methods would still have to be marked as deprecated, and it is also unclear how to name the new ones (query_json_v2() isn't ideal, because if/when we remove the old ones, we end up with this v2 naming in the method).
The benefit is we can freely change the data structure to directly match the /v2/query result structure (e.g. no _transform_fields_for_backwards_compatibility(), etc).
The second solution is to adapt the existing methods, while still preserving backwards compatibility:
Omitting min_timestamp is deprecated, and will be required in the future (internally, we use 2020-01-01 as a min_timestamp).
The data structure is mapped to preserve backwards compatibility (_map_v2_result()). This remaps the columns field, and only renames data -> rows (so no concerns of memory/CPU usage client side).
Only concern: /v2/query's default limit when from 100 to 1000, and we rely on the remote default here. I feel it is ok to potentially return more results; could this break existing usage? Anyone could be relying on the limit of 100 in some way?
Imo 2 is the best option, so this is the one I implemented.
For the query stream method, I'm wondering if a utility to filter (in a type safe way?) message types would be useful. Maybe having users filtering with if message['type'] = ... is enough.
Adapt the current methods. We can leave them around longer that way.
Apart from query_json() (which we can also not deprecate and keep), the existing ones I modified aren't meant to go away.
We add new methods / new clients that don't do (somewhat expensive) transformations.
How would the new ones be named? As per the PR body, I'm not a fan of v2 being in method names. Also the data transformation added in this PR will not be expensive, mostly because the data isn't deeply modified, just the top level key name is changed. In a major version, we could introduce a breaking change that renames the key names to match the /v2/query endpoint verbatim. It will be a one off migration for users, and I don't think it will cause any issues as the SDK is most likely not being really used by other libraries which would need to maintain compatibility with both.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The first commit switches the existing client to use the
/v2/queryendpoint. The second commit adds support for NDJSON streaming.I went over a couple solutions to make the switch to the new endpoint. Here are some requirements to keep in mind:
min_timestampset when existing usage did not set any.The first solution would be to introduce new methods, leaving the existing ones unchanged (still targeting
/v1/query):query_json_v2()isn't ideal, because if/when we remove the old ones, we end up with thisv2naming in the method)./v2/queryresult structure (e.g. no_transform_fields_for_backwards_compatibility(), etc).The second solution is to adapt the existing methods, while still preserving backwards compatibility:
min_timestampis deprecated, and will be required in the future (internally, we use2020-01-01as amin_timestamp)._map_v2_result()). This remaps thecolumnsfield, and only renamesdata->rows(so no concerns of memory/CPU usage client side)./v2/query's defaultlimitwhen from 100 to 1000, and we rely on the remote default here. I feel it is ok to potentially return more results; could this break existing usage? Anyone could be relying on the limit of 100 in some way?Imo 2 is the best option, so this is the one I implemented.
For the query stream method, I'm wondering if a utility to filter (in a type safe way?) message types would be useful. Maybe having users filtering with
if message['type'] = ...is enough.