-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: ensure no nulls in postgres. Fixes #13711 #14004
Conversation
Signed-off-by: isubasinghe <[email protected]>
Signed-off-by: isubasinghe <[email protected]>
/retest |
1 similar comment
/retest |
case string: | ||
ms, err := json.Marshal(v) | ||
if err != nil { | ||
return nil, err | ||
} | ||
tmp := string(ms) | ||
tmp = tmp[1:] | ||
tmp = tmp[:len(tmp)-1] | ||
m[k] = string(tmp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Horrible horrible hackiness
I have a couple of questions before we merge this. What are opinions on a serialize to []byte and then remove nulls approach instead? It seems saner. |
please don't remove the nulls, we use them in our bash script |
This doesn't remove nulls from workflows, it escapes to \u0000 when persisting to postgres. Your workflow wouldn't persist to postgres anyway if it had nulls. We can make this optional if you want ? |
just want to be able to run a workflow containing null character |
This fix doesn't break this so you got nothing to worry about there. |
AFAICT we don't have any automated tests for the PostgreSQL-specific logic in |
var workflow []byte | ||
var err error | ||
if r.dbType == Postgres { | ||
workflow, err = jsonMarshallRawStrings(wf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be good to ensure the new marshalled json is the same as the old way (with exception of the null char), could there be some if condition that compares them? otherwise there is risk that null character gets solved but other characters start being represented differently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My PR has a test for this: #14011
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MasonM that is only for 1 tiny workflow, in the wild there are all kinds of workflows with various funky characters. i think a runtime check to ensure correctness is better than risk of a different/incorrect workflow being archived/retried
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tooptoop4 @MasonM sounds like maybe its better to serialize first and then remove null bytes instead of working at this layer? I was hoping to avoid that but perhaps that is indeed better.
I have another fix that works on the byte level and is able to restore the original workflow. I think I want that to go in instead. |
I abandoned the byte level fix in favour of this: #14030 Why? To properly do this byte level fix would be a ton of effort, but to add to this, it would be error prone as it relies on a lot of moving parts in the form of these field/offset pairs that track modifications. |
Motivation
Unfortunately postgres does not handle NULL bytes in the json, this is as a result of a non-compliant implementation of JSON that exists in postgres. See more details about this limitation here.
Modifications
This is arguably hacky solution, but this fix attempts to resolve this by mutating the workflow.
Verification
This has tests and additionally been verified by checking postgres.