Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-6388] PsTableFunction throws NumberFormatException when the … #3777

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

asolimando
Copy link
Member

…'user' column has spaces

Copy link

sonarcloud bot commented Apr 27, 2024

@snuyanzin
Copy link
Contributor

snuyanzin commented Apr 28, 2024

I guess the problem here that other columns also could contain space separated values like e.g. comm

I tried this ./sqlsh select distinct \`user\` from ps for the branch for this PR
and the output is

6.9
13.3
0.0
0.1
1.0
0.2
1.1
0.3
1.2
69.7
39.5
0.4
0.5
0.6
0.7
108
4.4
0.8
1.8
35.8
0.9

while for main same query is

message+
serg
avahi
lp
kernoops
root
rtkit
syslog
systemd+
colord

after some checks I noticed that command could also contain space separated values and probably this is the reason...

Copy link
Member

@caicancai caicancai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some tests to illustrate the jira case?

@asolimando
Copy link
Member Author

@snuyanzin, you are right, both user and comm can contain spaces, I have looked up if any of the other fields could be used in an unambiguous manner as an anchor to understand when user is over, unfortunately it doesn't seem to be any.

What I came up with is to move user as first field, split by whitespace as before, and stop parsing user whenever a purely numeric token is found. For comm I am just composing all the tokens counting on the fact that the number of fields is known.

So at the moment we support spaces, but not when one of the tokens is purely numeric like root 123. I don't have better ideas, it seems already an improvement over the current implementation which doesn't properly support spaces neither in user nor in comm.

@caicancai I have added a test, I had to unpack a bit the current implementation to that aim, thanks for the reminder.

Eager to hear your thoughts, thanks for your comments!

Copy link

sonarcloud bot commented Jun 2, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants