Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Line parsing strategy is incorrect in some cases #37

Open
marianogappa opened this issue Jul 29, 2018 · 2 comments
Open

Line parsing strategy is incorrect in some cases #37

marianogappa opened this issue Jul 29, 2018 · 2 comments

Comments

@marianogappa
Copy link
Owner

Let's say the data is separated by commas. This input:

1,2,3
1,,3
1,2,3

Should be interpreted as 3 rows and 3 columns with all floats, the second row having a zero for its second column.

This is not what happens at the moment. Instead, the second row is ignored as it's interpreted as having only 2 columns. This doesn't make sense except in the case where the separator is a space, where many subsequent spaces are trimmed to one as a preprocessing step.

This preprocessing should only be applied when the separator is a space.

@Kuraio
Copy link
Contributor

Kuraio commented Nov 11, 2018

What if the data is a mix of floats, dates and/or strings?

This case could work:

14.2	215	70	Nestle
16.4		70	Häagen-Dazs

What about:

14.2	215	70	Nestle
16.4	325	70

You would want to exclude this second line, right?

So the default value for "empty" floats is 0?
What about dates or strings?

@marianogappa
Copy link
Owner Author

Sorry, I wasn't clear about the issue.

There's a difference in human perception on how separators work, which I took into account when parsing lines: normally you'd have "[value][separator][value]...", but specifically with spaces you could have many separators together. Concretely, whenever chart sees many separators together it interprets as if there was only one: https://github.com/marianogappa/chart/blob/master/format/format.go#L69 and https://github.com/marianogappa/chart/blob/master/format/format.go#L170

This seems correct when the separator is a space (is it?), but e.g. definitely wrong when it's a comma.

Outstanding work on this issue is to evaluate if this "smart feature" is helpful at all, if it should be removed or applied only to spaces, to write adequate tests for those cases (hopefully here: https://github.com/marianogappa/chart/blob/master/format/format_test.go) and to actually implement the changes. * wink wink *

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants