-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Belt.Int.fromString silently throws away part of input #3732
Comments
@RasmusKlett looks reasonable, do you have a clean alternative in mind? |
I've tried a few versions, but haven't been able to maintain the performance of the builtin parseInt. I'm not sure how much we care about performance, but maybe someone smarter than me can help? I've tried simply wrapping the OCaml int_of_string, this has a 3x performance penalty on my machine (Excuse the reason syntax): let intFromString = str =>
switch (int_of_string(str)) {
| number => Some(number)
| exception (Failure(_msg)) => None
}; Note this does not allow leading whitespace - I'm not sure what the correct behavour here is? MDN suggests simply doing a regex validation. This performs slightly better than int_of_string, and can easily be made to allow leading whitespace. function filterInt(value) {
if (/^[-+]?(\d+)$/.test(value)) {
return Number(value);
} else {
return undefined;
}
} I've also tried implementing it in terms of an explicit for-loop over code points. This performs terribly for some reason. If this was native code, I would expect this to perform best, since it skips a lot of checks not necessary with static types, but that was definitely not the case. function codePointInt(value) {
var result = 0;
var length = value.length;
for (var i=0; i < length; i++) {
var code = value.codePointAt(i);
result += (code - 48) * 10**(length - 1 - i);
}
return result;
} So there are two questions here: Do we allow leading whitespace? And can we live with a 2-3x performance hit? If not, I think we need someone smarter than me to do some tuning. Here's the stuff I've benched, if anyone wants to try. |
I don't think performance is critical here, but not introducing additional deps is quite important. |
It will take me some time, I haven't tried committing to the project before. I will give it a shot next week. Unless someone disagrees, I will go ahead and use the Mozilla-suggested version of validating the input using a regex. This has the lowest performance hit, and it can allow leading whitespace. |
@RasmusKlett There is one thing annoying that in latest js, |
I agree the default should be more reliable than Perhaps |
another case happening to me: let x = Int.fromString("22 foo bar") // 22 |
The actual implementation of Belt.Int.fromString relies on the Javascript parseInt function [0]. That function parses Strings in a permissive manner. For instance, if the function encounters an alphabetical character, it returns the value up to that character: parseInt("123a456") // output: 123 The same applies for Belt.Float.fromString that relies on parseFloat [1] and have a similar interpretation. This commit proposes a stricter implementation of Int and Float parsing using the Javascript Number constructor [2]. The Belt.Int.fromString uses the more specific Math.trunc function [3], that parses string similarly to Number and then returns the Int part of that number. Relying only on the Number constructor keeps the implementation simple. However, there exists one last quirk while using it with an empty string. It returns the 0 value: Number("") // ouptput: 0 Consequently, Belt.Int.fromString("") will return 0. Tests for Belt.(Int/Float).fromString have been updated in order to reflect that. Fix rescript-lang#3732 [0] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/parseInt [1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/parseFloat [2] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/Number [3] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/trunc
The actual implementation of Belt.Int.fromString relies on the Javascript `parseInt` function [0]. That function parses strings in a permissive manner. For instance, if the function encounters an alphabetical character, it returns the value up to that character: parseInt("123a456") // output: 123 The same applies for Belt.Float.fromString that relies on `parseFloat` [1] and have a similar interpretation. This commit proposes a stricter implementation of Int and Float parsing using the Javascript `Number` constructor [2]. The Belt.Int.fromString uses the more specific `Math.trunc` function [3], that parses string similarly to `Number` and then returns the Int part of that number. Relying only on the `Number` constructor keeps the implementation simple. However, there exists one last quirk while using it with an empty string. It returns the `0` value: Number("") // ouptput: 0 Consequently, Belt.Int.fromString("") will return `0`. Tests for Belt.(Int/Float).fromString have been updated in order to reflect that. Fix rescript-lang#3732 [0] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/parseInt [1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/parseFloat [2] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/Number [3] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/trunc Signed-Off-By: Ronan-A. Cherrueau <[email protected]>
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
See discussion about this issue in rescript-lang/rescript-core#86 and proposed solution in rescript-lang/rescript-core#93 Probably better to fix this in |
I encountered some very surprising behaviour using Belt.Int.fromString. If given the string representation of a floating point number, it will silently ignore anything beyond the decimal point.
Looking at the source, it turns out fromString is simply wrapping the Javascript parseInt. This is a very quirky Javascript-esque function, and in my opinion does not deserve a place in Belt.
Examples of surprising behaviour:
I suggest that at least the first two should result in None, since the full string could not be parsed. I find it convenient that leading whitespace is automatically stripped, but some might prefer strict None behaviour in this case as well.
Please - let us not repeat the mistakes of Javascript just because it is convenient to implement.
The text was updated successfully, but these errors were encountered: