Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return type of Enumerable#sum and #product for union elements #15269

Open
straight-shoota opened this issue Dec 11, 2024 · 3 comments · May be fixed by #15308
Open

Return type of Enumerable#sum and #product for union elements #15269

straight-shoota opened this issue Dec 11, 2024 · 3 comments · May be fixed by #15308
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib:numeric

Comments

@straight-shoota
Copy link
Member

Enumerable(T)#sum and #product usually use the element type (T) as the type of the sum/product unless specified.

This is usually a good default but raises the question what should happen when the element type is not a single type, but a union?
Turns out, the implementations picks the first type in the union (note: order of types in a union is lexicographic by the type name):

typeof([1_i8, 2_i16].sum) # => Int8

I think this is quite unexpected and can lead to surprising behaviour: For example [1_i16, 12345678_i32].sum raises an overflow error when the result would fit perfectly into Int32.

If the implementation picks one type from the union, I figure it should rather be the biggest type of the union. For number types this might be relatively easy, but not without issues (e.g. when combinging signed and unsigned types of the same width). And there would be no way this could work for custom types. I don't have a real use cases for math operations on unions outside of Number, but there is probably some with related values types.

In my opinion, it would probably be best to not automatically pick a type if there are multiple options. It's better to make this a compile time error instead of running into weird overflows at runtime because the sum type is smaller than expected.
It's trivial to specify the desired target type by passing an initial value in that type.
For example:

[1_i16, 12345678_i32].sum(0_i32) # => 12345679

Of course this would be a breaking change. But IMO that's acceptable because it corrects a confusing behaviour. This is a rarely used feature anyway, so impact should be minimal.

This repo doesn't seem to use this and test-ecosystem doesn't show anything breaking either: https://github.com/crystal-lang/test-ecosystem/actions/runs/12285549227/job/34286237286

This issue was originally reported at https://fosstodon.org/@[email protected]/113626739326812835

@straight-shoota straight-shoota added kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib:numeric labels Dec 11, 2024
rvprasad added a commit to rvprasad/crystal that referenced this issue Dec 21, 2024
Using the first type of a union type as the type of the result of
`Enumerable#sum()` call can cause runtime failures.  A safer alternative
is to flag the use of union types with `Enumerable#sum()` and suggest
the use of `Enumerable#sum(initial)` with an initial value of the
expected type of the `sum` call.
@rvprasad
Copy link

rvprasad commented Dec 21, 2024

@straight-shoota I have a fix along with a couple of tests (see https://github.com/rvprasad/crystal) but I am facing one issue: how do I test for compiler errors in automated tests?

I tried using assert_macro_error but it did not work as expected, i.e., it expected a TypeException at runtime while Macros#raise does not trip the compiler.

@straight-shoota
Copy link
Member Author

straight-shoota commented Dec 21, 2024

assert_macro is specifically for testing errors in macro expansion. This error is triggered from a macro expression, but it happens in the context of semantic analysis of the non-macro code. The spec helper for this is assert_error.
You can use it like this:

assert_error <<-CRYSTAL, "does support Union types"
  require "prelude"
  [1_i8, 2_i16].sum
  CRYSTAL

rvprasad added a commit to rvprasad/crystal that referenced this issue Dec 22, 2024
Using the first type of a union type as the type of the result of
`Enumerable#sum()` call can cause runtime failures.  A safer alternative
is to flag the use of union types with `Enumerable#sum()` and suggest
the use of `Enumerable#sum(initial)` with an initial value of the
expected type of the `sum` call.
rvprasad added a commit to rvprasad/crystal that referenced this issue Dec 22, 2024
Using the first type of a union type as the type of the result of
`Enumerable#sum()` call can cause runtime failures.  A safer alternative
is to flag the use of union types with `Enumerable#sum()` and suggest
the use of `Enumerable#sum(initial)` with an initial value of the
expected type of the `sum` call.
@rvprasad
Copy link

Thanks. That worked!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib:numeric
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants