Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

similar behavior #66

Open
cjprybol opened this issue Dec 5, 2017 · 2 comments
Open

similar behavior #66

cjprybol opened this issue Dec 5, 2017 · 2 comments

Comments

@cjprybol
Copy link
Contributor

cjprybol commented Dec 5, 2017

DataArrays and CategoricalArrays special case similar by initializing the values with missing, but default behavior in base is to initialize only the array, leaving #undef. How do we want to handle this?

julia> T = Union{Int, Missing}
Union{Int64, Missings.Missing}

julia> similar(CategoricalVector{T}(3))
3-element CategoricalArrays.CategoricalArray{Union{Int64, Missings.Missing},1,UInt32}:
 missing
 missing
 missing

julia> similar(DataArray(T, 3))
3-element DataArrays.DataArray{Int64,1}:
 missing
 missing
 missing

julia> similar(missings(T, 3))
3-element Array{Union{Int64, Missings.Missing},1}:
 #undef
 #undef
 #undef
@nalimilan
Copy link
Member

nalimilan commented Dec 5, 2017

Currently, on 0.7, this is the case for isbits types, but not for other types, which makes it relatively complex to grasp:

julia> similar(missings(Int, 3))
3-element Array{Union{Missings.Missing, Int64},1}:
 missing
 missing
 missing

julia> similar(missings(String, 3))
3-element Array{Union{Missings.Missing, String},1}:
 #undef
 #undef
 #undef

I think it makes more sense to fill all arrays with missing, which could allow us to get rid of missings once we have a short syntax for Union{T, Missing}. After discussing this with @StefanKarpinski, a possible general rule would be to always fill uninitialized arrays with the first singleton type (according to an internal order which doesn't correspond to what the user types).

Waiting for this (which should be discussed in Base), I'd say we should keep DataArrays and CategoricalArrays as they are. But indeed the inconsistency isn't great.

@nalimilan
Copy link
Member

See JuliaLang/julia#24939 about array constructors in Base.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants