-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Functional non-mutating methods. #46453
base: master
Are you sure you want to change the base?
Conversation
…rrallel behavior to `Base.setindex`.
base/array.jl
Outdated
|
||
function setindex(xs::AbstractArray, v, I...) | ||
@_propagate_inbounds_meta | ||
T = promote_type(eltype(xs), typeof(v)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means setindex(rand(5), ones(2), 3:4)
makes a Vector{Any}
.
I see that the pirate methods in ArrayInterfaceCore don't seem to make this mistake. Do they have tests, of all the many weird ways to index arrays?
julia> @which Base.setindex(rand(5), ones(2), 3:4)
setindex(x::AbstractArray, v, i...)
@ ArrayInterfaceCore ~/.julia/packages/ArrayInterfaceCore/7kMjZ/src/ArrayInterfaceCore.jl:248
julia> Base.setindex(rand(5), ones(2), 3:4)
5-element Vector{Float64}:
0.9187440365487348
0.5816199534074197
1.0
1.0
0.1473879772541169
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have tests but I thought that tkf's branch was discussed and agreed upon so I moved most of it over to what's done there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about a combined version of that accounts for non-scalar indexing and accommodates new types
function setindex(x::AbstractArray, v, i...)
@_propagate_inbounds_meta
inds = to_indices(x, i)
if inds isa Tuple{Vararg{Integer}}
T = promote_type(eltype(x), typeof(v))
else
T = promote_type(eltype(x), eltype(v))
end
y = similar(x, T)
copy!(y, x)
y[inds...] = v
return y
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this definition along with the corresponding tests from ArrayInterface
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -38,7 +38,7 @@ get(f::Callable, t::Tuple, i::Integer) = i in 1:length(t) ? getindex(t, i) : f() | |||
# returns new tuple; N.B.: becomes no-op if `i` is out-of-bounds | |||
|
|||
""" | |||
setindex(c::Tuple, v, i::Integer) | |||
setindex(x::Tuple, v, i::Integer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relatedly, note that this doesn't allow non-scalar indexing at all:
julia> getindex(Tuple("julia"), 3:5)
('l', 'i', 'a')
julia> Base.setindex(Tuple("julia"), ('l', 'i', 'o'), 3:5)
ERROR: MethodError:
And the pirate methods don't alter this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if we wanted to do multiple values at once for Tuple and NamedTuple but I personally have no problem with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are going to support setting multiple values what do we do with repeated indices? setindex!
just does another scalar setindex!
at the repeated position when iterating through the new values. Do we just write it so the behavior appears the same to users, or do we throw an error for repeated values like deleteat!
does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest addition supports non-scalar setindex
for both Tuple
and NamedTuple
. These probably could be more efficient but it's a bit difficult to accomplish this without traits that could tell us if the indices have a known size at compile time to help do things in place (instead of acting on a temporary vector and then transforming to a tuple).
It may be good to get #46500 in first so that cases where it's more efficient to perform the mutating counterpart on a copy of the provided array. |
@mbauman I saw that you reviewed the previous effort for |
merge(nt, (; idx => v)) | ||
setindex(nt::NamedTuple, v, idx::Symbol) = merge(nt, (; idx => v)) | ||
function setindex(nt::NamedTuple, vs, idxs::AbstractVector{Symbol}) | ||
merge(nt, (; [i => v for (i,v) in zip(idxs, vs)]...)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There might be a little footgun hidden here, as zip
does not check lengths:
julia> zip([:a,:c], [1,2,3])
zip([:a, :c], [1, 2, 3])
julia> collect(zip([:a,:c], [1,2,3]))
2-element Vector{Tuple{Symbol, Int64}}:
(:a, 1)
(:c, 2)
So I think there should be a length check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also allows duplicates, which getindex does not. Is this indended?
julia> _setindex((x=1, y=2, z=3), [4, 5, 6, 7], [:x, :y, :x])
(x = 6, y = 5, z = 3)
julia> (x=1, y=2, z=3)[[:x, :y, :x]]
ERROR: duplicate field name
Only one test https://github.com/JuliaLang/julia/pull/46453/files#diff-a4ae685ea77cc80977dc4214e7dbb9d96fed55bdb918396d3453f840da0a695bR308
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured this was more like the behavior of setindex!
where a value can be overwritten multiple times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But get/set both allow repeats for arrays. I'm not sure whether getindex
disallowing repeats here is a feature or a bug, either. Do you know?
For arrays, the types must match: setindex!(rand(10), (3, 4), 5:6)
is an error, rather than itererating the tuple, because getindex(rand(10), 5:6)
is a vector. Is there an argument for why this method should accept & iterate anything?
julia> rand(10)[5:6]
2-element Vector{Float64}:
0.46482934129714726
0.06656695848727301
julia> (x=1, y=2, z=3)[[:x, :y]]
(x = 1, y = 2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But get/set both allow repeats for arrays. I'm not sure whether
getindex
disallowing repeats here is a feature or a bug, either. Do you know?
I think it's just because you can't have repeated fields in a NamedTuple
, so (x=1, y=2, z=3)[[:x, :y, :x]]
would have to make up a new symbol for the last :x
in order for it to work.
For arrays, the types must match:
setindex!(rand(10), (3, 4), 5:6)
is an error, rather than itererating the tuple, becausegetindex(rand(10), 5:6)
is a vector. Is there an argument for why this method should accept & iterate anything?
I'm not completely sure what rules should carry over. You also can't index arrays with anything but integers and arrays of integers. I don't know if this means that anytime setindex
sets multiple values the collection should be an array, or if the type should match the destination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I guess the lack of repeats in getindex is indeed forced on you by the return type.
The others I don't know, I just think they need careful thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative definition could be
function setindex(nt::NamedTuple, vs, idxs::Union{AbstractVector{Symbol},Tuple{Vararg{Symbol}}})
merge(nt, NamedTuple{Tuple(idxs)}(vs))
end
which requires vs
have a conversion to Tuple
and matches the definition of getindex
more.
pinging @aplavin who might also be excited about this PR. |
delete(nt::NamedTuple, key::Integer) | ||
|
||
Constructs a new `NamedTuple` with the field corresponding to `key` removed. | ||
If no field corresponds to `key`, `nt` is returned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This differs from the error for tuples with out-of-bounds index. The logic is that setindex
lets you add arbitrary keys? But not for integers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was following the behavior of delete
which doesn't throw an error if key
isn't found. deleteat!
and deleteat
do throw errors if the field/index isn't found.
merge(nt, (; idx => v)) | ||
setindex(nt::NamedTuple, v, idx::Symbol) = merge(nt, (; idx => v)) | ||
function setindex(nt::NamedTuple, vs, idxs::AbstractVector{Symbol}) | ||
merge(nt, (; [i => v for (i,v) in zip(idxs, vs)]...)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I guess the lack of repeats in getindex is indeed forced on you by the return type.
The others I don't know, I just think they need careful thought.
What is the status of this? I would love to get this merged, especially |
I'd love to get this finalized. Are there any issues that I still need to address here so we can move forward. |
I don't see any open issues. |
I tried to use established discussion from previous PRs, rather than start from scratch. Hopefully, that gets us closer to a final product here. It might be worth noting that I didn't incorporate these methods on |
I also think it is good to keep this initial PR as minimal as possible. And I think some persistent nagging will be required to get this merged. I think there is no real technical or design issue why #33495 and other previous attempts did not get merged. Just that core devs have limited time and other priorities. |
Same as how setindex!(A, X, inds...)
A[inds...] = X
Store values from array X within some subset of A as specified by inds. |
This is a very good point. We should probably document this. I think a key strength of julia is, that it allows both of the following:
In my view julia is not a language that compromises on composability and genericness in order to force you to write fast code. Mathematically it is not possible to have a data structure for which every operation is fast. But for writing generic code it is critical to support everything and not force the user to write fast code. |
|
It is already in use in base with
The point of adding this to Base is that this is already part of the API. The interface that I previously referenced being ill-defined is the dictionary interface, not this one. |
Thank you for the correction, my mistake. I support adding more general methods to the existing |
Thank for the suggestion. Documentation now includes a performance note. |
base/array.jl
Outdated
end | ||
end | ||
end | ||
function setindex(t::Tuple, v, inds::AbstractVector{<:Integer}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For vectors with statically known size (eg SVector
s), the following alternative performs better and without allocations:
function setindex(t::Tuple, v, inds::AbstractVector{<:Integer})
Base.@_propagate_inbounds_meta
foldl(ntuple(identity, length(inds)); init=t) do acc, i
Base.setindex(acc, v[i], inds[i])
end
end
Maybe, this should be used instead? And/or, add a similar method with inds::NTuple{Integer}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the snippet! it now replaces the old commented code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I must've gone crazy with that fold
!
This works the same for static arrays, better for regular arrays, and even seems efficient for tuples:
function setindex(t::Tuple, v, inds::AbstractVector{<:Integer})
ntuple(length(t)) do i
ix = findfirst(==(i), inds)
isnothing(ix) ? t[i] : v[ix]
end
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and for unitranges
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we want to allow having tuples of indices be permitted as inds
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah maybe best without tuples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's correct.
julia> let x = zeros(3)
setindex!(x, [4,5,6], [1,2,1])
x
end
3-element Vector{Float64}:
6.0
5.0
0.0
julia> function _setindex(t::Tuple, v, inds::AbstractVector{<:Integer})
ntuple(length(t)) do i
ix = findfirst(==(i), inds)
isnothing(ix) ? t[i] : v[ix]
end
end
_setindex (generic function with 1 method)
julia> let x = Tuple(zeros(3))
_setindex(x, [4,5,6], [1,2,1])
end
(4, 5, 0.0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, findlast
instead of findfirst
? Or throw?
Not clear which is the best semantics.
true | ||
``` | ||
""" | ||
function deleteat(src::AbstractVector, i::Integer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleteat(src::AbstractVector, i) = deleteat!(copy(src), i)
?
Instead of all these methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see - it may work with setindex-able, but not resizeable arrays as of now...
Unfortunate that so much duplication is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's a bit messy. I thought about putting # TODO refactor when immutable arrays are in base
but I don't think we know how that's going to play out still. Also, I use "TODO" as a keyword in searches for my code bases, so I try not to pollute community projects with it unless it's clearly helpful for everyone.
@@ -381,8 +381,43 @@ julia> Base.setindex(nt, "a", :a) | |||
(a = "a",) | |||
``` | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setindex(x::NamedTuple{names}, v, i::Integer) where {names} = NamedTuple{names}(setindex(values(x), v, i))
|
What's wrong with |
It's discussed a bit here, but the gist of it is the order of the arguments is odd and is more of an artifact of translating Furthermore, the function means something different for dictionaries than arrays. It's add a new index/key for dicts and replacing an existing for other collections. |
Argument order: yes, it's strange in If there was a function
Yeah, and this behavior is already familiar to everyone using this function. Making an immutable counterpart with the same meaning is only natural. |
I really doubt this is going to change in the future since the entire reason is to support splatting indices.
My point was that the similarity between the two functions when working with immutable functions is less clear given that Although I would prefer something that diverges from |
Isn't the exact same concern (splatting indices) just as valid for
Nevertheless, much more people are familiar with Same as is done in this PR for |
function delete(d::ImmutableDict{K,V}, key) where {K,V} | ||
if isdefined(d, :parent) | ||
if isequal(d.key, key) | ||
d.parent | ||
else | ||
ImmutableDict{K,V}(delete(d.parent, key), d.key, d.value) | ||
end | ||
else | ||
d | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation is not quite correct since I can current;y construct an ImmutableDict where there are multiple entries for key
each shadowed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect delete
would be more easily done with adding a new key as a tombstone (with the same key
, but val
being #undef
), rather than a bulk copy like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect
delete
would be more easily done with adding a new key as a tombstone (with the samekey
, butval
being#undef
), rather than a bulk copy like this.
I really like that idea, but I think we would need to have a new type since we can't rely on #undef
for bits types.
That's a good point. I've mostly been trying to capture what I can remember from the triage where it was discussed. There may or may not be some things I'm leaving out that are important. We probably need to wait until relevant parties chime in. |
Should |
|
I'm not against this but I fear I've already made the scope of this PR too big given the lack of consensus on many details herein. |
@@ -95,6 +96,7 @@ Library changes | |||
* `@time` now separates out % time spent recompiling invalidated methods ([#45015]). | |||
* `eachslice` now works over multiple dimensions; `eachslice`, `eachrow` and `eachcol` return | |||
a `Slices` object, which allows dispatching to provide more efficient methods ([#32310]). | |||
* The non-mutationg `Base.setindex` function now has `AbstractDict` support ([#46453]). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mutationg
Her I've implemented what I think are some of the more straightforward non-mutating counterparts to mutating methods.
I borrowed liberally from @tkf's PR #33495 to get
Base.setindex
working here.Should also resolve type piracy in ArrayInterface on
Base.setindex
JuliaArrays/ArrayInterface.jl#305.Also implemented
delete
,deleteat
, andinsert
borrowing from work from ArrayInterface.I also took a bit from #24836, but I thought
insert
was a more clear method name that can be used to accomplish what the non-mutatingpush
/pushfirst
methods (and a bit more).I think some issues with functional collections can could really benefit from some basic tools like this (#34478, https://discourse.julialang.org/t/functional-implementation-of-collect/15177).
Currently
delete
,deleteat
, andinsert
are in the export list but if that's too intrusive I can change that.