-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
just for my learning... (please!) #89
Comments
This is a (nested) parametric type definition, which are IMHO the most complicated part of Julia's type system. If you're not familiar with them, I have opened a PR to try and clarify parametric types in the manual, might also be worth checking the linked issue. Let's go through, from left to right, (maintainers please correct me if I'm wrong). The
julia> t = Table(a = [1, 2, 3], b = [2.0, 4.0, 6.0])
Table with 2 columns and 3 rows:
a b
┌───────
1 │ 1 2.0
2 │ 2 4.0
3 │ 3 6.0
julia> typeof(t[1])
NamedTuple{(:a, :b), Tuple{Int64, Float64}} In this case The Now the fun part, the data itself: julia> typeof(t)
Table{NamedTuple{(:a, :b), Tuple{Int64, Float64}}, 1, NamedTuple{(:a, :b), Tuple{Vector{Int64}, Vector{Float64}}}}
julia> typeof(t.a)
Vector{Int64} (alias for Array{Int64, 1})
julia> typeof(t.b)
Vector{Float64} (alias for Array{Float64, 1}) The column-based data are stored in one big I hope this clarifies things. If you have any suggestions on how to improve the documentation for parametric types, let me know and I can maybe include it in my PR. In fact, this type definition could serve nicely as a showcase example... |
I've just read in #55 that it's actually a bit more complicated in practice: you can end up with |
May I also suggest changing the title to something more descriptive (like "Understanding the |
Wow. That is really complicated.
What does the syntax of ``` T <: NamedTuple, N, Data (…)``` say?
Does it mean that T is a subtype of each of the right hand side items?
Or does it imply nesting, as your explanation suggests:
NamedTuple has to include an N type and a Data type, which are themselves defined above the appearance of the T <: ...?
It’s so much folded into one statement that there is surely “magic” in how it works.
And what does the ```(…)``` signify? Does this refer to the column types within Data (the namedtuple of vectors), which we don’t want to be Any (well, I suppose that is allowed) but which can be each of the instance types in any example of an actual TypedTable—so that they all will match this pattern and inherit the methods of the supertypes.
You’ve given a great explanation of how the mechanics of TypedTables fits into the type system to benefit from method dispatch for the various super types.
But, the syntax of the T <: assertion remains a bit baffling. It is certainly compact but I don’t think that saving a handful of folks some typing (of the keyboard variety!—not the object variety) is a sound reason for pretty severe obscurity. Maybe your parametric types PR could address this. I’d say more typing (as long as it’s not a bunch of boilerplate to simplify parsing) to achieve clarity is probably a worthwhile trade-off.
…------ Original Message ------
From: "Leon" ***@***.******@***.***>>
To: "JuliaData/TypedTables.jl" ***@***.******@***.***>>
Cc: "Lewis Levin" ***@***.******@***.***>>; "Author" ***@***.******@***.***>>
Sent: 1/27/2022 5:55:31 AM
Subject: Re: [JuliaData/TypedTables.jl] just for my learning... (please!) (Issue #89)
This is a (nested) parametric type definition, which are IMHO the most complicated part of Julia's type system. If you're not familiar with them, I have opened a PR<JuliaLang/julia#43891> to try and clarify parametric types in the manual, might also be worth checking the linked issue.
Let's go through, from left to right, (maintainers please correct me if I'm wrong).
The Table type declares three things: the row types, a dummy "dimension", and the column types. It might first seem redundant to declare both row and column types, but it will be shown that this is necessary.
T <: NamedTuple, N, Data (...)
T reifies to a NamedTuple that "maps" column names to a type, thus defining the type of any single row. Let's take an example table:
julia> t = Table(a = [1, 2, 3], b = [2.0, 4.0, 6.0])
Table with 2 columns and 3 rows:
a b
┌───────
1 │ 1 2.0
2 │ 2 4.0
3 │ 3 6.0
julia> typeof(t[1])
NamedTuple{(:a, :b), Tuple{Int64, Float64}}
In this case T became NamedTuple{(:a, :b), Tuple{Int64, Float64}. The <: is necessary in the definition, because type parameters are invariant<https://github.com/adigitoleo/julia/blob/docs-man-types/doc/src/manual/types.md?plain=1#L556-L561>.
The N always resolves to 1 (see next snippet), and is necessary only so that we can have Table <: AbstractArray which means that tables inherit a bunch of nice methods<https://docs.julialang.org/en/v1/manual/interfaces/#man-interface-array>. Basically, the Table is like a Vector of rows (recall that Vecetor is an alias for Array{T,1}).
Now the fun part, the data itself:
julia> typeof(t)
Table{NamedTuple{(:a, :b), Tuple{Int64, Float64}}, 1, NamedTuple{(:a, :b), Tuple{Vector{Int64}, Vector{Float64}}}}
julia> typeof(t.a)
Vector{Int64} (alias for Array{Int64, 1})
julia> typeof(t.b)
Vector{Float64} (alias for Array{Float64, 1})
The column-based data are stored in one big NamedTuple. The types of the column names themselves are not constrained (<:Any). Next, we have the type of the data column itself, which is again parametric. In this case, Tuple{Vararg{AbstractArray{<:Any,N}}} resolved to Tuple{Vector{Int64}, Vector{Float64}}. We must use Vararg because the number of columns (i.e. Vectors) is not known until the table is constructed. The same dummy "dimension" parameter can be re-used, because it will also always be 1 (no such thing as a 2D column).
I hope this clarifies things. If you have any suggestions on how to improve the documentation for parametric types, let me know and I can maybe include it in my PR. In fact, this type definition could serve nicely as a showcase example...
—
Reply to this email directly, view it on GitHub<#89 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAIYWLIPX6UYPRB2QGQYJATUYDM5HANCNFSM5LUKBJFQ>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
This is not Julia syntax, I just wrote it like that for brevity. The component
No, the
I agree that this is the most confusing part. Hopefully it makes more sense now?
My PR is only about changing documentation, and I doubt that changes to fundamental Julia syntax would be accepted at v1.7 of the language. *It could seem confusing that
What's going on here? It's neither abstract nor concrete? I have highlighted this in my changes, but if both of these return false, then we are dealing with a parametric composite type. Parametric types aren't concrete, because they represent a family of types, but they need not represent a family of abstract types. In this case, the |
struct Table{T <: NamedTuple, N, Data <: NamedTuple{<:Any, <:Tuple{Vararg{AbstractArray{<:Any,N}}}}} <: AbstractArray{T, N}
What is the type qualifier for the struct saying?
The text was updated successfully, but these errors were encountered: