Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in chaining two consecutive @count #43

Open
danlooo opened this issue Sep 4, 2023 · 2 comments
Open

Error in chaining two consecutive @count #43

danlooo opened this issue Sep 4, 2023 · 2 comments

Comments

@danlooo
Copy link

danlooo commented Sep 4, 2023

I want to count the number of groups having the same number of elements. In R, this can be archived using a consecutive call of count:

library(dplyr)
df <- tibble(id=1:10, group = c(1,1,1,1,1,2,2,2,2,2))
df |>
  count(group) |>
  count(n)
#> Storing counts in `nn`, as `n` already present in input
#> ℹ Use `name = "new_name"` to pick a new name.
#> # A tibble: 1 × 2
#>       n    nn
#>   <int> <int>
#> 1     5     2

Indeed, there are 2 groups having 5 elements each. However, this can not be archived with Tidier.jl:

using Tidier
df = DataFrame(id=1:10,group=[1,1,1,1,1,2,2,2,2,2])
@chain df begin
    @count(group)
    @count(n)
end
# ERROR: ArgumentError: column :n in returned data frame is not equal to grouping key :n
# Stacktrace:
#  [1] _combine_prepare_norm(gd::DataFrames.GroupedDataFrame{DataFrame}, cs_vec::Vector{Any}, keepkeys::Bool, ungroup::Bool, copycols::Bool, keeprows::Bool, renamecols::Bool, threads::Bool)
#    @ DataFrames ~/.julia/packages/DataFrames/58MUJ/src/groupeddataframe/splitapplycombine.jl:104
#  [2] _combine_prepare(gd::DataFrames.GroupedDataFrame{DataFrame}, ::Base.RefValue{Any}; keepkeys::Bool, ungroup::Bool, copycols::Bool, keeprows::Bool, renamecols::Bool, threads::Bool)
#    @ DataFrames ~/.julia/packages/DataFrames/58MUJ/src/groupeddataframe/splitapplycombine.jl:52
#  [3] _combine_prepare
#    @ ~/.julia/packages/DataFrames/58MUJ/src/groupeddataframe/splitapplycombine.jl:26 [inlined]
#  [4] combine(gd::DataFrames.GroupedDataFrame{DataFrame}, args::Union{Regex, AbstractString, Function, Signed, Symbol, Unsigned, Pair, Type, DataAPI.All, Between, Cols, InvertedIndices.InvertedIndex, AbstractVecOrMat}; keepkeys::Bool, ungroup::Bool, renamecols::Bool, threads::Bool)
#    @ DataFrames ~/.julia/packages/DataFrames/58MUJ/src/groupeddataframe/splitapplycombine.jl:861
#  [5] macro expansion
#    @ ~/.julia/packages/TidierData/hI0Pq/src/TidierData.jl:324 [inlined]
#  [6] top-level scope
#    @ REPL[3]:70
@kdpsingh
Copy link
Member

kdpsingh commented Sep 4, 2023

This is fixable. We need to check whether n (and nn, etc) already exists before deciding on the new variable name containing the count.

@kdpsingh kdpsingh transferred this issue from TidierOrg/Tidier.jl Sep 4, 2023
@kdpsingh
Copy link
Member

kdpsingh commented Sep 4, 2023

Moved this issue to the TidierData.jl package because Tidier.jl is a meta-package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants