Skip to content

BigQuery pipe syntax #1604

@MichaelChirico

Description

@MichaelChirico

BigQuery now supports a highly tidy-ish piping syntax:

https://cloud.google.com/bigquery/docs/reference/standard-sql/pipe-syntax

FROM Produce
|> WHERE
    item != 'bananas'
    AND category IN ('fruit', 'nut')
|> AGGREGATE COUNT(*) AS num_items, SUM(sales) AS total_sales
   GROUP BY item
|> ORDER BY item DESC;

Obviously for simple cases this makes the {dbplyr}-SQL translation pretty trivial:

conn |>
  tbl("Produce") |>
  filter(item != "bananas", category %in% c("fruit", "nut")) |>
  summarize(num_items = n(), total_sales = sum(sales, na.rm=TRUE), .by="item") |>
  arrange(desc(item))

What is the right way to go about supporting this translation?

Would this be something only worthwhile implementing in {bigrquery}? Does it make sense to approach it as piecemeal changing how individual verbs are translated, or would it make more sense to implement it in {dbplyr} itself as a different "mode" of translation -- like we have the option to render using sql_options(cte=TRUE/FALSE) to use CTEs or subqueries, we could have a 3rd option to render with pipes?

I'm happy to help with the implementation; starting here to consult on what you think the best approach is.

If this is something that {bigrquery} should be wholly responsible for, that's also fine.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions