Skip to content

Repartition browser gene model release Hail Tables #1643

@gtiao

Description

@gtiao

Hi team,

Thank you so much for releasing the browser gene model Hail Table! It's so helpful to have all the relevant transcript and coding sequence annotations pulled together and harmonized in one place.

I wanted to note that for a relatively small dataset (~60k rows), there are a lot of partitions (~2k), and that's causing even simple, small joins on the table to run relatively slowly. It's not prohibitively slow, but it is a noticeable inefficiency, so I was wondering if there was a deliberate design decision around the number of partition in the release, or if this is something you could adjust.

Thank you!

Grace

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions