Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

directory based sharding #2

Open
rungwe opened this issue Apr 19, 2024 · 3 comments
Open

directory based sharding #2

rungwe opened this issue Apr 19, 2024 · 3 comments

Comments

@rungwe
Copy link
Collaborator

rungwe commented Apr 19, 2024

Hi there, how best can I implement directory based sharding, whereby the sharding key is a string or enum that represents a category or in a multi-tenant environment, the key could be the name of an organisation. There seems to only numerical range based sharding.

@kibae
Copy link
Owner

kibae commented Apr 19, 2024

Hi, @rungwe :)
I tried adding functional sharding rules for various sharding keys. However, by allowing any value within an entity to be used in the sharding rules, I ended up making methods like findOneById unusable.
#3
https://github.com/kibae/typeorm-sharding-repository/pull/3/files#diff-10165311b50fce469f6f3fcf946526f533abea482c8d56026b8986778cede872

Also, I discovered that when the key of an entity is updated, it necessitates moving the data to another shard. I haven't implemented this feature, and it might be considered a flaw.

Sharding requires you to determine which shard the data is in, so you need to use the key as the sharding key. Are you interested in using a string type of key? I wonder if I overengineered this.

Anyway, please take a look and let me know. Feel free to send a PR(feature/function-sharding).
Thank you.

@rungwe
Copy link
Collaborator Author

rungwe commented Apr 22, 2024

Hi @kibae

Thanks for the prompt response. Indeed it is an interesting problem I have checked out the repo to play around with it to figure out the extend of the challenge.

In my rough implementation that I tried to do, I ended up having to modify the findOneById and findOneByIds by having an extra parameter for the sharding key in an attempt to make it usable. On the bright side though, it seems these 2 methods have been deprecated since a while ago in the typeorm library itself. From that point of view I guess we are giving these methods too much attention which is adding complexity?

findByIds(ids: any[], shardingKey?: string): Promise<Entity[]>;

Another idea, which will make these 2 fully compliant to the original TypeOrm BaseEntity is perhaps a full scan from all shards, there are of course performance implications. However, I have noticed that other findBy methods are already doing a full scan, so perhaps indeed we could be over engineering findOneById and findOneByIds.

On the change of key aspect, I wouldn’t worry about it, sometimes simplicity is better, primary key rarely changes. The library shouldn’t promote such bad practices.

I have also noticed that we default to the use of the last shard if we cannot resolve to the last shard, I think as part of the shard configurations I would propose having a default field, to allow users to be explicit about that behaviour.

Let me know your thoughts on the above?

On another note, I am using NestJs, I was thinking of forking this library and convert it into a NestJs module, I don’t know if you had such intentions? Otherwise, I will assume you would like to keep it pure?

@kibae
Copy link
Owner

kibae commented Apr 23, 2024

Conversation continued from #4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants