Skip to content
This repository has been archived by the owner on Jan 9, 2024. It is now read-only.

Cannot create a read from replicas pipeline. #400

Open
duke-cliff opened this issue Oct 1, 2020 · 10 comments
Open

Cannot create a read from replicas pipeline. #400

duke-cliff opened this issue Oct 1, 2020 · 10 comments

Comments

@duke-cliff
Copy link

https://github.com/Grokzen/redis-py-cluster/blob/master/rediscluster/pipeline.py#L197

This should pick up the: self.read_from_replicas

Also, when we create the pipeline, we should be able to set read_from_replicas it from:
https://github.com/Grokzen/redis-py-cluster/blob/master/rediscluster/client.py#L445-L464

Why we need it?
We often need to read multiple keys(randomly from different nodes), the redis CPU load will be significantly different(60%) when using pipeline vs mget(a get loop in cluster).
But due to the above bug, we could not use pipeline to read from replicas.

@Grokzen
Copy link
Owner

Grokzen commented Oct 2, 2020

Good catch @duke-cliff Yes i am aware that pipelines do not support read from replicas today and that is something i want to sort out and fix for 3.0.0 later on. There is even two different readonly/read_from_replica connection pools that are buggy themself that needs to be sorted out. Pipelines in general is often a lower priority to sort out then the main code path through the client. Also note that the logic handling would grow in complexity a fair bit to properly route commands in several new situations.

Just as an example, if we have 1 master and two replicas per master node. If we want to have readonly mode enabled and the slave node is down, should we route to the master node or the other slave node for reads? Also based on what command you are running you would get different logics like SET style commands must be routed to a master where GET like command that can be offloaded to a pipeline can be routed differently, and you need to track this with MOVED and other cluster specific errors where commands needs to be rerouted for some reason.

@duke-cliff
Copy link
Author

Yes, I understand the complexity.

To me, the readonly does not mean replicas only, it should route read to all master/replicas. You can find a similar logic in go-redis which supports the cluster mode as well:
https://github.com/go-redis/redis/blob/b965d69fc9defa439a46d8178b60fc1d44f8fe29/cluster.go#L584

In terms of how the pipeline should decide which node to use.
If you want to do something simple, I think we can imply if the pipeline contains any write cmds, it will go to masters only.
This can even be set by caller as I suggested above to give a parameter in cluster.pipeline(readonly=True)

If you want something smarter you can also read the redis-go logic, it completely routes cmds based on if it's read-only cmd or not.
https://github.com/go-redis/redis/blob/eb28cb64264d03218e5aa8f35eb66a4e4c4ba60b/cluster.go#L1124

@kmorrison
Copy link

Hi @Grokzen, I work with @duke-cliff . We're wondering if you would accept a patch in the form of

diff --git a/rediscluster/client.py b/rediscluster/client.py
index cb8f59d..413a797 100644
--- a/rediscluster/client.py
+++ b/rediscluster/client.py
@@ -442,7 +442,7 @@ class RedisCluster(Redis):
         """
         return ClusterPubSub(self.connection_pool, **kwargs)

-    def pipeline(self, transaction=None, shard_hint=None):
+    def pipeline(self, transaction=None, shard_hint=None, read_from_replicas=False):
         """
         Cluster impl:
             Pipelines do not work in cluster mode the same way they do in normal mode.
@@ -461,6 +461,7 @@ class RedisCluster(Redis):
             result_callbacks=self.result_callbacks,
             response_callbacks=self.response_callbacks,
             cluster_down_retry_attempts=self.cluster_down_retry_attempts,
+            read_from_replicas=read_from_replicas,
         )

     def transaction(self, *args, **kwargs):
diff --git a/rediscluster/pipeline.py b/rediscluster/pipeline.py
index 1b7065f..4d2272e 100644
--- a/rediscluster/pipeline.py
+++ b/rediscluster/pipeline.py
@@ -194,7 +194,7 @@ class ClusterPipeline(RedisCluster):
             # refer to our internal node -> slot table that tells us where a given
             # command should route to.
             slot = self._determine_slot(*c.args)
-            node = self.connection_pool.get_node_by_slot(slot)
+            node = self.connection_pool.get_node_by_slot(slot, self.read_from_replicas)

             # little hack to make sure the node name is populated. probably could clean this up.
             self.connection_pool.nodes.set_node_name(node)

or would you be willing to collaborate on a PR that started with that as the basis? I'm a little unclear on the politeness of me just opening a PR against your repo but perhaps that was the more straightforward approach; thought I'd ask first.

@FranGM
Copy link
Contributor

FranGM commented Apr 19, 2021

Noticed that #402 was created for this. Is there any way we could help make this happen?
We'd love to have the ability to send read commands to replicas within pipelines.

@Grokzen
Copy link
Owner

Grokzen commented Apr 19, 2021

@FranGM Check out #406 it was the second attempt by duke to implement his but i am not super happy with the approach to go back to gevent for doing the parallel parts. We used to have gevent back in the day before 72squared figured out a much cleaner and simpler approach to it that do not really require gevent.

That PR right now has that issue, it is also doing some major rework in the pipeline code and it is almost to much to change in one PR and i am not really sure about all the changes as i did not develop the current version of the pipeline code so i am not really into the current status of the code.

So if you want to help with something, either look at that PR and possibly scale the changes back if that is even possible and maybe base a new set of work that does more minimal intervention to add this feature and that will highly increase the chances of this being added more quickly.

One more argument why i want to keep the changes to a minimal is what i wrote in my last comment above here that i am going to take a completley different approach to how to handle the cluster in 3.0.0 so i do not really want to tear out or rewrite everything now to just rip it out soon in 3.0.0 for a total new implementation anyway.

@FranGM
Copy link
Contributor

FranGM commented Apr 20, 2021

Thanks, that's a very clear explanation. I'll try to provide a simpler patch and then we can go from there!

@FranGM
Copy link
Contributor

FranGM commented Apr 20, 2021

@Grokzen I went ahead and created #450 with the simplest implementation I could come up with that still seems to work as advertised. Let me know if that's something along the lines of what you would be happy with!

Side-question: do you have any rough timeline for 3.0.0? We'd definitely like to see if we can help with more optimizations but don't really want to pester you about code you're heavily rewriting.

@Grokzen
Copy link
Owner

Grokzen commented Apr 24, 2021

@duke-cliff Note that even with this PR #450 merged now, there is still work to be done as the other suggested PR submitted has many more changes and fixes in it that can still be useful. The merged PR just fixes it in the short term and users should be aware that it can break and cause inconsistencies during hashslot migrations and failover scenarios, but that is right now something the user has to consider and ensure their code works with this.

I will keep this issue open until we sort out your PR @duke-cliff but for anyone else who finds this issues and sees it being open, part of the problem has been fixed and merged and shipped in 2.1.3 release, but be aware to use the stopgap solution with care and consideration that it can still be buggy.

@FranGM
Copy link
Contributor

FranGM commented May 18, 2021

@Grokzen wondering when you had planned to release 2.1.3? We have a release for our service coming out which I'd love to include 2.1.3 instead of 2.1.2 if at all possible

@Grokzen
Copy link
Owner

Grokzen commented May 30, 2021

@FranGM Oh lol, i prepared the release with changelog and all and thought i pushed the build files out to pypi -_- but aparently i forgot about that last step...i will do it right now sorry for that :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants