Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to force the driver to reconnect #2909

Open
tishun opened this issue Jul 9, 2024 Discussed in #2870 · 1 comment
Open

Provide a way to force the driver to reconnect #2909

tishun opened this issue Jul 9, 2024 Discussed in #2870 · 1 comment
Labels
status: waiting-for-feedback We need additional information before we can continue
Milestone

Comments

@tishun
Copy link
Collaborator

tishun commented Jul 9, 2024

Discussed in #2870

Originally posted by e-ts June 3, 2024
Can I reconnect to the node used when I catch a RedisCommandTimeoutException for a command to a Redis Cluster?

We are having a problem where the old master does not respond for 10 seconds after a FAILOVER is issued to its replica. TCP packets with new requests still get acked during these 10 seconds. As the connection is clearly not dead, Lettuce keeps sending new commands to the old master. Eventually, it will receive all the MOVED response at once but this is too late for us.

For our specific problem, it would be better if Lettuce reconnected to the node on command timeout as the bug only seems to affect a single TCP socket. A command on a new socket will get an immediate MOVED response, allowing Lettuce to continue on the master.

I guess it could be tricky to get this right as all the requests in flight will time out at different times and we probably do not want to reconnect for each timeout.

Of course, we are trying to get the underling problem with Redis resolved too, see #2572 but a work-around like this would still be useful until that gets fixed.

I have checked the wiki, GitHub issues and GitHub Discussions and found #2082 which is similar but in that case, the TCP packets do not get acked, leading to another solution.

I tried setting an absurdly low periodic refresh of a few hundred milliseconds but that does not seem to help, which might be a bug but I have not looked into it yet.

@tishun tishun modified the milestones: 7.x, Backlog Jul 9, 2024
@tishun tishun added the status: help-wanted An issue that a contributor can help us with label Jul 9, 2024
@tishun tishun added status: blocked An issue that is blocked on an external project change status: waiting-for-feedback We need additional information before we can continue and removed status: help-wanted An issue that a contributor can help us with status: blocked An issue that is blocked on an external project change labels Jul 17, 2024
@tishun
Copy link
Collaborator Author

tishun commented Jul 17, 2024

Suggested a solution in the discussion, waiting for user feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: waiting-for-feedback We need additional information before we can continue
Projects
None yet
Development

No branches or pull requests

1 participant