Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load balancer must check if the server in the provided pool is online before sending a request #64

Open
SiNONiMiTY opened this issue Feb 21, 2023 · 4 comments
Labels
bug Something isn't working

Comments

@SiNONiMiTY
Copy link

SiNONiMiTY commented Feb 21, 2023

Title says

I am encountering a scenario where I provide 2 URLs for a single subgraph in an array form

const gateway = Fastify()
gateway.register(mercuriusGateway, {
    gateway: {
        services: [
            {
                "name": "user",
                "url": [
                        "http://endpoint1:4001/graphql",
                        "http://endpoint2:4001/graphql"
                ],
                "schema": "type Query { id: ID }"
            }
        ]
    }
})

endpoint2 is intentionally taken down and only endpoint1 is working, however,
when sending queries on the gateway, I am occassionally receiving errors about ECONNREFUSED on endpoint2.

The load balancing mechanism should first do a test ping if the host is reachable before sending a request.

@mcollina
Copy link
Contributor

Unfortunately it's a bit more complex than sending a "ping", as those errors come from existing sockets that are truncated.

How are you shutting down your upstreams servers? Are they closing gracefully or are they crashing?

@SiNONiMiTY
Copy link
Author

Unfortunately it's a bit more complex than sending a "ping", as those errors come from existing sockets that are truncated.

How are you shutting down your upstreams servers? Are they closing gracefully or are they crashing?

Starting the gateway with only one online subgraph out of the two provided

@mcollina
Copy link
Contributor

Thanks, that helps!

I think there is a bug in undici BalancedPool that routes requests to an upstream even if it could not connect there, and it does not retry/send it elsewhere in case it fails to connect. Things stabilizes over time because of BalancedPool algorithm, so only a few number of requests would fail.

The bad news is that I don't have time right now to fix it there.

@mcollina mcollina added the bug Something isn't working label Feb 21, 2023
@SiNONiMiTY
Copy link
Author

Thanks, that helps!

I think there is a bug in undici BalancedPool that routes requests to an upstream even if it could not connect there, and it does not retry/send it elsewhere in case it fails to connect. Things stabilizes over time because of BalancedPool algorithm, so only a few number of requests would fail.

The bad news is that I don't have time right now to fix it there.

Yes! I noticed that the balancing algorithm eventually only selects the online server after sending some requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants