-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNMG ANN #231
SNMG ANN #231
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Victor for the PR, it looks really great! I have just a few smaller comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few additional comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are almost there, but the matrix extent type has to be fixed: we should distinguish the neighbor index type (IdxT, usually int64), and the mdspan extent type which is always int64. In a previous review I have marked such changes in ann_mp.cuh
, but actually the public API in ann_mg.hpp
has also a few places where this need to be fixed
/ok to test |
/ok to test |
/ok to test |
/ok to test |
/ok to test |
/ok to test |
/ok to test |
/ok to test |
CI error here seemed to have been a connection error:
|
/ok to test |
/ok to test |
/ok to test |
/ok to test |
/merge |
This PR implements a distributed (single-node-multiple-GPUs) implementation of ANN indexes. It allows to build, extend and search an index on multiple GPUs.
Before building the index, the user has to choose between two modes :
Sharding mode : The index dataset is split, each GPU trains its own index with its respective share of the dataset. This is intended to both increase the search throughput and the maximal size of the index.
Index duplication mode : The index is built once on a GPU and then copied over to others. Alternatively, the index dataset is sent to each GPU to be built there. This intended to increase the search throughput.
SNMG indexes can be serialized and de-serialized. Local models can also be deserialized and deployed in index duplication mode.
Migrated from rapidsai/raft#1993