Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAGRA tech debt: distance descriptor and workspace memory #436

Open
wants to merge 2 commits into
base: branch-24.12
Choose a base branch
from

Conversation

achirkin
Copy link
Contributor

This PR introduces two changes:

  1. Refactor dataset_descriptor_host to pass and cache it by value while keeping the state in a thread-safe object in a shared pointers. Before this, the descriptor host itself was kept in shared pointer in LRU cache and was passed by reference; as a result, it could in theory die due to cache eviction while still being used via references to it.
  2. Adjust the temporary buffers to always use the workspace resource in all CAGRA algo implementations (as of now, only SINGLE_CTA algo does this; the PR expands the change to MULTI_CTA and MULTI_KERNEL).

Both of the changes are required for effective use of stream-ordered dynamic batching #261 (1. fixes crashes and 2. fixes thread-blocking behavior).

@achirkin achirkin added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Oct 30, 2024
@achirkin achirkin requested a review from a team as a code owner October 30, 2024 15:48
@github-actions github-actions bot added the cpp label Oct 30, 2024
@achirkin achirkin self-assigned this Oct 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cpp improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

1 participant