Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce vecmem::metal, main branch (2024.10.27.) #300

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

krasznaa
Copy link
Member

@krasznaa krasznaa commented Oct 27, 2024

What should one do on a weekend if not write some useless code... 🤔

I wanted to see for a while now whether Metal was the least bit compatible with how this project works. And after a weekend of goofing around, the answer is eeehhh... 😕

Metal is not a C(++) language/library to begin with, but an Objective-C one. Apple wrote a wrapper itself for calling Metal library functions from C++ code directly, but I didn't get the impression from the documentation that I found, that it would be particularly supported. So I went for interfacing with Metal by adding Objective-C++ source files to this repository. 😮

Similar to SYCL/Vulkan/etc., Metal would like to handle all its memory allocations through language specific buffer objects, which would control the lifetime of those allocations. Which is not quite how memory resources work. But it's also not completely unusable with memory resources. 🤔 The memory resource "just" needs to manage buffer objects itself.

This is what I did as a demonstration in vecmem::metal::shared_memory_resource. Which, on a very basic level, does seem to work on my M1 Mac Mini. 🎉

[zsh][Thranduil]:build > ./bin/vecmem_test_metal
[vecmem] metal/src/utils/device_wrapper.mm:37 Created an "owning wrapper" around device: Apple M1
[vecmem] metal/src/utils/device_wrapper.mm:37 Created an "owning wrapper" around device: Apple M1
[==========] Running 4 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 1 test from metal_basic_memory_resource_tests/memory_resource_test_basic
[ RUN      ] metal_basic_memory_resource_tests/memory_resource_test_basic.allocations/shared_resource
[       OK ] metal_basic_memory_resource_tests/memory_resource_test_basic.allocations/shared_resource (0 ms)
[----------] 1 test from metal_basic_memory_resource_tests/memory_resource_test_basic (0 ms total)

[----------] 3 tests from metal_host_accessible_memory_resource_tests/memory_resource_test_host_accessible
[ RUN      ] metal_host_accessible_memory_resource_tests/memory_resource_test_host_accessible.int_value/shared_resource
[       OK ] metal_host_accessible_memory_resource_tests/memory_resource_test_host_accessible.int_value/shared_resource (0 ms)
[ RUN      ] metal_host_accessible_memory_resource_tests/memory_resource_test_host_accessible.double_value/shared_resource
[       OK ] metal_host_accessible_memory_resource_tests/memory_resource_test_host_accessible.double_value/shared_resource (0 ms)
[ RUN      ] metal_host_accessible_memory_resource_tests/memory_resource_test_host_accessible.custom_value/shared_resource
[       OK ] metal_host_accessible_memory_resource_tests/memory_resource_test_host_accessible.custom_value/shared_resource (0 ms)
[----------] 3 tests from metal_host_accessible_memory_resource_tests/memory_resource_test_host_accessible (0 ms total)

[----------] Global test environment tear-down
[==========] 4 tests from 2 test suites ran. (0 ms total)
[  PASSED  ] 4 tests.
[zsh][Thranduil]:build >

But allocating memory by itself is of course not much of a thing. 🤔 And to actually implement Metal shaders that would do some computation, is a whole other level of complexity. 😢 See:

I actually want to come back to this during future weekends, and goof some more, but of course none of this seems actually usable to us. Building "Metal shader libraries" is a bit of an involved process, which really only works with XCode at the moment. Though with that generator, CMake can be used for it like:

set_source_files_properties("add.metal"
    PROPERTIES
    LANGUAGE METAL
)

(Without having to write code like what we have for SYCL and HIP. Which others have done already, see: https://dpogue.ca/articles/cmake-metal.html)

So yeah, pretty silly, but I thought I would entertain you guys with it nevertheless. 😄

As it turns out, requesting the Cpp language explicitly from
clang-format didn't actually do anything in practice.
It provides just a shared memory resource for now, with the plumbing
necessary to use that resource from C++ code.
Updated one of the tests to not try to perform 0 sized allocations
anymore. Doing so is ill-defined for all memory resources, but
Metal really-really doesn't like it.
@krasznaa krasznaa added the enhancement New feature or request label Oct 27, 2024
Copy link
Member

@paulgessinger paulgessinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat. One could probably also do it by bridging with Swift, but this is probably even more hairy.

assert(m_data);
auto it = m_data->m_buffers.find(p);
assert(it != m_data->m_buffers.end());
m_data->m_buffers.erase(it);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The m_buffers array just contains the pointer right? Does id<MTLBuffer> support RAII in this way, or do you need to do any other cleanup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My own experience with this is limited, but from what I understand, Apple uses ARC these days.

But I've indeed not quite made sure that it would be active in this case. 🤔 I think ARC is in effect all the time unless it's turned off, but at the same time I've also seen @autoreleasepool in some of the examples. But I think that's not needed for the garbage collection... 🤔 (I believe that's a slightly different thing.)

I'll need to check this in action, but I think one just needs to make id objects go out of scope, and the runtime should take care of deleting them eventually.

// managed properly.
m_managedDevice =
std::make_unique<details::opaque_device>(*(parent.m_managedDevice));
m_device = (__bridge void*)m_managedDevice->m_device;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does __bridge do here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on very shaky ground here. 🤔 I was trying to find out from the internets how to do the equivalent of what's happening in vecmem::sycl::queue_wrapper. Where a void* pointer is used to point at a cl::sycl::queue object in memory.

The compiler was happy to let me convert an id<MTLDevice> "object" to a void* pointer, but I was not sure how to "safely" do the reverse. Since the reinterpret_cast that we do in vecmem::sycl::queue_wrapper is not really an option here. (I think...)

After some looking around, I found: https://clang.llvm.org/docs/AutomaticReferenceCounting.html#bridged-casts Tried to use it, and the code compiled without warnings, and ran successfully. 🤔 So I went with this formalism. However it's absolutely true that this code might be doing something silly at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants