You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 6, 2024. It is now read-only.
For many GPUs (IoT devices using unified memory models, Intel iGPUs, and GPUs with pinned memory), zero-copy buffers are really useful. You use malloc() in the examples currently, which only aligns to 8-byte boundaries typically. Intel requires 4096-byte boundaries, ARM usually requires 64-bytes, etc. Unfortunately Windows uses different alignment methods than Linux, but this stackoverflow question gives a nice example of how to #ifdef wrap a mimic of posix_memalign : https://stackoverflow.com/a/33696858 and https://stackoverflow.com/a/38291021
To make life easy for people you could create a function that extracts CL_DEVICE_MEM_BASE_ADDR_ALIGN and uses this to correctly align buffers (and have a corresponding free call, to abstract windows/linux differences).
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
For many GPUs (IoT devices using unified memory models, Intel iGPUs, and GPUs with pinned memory), zero-copy buffers are really useful. You use malloc() in the examples currently, which only aligns to 8-byte boundaries typically. Intel requires 4096-byte boundaries, ARM usually requires 64-bytes, etc. Unfortunately Windows uses different alignment methods than Linux, but this stackoverflow question gives a nice example of how to #ifdef wrap a mimic of posix_memalign : https://stackoverflow.com/a/33696858 and https://stackoverflow.com/a/38291021
To make life easy for people you could create a function that extracts CL_DEVICE_MEM_BASE_ADDR_ALIGN and uses this to correctly align buffers (and have a corresponding free call, to abstract windows/linux differences).
The text was updated successfully, but these errors were encountered: