-
Notifications
You must be signed in to change notification settings - Fork 45
[KA] support unified memory API #394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Your PR requires formatting changes to meet the project's style guidelines. Click here to view the suggested changes.diff --git a/lib/cl/state.jl b/lib/cl/state.jl
index bed42dc..da99fe9 100644
--- a/lib/cl/state.jl
+++ b/lib/cl/state.jl
@@ -9,7 +9,7 @@ function clear_task_local_storage!()
delete!(task_local_storage(), :CLPlatform)
delete!(task_local_storage(), :CLQueue)
delete!(task_local_storage(), :CLMemoryBackend)
- delete!(task_local_storage(), :CLUnifiedMemoryBackend)
+ return delete!(task_local_storage(), :CLUnifiedMemoryBackend)
end
@@ -164,7 +164,7 @@ struct SVMBackend <: AbstractMemoryBackend end
struct USMBackend <: AbstractMemoryBackend end
struct BufferBackend <: AbstractMemoryBackend end
-function supported_memory_backends(dev::Device; unified=false)
+function supported_memory_backends(dev::Device; unified = false)
backends = AbstractMemoryBackend[]
# unified shared memory is the first choice, as it gives us separate host and device
@@ -197,7 +197,7 @@ function supported_memory_backends(dev::Device; unified=false)
return backends
end
-function default_memory_backend(dev::Device; unified=false)
+function default_memory_backend(dev::Device; unified = false)
supported_backends = supported_memory_backends(dev; unified)
isempty(supported_backends) && return nothing
@@ -234,7 +234,7 @@ end
function unified_memory_backend()
return get!(task_local_storage(), :CLUnifiedMemoryBackend) do
dev = device()
- backend = default_memory_backend(dev; unified=true)
+ backend = default_memory_backend(dev; unified = true)
if backend === nothing
error("Device $(dev) does not support any of the available unified memory backends")
end
diff --git a/src/OpenCLKernels.jl b/src/OpenCLKernels.jl
index e06102c..9891360 100644
--- a/src/OpenCLKernels.jl
+++ b/src/OpenCLKernels.jl
@@ -17,7 +17,7 @@ export OpenCLBackend
struct OpenCLBackend <: KA.GPU
end
-function KA.allocate(::OpenCLBackend, ::Type{T}, dims::Tuple; unified::Bool = false) where T
+function KA.allocate(::OpenCLBackend, ::Type{T}, dims::Tuple; unified::Bool = false) where {T}
if unified
memory_backend = cl.unified_memory_backend()
if memory_backend === cl.USMBackend()
@@ -32,7 +32,7 @@ function KA.allocate(::OpenCLBackend, ::Type{T}, dims::Tuple; unified::Bool = fa
end
end
-KA.supports_unified(::OpenCLBackend) = cl.default_memory_backend(cl.device(); unified=true) !== nothing
+KA.supports_unified(::OpenCLBackend) = cl.default_memory_backend(cl.device(); unified = true) !== nothing
KA.get_backend(::CLArray) = OpenCLBackend()
# TODO should be non-blocking |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #394 +/- ##
==========================================
+ Coverage 80.22% 80.27% +0.05%
==========================================
Files 12 12
Lines 723 730 +7
==========================================
+ Hits 580 586 +6
- Misses 143 144 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Lines 96 to 105 in 53f450a
|
|
Like this? |
849e5cd to
fbe88ca
Compare
|
Do we need a separate |
|
The issue I saw with that is that if we have no USM support but support for BDA, we'll always pick |
|
Yeah, I guess. Although it's definitely an edge case, e.g., I don't know how well we support copies from unified BDA memory to Buffers (to support the unified <-> device copies). |
fbe88ca to
adca066
Compare
|
Merge? |
adca066 to
4d32ac2
Compare
maleadt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah go ahead. I'm not entirely happy with the design, but this is internal anyway.
No description provided.