-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenCL/clfft Integration including CI #26
base: master
Are you sure you want to change the base?
OpenCL/clfft Integration including CI #26
Conversation
pretty awesome, thank you! @Flamefire whenever you have the time, feel free to have a look! :) |
The travis files and testOpenCL.cpp have been updated. If you want to have this PR as a single commit, let me know and I squash the commits. There is another use case which has not been shown here yet. When you want to execute the FFT on a CPU or GPU, you can specify the context target by an enum: enum class ContextDevice {
GPU=CL_DEVICE_TYPE_GPU,
CPU=CL_DEVICE_TYPE_CPU,
ACCELERATOR=CL_DEVICE_TYPE_ACCELERATOR
}; So you could request an OpenCL context for CPU and one for GPU using Context = LiFFT::libraries::clFFT::policies::ContextLocal<true>;
Context context_cpu(ContextDevice::CPU);
Context context_gpu(ContextDevice::GPU);
// ... If there is no GPU, OpenCL uses $ ./test/Test --run_test=OpenCL/TestClFFTR2CInplaceTwoArch*
Running 2 test cases...
1: "ClFFT Informations","Device","Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz", <snip>
2: "ClFFT Informations","Device","Tesla P100-PCIE-16GB", <snip>
TwoArch Sync: Time = 101.061
TwoArch ASync: Time = 88.4254
*** No errors detected I know it is not an exact proof that both FFTs were running concurrently, but it shows the workflow of sync and async architecture specific contexts and it is on the ToDo to play around with liFFT and threaded environments. |
@Flamefire if you are interested to take a look at the implementation of clfft or want to merge it feel free to jump in :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just got around reviewing this. Looks great to me except one place with a trait. Please provide a short explanation and at least rename the trait has_type
to something meaningful. Maybe something like shown here would be more readable? Note that there is a void_t
already defined in the code so C++11 compatibility is ok.
|
||
// SFINAE test if T has type member | ||
template <typename T> | ||
class has_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have trouble understanding the usage of this. I assume this trait is supposed to return true iff a member isComplex
exists, that is constructible from a int?
Then what is the reasoning in using it as used in IntegralTypeImpl
below? If I read that correctly then IntegralTypeImpl<Foo>
returns Foo
for every type Foo
that is either an int or float or simply does not have a isComplex
member which is true for pretty much any class. Wouldn't that make it pointless?
@Flamefire you are absolutely right, this is too messy code and edit: CI failed, I look into this (uh but it did work on my system :P) |
From the comment there seems to be a misunderstanding on what Oh and while I'm on that comment: You don't need the |
thanks for your detailed feedback. I try to summarize the next steps: the goal is to decouple FFT data from FFT executor. FFT properties exist on both sides and become validated at compile-time and that's what we also want for clFFT backend of course. template<typename T, typename TValueType, unsigned T_numDims>
struct LibHandle : public T {
using type = TValueType;
using IdxType = types::Vec< T_numDims, size_t >;
protected:
IdxType m_extents;
}; I cannot derive from cl_mem cldata;
// .. this cldata will contain 2D floats ..
LibHandle<cl_mem,float,2> handle = LiFFT::mem::wrapLibHandle<float>(cldata, extents);
// or just: auto handle = LiFFT::mem::wrapLibHandle<float>(cldata, extents);
// .. implicit conversion to base type cl_mem is also possible
// things like copy does not work as there is no accessor defined in handle
// LiFFT::policies::copy(handle, baseR2CInput); Now the FFT part: auto in_handle = LiFFT::mem::wrapLibHandle<float>(in_cldata, extents);
auto out_handle = LiFFT::mem::wrapLibHandle<Complex<float>>(out_cldata, extents);
using FFT_TYPE = LiFFT::FFT_2D_R2C<float>;
// add the FFT properties to the handle in a wrapper
auto in_wrapped = FFT_TYPE::wrapInputLibHandle(in_handle);
auto out_wrapped = FFT_TYPE::wrapOutputLibHandle(out_handle);
// make FFT based on wrappers' FFT properties
auto fft = LiFFT::makeFFTInQueue<ClFFTContextAPI>(in_wrapped, out_wrapped,
context);
// execute
fft(in_wrapped, out_wrapped, context); Not checked all the possible conflicts under the hood, but what do you think? |
Yes sounds great. Maybe pack in strides too but have them be available as default params? Not sure if this is requires/usefull, just plain pointers may have strides which are checked (I think) by the accessors. |
Hi,
the liFFT interface is extended to support context including queue management with an option for asynchronous functionality.
This is realized by the clfft client and there are several possibilities how one can use liFFT in conjunction with clfft. Some examples to show you the API changes and usage of clfft.
ContextLocal
(RAII),ContextGlobal
(Singleton) andContextWrapper
(wrap raw OpenCL context, device and queue).FFT_LibPtrWrapper
is added to liFFT to handle non-accessible device/library pointers and is aFFT_DataWrapperBase
I hope it provides a usable design now, where we can build on it.
When you have some time, please review and play around with the code :)