cuda / opencl port?

Dear Forum,

Perhaps this has been asked before, but has there been any effort to port bits of libRaw to CUDA / OpenCL? For 100MB / 101-megapixel files, it seems to take any good modern CPU several seconds to process a single image. E.g., for a Fuji GFX frame (101MP RAF), I see a top-end Threadripper taking about 4 cores and 4-6 seconds. Since it seems part of the processing (De-Bayering?) benefits a ton from OpenMP, I'd suspect that CUDA / OpenCL could help even more? There's some non-zero cost of moving the raw image data to the GPU, but perhaps that could be worth it if the overall throughput / processing is 5x-10x faster?

Anybody aware of any CUDA / OpenCL efforts? Or perhaps ports of some parts of LibRaw to CUDA/OpenCL-accelerated code?

Happy New Year!

Forums: 

The library is intended

The library is intended (primarily) for decoding/decompressing RAW and RAW metadata. Compressed RAW data is arranged (compressed) in such a way that no gain can be obtained from using highly parallel computing (except for partial bulk operations, such as tone conversion on the curve/LUT, but for these operations there will be no gain either, since the transfer of data to/from the accelerator is slower than the LUT conversion directly on the CPU (esp. today multi-core CPU).

Postprocessing can certainly gain a lot from using parallel computing, but postprocessing is not the task of the library and the corresponding code is included there only as a minimal and imperfect example.

However, you can try to do parallel decompression of RAW data. If the result obtained is many times faster than on the CPU and you agree to provide your code for our library, we will gladly accept this contribution.

Everything related to post-processing (debayering and so on) should be done outside of LibRaw: we do not plan to extend this part of our library

-- Alex Tutubalin @LibRaw LLC