GPU accelerated image processing using Quasar

Overview

GPU-accelerated image/video processing

Today’s algorithms (e.g., image/video processing, hyperspectral sensor data, …) require huge amounts of data. For many algorithms, a good computational performance is indispensable for use in practical applications. These applications are often targeted toward a big diversity of devices, such as desktop PCs, tablets, smartphones, mini PCs.

To reach a good computational performance, modern GPUs bring speedups of 10x-200x for highly parallel processing tasks, but one main disadvantage is the difficulty of programming: not only does (properly) programming a GPU require an extensive in-depth knowledge of the details of a GPU, the development efforts are usually high, which causes GPUs not easily to be used for research purposes, e.g., for devising and testing of new algorithms. Then, when CPUs and GPUs of different types and models are combined, the development and debugging complexity level further increases.

One of our concerns is that training a developer (in academia, sometimes a.k.a. “Ph.D. student”) to learn CUDA/OpenCL or related environments will easily take 1 year of effort, time that cannot be used for other, more useful tasks (e.g., research, algorithm design). Moreover, the code cannot easily be shared with other people (this also requires a learning process). There exists some shortcuts, like using GPU libraries, but these do not offer full flexibility and in many cases GPU kernels need to be written manually, resulting again in the previously mentioned shortcomings.

A solution… Quasar is a novel development environment aiming at bringing a solution to the above problem by:

reducing the complexity of heterogeneous programming of CPUs and GPUs to the programming in a high-level language like Matlab/Python. Using Quasar, heterogeneous programming can be done straightforwardly in a (mostly) hardware agnostic manner. No sophisticated programming knowledge is required and the barrier of entry is low (read: a few days to a few weeks).

providing an IDE with integrated debugger and profiling tools to ease the development even more. The debugger allows the programmer to pause the program at any time during the execution, display tooltips, inspect variables and even allows modifying the program on the fly.

Real-time video processing for autonomous vehicles using Quasar

handling all difficult and time consuming aspects of the heterogeneous system development by a cooperative effort of a compiler and a runtime system. Of course, many aspects of the compilation/code generation and runtime can be controlled by the programmer. Quasar’s goal is to generate code that is on a par with hand-written and hand-tuned CUDA/OpenCL code and it has already achieved this goal in a number of cases.

Screenshot of the Quasar Redshift IDE

Some notable features are:

  • code generator support for multiple back-ends (CUDA, OpenCL) and many available open source/commercial C++ compilers. Works with NVidia, AMD and Intel GPUs. Also works without GPU (uses OpenMP for multi-threading).

  • platform-agnostic: Windows, Linux (Mac OS extension planned). Architectures: x86/x64 and ARM.

  • run-time system performing device management, automatic memory management, memory transfers, heterogeneous load-balancing and scheduling.

  • various compilation techniques, such as automatic loop parallelization, generation of parallel reduction algorithms, shared memory caching, stencil code optimization, branch divergence reduction, automatic code specialization, …

  • easy use of multi-GPU and various GPU features (texture memory, constant memory, concurrent streams, dynamic parallelism) with little or even no changes to the programming code.

  • integration of various libraries, such as OpenGL (visualization), cuFFT (fast Fourier transforms on GPU), cuBLAS (Basic Linear Algebra) and cuDNN (deep learning on GPU).

Example real-time applications of Quasar:

  • Real-time SLAM using Quasar

  • Real-time video processing using Quasar

Our tool Quasar is currently being intensively used at the Image Processing Research group of Ghent University as well as some other Flemish research groups (for example at the Flemish institute for biotechnology). Quasar is powered by Ghent University and imec.

Quasar is available for testing on request (goto try quasar).

Contact:dr. ir. Bart Goossens

Publications

Main reference:

  • B. Goossens, J. De Vylder and W. Philips, “Quasar - a New Heterogeneous Programming Framework for Image and Video Processing Algorithms on CPU and GPU, ” IEEE Int. Conf. on Image Processing (ICIP2014), Oct. 27-30, 2014, Paris, France, p. 2183-2185. Application papers (realized with Quasar):

  • Donné, S., De Vylder, J., Goossens, B., & Philips, W. (2016). MATE: Machine Learning for Adaptive Calibration Template Detection. Sensors, 16(11), 1858.

  • Roels, J., Aelterman, J., De Vylder, J., Lippens, S., Luong, H. Q., Guérin, C. J., & Philips, W. (2016). Image Degradation in Microscopic Images: Avoidance, Artifacts, and Solutions. In Focus on Bio-Image Informatics (pp. 41-67). Springer International Publishing.

  • J. De Vylder, D. Van Haerenborgh, J. Roels and B. Goossens, “Quasar tutorial: High-level programming of Heterogeneous Hardware,” HiPEAC 2016, Jan. 18-20, 2016, Prague.

  • J. De Vylder, S. Donné, D. Van Haerenborgh and B. Goossens, “Real-time Machine Vision with GPU-acceleration using Quasar,” IS&T Electronic Imaging, Feb. 14-18, 2016, San Francisco, CA, USA.

  • B. Goossens, S. Donné, J. Aelterman, J. De Vylder, D. Van Haerenborgh and W. Philips, “Real-time depth estimation and view interpolation using Quasar,” IS&T Electronic Imaging, Feb. 14-18, 2016, San Francisco, CA, USA.

  • D. Van Haerenborgh, J. De Vylder and B. Goossens, “Quasar : rapid prototyping for image/video processing on heterogeneous hardware,” Int. Conf. Acoust. Speech and Signal Proc. (ICASSP), Mar. 20-25, 2016, Shanghai, China.

  • M. Vlaminck, H. Luong, H. Vu, P. Veelaert and W. Philips. “Indoor assistance for visually impaired people using a RGB-D camera.” 2016 IEEE Southwest Symposium on Image Analysis and Interpretation (SSIAI). IEEE, 2016.

  • Luong, H. Q., Vlaminck, M., Goeman, W., & Philips, W. (2016, July). Consistent ICP for the registration of sparse and inhomogeneous point clouds. In Communications and Electronics (ICCE), 2016 IEEE Sixth International Conference on (pp. 262-267). IEEE.

  • Donné, S., Aelterman, J., Goossens, B., & Philips, W. (2015, October). Fast and Robust Variational Optical Flow for High-Resolution Images Using SLIC Superpixels. In International Conference on Advanced Concepts for Intelligent Vision Systems (pp. 205-216). Springer International Publishing.

  • Roels, J., De Vylder, J., Saeys, Y., Goossens, B., & Philips, W. (2016, October). Decreasing Time Consumption of Microscopy Image Segmentation Through Parallel Processing on the GPU. In International Conference on Advanced Concepts for Intelligent Vision Systems (pp. 147-159). Springer International Publishing.

  • B. Goossens, J. De Vylder, S. Donné and W. Philips, “Demo: Quasar - a New Programming Framework for Real-Time Image/Video Processing on GPU and CPU,” in Ninth International Conference on Distributed Smart Cameras (ICDSC 2015), Seville, Spain, Sept. 8-11, 2015, p. 205-206.

Back to the top

Avatar
IPI
Image Processing and Interpretation

My research interests include distributed robotics, mobile computing and programmable matter.