Mittwoch, 11. April 2012

Linux/Kubuntu - Qt + Cuda + QtCreator

"Cuda Toolkit 4.1 and Qt4 on Linux"

It is time for another tutorial. Today we want to setup Qt+Cuda on Linux/Kubuntu. It is recommend to install Qt from the repositories. In the 3rd section I also wrote down the instructions for compiling Qt from scratch. If you find some oddities, just leave a comment (this isn't the best HowTo, I am still learning).

1. GPU Drivers + CUDA Toolkit 4.1 + CUDA SDK
2. QtCreator and Qt4 Example Project with CUDA
3. Compile Qt4 from scratch (optional)
4. Tools - Debugging and Profiling


My System:
- Kubuntu 11.10 (Oneiric Ocelot) at 32bit.
- g++ v4.5, gcc v4.5 (4.6 and up is not supported by cuda toolkit 4.1, see here)
- Qt 4.7.4 (dev-tools, qtcreator)

- Cuda Toolkit 4.1 + SDK (got Ubuntu 11.04 & 32bit)
- qt4_mandelbrot (Qt4 example project with Cuda kernel)


1. Get the NVidia / CUDA stuff installed:

I've got here:

1.1 Drivers
First of all we update our nvidia drivers. After I got some troubles with the downloaded drivers (blue colored videos^^) the following did it for me.

sudo add-apt-repository ppa:ubuntu-x-swat/x-updates
sudo apt-get update
sudo apt-get install nvidia-current

I've uninstalled all the nvidia drivers before, but maybe you don't have to. I did a backup of my /etc/X11/xorg.conf and stopped the x-server (ctrl+alt+F1 and sudo stop kdm, then purging my old nvidia stuff, then run the commands from above and finally sudo start kdm to get back, I hope you wont need to do that).

Now I have NVidia driver version 295.40.

(Refs:, hecticgeek)

1.2 Toolkit
Just run:
sudo ./

Edit your .bashrc:
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib:$LD_LIBRARY_PATH

To install the gpu computing sdk, just run as user:

The examples are not build yet, you actually have to run make in ~/NVIDIA_GPU_Computing_SDK, but in most cases errors will occur.

Error Messages:

"unsupported GNU version! gcc 4.6 and up are not supported!",
Install gcc-4.5 and g++-4.5 as a second compiler and softlink to it in /usr/local/cuda/bin
sudo ln -s /usr/bin/gcc-4.5 /usr/local/cuda/bin/gcc

rendercheck_gl.cpp:(.text+0x119b): undefined reference to 'gluErrorString'
cannot find -lcuda
Lib paths are messed up in and (see here). Path to nvidia_current is missing. I have uploaded a patched version.

Download Patch

Maybe you have to edit the path to your nvidia driver where libcuda and others are located. Edit in that downloaded patch the files and at the beginning:
# ### patch for finding libcuda
NVIDIA_CURRENT = /usr/lib/nvidia-current/

There are still some errors in the freeImageInteropNPP in CUDALibraries, but I dont care for the moment. The examples can be compiled in NVIDIA_GPU_Computing_SDK/C/ directly. I run deviceQuery to check if installation was successfull.
~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release$ ./deviceQuery

If this is working for you, then everything with cuda is fine. If you got "./deviceQuery: error while loading shared libraries: cannot open shared object file: No such file or directory", then you forgot to add the cuda libs to LD_LIBRARY_PATH.


2. Qt4 Projects with CUDA on Linux

Start your QtCreator and create a new project (or download qt4_mandelbrot as example). My structure is:
qt4_mandelbrot - source code
qt4_mandelbrot/obj - object code including cuda object code
qt4_mandelbrot/bin - binaries

// I've disabled the shadow build option in the qt4 project settings, because I configured the directories in the .pro file on my own.

For the .pro file cuda settings I refer to this site. You also can have a look at the qt4_mandelbrot project.

In QtCreator you need to set the build environment. Go to Projects and add to the Build Environment a new variable "LD_LIBRARY_PATH" with "/usr/local/cuda/lib". Check if the execution environment has this setting as well.

Press Ctrl+B or Ctrl+R to build/run the project.


3. Qt4 - compile it from scratch (optional)

- Qt Environment (Download Page)
- - Qt Sourcecode (tar.gz) (v4.8.1)
- - Qt Creator (32bit Binary, v2.4.1)

untar source code:
tar -xf qt-everywhere-opensource-src-4.8.1.tar.gz

configure Qt for compilation ("-release", "-shared" are actually default)
./configure -release -shared -no-qt3support -no-webkit -optimized-qmake -no-multimedia -no-phonon -nomake examples -nomake demos

compile and install (takes a while, runs with two jobs at once cuz I have two CPUs):
sudo make install -j2

install qt creator
chmod +x qt-creator-linux-x86-opensource-2.4.1.bin

start qt creator and set path to qmake (Qt Creator - Tools - Options: Qt Versions):

Now you can try a qt application created by the project wizard of Qt Creator. Press Ctrl+R to compile and run.

To get access to Qt on shell/console, you have to extend the PATH variable. Edit .profile and .bashrc (home directory) and add:
export PATH=/usr/local/Trolltech/Qt-4.8.1/bin/:$PATH

Qt: cd /pathto/qt_source and sudo make uninstall
QtCreator: cd /pathto/qtcreator/ and run ./uninstall


4. Tools - Debugging and Profiling
For debugging you can try DDD (sudo apt-get install ddd):
ddd --debugger cuda-gdb ./application

If you just have a single GPU you can't debug local kernels yet. Since NSight 2.2 enables Single-GPU debugging again (NSight is only for Windows/VisualStudio), I hope the upcoming Toolkit 4.2 will do this for Linux as well.

For informations on memory leaks and access errors just use cuda-memcheck (see manual ;) ).

To profile the gpu site you can use the nvidia profiler (nvvp). Just run nvvp and open your binary into a new session.

You can profile (the host site) with gprof or oprofile. gprof needs the compiler flag -pg that is also available in nvcc. Just have a look at the .pro file coming with this tutorials example project. To get a profile you have to run the binary, which creates a gmon.out for the gprof profiler. Now you can generate a readable output:
gprof ./qt4_mandelbrot > output.txt

oprofile (sudo apt-get install oprofile oprofile-gui).
"OProfile is a system-wide profiler for Linux systems, capable of profiling all running code at low overhead. [...] OProfile leverages the hardware performance counters of the CPU [...]" (oprofile website)

So oprofile offers CPU and hardware based profiling which gprof doesn't. For usage I just refer to this site. If you want to read more about the sampling methods and the advantages of oprofile, so read this site.