vcio2 programming guide

▲ Top, ▼ Migration of vcio applications, ▼ Concurrent use of vcio2, ▼ Sample code, ▶ API reference

Use the GPU with vcio2

Open /dev/vcio2 with open in read/write mode.
Allocate an appropriate amount of GPU memory with mmap.
Place the shader code, the uniforms and the control blocks into the allocated GPU memory. Don't forget that the GPU only accepts bus addresses and cannot access the applications memory.
Although the GPU can access all physical memory there is no reasonable way to use application memory directly, since it may be fragmented in physical memory and the GPU cannot use scatter gather lists.
Run your code on the GPU with IOCTL_EXEC_QPU.
Use an appropriate timeout. This will recover from GPU crashes in most cases.
Repeat the last two steps as needed.
Close the device when you don't need it anymore. This releases all GPU memory allocated by this handle.

See sample program for further details.

See also vc4asm for a powerful and free QPU macro assembler.

Migrating from hello_fft style mailbox

If you have an application that uses the old vcio driver, e.g. hello_fft, you need to do the following steps to migrate to vcio2:

Copy include/soc/bcm2835/vcio2.h to your source files.
Replace the files mailbox.h and mailbox.c with the ones from the sample/porting or sample/hybrid directory depending on whether you want to run on installations without the vcio2 driver as well.
Adjust include path of vcio2.h in mailbox.c.
Add an additional first parameter with the device handle returned from mbox_open to any call to mapmem.
Compile and run.

Now you application should use the new kernel driver it does no longer require root privileges to run if you grant access to /dev/vcio2.

Some applications (including hello_fft) access the Videocore IV control registers directly for faster access. This is not supported by vcio2 and it will never be. It is strongly recommended not to do so because this raises serious race conditions in multitasking environment. You need to refactor the code to always use the vcio2 driver for GPU access.

Using vcio2 concurrently

The vcio2 device may be opened as often as you like. The only limiting factor is the amount of GPU memory available.

Calls to IOCTL_EXEC_QPU are strictly serialized. So no two applications can run code on the GPU simultaneously. However, they can run code alternately.

You should not use the /dev/vcio device or a character device of the built-in vcio driver concurrently to /dev/vcio2. The calls to this devices are not understood by vcio2 and may seriously interfere with /dev/vcio2 access. E.g. the application using /dev/vcio might disable the power of the GPU while another application is running GPU code. Future versions of vcio2 might lock the other devices while /dev/vcio2 is open at least once.
This restriction does not apply to calls that are independent, e.g. reading the GPU memory size or even GPU memory allocations. The vcio2 driver and the vcio driver share a common mutex for this purpose. Only indirect dependencies are not tracked.

Performance counters

vcio2 manages access to the V3D performance counters for all device users. There is however no way to synchronize access to the hardware registers with other applications or drivers that access these counters. I.e. as soon as any application enables performance counters via IOCTL_SET_V3D_PERF_COUNT no other way to access these counters must be used unless the feature is disabled again.

There are no more restrictions that I know of.

Sample application

In the sample folder there is a simple sample application that utilizes the vcio2 device in different variants, each in its own folder. It does some stupid computations by applying all available QPU operators to a bunch of constants.

Variants

native: This is the simplest way to use the vcio2 driver. This is recommended for new developments that depend on vcio2.
porting: This is the sample application that uses an API closely to the mailbox API of hello_fft. It only routes the method calls to the corresponding IOCTLs of vcio2.
hybrid: This implementation first checks whether /dev/vcio2 is available. If it is present the behavior is the same than variant 'porting'. But if not then the code falls back to the builtin /dev/vcio driver.
You should use this pattern only if you want to ensure that your application does not strictly depend on vcio2. But there are several drawbacks. First of all, without vcio2 the application always requires root privileges.

How to build and run

Enter one of the sample directories.
Type make
Run ./smitest

Now the test application should print some pages of results to the console. Of course, you need to install the vcio2 driver before. In case you have chosen the hybrid directory the application may run without the vcio2 driver installed as well. But you must run it as root in this case.