SetupFlags

io_uring_setup() flags

Values

ValueMeaning
NONE0

No flags set

IOPOLL1U << 0

IORING_SETUP_IOPOLL

Perform busy-waiting for an I/O completion, as opposed to getting notifications via an asynchronous IRQ (Interrupt Request). The file system (if any) and block device must support polling in order for this to work. Busy-waiting provides lower latency, but may consume more CPU resources than interrupt driven I/O. Currently, this feature is usable only on a file descriptor opened using the O_DIRECT flag. When a read or write is submitted to a polled context, the application must poll for completions on the CQ ring by calling io_uring_enter(2). It is illegal to mix and match polled and non-polled I/O on an io_uring instance.

SQPOLL1U << 1

IORING_SETUP_SQPOLL

When this flag is specified, a kernel thread is created to perform submission queue polling. An io_uring instance configured in this way enables an application to issue I/O without ever context switching into the kernel. By using the submission queue to fill in new submission queue entries and watching for completions on the completion queue, the application can submit and reap I/Os without doing a single system call. If the kernel thread is idle for more than sq_thread_idle microseconds, it will set the IORING_SQ_NEED_WAKEUP bit in the flags field of the struct io_sq_ring. When this happens, the application must call io_uring_enter(2) to wake the kernel thread. If I/O is kept busy, the kernel thread will never sleep. An application making use of this feature will need to guard the io_uring_enter(2) call with the following code sequence:

// Ensure that the wakeup flag is read after the tail pointer has been written.
smp_mb();
if (*sq_ring->flags & IORING_SQ_NEED_WAKEUP)
    io_uring_enter(fd, 0, 0, IORING_ENTER_SQ_WAKEUP);

where sq_ring is a submission queue ring setup using the struct io_sqring_offsets described below.

To successfully use this feature, the application must register a set of files to be used for IO through io_uring_register(2) using the IORING_REGISTER_FILES opcode. Failure to do so will result in submitted IO being errored with EBADF.

SQ_AFF1U << 2

IORING_SETUP_SQ_AFF

If this flag is specified, then the poll thread will be bound to the cpu set in the sq_thread_cpu field of the struct io_uring_params. This flag is only meaningful when IORING_SETUP_SQPOLL is specified.

CQSIZE1U << 3

IORING_SETUP_CQSIZE

Create the completion queue with struct io_uring_params.cq_entries entries. The value must be greater than entries, and may be rounded up to the next power-of-two.

Note: Available from Linux 5.5

CLAMP1U << 4

IORING_SETUP_CLAMP

Some applications like to start small in terms of ring size, and then ramp up as needed. This is a bit tricky to do currently, since we don't advertise the max ring size.

This adds IORING_SETUP_CLAMP. If set, and the values for SQ or CQ ring size exceed what we support, then clamp them at the max values instead of returning -EINVAL. Since we return the chosen ring sizes after setup, no further changes are needed on the application side. io_uring already changes the ring sizes if the application doesn't ask for power-of-two sizes, for example.

Note: Available from Linux 5.6

ATTACH_WQ1U << 5

IORING_SETUP_ATTACH_WQ

If IORING_SETUP_ATTACH_WQ is set, it expects wq_fd in io_uring_params to be a valid io_uring fd io-wq of which will be shared with the newly created io_uring instance. If the flag is set but it can't share io-wq, it fails.

This allows creation of "sibling" io_urings, where we prefer to keep the SQ/CQ private, but want to share the async backend to minimize the amount of overhead associated with having multiple rings that belong to the same backend.

Note: Available from Linux 5.6

Meta