SubmissionEntryFlags

sqe->flags

Values

ValueMeaning
NONE0
FIXED_FILE1U << 0

Use fixed fileset (IOSQE_FIXED_FILE) When this flag is specified, fd is an index into the files array registered with the io_uring instance (see the IORING_REGISTER_FILES section of the io_uring_register(2) man page).

IO_DRAIN1U << 1

IOSQE_IO_DRAIN: issue after inflight IO

If a request is marked with IO_DRAIN, then previous commands must complete before this one is issued. Subsequent requests are not started until the drain has completed.

Note: available from Linux 5.2

ASYNC1U << 4

IOSQE_ASYNC

io_uring defaults to always doing inline submissions, if at all possible. But for larger copies, even if the data is fully cached, that can take a long time. Add an IOSQE_ASYNC flag that the application can set on the SQE - if set, it'll ensure that we always go async for those kinds of requests.

Note: available from Linux 5.6

BUFFER_SELECT1U << 5

IOSQE_BUFFER_SELECT If a server process has tons of pending socket connections, generally it uses epoll to wait for activity. When the socket is ready for reading (or writing), the task can select a buffer and issue a recv/send on the given fd.

Now that we have fast (non-async thread) support, a task can have tons of pending reads or writes pending. But that means they need buffers to back that data, and if the number of connections is high enough, having them preallocated for all possible connections is unfeasible.

With IORING_OP_PROVIDE_BUFFERS, an application can register buffers to use for any request. The request then sets IOSQE_BUFFER_SELECT in the sqe, and a given group ID in sqe->buf_group. When the fd becomes ready, a free buffer from the specified group is selected. If none are available, the request is terminated with -ENOBUFS. If successful, the CQE on completion will contain the buffer ID chosen in the cqe->flags member, encoded as:

(buffer_id << IORING_CQE_BUFFER_SHIFT) | IORING_CQE_F_BUFFER;

Once a buffer has been consumed by a request, it is no longer available and must be registered again with IORING_OP_PROVIDE_BUFFERS.

Requests need to support this feature. For now, IORING_OP_READ and IORING_OP_RECV support it. This is checked on SQE submission, a CQE with res == -EOPNOTSUPP will be posted if attempted on unsupported requests.

Note: available from Linux 5.7

Meta