947 lines
37 KiB
Markdown
947 lines
37 KiB
Markdown
|
|
# OpenBMC in-kernel MCTP
|
||
|
|
|
||
|
|
Author: Jeremy Kerr `<jk@codeconstruct.com.au>`
|
||
|
|
|
||
|
|
Please refer to the [MCTP Overview](mctp.md) document for general MCTP design
|
||
|
|
description, background and requirements.
|
||
|
|
|
||
|
|
This document describes a kernel-based implementation of MCTP infrastructure,
|
||
|
|
providing a sockets-based API for MCTP communication within an OpenBMC-based
|
||
|
|
platform.
|
||
|
|
|
||
|
|
# Requirements for a kernel implementation
|
||
|
|
|
||
|
|
- The MCTP messaging API should be an obvious application of the existing POSIX
|
||
|
|
socket interface
|
||
|
|
|
||
|
|
- Configuration should be simple for a straightforward MCTP endpoint: a single
|
||
|
|
network with a single local endpoint id (EID).
|
||
|
|
|
||
|
|
- Infrastructure should be flexible enough to allow for more complex MCTP
|
||
|
|
networks, allowing:
|
||
|
|
|
||
|
|
- each MCTP network (as defined by section 3.2.31 of DSP0236) may consist of
|
||
|
|
multiple local physical interfaces, and/or multiple EIDs;
|
||
|
|
|
||
|
|
- multiple distinct (ie., non-bridged) networks, possibly containing
|
||
|
|
duplicated EIDs between networks;
|
||
|
|
|
||
|
|
- multiple local EIDs on a single interface, and
|
||
|
|
|
||
|
|
- customisable routing/bridging configurations within a network.
|
||
|
|
|
||
|
|
# Proposed Design
|
||
|
|
|
||
|
|
The design contains several components:
|
||
|
|
|
||
|
|
- An interface for userspace applications to send and receive MCTP messages: A
|
||
|
|
mapping of the sockets API to MCTP usage
|
||
|
|
|
||
|
|
- Infrastructure for control and configuration of the MCTP network(s),
|
||
|
|
consisting of a configuration utility, and a kernel messaging facility for
|
||
|
|
this utility to use.
|
||
|
|
|
||
|
|
- Kernel drivers for physical interface bindings.
|
||
|
|
|
||
|
|
In general, the kernel components cover the transport functionality of MCTP,
|
||
|
|
such as message assembly/disassembly, packet forwarding, and physical interface
|
||
|
|
implementations.
|
||
|
|
|
||
|
|
Higher-level protocols (such as PLDM) are implemented in userspace, through the
|
||
|
|
introduced socket API. This also includes the majority of the MCTP Control
|
||
|
|
Protocol implementation (DSP0236, section 11) - MCTP endpoints will typically
|
||
|
|
have a specific process to request and respond to control protocol messages.
|
||
|
|
However, the kernel will include a small subset of control protocol code to
|
||
|
|
allow very simple endpoints, with static EID allocations, to run without this
|
||
|
|
process. MCTP endpoints that require more than just single-endpoint
|
||
|
|
functionality (bus owners, bridges, etc), and/or dynamic EID allocation, would
|
||
|
|
include the control message protocol process.
|
||
|
|
|
||
|
|
A new driver is introduced to handle each physical interface binding. These
|
||
|
|
drivers expose the appropriate `struct net_device` to handle transmission and
|
||
|
|
reception of MCTP packets on their associated hardware channels. Under Linux,
|
||
|
|
the namespace for these interfaces is separate from other network interfaces -
|
||
|
|
such as those for ethernet.
|
||
|
|
|
||
|
|
## Structure: interfaces & networks
|
||
|
|
|
||
|
|
The kernel models the local MCTP topology through two items: interfaces and
|
||
|
|
networks.
|
||
|
|
|
||
|
|
An interface (or "link") is an instance of an MCTP physical transport binding
|
||
|
|
(as defined by DSP0236, section 3.2.47), likely connected to a specific hardware
|
||
|
|
device. This is represented as a `struct netdevice`, and has a user-visible name
|
||
|
|
and index (`ifindex`). Non-hardware-attached interfaces are permitted, to allow
|
||
|
|
local loopback and/or virtual interfaces.
|
||
|
|
|
||
|
|
A network defines a unique address space for MCTP endpoints by endpoint-ID
|
||
|
|
(described by DSP0236, section 3.2.31). A network has a user-visible identifier
|
||
|
|
to allow references from userspace. Route definitions are specific to one
|
||
|
|
network.
|
||
|
|
|
||
|
|
Interfaces are associated with one network. A network may be associated with one
|
||
|
|
or more interfaces.
|
||
|
|
|
||
|
|
If multiple networks are present, each may contain EIDs that are also present on
|
||
|
|
other networks.
|
||
|
|
|
||
|
|
## Sockets API
|
||
|
|
|
||
|
|
### Protocol definitions
|
||
|
|
|
||
|
|
We define a new address family (and corresponding protocol family) for MCTP:
|
||
|
|
|
||
|
|
```c
|
||
|
|
#define AF_MCTP /* TBD */
|
||
|
|
#define PF_MCTP AF_MCTP
|
||
|
|
```
|
||
|
|
|
||
|
|
MCTP sockets are created with the `socket()` syscall, specifying `AF_MCTP` as
|
||
|
|
the domain. Currently, only a `SOCK_DGRAM` socket type is defined.
|
||
|
|
|
||
|
|
```c
|
||
|
|
int sd = socket(AF_MCTP, SOCK_DGRAM, 0);
|
||
|
|
```
|
||
|
|
|
||
|
|
The only (current) value for the `protocol` argument is 0. Future protocol
|
||
|
|
implementations may be added later.
|
||
|
|
|
||
|
|
MCTP Sockets opened with a protocol value of 0 will communicate directly at the
|
||
|
|
transport layer; message buffers received by the application will consist of
|
||
|
|
message data from reassembled MCTP packets, and will include the full message
|
||
|
|
including message type byte and optional message integrity check (IC).
|
||
|
|
Individual packet headers are not included; they may be accessible through a
|
||
|
|
future `SOCK_RAW` socket type.
|
||
|
|
|
||
|
|
As with all socket address families, source and destination addresses are
|
||
|
|
specified with a new `sockaddr` type:
|
||
|
|
|
||
|
|
```c
|
||
|
|
struct sockaddr_mctp {
|
||
|
|
sa_family_t smctp_family; /* = AF_MCTP */
|
||
|
|
int smctp_network;
|
||
|
|
struct mctp_addr smctp_addr;
|
||
|
|
uint8_t smctp_type;
|
||
|
|
uint8_t smctp_tag;
|
||
|
|
};
|
||
|
|
|
||
|
|
struct mctp_addr {
|
||
|
|
uint8_t s_addr;
|
||
|
|
};
|
||
|
|
|
||
|
|
/* MCTP network values */
|
||
|
|
#define MCTP_NET_ANY 0
|
||
|
|
|
||
|
|
/* MCTP EID values */
|
||
|
|
#define MCTP_ADDR_ANY 0xff
|
||
|
|
#define MCTP_ADDR_BCAST 0xff
|
||
|
|
|
||
|
|
/* MCTP type values. Only the least-significant 7 bits of
|
||
|
|
* smctp_type are used for tag matches; the specification defines
|
||
|
|
* the type to be 7 bits.
|
||
|
|
*/
|
||
|
|
#define MCTP_TYPE_MASK 0x7f
|
||
|
|
|
||
|
|
/* MCTP tag defintions; used for smcp_tag field of sockaddr_mctp */
|
||
|
|
/* MCTP-spec-defined fields */
|
||
|
|
#define MCTP_TAG_MASK 0x07
|
||
|
|
#define MCTP_TAG_OWNER 0x08
|
||
|
|
/* Others: reserved */
|
||
|
|
|
||
|
|
/* Helpers */
|
||
|
|
#define MCTP_TAG_RSP(x) (x & MCTP_TAG_MASK) /* response to a request: clear TO, keep value */
|
||
|
|
```
|
||
|
|
|
||
|
|
### Syscall behaviour
|
||
|
|
|
||
|
|
The following sections describe the MCTP-specific behaviours of the standard
|
||
|
|
socket system calls. These behaviours have been chosen to map closely to the
|
||
|
|
existing sockets APIs.
|
||
|
|
|
||
|
|
#### `bind()`: set local socket address
|
||
|
|
|
||
|
|
Sockets that receive incoming request packets will bind to a local address,
|
||
|
|
using the `bind()` syscall.
|
||
|
|
|
||
|
|
```c
|
||
|
|
struct sockaddr_mctp addr;
|
||
|
|
|
||
|
|
addr.smctp_family = AF_MCTP;
|
||
|
|
addr.smctp_network = MCTP_NET_ANY;
|
||
|
|
addr.smctp_addr.s_addr = MCTP_ADDR_ANY;
|
||
|
|
addr.smctp_type = MCTP_TYPE_PLDM;
|
||
|
|
addr.smctp_tag = MCTP_TAG_OWNER;
|
||
|
|
|
||
|
|
int rc = bind(sd, (struct sockaddr *)&addr, sizeof(addr));
|
||
|
|
```
|
||
|
|
|
||
|
|
This establishes the local address of the socket. Incoming MCTP messages that
|
||
|
|
match the network, address, and message type will be received by this socket.
|
||
|
|
The reference to 'incoming' is important here; a bound socket will only receive
|
||
|
|
messages with the TO bit set, to indicate an incoming request message, rather
|
||
|
|
than a response.
|
||
|
|
|
||
|
|
The `smctp_tag` value will configure the tags accepted from the remote side of
|
||
|
|
this socket. Given the above, the only valid value is `MCTP_TAG_OWNER`, which
|
||
|
|
will result in remotely "owned" tags being routed to this socket. Since
|
||
|
|
`MCTP_TAG_OWNER` is set, the 3 least-significant bits of `smctp_tag` are not
|
||
|
|
used; callers must set them to zero. See the
|
||
|
|
[Tag behaviour for transmitted messages](#tag-behaviour-for-transmitted-messages)
|
||
|
|
section for more details. If the `MCTP_TAG_OWNER` bit is not set, `bind()` will
|
||
|
|
fail with an errno of `EINVAL`.
|
||
|
|
|
||
|
|
A `smctp_network` value of `MCTP_NET_ANY` will configure the socket to receive
|
||
|
|
incoming packets from any locally-connected network. A specific network value
|
||
|
|
will cause the socket to only receive incoming messages from that network.
|
||
|
|
|
||
|
|
The `smctp_addr` field specifies a local address to bind to. A value of
|
||
|
|
`MCTP_ADDR_ANY` configures the socket to receive messages addressed to any local
|
||
|
|
destination EID.
|
||
|
|
|
||
|
|
The `smctp_type` field specifies which message types to receive. Only the lower
|
||
|
|
7 bits of the type is matched on incoming messages (ie., the most-significant IC
|
||
|
|
bit is not part of the match). This results in the socket receiving packets with
|
||
|
|
and without a message integrity check footer.
|
||
|
|
|
||
|
|
#### `connect()`: set remote socket address
|
||
|
|
|
||
|
|
Sockets may specify a socket's remote address with the `connect()` syscall:
|
||
|
|
|
||
|
|
```c
|
||
|
|
struct sockaddr_mctp addr;
|
||
|
|
int rc;
|
||
|
|
|
||
|
|
addr.smctp_family = AF_MCTP;
|
||
|
|
addr.smctp_network = MCTP_NET_ANY;
|
||
|
|
addr.smctp_addr.s_addr = 8;
|
||
|
|
addr.smctp_tag = MCTP_TAG_OWNER;
|
||
|
|
addr.smctp_type = MCTP_TYPE_PLDM;
|
||
|
|
|
||
|
|
rc = connect(sd, (struct sockaddr *)&addr, sizeof(addr));
|
||
|
|
```
|
||
|
|
|
||
|
|
This establishes the remote address of a socket, used for future message
|
||
|
|
transmission. Like other `SOCK_DGRAM` behaviour, this does not generate any MCTP
|
||
|
|
traffic directly, but just sets the default destination for messages sent from
|
||
|
|
this socket.
|
||
|
|
|
||
|
|
The `smctp_network` field may specify a locally-attached network, or the value
|
||
|
|
`MCTP_NET_ANY`, in which case the kernel will select a suitable MCTP network.
|
||
|
|
This is guaranteed to work for single-network configurations, but may require
|
||
|
|
additional routing definitions for endpoints attached to multiple distinct
|
||
|
|
networks. See the [Addressing](#addressing) section for details.
|
||
|
|
|
||
|
|
The `smctp_addr` field specifies a remote EID. This may be the `MCTP_ADDR_BCAST`
|
||
|
|
the MCTP broadcast EID (0xff).
|
||
|
|
|
||
|
|
The `smctp_type` field specifies the type field of messages transferred over
|
||
|
|
this socket.
|
||
|
|
|
||
|
|
The `smctp_tag` value will configure the tag used for the local side of this
|
||
|
|
socket. The only valid value is `MCTP_TAG_OWNER`, which will result in an
|
||
|
|
"owned" tag to be allocated for this socket, and will remain allocated for all
|
||
|
|
future outgoing messages, until either the socket is closed, or `connect()` is
|
||
|
|
called again. If a tag cannot be allocated, `connect()` will report an error,
|
||
|
|
with an errno value of `EAGAIN`. See the
|
||
|
|
[Tag behaviour for transmitted messages](#tag-behaviour-for-transmitted-messages)
|
||
|
|
section for more details. If the `MCTP_TAG_OWNER` bit is not set, `connect()`
|
||
|
|
will fail with an errno of `EINVAL`.
|
||
|
|
|
||
|
|
Requesters which connect to a single responder will typically use `connect()` to
|
||
|
|
specify the peer address and tag for future outgoing messages.
|
||
|
|
|
||
|
|
#### `sendto()`, `sendmsg()`, `send()` & `write()`: transmit an MCTP message
|
||
|
|
|
||
|
|
An MCTP message is transmitted using one of the `sendto()`, `sendmsg()`,
|
||
|
|
`send()` or `write()` syscalls. Using `sendto()` as the primary example:
|
||
|
|
|
||
|
|
```c
|
||
|
|
struct sockaddr_mctp addr;
|
||
|
|
char buf[14];
|
||
|
|
ssize_t len;
|
||
|
|
|
||
|
|
/* set message destination */
|
||
|
|
addr.smctp_family = AF_MCTP;
|
||
|
|
addr.smctp_network = 0;
|
||
|
|
addr.smctp_addr.s_addr = 8;
|
||
|
|
addr.smctp_tag = MCTP_TAG_OWNER;
|
||
|
|
addr.smctp_type = MCTP_TYPE_ECHO;
|
||
|
|
|
||
|
|
/* arbitrary message to send, with message-type header */
|
||
|
|
buf[0] = MCTP_TYPE_ECHO;
|
||
|
|
memcpy(buf + 1, "hello, world!", sizeof(buf) - 1);
|
||
|
|
|
||
|
|
len = sendto(sd, buf, sizeof(buf), 0,
|
||
|
|
(struct sockaddr_mctp *)&addr, sizeof(addr));
|
||
|
|
```
|
||
|
|
|
||
|
|
The address argument is treated the same way as for `connect()`: The network and
|
||
|
|
address fields define the remote address to send to. If `smctp_tag` has the
|
||
|
|
`MCTP_TAG_OWNER`, the kernel will ignore any bits set in `MCTP_TAG_VALUE`, and
|
||
|
|
generate a tag value suitable for the destination EID. If `MCTP_TAG_OWNER` is
|
||
|
|
not set, the message will be sent with the tag value as specified. If a tag
|
||
|
|
value cannot be allocated, the system call will report an errno of `EAGAIN`.
|
||
|
|
|
||
|
|
The application must provide the message type byte as the first byte of the
|
||
|
|
message buffer passed to `sendto()`. If a message integrity check is to be
|
||
|
|
included in the transmitted message, it must also be provided in the message
|
||
|
|
buffer, and the most-significant bit of the message type byte must be 1.
|
||
|
|
|
||
|
|
If the first byte of the message does not match the message type value, then the
|
||
|
|
system call will return an error of `EPROTO`.
|
||
|
|
|
||
|
|
The `send()` and `write()` system calls behave in a similar way, but do not
|
||
|
|
specify a remote address. Therefore, `connect()` must be called beforehand; if
|
||
|
|
not, these calls will return an error of `EDESTADDRREQ` (Destination address
|
||
|
|
required).
|
||
|
|
|
||
|
|
Using `sendto()` or `sendmsg()` on a connected socket may override the remote
|
||
|
|
socket address specified in `connect()`. The `connect()` address and tag will
|
||
|
|
remain associated with the socket, for future unaddressed sends. The tag
|
||
|
|
allocated through a call to `sendto()` or `sendmsg()` on a connected socket is
|
||
|
|
subject to the same invalidation logic as on an unconnected socket: It is
|
||
|
|
expired either by timeout or by a subsequent `sendto()`.
|
||
|
|
|
||
|
|
The `sendmsg()` system call allows a more compact argument interface, and the
|
||
|
|
message buffer to be specified as a scatter-gather list. At present no ancillary
|
||
|
|
message types (used for the `msg_control` data passed to `sendmsg()`) are
|
||
|
|
defined.
|
||
|
|
|
||
|
|
Transmitting a message on an unconnected socket with `MCTP_TAG_OWNER` specified
|
||
|
|
will cause an allocation of a tag, if no valid tag is already allocated for that
|
||
|
|
destination. The (destination-eid,tag) tuple acts as an implicit local socket
|
||
|
|
address, to allow the socket to receive responses to this outgoing message. If
|
||
|
|
any previous allocation has been performed (to for a different remote EID), that
|
||
|
|
allocation is lost. This tag behaviour can be controlled through the
|
||
|
|
`MCTP_TAG_CONTROL` socket option.
|
||
|
|
|
||
|
|
Sockets will only receive responses to requests they have sent (with TO=1) and
|
||
|
|
may only respond (with TO=0) to requests they have received.
|
||
|
|
|
||
|
|
#### `recvfrom()`, `recvmsg()`, `recv()` & `read()`: receive an MCTP message
|
||
|
|
|
||
|
|
An MCTP message can be received by an application using one of the `recvfrom()`,
|
||
|
|
`recvmsg()`, `recv()` or `read()` system calls. Using `recvfrom()` as the
|
||
|
|
primary example:
|
||
|
|
|
||
|
|
```c
|
||
|
|
struct sockaddr_mctp addr;
|
||
|
|
socklen_t addrlen;
|
||
|
|
char buf[14];
|
||
|
|
ssize_t len;
|
||
|
|
|
||
|
|
addrlen = sizeof(addr);
|
||
|
|
|
||
|
|
len = recvfrom(sd, buf, sizeof(buf), 0,
|
||
|
|
(struct sockaddr_mctp *)&addr, &addrlen);
|
||
|
|
|
||
|
|
/* We can expect addr to describe an MCTP address */
|
||
|
|
assert(addrlen >= sizeof(buf));
|
||
|
|
assert(addr.smctp_family == AF_MCTP);
|
||
|
|
|
||
|
|
printf("received %zd bytes from remote EID %d\n", rc, addr.smctp_addr);
|
||
|
|
```
|
||
|
|
|
||
|
|
The address argument to `recvfrom` and `recvmsg` is populated with the remote
|
||
|
|
address of the incoming message, including tag value (this will be needed in
|
||
|
|
order to reply to the message).
|
||
|
|
|
||
|
|
The first byte of the message buffer will contain the message type byte. If an
|
||
|
|
integrity check follows the message, it will be included in the received buffer.
|
||
|
|
|
||
|
|
The `recv()` and `read()` system calls behave in a similar way, but do not
|
||
|
|
provide a remote address to the application. Therefore, these are only useful if
|
||
|
|
the remote address is already known, or the message does not require a reply.
|
||
|
|
|
||
|
|
Like the send calls, sockets will only receive responses to requests they have
|
||
|
|
sent (TO=1) and may only respond (TO=0) to requests they have received.
|
||
|
|
|
||
|
|
#### `getsockname()` & `getpeername()`: query local/remote socket address
|
||
|
|
|
||
|
|
The `getsockname()` system call returns the `struct sockaddr_mctp` value for the
|
||
|
|
local side of this socket, `getpeername()` for the remote (ie, that used in a
|
||
|
|
connect()). Since the tag value is a property of the remote address,
|
||
|
|
`getpeername()` may be used to retrieve a kernel-allocated tag value.
|
||
|
|
|
||
|
|
Calling `getpeername()` on an unconnected socket will result in an error of
|
||
|
|
`ENOTCONN`.
|
||
|
|
|
||
|
|
#### Socket options
|
||
|
|
|
||
|
|
The following socket options are defined for MCTP sockets:
|
||
|
|
|
||
|
|
##### `MCTP_ADDR_EXT`: Use extended addressing information in sendmsg/recvmsg
|
||
|
|
|
||
|
|
Enabling this socket option allows an application to specify extended addressing
|
||
|
|
information on transmitted packets, and access the same on received packets.
|
||
|
|
|
||
|
|
When the `MCTP_ADDR_EXT` socket option is enabled, the application may specify
|
||
|
|
an expanded `struct sockaddr` to the `recvfrom()` and `sendto()` system calls.
|
||
|
|
This as defined as:
|
||
|
|
|
||
|
|
```c
|
||
|
|
struct sockaddr_mctp_ext {
|
||
|
|
/* fields exactly match struct sockaddr_mctp */
|
||
|
|
sa_family_t smctp_family; /* = AF_MCTP */
|
||
|
|
int smctp_network;
|
||
|
|
struct mctp_addr smctp_addr;
|
||
|
|
uint8_t smcp_tag;
|
||
|
|
/* extended addressing */
|
||
|
|
int smctp_ifindex;
|
||
|
|
uint8_t smctp_halen;
|
||
|
|
unsigned char smctp_haddr[/* TBD */];
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
If the `addrlen` specified to `sendto()` or `recvfrom()` is sufficient to
|
||
|
|
contain this larger structure, then the extended addressing fields are consumed
|
||
|
|
/ populated respectively.
|
||
|
|
|
||
|
|
##### `MCTP_TAG_CONTROL`: manage outgoing tag allocation behaviour
|
||
|
|
|
||
|
|
The set/getsockopt argument is a `mctp_tagctl` structure:
|
||
|
|
|
||
|
|
struct mctp_tagctl {
|
||
|
|
bool retain;
|
||
|
|
struct timespec timeout;
|
||
|
|
};
|
||
|
|
|
||
|
|
This allows an application to control the behaviour of allocated tags for
|
||
|
|
non-connected sockets when transferring messages to multiple different
|
||
|
|
destinations (ie., where a `struct sockaddr_mctp` is provided for individual
|
||
|
|
messages, and the `smctp_addr` destination for those sockets may vary across
|
||
|
|
calls).
|
||
|
|
|
||
|
|
The `retain` flag indicates to the kernel that the socket should not release tag
|
||
|
|
allocations when a message is sent to a new destination EID. This causes the
|
||
|
|
socket to continue to receive incoming messages to the old (dest,tag) tuple, in
|
||
|
|
addition to the new tuple.
|
||
|
|
|
||
|
|
The `timeout` value specifies a maximum amount of time to retain tag values.
|
||
|
|
This should be based on the reply timeout for any upper-level protocol.
|
||
|
|
|
||
|
|
The kernel may reject a request to set values that would cause excessive tag
|
||
|
|
allocation by this socket. The kernel may also reject subsequent tag-allocation
|
||
|
|
requests (through send or connect syscalls) which would cause excessive tags to
|
||
|
|
be consumed by the socket, even though the tag control settings were accepted in
|
||
|
|
the setsockopt operation.
|
||
|
|
|
||
|
|
Changing the default tag control behaviour should only be required when:
|
||
|
|
|
||
|
|
- the socket is sending messages with TO=1 (ie, is a requester); and
|
||
|
|
- messages are sent to multiple different destination EIDs from the one socket.
|
||
|
|
|
||
|
|
#### Syscalls not implemented
|
||
|
|
|
||
|
|
The following system calls are not implemented for MCTP, primarily as they are
|
||
|
|
not used in `SOCK_DGRAM`-type sockets:
|
||
|
|
|
||
|
|
- `listen()`
|
||
|
|
- `accept()`
|
||
|
|
- `ioctl()`
|
||
|
|
- `shutdown()`
|
||
|
|
- `mmap()`
|
||
|
|
|
||
|
|
### Userspace examples
|
||
|
|
|
||
|
|
These examples cover three general use-cases:
|
||
|
|
|
||
|
|
- **requester**: sends requests to a particular (EID, type) target, and receives
|
||
|
|
responses to those packets
|
||
|
|
|
||
|
|
This is similar to a typical UDP client
|
||
|
|
|
||
|
|
- **responder**: receives all locally-addressed messages of a specific
|
||
|
|
message-type, and responds to the requester immediately.
|
||
|
|
|
||
|
|
This is similar to a typical UDP server
|
||
|
|
|
||
|
|
- **controller**: a specific service for a bus owner; may send broadcast
|
||
|
|
messages, manage EID allocations, update local MCTP stack state. Will need
|
||
|
|
low-level packet data.
|
||
|
|
|
||
|
|
This is similar to a DHCP server.
|
||
|
|
|
||
|
|
#### Requester
|
||
|
|
|
||
|
|
"Client"-side implementation to send requests to a responder, and receive a
|
||
|
|
response. This uses a (fictitious) message type of `MCTP_TYPE_ECHO`.
|
||
|
|
|
||
|
|
```c
|
||
|
|
int main() {
|
||
|
|
struct sockaddr_mctp addr;
|
||
|
|
socklen_t addrlen;
|
||
|
|
struct {
|
||
|
|
uint8_t type;
|
||
|
|
uint8_t data[14];
|
||
|
|
} msg;
|
||
|
|
int sd, rc;
|
||
|
|
|
||
|
|
sd = socket(AF_MCTP, SOCK_DGRAM, 0);
|
||
|
|
|
||
|
|
addr.sa_family = AF_MCTP;
|
||
|
|
addr.smctp_network = MCTP_NET_ANY; /* any network */
|
||
|
|
addr.smctp_addr.s_addr = 9; /* remote eid 9 */
|
||
|
|
addr.smctp_tag = MCTP_TAG_OWNER; /* kernel will allocate an owned tag */
|
||
|
|
addr.smctp_type = MCTP_TYPE_ECHO; /* ficticious message type */
|
||
|
|
addrlen = sizeof(addr);
|
||
|
|
|
||
|
|
/* set message type and payload */
|
||
|
|
msg.type = MCTP_TYPE_ECHO;
|
||
|
|
strncpy(msg.data, "hello, world!", sizeof(msg.data));
|
||
|
|
|
||
|
|
/* send message */
|
||
|
|
rc = sendto(sd, &msg, sizeof(msg), 0,
|
||
|
|
(struct sockaddr *)&addr, addrlen);
|
||
|
|
|
||
|
|
if (rc < 0)
|
||
|
|
err(EXIT_FAILURE, "sendto");
|
||
|
|
|
||
|
|
/* Receive reply. This will block until a reply arrives,
|
||
|
|
* which may never happen. Actual code would need a timeout
|
||
|
|
* here. */
|
||
|
|
rc = recvfrom(sd, &msg, sizeof(msg), 0,
|
||
|
|
(struct sockaddr *)&addr, &addrlen);
|
||
|
|
if (rc < 0)
|
||
|
|
err(EXIT_FAILURE, "recvfrom");
|
||
|
|
|
||
|
|
assert(msg.type == MCTP_TYPE_ECHO);
|
||
|
|
/* ensure we're nul-terminated */
|
||
|
|
msg.data[sizeof(msg.data)-1] = '\0';
|
||
|
|
|
||
|
|
printf("reply: %s\n", msg.data);
|
||
|
|
|
||
|
|
return EXIT_SUCCESS;
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
#### Responder
|
||
|
|
|
||
|
|
"Server"-side implementation to receive requests and respond. Like the client,
|
||
|
|
This uses a (fictitious) message type of `MCTP_TYPE_ECHO` in the
|
||
|
|
`struct sockaddr_mctp`; only messages matching this type will be received.
|
||
|
|
|
||
|
|
```c
|
||
|
|
int main() {
|
||
|
|
struct sockaddr_mctp addr;
|
||
|
|
socklen_t addrlen;
|
||
|
|
int sd, rc;
|
||
|
|
|
||
|
|
sd = socket(AF_MCTP, SOCK_DGRAM, 0);
|
||
|
|
|
||
|
|
addr.sa_family = AF_MCTP;
|
||
|
|
addr.smctp_network = MCTP_NET_ANY; /* any network */
|
||
|
|
addr.smctp_addr.s_addr = MCTP_EID_ANY;
|
||
|
|
addr.smctp_type = MCTP_TYPE_ECHO;
|
||
|
|
addr.smctp_tag = MCTP_TAG_OWNER;
|
||
|
|
addrlen = sizeof(addr);
|
||
|
|
|
||
|
|
rc = bind(sd, (struct sockaddr *)&addr, addrlen);
|
||
|
|
if (rc)
|
||
|
|
err(EXIT_FAILURE, "bind");
|
||
|
|
|
||
|
|
for (;;) {
|
||
|
|
struct {
|
||
|
|
uint8_t type;
|
||
|
|
uint8_t data[14];
|
||
|
|
} msg;
|
||
|
|
|
||
|
|
rc = recvfrom(sd, &msg, sizeof(msg), 0,
|
||
|
|
(struct sockaddr *)&addr, &addrlen);
|
||
|
|
if (rc < 0)
|
||
|
|
err(EXIT_FAILURE, "recvfrom");
|
||
|
|
if (rc < 1)
|
||
|
|
warnx("not enough data for a message type");
|
||
|
|
|
||
|
|
assert(addrlen == sizeof(addr));
|
||
|
|
assert(msg.type == MCTP_TYPE_ECHO);
|
||
|
|
|
||
|
|
printf("%zd bytes from EID %d\n", rc, addr.smctp_addr);
|
||
|
|
|
||
|
|
/* Reply to requester; this macro just clears the TO-bit.
|
||
|
|
* Other addr fields will describe the remote endpoint,
|
||
|
|
* so use those as-is.
|
||
|
|
*/
|
||
|
|
addr.smctp_tag = MCTP_TAG_RSP(addr.smctp_tag);
|
||
|
|
|
||
|
|
rc = sendto(sd, &msg, rc, 0,
|
||
|
|
(struct sockaddr *)&addr, addrlen);
|
||
|
|
if (rc < 0)
|
||
|
|
err(EXIT_FAILURE, "sendto");
|
||
|
|
}
|
||
|
|
|
||
|
|
return EXIT_SUCCESS;
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
#### Broadcast request
|
||
|
|
|
||
|
|
Sends a request to a broadcast EID, and receives (unicast) replies. Typical
|
||
|
|
control protocol pattern.
|
||
|
|
|
||
|
|
```c
|
||
|
|
int main() {
|
||
|
|
struct sockaddr_mctp txaddr, rxaddr;
|
||
|
|
struct timespec start, cur;
|
||
|
|
struct pollfd pollfds[1];
|
||
|
|
socklen_t addrlen;
|
||
|
|
uint8_t buf[2];
|
||
|
|
int timeout;
|
||
|
|
|
||
|
|
sd = socket(AF_MCTP, SOCK_DGRAM, 0);
|
||
|
|
|
||
|
|
/* destination address setup */
|
||
|
|
txaddr.sa_family = AF_MCTP;
|
||
|
|
txaddr.smctp_network = 1; /* specific network required for broadcast */
|
||
|
|
txaddr.smctp_addr.s_addr = MCTP_TAG_BCAST; /* broadcast dest */
|
||
|
|
txaddr.smctp_type = MCTP_TYPE_CONTROL;
|
||
|
|
txaddr.smctp_tag = MCTP_TAG_OWNER;
|
||
|
|
|
||
|
|
buf[0] = MCTP_TYPE_CONTROL;
|
||
|
|
buf[1] = 'a';
|
||
|
|
|
||
|
|
/* We're doing a sendto() to a broadcast address here. If we were
|
||
|
|
* sending more than one broadcast message, we'd be better off
|
||
|
|
* doing connect(); sendto();, in order to retain the tag
|
||
|
|
* reservation across all transmitted messages. However, since this
|
||
|
|
* is a single transmit, that makes no difference in this
|
||
|
|
* particular case.
|
||
|
|
*/
|
||
|
|
rc = sendto(sd, buf, 2, 0, (struct sockaddr *)&txaddr,
|
||
|
|
sizeof(txaddr));
|
||
|
|
if (rc < 0)
|
||
|
|
err(EXIT_FAILURE, "sendto");
|
||
|
|
|
||
|
|
/* Set up poll behaviour, and record our starting time for
|
||
|
|
* reply timeouts */
|
||
|
|
pollfds[0].fd = sd;
|
||
|
|
pollfds[0].events = POLLIN;
|
||
|
|
clock_gettime(CLOCK_MONOTONIC, &start);
|
||
|
|
|
||
|
|
for (;;) {
|
||
|
|
/* Calculate the amount of time left for replies */
|
||
|
|
clock_gettime(CLOCK_MONOTONIC, &cur);
|
||
|
|
timeout = calculate_timeout(&start, &cur, 1000);
|
||
|
|
|
||
|
|
rc = poll(pollfds, 1, timeout)
|
||
|
|
if (rc < 0)
|
||
|
|
err(EXIT_FAILURE, "poll");
|
||
|
|
|
||
|
|
/* timeout receiving a reply? */
|
||
|
|
if (rc == 0)
|
||
|
|
break;
|
||
|
|
|
||
|
|
/* sanity check that we have a message to receive */
|
||
|
|
if (!(pollfds[0].revents & POLLIN))
|
||
|
|
break;
|
||
|
|
|
||
|
|
addrlen = sizeof(rxaddr);
|
||
|
|
|
||
|
|
rc = recvfrom(sd, &buf, 2, 0, (struct sockaddr *)&rxaddr,
|
||
|
|
&addrlen);
|
||
|
|
if (rc < 0)
|
||
|
|
err(EXIT_FAILURE, "recvfrom");
|
||
|
|
|
||
|
|
assert(addrlen >= sizeof(rxaddr));
|
||
|
|
assert(rxaddr.smctp_family == AF_MCTP);
|
||
|
|
|
||
|
|
printf("response from EID %d\n", rxaddr.smctp_addr);
|
||
|
|
}
|
||
|
|
|
||
|
|
return EXIT_SUCCESS;
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
### Implementation notes
|
||
|
|
|
||
|
|
#### Addressing
|
||
|
|
|
||
|
|
Transmitted messages (through `sendto()` and related system calls) specify their
|
||
|
|
destination via the `smctp_network` and `smctp_addr` fields of a
|
||
|
|
`struct sockaddr_mctp`.
|
||
|
|
|
||
|
|
The `smctp_addr` field maps directly to the destination endpoint's EID.
|
||
|
|
|
||
|
|
The `smctp_network` field specifies a locally defined network identifier. To
|
||
|
|
simplify situations where there is only one network defined, the special value
|
||
|
|
`MCTP_NET_ANY` is allowed. This will allow the kernel to select a specific
|
||
|
|
network for transmission.
|
||
|
|
|
||
|
|
This selection is entirely user-configured; one specific network may be defined
|
||
|
|
as the system default, in which case it will be used for all message
|
||
|
|
transmission where `MCTP_NET_ANY` is used as the destination network.
|
||
|
|
|
||
|
|
In particular, the destination EID is never used to select a destination
|
||
|
|
network.
|
||
|
|
|
||
|
|
MCTP responders should use the EID and network values of an incoming request to
|
||
|
|
specify the destination for any responses.
|
||
|
|
|
||
|
|
#### Bridging/routing
|
||
|
|
|
||
|
|
The network and interface structure allows multiple interfaces to share a common
|
||
|
|
network. By default, packets are not forwarded between interfaces.
|
||
|
|
|
||
|
|
A network can be configured for "forwarding" mode. In this mode, packets may be
|
||
|
|
forwarded if their destination EID is non-local, and matches a route for another
|
||
|
|
interface on the same network.
|
||
|
|
|
||
|
|
As per DSP0236, packet reassembly does not occur during the forwarding process.
|
||
|
|
If the packet is larger than the MTU for the destination interface/route, then
|
||
|
|
the packet is dropped.
|
||
|
|
|
||
|
|
#### Tag behaviour for transmitted messages
|
||
|
|
|
||
|
|
On every message sent with the tag-owner bit set ("TO" in DSP0236), the kernel
|
||
|
|
must allocate a tag that will uniquely identify responses over a (destination
|
||
|
|
EID, source EID, tag-owner, tag) tuple. The tag value is 3 bits in size.
|
||
|
|
|
||
|
|
To allow this, a `sendto()` with the `MCTP_TAG_OWNER` bit set in the `smctp_tag`
|
||
|
|
field will cause the kernel to allocate a unique tag for subsequent replies from
|
||
|
|
that specific remote EID.
|
||
|
|
|
||
|
|
This allocation will expire when any of the following occur:
|
||
|
|
|
||
|
|
- the socket is closed
|
||
|
|
- a new message is sent to a new destination EID
|
||
|
|
- an implementation-defined timeout expires
|
||
|
|
|
||
|
|
Because the "tag space" is limited, it may not be possible for the kernel to
|
||
|
|
allocate a unique tag for the outgoing message. In this case, the `sendto()`
|
||
|
|
call will fail with errno `EAGAIN`. This is analogous to the UDP behaviour when
|
||
|
|
a local port cannot be allocated for an outgoing message.
|
||
|
|
|
||
|
|
The implementation-defined timeout value shall be chosen to reasonably cover
|
||
|
|
standard reply timeouts. If necessary, this timeout may be modified through the
|
||
|
|
`MCTP_TAG_CONTROL` socket option.
|
||
|
|
|
||
|
|
For applications that expect to perform an ongoing message exchange with a
|
||
|
|
particular destination address, they may use the `connect()` call to set a
|
||
|
|
persistent remote address. In this case, the tag will be allocated during
|
||
|
|
connect(), and remain reserved for this socket until any of the following occur:
|
||
|
|
|
||
|
|
- the socket is closed
|
||
|
|
- the remote address is changed through another call to `connect()`.
|
||
|
|
|
||
|
|
In particular, calling `sendto()` with a different address does not release the
|
||
|
|
tag reservation.
|
||
|
|
|
||
|
|
Broadcast messages are particularly onerous for tag reservations. When a message
|
||
|
|
is transmitted with TO=1 and dest=0xff (the broadcast EID), the kernel must
|
||
|
|
reserve the tag across the entire range of possible EIDs. Therefore, a
|
||
|
|
particular tag value must be currently-unused across all EIDs to allow a
|
||
|
|
`sendto()` to a broadcast address. Additionally, this reservation is not cleared
|
||
|
|
when a reply is received, as there may be multiple replies to a broadcast.
|
||
|
|
|
||
|
|
For this reason, applications wanting to send to the broadcast address should
|
||
|
|
use the `connect()` system call to reserve a tag, and guarantee its availability
|
||
|
|
for future message transmission. Note that this will remove the tag value for
|
||
|
|
use with _any other EID_. Sending to the broadcast address should be avoided; we
|
||
|
|
expect few applications will need this functionality.
|
||
|
|
|
||
|
|
#### MCTP Control Protocol implementation
|
||
|
|
|
||
|
|
Aside from the "Resolve endpoint EID" message, the MCTP control protocol
|
||
|
|
implementation would exist as a userspace process, `mctpd`. This process is
|
||
|
|
responsible for responding to incoming control protocol messages, any dynamic
|
||
|
|
EID allocations (for bus owner devices) and maintaining the MCTP route table
|
||
|
|
(for bridging devices).
|
||
|
|
|
||
|
|
This process would create a socket bound to the type `MCTP_TYPE_CONTROL`, with
|
||
|
|
the `MCTP_ADDR_EXT` socket option enabled in order to access physical addressing
|
||
|
|
data on incoming control protocol requests. It would interact with the kernel's
|
||
|
|
route table via a netlink interface - the same as that implemented for the
|
||
|
|
[Utility and configuration interfaces](#utility-and-configuration-interfaces).
|
||
|
|
|
||
|
|
### Neighbour and routing implementation
|
||
|
|
|
||
|
|
The packet-transmission behaviour of the MCTP infrastructure relies on a single
|
||
|
|
routing table to lookup both route and neighbour information. Entries in this
|
||
|
|
table are of the format:
|
||
|
|
|
||
|
|
| EID range | interface | physical address | metric | MTU | flags | expiry |
|
||
|
|
| --------- | --------- | ---------------- | ------ | --- | ----- | ------ |
|
||
|
|
|
||
|
|
This table can be updated from two sources:
|
||
|
|
|
||
|
|
- From userspace, via a netlink interface (see the
|
||
|
|
[Utility and configuration interfaces](#utility-and-configuration-interfaces)
|
||
|
|
section).
|
||
|
|
|
||
|
|
- Directly within the kernel, when basic neighbour information is discovered.
|
||
|
|
Kernel-originated routes are marked as such in the flags field, and have a
|
||
|
|
maximum validity age, indicated by the expiry field.
|
||
|
|
|
||
|
|
Kernel-discovered routing information can originate from two sources:
|
||
|
|
|
||
|
|
- physical-to-EID mappings discovered through received packets
|
||
|
|
|
||
|
|
- explicit endpoint physical-address resolution requests
|
||
|
|
|
||
|
|
When a packet is to be transmitted to an EID that does not have an entry in the
|
||
|
|
routing table, the kernel may attempt to resolve the physical address of that
|
||
|
|
endpoint using the Resolve Endpoint ID command of the MCTP Control Protocol
|
||
|
|
(section 12.9 of DSP0236). The response message will be used to add a
|
||
|
|
kernel-originated route into the routing table.
|
||
|
|
|
||
|
|
This is the only kernel-internal usage of MCTP Control Protocol messages.
|
||
|
|
|
||
|
|
## Utility and configuration interfaces
|
||
|
|
|
||
|
|
A small utility will be developed to control the state of the kernel MCTP stack.
|
||
|
|
This will be similar in design to the 'iproute2' tools, which perform a similar
|
||
|
|
function for the IPv4 and IPv6 protocols.
|
||
|
|
|
||
|
|
The utility will be invoked as `mctp`, and provide subcommands for managing
|
||
|
|
different aspects of the kernel stack.
|
||
|
|
|
||
|
|
### `mctp link`: manage interfaces
|
||
|
|
|
||
|
|
```sh
|
||
|
|
mctp link set <link> <up|down>
|
||
|
|
mctp link set <link> network <network-id>
|
||
|
|
mctp link set <link> mtu <mtu>
|
||
|
|
mctp link set <link> bus-owner <hwaddr>
|
||
|
|
```
|
||
|
|
|
||
|
|
### `mctp network`: manage networks
|
||
|
|
|
||
|
|
```sh
|
||
|
|
mctp network create <network-id>
|
||
|
|
mctp network set <network-id> forwarding <on|off>
|
||
|
|
mctp network set <network-id> default [<true|false>]
|
||
|
|
mctp network delete <network-id>
|
||
|
|
```
|
||
|
|
|
||
|
|
### `mctp address`: manage local EID assignments
|
||
|
|
|
||
|
|
```sh
|
||
|
|
mctp address add <eid> dev <link>
|
||
|
|
mctp address del <eid> dev <link>
|
||
|
|
```
|
||
|
|
|
||
|
|
### `mctp route`: manage routing tables
|
||
|
|
|
||
|
|
```sh
|
||
|
|
mctp route add net <network-id> eid <eid|eid-range> via <link> [hwaddr <addr>] [mtu <mtu>] [metric <metric>]
|
||
|
|
mctp route del net <network-id> eid <eid|eid-range> via <link> [hwaddr <addr>] [mtu <mtu>] [metric <metric>]
|
||
|
|
mctp route show [net <network-id>]
|
||
|
|
```
|
||
|
|
|
||
|
|
### `mctp stat`: query socket status
|
||
|
|
|
||
|
|
```sh
|
||
|
|
mctp stat
|
||
|
|
```
|
||
|
|
|
||
|
|
A set of netlink message formats will be defined to support these control
|
||
|
|
functions.
|
||
|
|
|
||
|
|
# Design points & alternatives considered
|
||
|
|
|
||
|
|
## Including message-type byte in send/receive buffers
|
||
|
|
|
||
|
|
This design specifies that message buffers passed to the kernel in send syscalls
|
||
|
|
and from the kernel in receive syscalls will have the message type byte as the
|
||
|
|
first byte of the buffer. This corresponds to the definition of a MCTP message
|
||
|
|
payload in DSP0236.
|
||
|
|
|
||
|
|
This somewhat duplicates the type data provided in `struct sockaddr_mctp`; it's
|
||
|
|
superficially possible for the kernel to prepend this byte on send, and remove
|
||
|
|
it on receive.
|
||
|
|
|
||
|
|
However, the exact format of the MCTP message payload is not precisely defined
|
||
|
|
by the specification. Particularly, any message integrity check data (which
|
||
|
|
would also need to be appended / stripped in conjunction with the type byte) is
|
||
|
|
defined by the type specification, not DSP0236. The kernel would need knowledge
|
||
|
|
of all protocols in order to correctly deconstruct the payload data.
|
||
|
|
|
||
|
|
Therefore, we transfer the message payload as-is to userspace, without any
|
||
|
|
modification by the kernel.
|
||
|
|
|
||
|
|
## MCTP message-type specification: using `sockaddr_mctp.smctp_type` rather than protocol
|
||
|
|
|
||
|
|
This design specifies message-types to be passed in the `smctp_type` field of
|
||
|
|
`struct sockaddr_mctp`. An alternative would be to pass it in the `protocol`
|
||
|
|
argument of the `socket()` system call:
|
||
|
|
|
||
|
|
```c
|
||
|
|
int socket(int domain /* = AF_MCTP */, int type /* = SOCK_DGRAM */, int protocol);
|
||
|
|
```
|
||
|
|
|
||
|
|
The `smctp_type` implementation was chosen as it better matches the "addressing"
|
||
|
|
model of the message type; sockets are bound to an incoming message type,
|
||
|
|
similar to the IP protocol's model of binding UDP sockets to a local port
|
||
|
|
number.
|
||
|
|
|
||
|
|
There is no kernel behaviour that depends on the specific type (particularly
|
||
|
|
given the design choice above), so it is not suited to use the protocol argument
|
||
|
|
here.
|
||
|
|
|
||
|
|
Future additions that perform protocol-specific message handling, and so alter
|
||
|
|
the send/receive buffer format, may use a new protocol argument.
|
||
|
|
|
||
|
|
## Networks referenced by index rather than UUID
|
||
|
|
|
||
|
|
This design proposes referencing networks by an integer index. The MCTP standard
|
||
|
|
does optionally associate a RFC4122 UUID with a networks; it would be possible
|
||
|
|
to use this UUID where we pass a network identifier.
|
||
|
|
|
||
|
|
This approach does not incorporate knowledge of network UUIDs in the kernel.
|
||
|
|
Given that the Get Network ID message in the MCTP Control Protocol is
|
||
|
|
implemented entirely via userspace, it does not need to be aware of network
|
||
|
|
UUIDs, and requiring network references (for example, the `smctp_network` field
|
||
|
|
of `struct sockaddr_mctp`, as type `uuid_t`) complicates assignment.
|
||
|
|
|
||
|
|
Instead, the index integer is used instead, in a similar fashion to the integer
|
||
|
|
index used to reference `struct netdevice`s elsewhere in the network stack.
|
||
|
|
|
||
|
|
## Tag behaviour alternatives
|
||
|
|
|
||
|
|
We considered _several_ different designs for the tag handling behaviour. A
|
||
|
|
brief overview of the more-feasible of those, and why they were rejected:
|
||
|
|
|
||
|
|
### Each socket is allocated a unique tag value on creation
|
||
|
|
|
||
|
|
We could allocate a tag for each socket on creation, and use that value when a
|
||
|
|
tag is required. This, however:
|
||
|
|
|
||
|
|
- needlessly consumes a tag on non-tag-owning sockets (ie, those which send with
|
||
|
|
TO=0 - responders); and
|
||
|
|
|
||
|
|
- limits us to 8 sockets per network.
|
||
|
|
|
||
|
|
### Tags only used for message packetisation / reassembly
|
||
|
|
|
||
|
|
An alternative would be to completely dissociate tag allocation from sockets;
|
||
|
|
and only allocate a tag for the (short-lived) task of packetising a message, and
|
||
|
|
sending those packets. Tags would be released when the last packet has been
|
||
|
|
sent.
|
||
|
|
|
||
|
|
However, this removes any facility to correlate responses with the correct
|
||
|
|
socket, which is the purpose of the TO bit in DSP0236. In order for the sending
|
||
|
|
application to receive the response, we would either need to:
|
||
|
|
|
||
|
|
- limit the system to one socket of each message type (which, for example,
|
||
|
|
precludes running a requester and a responder of the same type); or
|
||
|
|
|
||
|
|
- forward all incoming messages of a specific message-type to all sockets
|
||
|
|
listening on that type, making it trivial to eavesdrop on MCTP data of other
|
||
|
|
applications
|
||
|
|
|
||
|
|
### Allocate a tag for one request/response pair
|
||
|
|
|
||
|
|
Another alternative would be to allocate a tag on each outgoing TO=1 message,
|
||
|
|
and then release that allocation after the incoming response to that tag (TO=0)
|
||
|
|
is observed.
|
||
|
|
|
||
|
|
However, MCTP protocols exist that do not have a 1:1 mapping of responses to
|
||
|
|
requests - more than one response may be valid for a given request message. For
|
||
|
|
example, in response to a request, a NVMe-MI implementation may send an
|
||
|
|
in-progress reply before the final reply. In this case, we would release the tag
|
||
|
|
after the first response is received, and then have no way to correlate the
|
||
|
|
second message with the socket.
|
||
|
|
|
||
|
|
Broadcast MCTP request messages may have multiple replies from multiple
|
||
|
|
endpoints, meaning we cannot release the tag allocation on the first reply.
|