openbmc_docs/designs/mctp/mctp-userspace.md
2024-12-23 14:53:31 +08:00

150 lines
6.7 KiB
Markdown

# OpenBMC platform communication channel: MCTP & PLDM in userspace
Author: Jeremy Kerr <jk@ozlabs.org> <jk>
Please refer to the [MCTP Overview](mctp.md) document for general MCTP design
description, background and requirements.
This document describes a userspace implementation of MCTP infrastructure,
allowing a straightforward mechanism of supporting MCTP messaging within an
OpenBMC system.
## Proposed Design
The MCTP core specification just provides the packetisation, routing and
addressing mechanisms. The actual transmit/receive of those packets is up to the
hardware binding of the MCTP transport.
For OpenBMC, we would introduce a MCTP daemon, which implements the transport
over a configurable hardware channel (eg., Serial UART, I2C or PCIe), and
provides a socket-based interface for other processes to send and receive
complete MCTP messages. This daemon is responsible for the packetisation and
routing of MCTP messages from external endpoints, and handling the forwarding
these messages to and from individual handler applications. This includes
handling local MCTP-stack configuration, like local EID assignments.
This daemon has a few components:
1. the core MCTP stack
2. one or more binding implementations (eg, MCTP-over-serial), which interact
with the hardware channel(s).
3. an interface to handler applications over a unix-domain socket.
The proposed implementation here is to produce an MCTP "library" which provides
the packetisation and routing functions, between:
- an "upper" messaging transmit/receive interface, for tx/rx of a full message
to a specific endpoint (ie, (1) above)
- a "lower" hardware binding for transmit/receive of individual packets,
providing a method for the core to tx/rx each packet to hardware, and defines
the parameters of the common packetisation code (ie. (2) above).
The lower interface would be plugged in to one of a number of hardware-specific
binding implementations. Most of these would be included in the library source
tree, but others can be plugged-in too, perhaps where the physical layer
implementation does not make sense to include in the platform-agnostic library.
The reason for a library is to allow the same MCTP implementation to be used in
both OpenBMC and host firmware; the library should be bidirectional. To allow
this, the library would be written in portable C (structured in a way that can
be compiled as "extern C" in C++ codebases), and be able to be configured to
suit those runtime environments (for example, POSIX IO may not be available on
all platforms; we should be able to compile the library to suit). The licence
for the library should also allow this re-use; a dual Apache & GPLv2+ licence
may be best.
These "lower" binding implementations may have very different methods of
transferring packets to the physical layer. For example, a serial binding
implementation for running on a Linux environment may be implemented through
read()/write() syscalls to a PTY device. An I2C binding for use in low-level
host firmware environments may interact directly with hardware registers to
perform packet transfers.
The application-specific handlers implement the actual functionality provided
over the MCTP channel, and connect to the central daemon over a UNIX domain
socket. Each of these would register with the MCTP daemon to receive MCTP
messages of a certain type, and would transmit MCTP messages of that same type.
The daemon's sockets to these handlers is configured for non-blocking IO, to
allow the daemon to be decoupled from any blocking behaviour of handlers. The
daemon would use a message queue to enable message reception/transmission to a
blocked daemon, but this would be of a limited size. Handlers whose sockets
exceed this queue would be disconnected from the daemon.
One design intention of the multiplexer daemon is to allow a future kernel-based
MCTP implementation without requiring major structural changes to handler
applications. The socket-based interface facilitates this, as the unix-domain
socket interface could be fairly easily swapped out with a new kernel-based
socket type.
MCTP is intended to be an optional component of OpenBMC. Platforms using OpenBMC
are free to adopt it as they see fit.
### Demultiplexer daemon interface
MCTP handlers (ie, clients of the demultiplexer) connect using a unix-domain
socket, at the abstract socket address:
```
\0mctp-demux
```
The socket type used should be `SOCK_SEQPACKET`.
Once connected, the client sends a single byte message, indicating what type of
MCTP messages should be forwarded to the client. Types must be greater than
zero.
Subsequent messages sent over the socket are MCTP messages sent/received by the
demultiplexer, that match the specified MCTP message type. Clients should use
the send/recv syscalls to interact with the socket.
Each message has a fixed small header:
```
uint8_t eid
```
For messages coming from the demux daemon, this indicates the source EID of the
outgoing MCTP message. For messages going to the demux daemon, this indicates
the destination EID.
The rest of the message data is the complete MCTP message, including MCTP
message type field.
The daemon does not provide a facility for clients to specify or retrieve values
for the tag field in individual MCTP packets.
## Alternatives Considered
In terms of an MCTP daemon structure, an alternative is to have the MCTP
implementation contained within a single process, using the libmctp API directly
for passing messages from the core code to application-level handlers. The
drawback of this approach is that this single process needs to implement all
possible functionality that is available over MCTP, which may be quite a
disjoint set. This would likely lead to unnecessary restrictions on the
implementation of those application-level handlers (programming language,
frameworks used, etc). Also, this single-process approach would likely need more
significant modifications if/when MCTP protocol support is moved to the kernel.
The interface between the demultiplexer daemon and clients is currently defined
as a socket-based interface. However, an alternative here would be to pass MCTP
messages over dbus instead. The reason for the choice of sockets rather than
dbus is that the former allows a direct transition to a kernel-based socket API
when suitable.
## Testing
For the core MCTP library, we are able to run tests there in complete isolation
(I have already been able to run a prototype MCTP stack through the afl fuzzer)
to ensure that the core transport protocol works.
For MCTP hardware bindings, we would develop channel-specific tests that would
be run in CI on both host and BMC.
For the OpenBMC MCTP daemon implementation, testing models would depend on the
structure we adopt in the design section.