219 lines
9.0 KiB
Markdown
219 lines
9.0 KiB
Markdown
|
|
### NVMe-MI over SMBus
|
||
|
|
|
||
|
|
Author: Tony Lee <tony.lee@quantatw.com>
|
||
|
|
|
||
|
|
Created: 3-8-2019
|
||
|
|
|
||
|
|
#### Problem Description
|
||
|
|
|
||
|
|
Currently, OpenBMC does not support NVMe drive information. NVMe-MI
|
||
|
|
specification defines a command that can read the NVMe drive information via
|
||
|
|
SMBus directly. The NVMe drive can provide its information or status, like
|
||
|
|
vendor ID, temperature, etc. The aim of this proposal is to allow users to
|
||
|
|
monitor NVMe drives so appropriate action can be taken.
|
||
|
|
|
||
|
|
#### Background and References
|
||
|
|
|
||
|
|
NVMe-MI specification defines a command called
|
||
|
|
`NVM Express Basic Management Command` that can read the NVMe drives information
|
||
|
|
via SMBus directly. [1]. This command uses SMBus Block Read protocol specified
|
||
|
|
by the SMBus specification. [2].
|
||
|
|
|
||
|
|
For our purpose is retrieve NVMe drives information, therefore, using NVM
|
||
|
|
Express Basic Management Command where describe in NVMe-MI specification to
|
||
|
|
communicate with NVMe drives. According to different platforms, temperature
|
||
|
|
sensor, present status, LED and power sequence will be customized.
|
||
|
|
|
||
|
|
[1] NVM Express Management Interface Revision 1.0a April 8, 2017 in Appendix A.
|
||
|
|
(https://nvmexpress.org/wp-content/uploads/NVM_Express_Management_Interface_1_0a_2017.04.08_-_gold.pdf)
|
||
|
|
[2] System Management Bus (SMBus) Specification Version 3.0 20 Dec 2014
|
||
|
|
(http://smbus.org/specs/SMBus_3_0_20141220.pdf)
|
||
|
|
|
||
|
|
#### Requirements
|
||
|
|
|
||
|
|
The implementation should:
|
||
|
|
|
||
|
|
- Provide a daemon to monitor NVMe drives. Parameters to be monitored are Status
|
||
|
|
Flags, SMART Warnings, Temperature, Percentage Drive Life Used, Vendor ID, and
|
||
|
|
Serial Number.
|
||
|
|
- Provide a D-bus interface to allow other services to access data.
|
||
|
|
- Capability of communication over hardware channel I2C to NVMe drives.
|
||
|
|
- Ability to turn the fault LED on/off for each drive by SmartWarnings if the
|
||
|
|
object path of fault LED is defined in the configuration file.
|
||
|
|
|
||
|
|
#### Proposed Design
|
||
|
|
|
||
|
|
Create a D-bus service "xyz.openbmc_project.nvme.manager" with object paths for
|
||
|
|
each NVMe sensor: "/xyz/openbmc_project/sensors/temperature/nvme0",
|
||
|
|
"/xyz/openbmc_project/sensors/temperature/nvme1", etc. There is a JSON
|
||
|
|
configuration file for drive index, bus ID, and the fault LED object path for
|
||
|
|
each drive. For example,
|
||
|
|
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"NvmeDriveIndex": 0,
|
||
|
|
"NVMeDriveBusID": 16,
|
||
|
|
"NVMeDriveFaultLEDGroupPath": "/xyz/openbmc_project/led/groups/led_u2_0_fault",
|
||
|
|
"NVMeDrivePresentPin": 148,
|
||
|
|
"NVMeDrivePwrGoodPin": 161
|
||
|
|
},
|
||
|
|
{
|
||
|
|
"NvmeDriveIndex": 1,
|
||
|
|
"NVMeDriveBusID": 17,
|
||
|
|
"NVMeDriveFaultLEDGroupPath": "/xyz/openbmc_project/led/groups/led_u2_0_fault",
|
||
|
|
"NVMeDrivePresentPin": 149,
|
||
|
|
"NVMeDrivePwrGoodPin": 162
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
Structure like:
|
||
|
|
|
||
|
|
Under the D-bus named "xyz.openbmc_project.nvme.manager":
|
||
|
|
|
||
|
|
```
|
||
|
|
/xyz/openbmc_project
|
||
|
|
└─/xyz/openbmc_project/sensors
|
||
|
|
└─/xyz/openbmc_project/sensors/temperature/nvme0
|
||
|
|
```
|
||
|
|
|
||
|
|
/xyz/openbmc_project/sensors/temperature/nvme0 Which implements:
|
||
|
|
|
||
|
|
- xyz.openbmc_project.Sensor.Value
|
||
|
|
- xyz.openbmc_project.Sensor.Threshold.Warning
|
||
|
|
- xyz.openbmc_project.Sensor.Threshold.Critical
|
||
|
|
|
||
|
|
Under the D-bus named "xyz.openbmc_project.Inventory.Manager":
|
||
|
|
|
||
|
|
```
|
||
|
|
/xyz/openbmc_project
|
||
|
|
└─/xyz/openbmc_project/inventory
|
||
|
|
└─/xyz/openbmc_project/inventory/system
|
||
|
|
└─/xyz/openbmc_project/inventory/system/chassis
|
||
|
|
└─/xyz/openbmc_project/inventory/system/chassis/motherboard
|
||
|
|
└─/xyz/openbmc_project/inventory/system/chassis/motherboard/nvme0
|
||
|
|
```
|
||
|
|
|
||
|
|
/xyz/openbmc_project/inventory/system/chassis/motherboard/nvme0 Which
|
||
|
|
implements:
|
||
|
|
|
||
|
|
- xyz.openbmc_project.Inventory.Item
|
||
|
|
- xyz.openbmc_project.Inventory.Decorator.Asset
|
||
|
|
- xyz.openbmc_project.Nvme.Status
|
||
|
|
|
||
|
|
Interface `xyz.openbmc_project.Sensor.Value`, it's for hwmon to monitor
|
||
|
|
temperature and with the following properties:
|
||
|
|
|
||
|
|
| Property | Type | Description |
|
||
|
|
| -------- | ------ | -------------------- |
|
||
|
|
| MaxValue | int64 | Sensor maximum value |
|
||
|
|
| MinValue | int64 | Sensor minimum value |
|
||
|
|
| Scale | int64 | Sensor value scale |
|
||
|
|
| Unit | string | Sensor unit |
|
||
|
|
| Value | int64 | Sensor value |
|
||
|
|
|
||
|
|
Interface `xyz.openbmc_project.Nvme.Status` with the following properties:
|
||
|
|
|
||
|
|
| Property | Type | Description |
|
||
|
|
| ----------------- | ------ | -------------------------------------------- |
|
||
|
|
| SmartWarnings | string | Indicates smart warnings for the state |
|
||
|
|
| StatusFlags | string | Indicates the status of the drives |
|
||
|
|
| DriveLifeUsed | string | A vendor specific estimate of the percentage |
|
||
|
|
| TemperatureFault | bool | If warning type about temperature happened |
|
||
|
|
| BackupdrivesFault | bool | If warning type about backup drives happened |
|
||
|
|
| CapacityFault | bool | If warning type about capacity happened |
|
||
|
|
| DegradesFault | bool | If warning type about degrades happened |
|
||
|
|
| MediaFault | bool | If warning type about media happened |
|
||
|
|
|
||
|
|
Interface `xyz.openbmc_project.Inventory.Item` with the following properties:
|
||
|
|
|
||
|
|
| Property | Type | Description |
|
||
|
|
| ---------- | ------ | ----------------------------------- |
|
||
|
|
| PrettyName | string | The human readable name of the item |
|
||
|
|
| Present | bool | Whether or not the item is present |
|
||
|
|
|
||
|
|
Interface `xyz.openbmc_project.Inventory.Decorator.Asset` with the following
|
||
|
|
properties:
|
||
|
|
|
||
|
|
| Property | Type | Description |
|
||
|
|
| ------------ | ------ | ------------------------------------------------- |
|
||
|
|
| PartNumber | string | The item part number, typically a stocking number |
|
||
|
|
| SerialNumber | string | The item serial number |
|
||
|
|
| Manufacturer | string | The item manufacturer |
|
||
|
|
| BuildDate | bool | The date of item manufacture in YYYYMMDD format |
|
||
|
|
| Model | bool | The model of the item |
|
||
|
|
|
||
|
|
##### xyz.openbmc_project.nvme.manager.service
|
||
|
|
|
||
|
|
This service has several steps:
|
||
|
|
|
||
|
|
1. It will register a D-bus called `xyz.openbmc_project.nvme.manager`
|
||
|
|
description above.
|
||
|
|
2. Obtain the drive index, bus ID, GPIO present pin, power good pin and fault
|
||
|
|
LED object path from the json file mentioned above.
|
||
|
|
3. Each cycle will do following steps:
|
||
|
|
1. Check if the present pin of target drive is true, if true, means drive
|
||
|
|
exists and go to next step. If not, means drive does not exists and remove
|
||
|
|
object path from D-bus by drive index.
|
||
|
|
2. Check if the power good pin of target drive is true, if true means drive
|
||
|
|
is ready then create object path by drive index and go to next step. If
|
||
|
|
not, means drive power abnormal, turn on fault LED and log in journal.
|
||
|
|
3. Send a NVMe-MI command via SMBus Block Read protocol by bus ID of target
|
||
|
|
drive to get data. Data get from NVMe drives are "Status Flags", "SMART
|
||
|
|
Warnings", "Temperature", "Percentage Drive Life Used", "Vendor ID", and
|
||
|
|
"Serial Number".
|
||
|
|
4. The data will be set to the properties in D-bus.
|
||
|
|
|
||
|
|
This service will run automatically and look up NVMe drives every second.
|
||
|
|
|
||
|
|
##### Fault LED
|
||
|
|
|
||
|
|
When the value obtained from the command corresponds to one of the warning
|
||
|
|
types, it will trigger the fault LED of corresponding device and issue events.
|
||
|
|
|
||
|
|
##### Add SEL related to NVMe
|
||
|
|
|
||
|
|
The events `TemperatureFault`, `BackupdrivesFault`, `CapacityFault`,
|
||
|
|
`DegradesFault` and `MediaFault` will be generated for the NVMe errors.
|
||
|
|
|
||
|
|
- Temperature Fault log : when the property `TemperatureFault` set to true
|
||
|
|
- Backupdrives Fault log : when the property `BackupdrivesFault` set to true
|
||
|
|
- Capacity Fault log : when the property `CapacityFault` set to true
|
||
|
|
- Degrades Fault log : when the property `DegradesFault` set to true
|
||
|
|
- Media Fault log: when the property `MediaFault` set to true
|
||
|
|
|
||
|
|
#### Alternatives Considered
|
||
|
|
|
||
|
|
NVMe-MI specification defines multiple commands that can communicate with NVMe
|
||
|
|
drives over MCTP protocol. The NVMe-MI over MCTP has the following key
|
||
|
|
capabilities:
|
||
|
|
|
||
|
|
- Discover drives that are present and learn capabilities of each drives.
|
||
|
|
- Store data about the host environment enabling a Management Controller to
|
||
|
|
query the data later.
|
||
|
|
- A standard format for VPD and defined mechanisms to read/write VPD contents.
|
||
|
|
- Inventorying, configuring and monitoring.
|
||
|
|
|
||
|
|
For monitoring NVMe drives, using NVM Express Basic Management Command over
|
||
|
|
SMBus directly is much simpler than NVMe-MI over MCTP protocol.
|
||
|
|
|
||
|
|
#### Impacts
|
||
|
|
|
||
|
|
This application is monitoring NVMe drives via SMbus and set values to D-bus.
|
||
|
|
The impacts should be small in the system.
|
||
|
|
|
||
|
|
#### Testing
|
||
|
|
|
||
|
|
This implementation is to use NVMe-MI-Basic command over SMBus and then set the
|
||
|
|
response data to D-bus. Testing will send SMBus command to the drives to get the
|
||
|
|
information and compare with the properties in D-bus to make sure they are the
|
||
|
|
same. The testing can be performed on different NVMe drives by different
|
||
|
|
manufacturers. For example: Intel P4500/P4600 and Micron 9200 Max/Pro.
|
||
|
|
|
||
|
|
Unit tests will test by function:
|
||
|
|
|
||
|
|
- It tests the length of responded data is as same as design in the function of
|
||
|
|
getting NVMe information.
|
||
|
|
- It tests the function of setting values to D-bus is as same as design.
|
||
|
|
- It tests the function of turn the corresponding LED ON/OFF by different
|
||
|
|
Smartwarnings values.
|