283 lines
12 KiB
Markdown
283 lines
12 KiB
Markdown
|
|
### BMC Health Monitor
|
|||
|
|
|
|||
|
|
Author: Vijay Khemka <vijaykhemka@fb.com>, Sui Chen <suichen@google.com>, Jagpal
|
|||
|
|
Singh Gill <paligill@gmail.com>
|
|||
|
|
|
|||
|
|
Created: 2020-05-04
|
|||
|
|
|
|||
|
|
## Problem Description
|
|||
|
|
|
|||
|
|
The problem is to monitor the health of a system with a BMC so we have some
|
|||
|
|
means to make sure the BMC is working correctly. User can get required metrics
|
|||
|
|
data as per configurations instantly. Set of monitored metrics may include CPU
|
|||
|
|
and memory utilization, uptime, free disk space, I2C bus stats, and so on.
|
|||
|
|
Actions can be taken based on monitoring data to correct the BMC’s state.
|
|||
|
|
|
|||
|
|
For this purpose, there may exist a metric producer (the subject of discussion
|
|||
|
|
of this document), and a metric consumer (a program that makes use of health
|
|||
|
|
monitoring data, which may run on the BMC or on the host.) They perform the
|
|||
|
|
following tasks:
|
|||
|
|
|
|||
|
|
1. Configuration, where the user specifies what and how to collect, thresholds,
|
|||
|
|
etc.
|
|||
|
|
2. Metric collection, similar to what the read routine in phosphor-hwmon-readd
|
|||
|
|
does.
|
|||
|
|
3. Metric staging. When metrics are collected, they will be ready to be read
|
|||
|
|
anytime in accessible forms like DBus objects or raw files for use with
|
|||
|
|
consumer programs. Because of this staging step, the consumer does not need
|
|||
|
|
to poll and wait.
|
|||
|
|
4. Data transfer, where the consumer program obtains the metrics from the BMC by
|
|||
|
|
in-band or out-of-band methods.
|
|||
|
|
5. The consumer program may take certain actions based on the metrics collected.
|
|||
|
|
|
|||
|
|
Among those tasks, 1), 2), and 3) are the producer’s responsibility. 4) is
|
|||
|
|
accomplished by both the producer and consumer. 5) is up to the consumer.
|
|||
|
|
|
|||
|
|
We realize there is some overlap between sensors and health monitoring in terms
|
|||
|
|
of design rationale and existing infrastructure. But there are also quite a few
|
|||
|
|
differences between sensors and metrics:
|
|||
|
|
|
|||
|
|
1. Sensor data originate from hardware, while most metrics may be obtained
|
|||
|
|
through software. For this reason, there may be more commonalities between
|
|||
|
|
metrics on all kinds of BMCs than sensors on BMCs, and we might not need the
|
|||
|
|
hardware discovery process or build-time, hardware-specific configuration for
|
|||
|
|
most health metrics.
|
|||
|
|
2. Most sensors are instantaneous readings, while metrics might accumulate over
|
|||
|
|
time, such as “uptime”. For those metrics, we might want to do calculations
|
|||
|
|
that do not apply to sensor readings.
|
|||
|
|
3. Metrics can represent device attributes which don't change, for example,
|
|||
|
|
total system memory which is constant. Contrary, the primary intention of
|
|||
|
|
sensors is to sense the change in attributes and represent that variability.
|
|||
|
|
4. Metrics are expressed in native units such as bytes for memory. Sensors
|
|||
|
|
infrastructure doesn't adhere to this and community has rejected the proposal
|
|||
|
|
to add bytes for sensor unit.
|
|||
|
|
|
|||
|
|
Based on above, it doesn't sound reasonable to use sensors for representing the
|
|||
|
|
metrics data.
|
|||
|
|
|
|||
|
|
## Background and References
|
|||
|
|
|
|||
|
|
References: dbus-monitor
|
|||
|
|
|
|||
|
|
## Requirements
|
|||
|
|
|
|||
|
|
The metric producer should provide
|
|||
|
|
|
|||
|
|
- A daemon to periodically collect various health metrics and expose them on
|
|||
|
|
DBus.
|
|||
|
|
- A dbus interface to allow other services, like redfish and IPMI, to access its
|
|||
|
|
data.
|
|||
|
|
- Capability to configure health monitoring for wide variety of metrics, such as
|
|||
|
|
Memory Utilization, CPU Utilization, Reboot Statistics, etc.
|
|||
|
|
- Capability to provide granular details for various metric types, for example -
|
|||
|
|
- Memory Utilization - Free Memory, Shared Memory, Buffered&CachedMemory, etc.
|
|||
|
|
- CPU Utilization - Userspace CPU Utilization, Kernelspace CPU Utilization,
|
|||
|
|
etc.
|
|||
|
|
- Reboot Statistics - Normal reboot count, Reboot count with failures, etc.
|
|||
|
|
- Capability to take action as configured when values crosses threshold.
|
|||
|
|
- Optionally, maintain a certain amount of historical data.
|
|||
|
|
- Optionally, log critical / warning messages.
|
|||
|
|
|
|||
|
|
The metric consumer may be written in various different ways. No matter how the
|
|||
|
|
consumer is obtained, it should be able to obtain the health metrics from the
|
|||
|
|
producer through a set of interfaces.
|
|||
|
|
|
|||
|
|
The metric consumer is not in the scope of this document.
|
|||
|
|
|
|||
|
|
## Proposed Design
|
|||
|
|
|
|||
|
|
The metric producer is a daemon running on the BMC that performs the required
|
|||
|
|
tasks and meets the requirements above. As described above, it is responsible
|
|||
|
|
for
|
|||
|
|
|
|||
|
|
1. Configuration
|
|||
|
|
2. Metric collection and
|
|||
|
|
3. Metric staging & disperse tasks
|
|||
|
|
|
|||
|
|
For 1) Configuration, the daemon will have a default in code configuration.
|
|||
|
|
Platform may supply a configuration file if it wants to over-ride the specific
|
|||
|
|
default attributes. The format for the JSON configuration file is as under -
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
"kernel" : {
|
|||
|
|
"Frequency" : 1,
|
|||
|
|
"Window_size": 120,
|
|||
|
|
"Type": "CPU",
|
|||
|
|
"Threshold":
|
|||
|
|
{
|
|||
|
|
"Critical":
|
|||
|
|
{
|
|||
|
|
"Value": 90.0,
|
|||
|
|
"Log": true,
|
|||
|
|
"Target": "reboot.target"
|
|||
|
|
},
|
|||
|
|
"Warning":
|
|||
|
|
{
|
|||
|
|
"Value": 80.0,
|
|||
|
|
"Log": false,
|
|||
|
|
"Target": "systemd unit file"
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
},
|
|||
|
|
"available" : {
|
|||
|
|
"Frequency" : 1,
|
|||
|
|
"Window_size": 120,
|
|||
|
|
"Type": "Memory",
|
|||
|
|
"Threshold":
|
|||
|
|
{
|
|||
|
|
"Critical":
|
|||
|
|
{
|
|||
|
|
"Value": 90.0,
|
|||
|
|
"Log": true,
|
|||
|
|
"Target": "reboot.target"
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Frequency : It is time in second when these data are collected in regular
|
|||
|
|
interval. Window_size: This is a value for number of samples taken to average
|
|||
|
|
out usage of system rather than taking a spike in usage data. Log : A boolean
|
|||
|
|
value which allows to log an alert. This field is an optional with default value
|
|||
|
|
for this in critical is 'true' and in warning it is 'false'. Target : This is a
|
|||
|
|
systemd target unit file which will called once value crosses its threshold and
|
|||
|
|
it is optional. Type: This indicates the type of configuration entry. Possible
|
|||
|
|
values are Memory, CPU, Reboot, Storage.
|
|||
|
|
|
|||
|
|
For 2) Metric collection, this will be done by running certain functions within
|
|||
|
|
the daemon, as opposed to launching external programs and shell scripts. This is
|
|||
|
|
due to performance and security considerations.
|
|||
|
|
|
|||
|
|
For 3) Metric staging & disperse, the daemon creates a D-bus service named
|
|||
|
|
"xyz.openbmc_project.HealthManager". The design proposes new
|
|||
|
|
[Metrics Dbus interfaces](https://gerrit.openbmc.org/c/openbmc/phosphor-dbus-interfaces/+/64914).
|
|||
|
|
|
|||
|
|
| Interface Name | Purpose | Required/Optional |
|
|||
|
|
| :----------------------------------- | :----------------------------------------------------------------------------- | :---------------- |
|
|||
|
|
| xyz.openbmc_project.Metric.Value | Interface to represent value for Metrics. | Required |
|
|||
|
|
| xyz.openbmc_project.Metric.Reset | Interface to reset persistent Metrics counters. | Optional |
|
|||
|
|
| xyz.openbmc_project.Common.Threshold | Interface to represent Metric thresholds and signals for threshold violations. | Optional |
|
|||
|
|
| xyz.openbmc_project.Time.EpochTime | Interface to indicate when the metric was collected. | Optional |
|
|||
|
|
|
|||
|
|
Each metric will be exposed on a specific object path and above interfaces will
|
|||
|
|
be implemented at these paths.
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
/xyz/openbmc_project
|
|||
|
|
|- /xyz/openbmc_project/metric/bmc/memory/total
|
|||
|
|
|- /xyz/openbmc_project/metric/bmc/memory/free
|
|||
|
|
|- /xyz/openbmc_project/metric/bmc/memory/available
|
|||
|
|
|- /xyz/openbmc_project/metric/bmc/memory/shared
|
|||
|
|
|- /xyz/openbmc_project/metric/bmc/memory/buffered_and_cached
|
|||
|
|
|- /xyz/openbmc_project/metric/bmc/cpu/user
|
|||
|
|
|- /xyz/openbmc_project/metric/bmc/cpu/kernel
|
|||
|
|
|- /xyz/openbmc_project/metric/bmc/reboot/count
|
|||
|
|
|- /xyz/openbmc_project/metric/bmc/reboot/count_with_failure
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Servers for Metrics Data
|
|||
|
|
|
|||
|
|
| Interface Name | Interface Server | Info Source |
|
|||
|
|
| :----------------- | :---------------------- | :----------------------------------------------------- |
|
|||
|
|
| Memory Utilization | phosphor-health-manager | /proc/meminfo |
|
|||
|
|
| CPU Utilization | phosphor-health-manager | /proc/stat |
|
|||
|
|
| Reboot Statistics | phosphor-state-manager | Persistent counters incremented based on reboot status |
|
|||
|
|
|
|||
|
|
Multiple devices of same type -
|
|||
|
|
|
|||
|
|
In case there are multiple devices of same type, the D-Bus path can be extended
|
|||
|
|
to add context about **"which device"**. For example -
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
/xyz/openbmc_project/metric/device-0/memory/total
|
|||
|
|
/xyz/openbmc_project/metric/device-1/memory/total
|
|||
|
|
...
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
These paths can be hosted by different daemons, for example, pldmd can host DBus
|
|||
|
|
paths for BICs if master BMC uses PLDM to communicate with BIC. The Value
|
|||
|
|
interface for each metric would need to be associated with the appropriate
|
|||
|
|
system inventory item.
|
|||
|
|
|
|||
|
|
## Alternatives Considered
|
|||
|
|
|
|||
|
|
We have tried doing health monitoring completely within the IPMI Blob framework.
|
|||
|
|
In comparison, having the metric collection part a separate daemon is better for
|
|||
|
|
supporting more interfaces.
|
|||
|
|
|
|||
|
|
We have also tried doing the metric collection task by running an external
|
|||
|
|
binary as well as a shell script. It turns out running shell script is too slow,
|
|||
|
|
while running an external program might have security concerns (in that the 3rd
|
|||
|
|
party program will need to be verified to be safe).
|
|||
|
|
|
|||
|
|
Collected: Collectd provides multiple plugins which allows to gather wide
|
|||
|
|
variety of metrics from various sources and provides mechanisms to store them in
|
|||
|
|
different ways. For exposing these metrics to DBus, a Collectd C plugin can be
|
|||
|
|
written.
|
|||
|
|
|
|||
|
|
Pros:
|
|||
|
|
|
|||
|
|
- Off the shelf tool with support for lot of metrics.
|
|||
|
|
|
|||
|
|
Cons:
|
|||
|
|
|
|||
|
|
- Due to support for wide variety of systems (Linux, Solaris, OpenBSD, MacOSX,
|
|||
|
|
AIX, etc) and applications, the amount of code for each Collected plugin is
|
|||
|
|
pretty significant. Given the amount of functionality needed for openBMC,
|
|||
|
|
Collectd seems heavyweight. Majority of phosphor-health-monitor code will be
|
|||
|
|
around exposing the metrics on Dbus which will also be needed for Collectd
|
|||
|
|
plugin. Hence, directly reading from /proc/<fileX> seems lightweight as code
|
|||
|
|
already exist for it.
|
|||
|
|
- Collected has minimal support for threshold monitoring and doesn't allow
|
|||
|
|
starting systemd services on threshold violations.
|
|||
|
|
|
|||
|
|
## Future Enhancements
|
|||
|
|
|
|||
|
|
Extend Metrics Dbus interface for -
|
|||
|
|
|
|||
|
|
- Storage
|
|||
|
|
- Inodes
|
|||
|
|
- Port/Network Statistics
|
|||
|
|
- BMC Daemon Statistics
|
|||
|
|
|
|||
|
|
## Impacts
|
|||
|
|
|
|||
|
|
Most of what the Health Monitoring Daemon does is to do metric collection and
|
|||
|
|
update DBus objects. The impacts of the daemon itself should be small.
|
|||
|
|
|
|||
|
|
The proposed design changes the DBus interface from Sensors to Metrics, so
|
|||
|
|
following daemons would need to refactored/updated to account for interface
|
|||
|
|
change -
|
|||
|
|
|
|||
|
|
- [BMCWeb](https://github.com/openbmc/bmcweb/blob/master/redfish-core/lib/manager_diagnostic_data.hpp)
|
|||
|
|
- [phosphor-host-ipmid](https://grok.openbmc.org/xref/openbmc/openbmc/meta-quanta/meta-s6q/recipes-phosphor/configuration/s6q-yaml-config/ipmi-sensors.yaml?r=e4f3792f#82)
|
|||
|
|
|
|||
|
|
## Organizational
|
|||
|
|
|
|||
|
|
### Does this design require a new repository?
|
|||
|
|
|
|||
|
|
No, changes will go into phosphor-health-monitor.
|
|||
|
|
|
|||
|
|
### Which repositories are expected to be modified to execute this design?
|
|||
|
|
|
|||
|
|
- phosphor-health-monitor
|
|||
|
|
- phosphor-state-manager
|
|||
|
|
- BMCWeb
|
|||
|
|
- phosphor-host-ipmid
|
|||
|
|
|
|||
|
|
## Testing
|
|||
|
|
|
|||
|
|
### Unit Testing
|
|||
|
|
|
|||
|
|
To verify the daemon is functioning correctly, monitor the DBus traffic
|
|||
|
|
generated by the Daemon and the metric values from Daemon’s DBus objects.
|
|||
|
|
Automated unit testing will be covered via GTest.
|
|||
|
|
|
|||
|
|
### Integration Testing
|
|||
|
|
|
|||
|
|
Manual end to end testing can be performed via Redfish GET for
|
|||
|
|
ManagerDiagnosticData. The end to end automated testing will be covered using
|
|||
|
|
openbmc-test-automation. To verify the performance aspect, we can stress-test
|
|||
|
|
the Daemon’s DBus interfaces to make sure the interfaces do not cause a high
|
|||
|
|
overhead.
|