Articles and tutorials in the “Edge of the Stack” series cover fundamental programming issues and concerns that might not come up when dealing with OpenStack directly, but are certainly relevant to the OpenStack ecosystem. For example, bare metal provisioning requires the ability to programmatically turn a machine on or off, which is handled using IPMI.
When computer networks grew large, the need to manage many types of hardware produced by different vendors turned painful. Existing tools provided by manufacturers were mutually incompatible. This led to developing a standard that enables you to manage servers in a uniform way. This widely adopted standard was named the Intelligent Platform Management Interface (IPMI).
IPMI provides a well-defined hardware interface between the management software and the managed platform. The standard also describes the hardware architecture so that most hardware can be managed with the same set of tools. It provides an inventory of components, monitoring for a large number of system parameters, event logging, and a set of management tasks such as access to the system BIOS, changing the platform’s power state. It also allows remote access to the console and provides some other features.
The IPMI hardware is integrated into most of the modern server system boards. It can have its own LAN or serial port or share one with the managed system. The IPMI board does not depend on the platform CPU, BIOS, OS, or power state.
There are several implementations of IPMI protocol included in Linux distribution. This allows OpenStack baremetal instance to monitor and manage underlying hardware as well as expose management interface via Nova API. In this scenario IPMI provides ability to detect hardware problems early and thus increase robustness of the cloud by taking measures proactively.
IPMI was developed by Intel, HP, Dell, and NEC. Version 1 was presented on Intel Developers Forum in the fall of 1998. By that time many current features was already described.
Next major update was presented in version 1.5, which was published in 2001. This version extended messaging and alerting capabilities of the system, allowing it to send alerts via LAN or serial interface, for example modem. Also since v.1.5 IPMI can send alerts as SNMP traps (Platform Events Trap, PET) and supports complex alert processing rules with Platform Events Filter (PEF) and Alerts Policy. Another introduced feature was boot options management.
In 2004 was published the specification for version 2.0. The main updates were enhanced security and payloads. New security features allow to encrypt communication with IPMI, access control privileges and enhanced authentication. Payloads provide ability to transfer over IPMI session not only IPMI messages but also other types of data. Payloads allowed to introduce Serial-over-LAN access in the same version of specification.
The current version of specification has number 2.1 and was published in October 2013. Its biggest change is introduction of IPv6 support for IPMI-over-LAN.
The central part of an IPMI board is Baseboard Management Controller (BMC). The BMC is a specialized microcontroller which manages interface between the platform hardware and the rest of the world.
A IPMI board consists of a set of modules which are connected to BMC via a number of interfaces like SMBus, I2C or memory-mapped I/O ports. The modules can be as simple as a single temperature sensor and quite complex, managed by their own controllers (Satellite MC), like an IPMI module of a blade server installed in chassis.
Figure 1: IPMI Structure (REDRAW)
For communication between BMC and Satellite MC IPMI uses Intelligent Platform Management Bus protocol (IPMB), which is based on I2C bus. Several BMC can also be connected to each other. In this case they use a variant of IPMB called Intelligent Chassis Management Bus (ICMB) which also supports RS-485 and CAN interfaces.
Beside BMC, IPMI subsystem includes non-volatile memory, which stores Field Replacement Units (FRU) information, Sensor Data Records (SDR) and System Event Log (SEL).
FRU Information storage contains data used primarily for inventory. The required minimum is part or version number of a component. Usually FRU contains much more information including product name, manufacturer, description etc.
SDR holds the list of hardware sensors as well as their types and values. IPMI supports wide range of sensor types: threshold, gauge, digital to name a few. All Sensor Data Records are kept in SDR Repository.
Another piece of information stored in non-volatile memory is SEL. The log data is collected from every component of IPMI module, processed according to Alerts Policy and stored until explicitly cleared. Events handling can be configured to take some actions without human interaction, such as running power cycle for hardware, or send alerts in form of SNMP traps or SMS messages.
IPMI was designed with extensibility in mind. OEM can add own commands, custom sensor types not described in specification and private protocols. The latter is widely used in vendor management software to provide remote access to GUI of running hosts.
Hardware vendors has been providing management tools for their platform even before IPMI was created. These tools, such as Intel AMT, Dell iDRAC, HP iLo and others now are built on the top of IPMI. They provide full support for own hardware which often contain proprietary extensions, such as providing virtual HDD or CD-ROM drive, but don’t support features of other vendors. The management software can be provided free of charge with hardware (possibly with somewhat limited functionality), or can be purchased separately.
Also there are several open source tools for accessing and configuring IPMI hardware. They have some subtle differences in features, OS support and intended audience, but all generally are functional and mature. Detailed comparison can be found here. The common disadvantage of these tools is the lack of support for some vendors extensions like GUI access to the running servers. On the other hand, they allow to manage heterogeneous hardware at the same time since have support for some non-standard extensions of several vendors. Also these tools are command-line, so it is possible to create complex management scenarios without user intervention.
Since IPMI is provides almost physical access to managed hardware and the number of servers can be very large, access to it should be carefully protected. Unfortunately, security is often neglected and it is quite common to find thousands of servers protected with the same password.
IPMI own protection mechanisms are extensive — it supports strong encryption when transmitting data over LAN, user privilege separation and VLAN, but default security configuration is usually extremely permissive and implementations are not error-free. For example, some time ago was discovered a vulnerability which allowed to gain administrator access to the BMC with any password. Fortunately, this issue can be easily fixed by modifying allowed cyphers list.
Moreover, another security issue was discovered in the IPMI specification, and it cannot be fixed without changing standard, so it is not fixed yet. In short, the standard requires that BMC should send password hash for the user before client authenticates, which allows to recover password for any user without much effort. This document provides detailed information, along with ways to check whether a system is vulnerable. Also IPMI security best practices contains some tips for securing IPMI infrastructure.
A rich set of supported features, such as remote power management, BIOS configuration and a vast number of sensors for monitoring hardware, as well as a standard way to access these features and a set of software implementations makes IPMI very helpful in maintaining and monitoring a large number of servers from a single point. Despite some security weakness when exposed to an untrusted network, with proper setup IPMI can be considered as a robust and mature management technology.
For more information about IPMI, please refer to:1 comment
[…] 管理的边缘：带您了解IPMI，裸机监控解决方案 作者：Mikhail Chernik […]March 16, 201423:51