Scope
The package domain manages the entire CPU socket, including all processor cores and non-core components like the last-level cache, integrated GPU, and memory controller. The power limit and energy consumption reported at this level cover the whole CPU package, including Power Plane 0, Power Plane 1, and the DRAM domain.
The Granularity is the entire CPU package, processor cores, integrated GPU, and DRAM. It does not provide per-core energy consumption data. This must be derived from RAPL data using other means. [1]
Summary
Running Average Power Limit (RAPL) is a feature introduced by Intel in their Sandy Bridge processors, which provides a way to monitor and control the energy consumption of various components within the processor package. It allows for real-time measurement and enforcement of power limits, helping to optimise power usage and thermal management.
RAPL works by dividing the processor package into different power domains or “planes,” each representing a specific component or set of components. These domains typically include the CPU cores (Package Domain), integrated graphics (GT Domain), and DRAM memory (DRAM Domain). RAPL provides hardware counters and interfaces to read the energy consumption and set power limits for each domain. [2]
The energy consumption is measured by RAPL in terms of “energy units,” which are specific to the processor model. For example, Sandy Bridge processors use units of 15.3 microjoules (μJ), while Haswell and Skylake processors use units of 61 μJ. These counters are updated approximately every millisecond, allowing for fine-grained power consumption monitoring. [3]
One of the key features of RAPL is the ability to set power limits for each domain. These limits can be enforced by the processor’s hardware, which will throttle the performance of the corresponding components to stay within the specified power budget. This feature is particularly useful in scenarios where power consumption needs to be capped, such as in data centres or mobile devices, to manage thermal dissipation and battery life. [2]
RAPL provides a more accurate and fine-grained way to measure power consumption compared to other methods like IPMI (Intelligent Platform Management Interface). However, it does have some limitations, such as the potential for counter overflow due to the 32-bit register size, non-atomic updates of registers, and the lack of individual core-level measurements. Additionally, in virtualised environments like cloud instances, the RAPL readings may be intercepted or modified by the hypervisor, potentially affecting their accuracy. [3]
Despite these limitations, RAPL has become a widely used power measurement and optimisation tool, particularly in energy-efficient computing. It enables developers and researchers to analyse the power consumption of their software and hardware, identify bottlenecks, and explore optimisation techniques to improve energy efficiency. [4]
References
[1] https://stackoverflow.com/questions/67925368/how-does-intels-rapl-estimate-the-power-consumption
[2] https://www.devsustainability.com/p/paper-notes-rapl-in-action
[3] https://powerapi.org/reference/formulas/rapl/
[5] https://github.com/mhirki/rapl-tools
[6] https://firefox-source-docs.mozilla.org/performance/tools_power_rapl.html
[7] https://luiscruz.github.io/2021/07/20/measuring-energy.html
[8] https://github.com/hubblo-org/scaphandre/blob/main/README.md
[9] https://www.devsustainability.com/p/paper-notes-rapl-in-action
Relevance for EXIGENCE
RAPL can be used to attribute energy consumption to compute jobs or application sessions. To do this more intelligence is needed that manipulates this data and combines it with other information on how, for what jobs and sessions, and for how long a power consuming resource is used. The following tools (this may not be an exhaustive list) make use of the Running Average Power Limit (RAPL) interface provided by Intel processors and processes data RAPL provides:
- RAPL-tools: This is a collection of tools for experimenting with RAPL, including AppPowerMeter for measuring the energy and power consumption of an application, and PowerMonitor for monitoring system-wide CPU power. [5]
- Mozilla’s RAPL tool: This is a command-line utility in the Mozilla codebase that periodically reads and prints all available Intel RAPL power estimates. [6]
- Intel PowerLog: This is a command-line tool that comes bundled with the Intel Power Gadget software. It uses RAPL to log power and energy consumption data to a CSV file while running a specified command. [7]
- Likwid: This is a Linux-based tool that uses the RAPL interface to fetch energy and power measurements from different domains of an Intel CPU. [7]
- CodeCarbon: This Python library collects energy data for Python code by utilizing RAPL under the hood (for Intel devices). [7]
- Scaphandre: Written in RUST, under the hood it uses the same CPU registers as RAPL. It can expose metrics through Prometheus and several other performance monitoring and display tools. [8]
- Kepler: Allows monitoring of power consumption on a Pod level in a Kubernetes cluster. It uses raw energy measurement data from RAPL, SPECPower estimations, hardware sensor monitor data, and information from GPUs.
Additionally, the paper “RAPL In Action” [9] discusses the advantages and limitations of using RAPL for measuring CPU power consumption. It mentions that many energy profilers, including PowerLog, Power Gadget, Powerstat, and PowerTop, internally use RAPL for Intel devices.
In terms of EXIGENCE, RAPL and abovementioned tools have clear relevance towards energy measurements.