# A Simplified Method for Instruction Level Energy Estimation for Embedded System

V. A. Kulkarni and G. R. Udupi

*Abstract*—Embedded systems are used extensively in all spheres of life. Size, cost and power are the major issues in design and marketing of these products. In embedded system, the processor has to perform given task repeatedly. Power optimization need to be achieved not only at hardware level, but also at software level. Because software power contributes substantially in overall power consumption of embedded system. In this paper, a simplified method for instruction level energy estimation is presented for ARM Cortex M4 processor. Results obtained are compared with micro benchmark programs. The result shows less than -2% error in energy consumption estimation.

*Index Terms*—Current Measurement; Embedded System; Energy Estimation; Instruction Set.

### I. INTRODUCTION

Embedded systems are used in variety of applications like cell phone, PDA, medical equipment and many more. A processor embedded in them performs a task repeatedly. Majority of these devices are battery powered. Hence energy consumption becomes important parameter of consideration. Less energy consumption not only increases time between successive recharges, but also results in less heat dissipation. A formula based on the Arrhenius Law suggests that life expectancy of component decreases 50% for every 10°C increase in the temperature. Thus, reducing a component's operating temperature by the same amount (consuming less energy), doubles its life expectancy [1]. Lesser power consumption reduces heat dissipation, resulting in low cost packaging, cooling methods and increases device reliability. Since software is responsible for a large portion of the system energy consumption, an accurate energy estimation of software is necessary for optimization of system energy [2]. The two main methods of embedded software energy estimation are: measurement based and simulation based. In simulation based approach, a simulation model of target hardware is used. Non availability of simulation model of all hardware modules and even if available, very high cost are major drawbacks. Measurement based methods used data obtained by conducting experiments on target platform. High accuracy is the advantage.

Power consumption model of the processor software can be categorized as Low-Level models and High-Level models. Low level models are also called as hardware models. Various levels of abstractions under this category are: Circuit/Transistor level, Logic gate level, RT-level and Architectural level. On the contrary, High-Level models uses instructions and functional units from the software perspective. Electrical knowledge of the architecture is not required. The existing high level power estimation models can be classified as: Instruction Level Power Analysis (ILPA) and Functional Level Power Analysis (FLPA). In ILPA, a power consumption model is associated with instructions or instruction pairs. The power consumed by a program running on the processor can be estimated using the model. Large number of experiments required to obtain the model is major drawback.

In FLPA the processor is separated into functional blocks. Several measurements or simulations are conducted to obtain mathematical functions. Thorough these mathematical functions, power consumption of every block is characterized [3].

In ILPA, each instruction is assigned with base cost, energy consumed while executing the instruction. When two instructions are executed in sequence, total energy is more than (in some cases less also) sum of base cost of two instructions. It is called inter instruction cost. Total energy consumption of a program can be calculated by adding base cost and inter instruction cost of all instructions. There are other energy sensitive factors which are to be taken into account.

In this paper a simplified approach for software energy estimation for ARM Cortex M4 processor is presented. The rest of the paper is organized as follows. Section II reviews the related work. In section III measurement details and method adopted for measurement are explained. The result of experiments is discussed in Section IV, and Section V concludes the paper.

### II. RELATED WORKS

Tiwari *et al.* [2] estimated average power using a standard off-the-shelf, dual-slope integrating digital multimeter. Voltage measured across a small known resistance when executing an instruction gives the current taken by that instruction. Loops containing same instruction is executed repeatedly to minimize effect of branching and also to get stable reading. Power reduction up to 40% obtained by rewriting the code.

Chang *et al.* [4] used complex circuit for cycle accurate measurement. It is based on charge transfer using switched capacitors. Voltage V<sub>1</sub> and V<sub>2</sub> measured at beginning and end of instruction execution respectively, across a previously charged capacitor. Instruction energy is calculated by  $E=(\frac{1}{2})$ (C) (V<sub>1</sub>-V<sub>2</sub>)<sup>2</sup>. The approach validation done by DMM method

Published on May 30, 2017.

V. A. Kulkarni is with AMGOI, India (e-mail: vak@amgoi.edu.in) Dr. G. R. Udupi is with GIT, India. (e-mail: grudupi@git.edu)

DOI: http://dx.doi.org/10.24018/ejers.2017.2.5.359

and the measurement errors found not to exceed 2-3 %. An energy consumption model for the ARM7 processor is obtained.

Nikolaos Kavvadias *et al.* [5] used a current mirror circuit with BJTs and a high frequency digital storage oscilloscope to measure instantaneous current. This arrangement eliminates drawback of voltage fluctuations found in series resistor inserted in processor power line. The resolution of the measurements is considerably increased. Inter instruction cost found to be 5% to 15% of base cost. No symmetry and negative value of inter instruction cost is observed. The error found to be up to 1.5%.

Konstantakos *et al.* [6] considered number of execution cycles instead of type of instructions executed. Energy is computed by a polynomial expression. Deviation of 0.5% between estimated and measured energy is observed for the system consisting of memory, microcontroller and ADC.

Bazzaz et al. [7] modeled the energy consumption of the CPU, Flash memory, SRAM and the memory controller. Since the inter instruction energy cost is about 5% of the base instruction cost, detailed estimation of inter instruction cost is not carried out. This simplifies the model. A digitized oscilloscope is used to read the voltage difference over a precision resistor. The resistor is placed between the power supply and the core supply pin of the processor. There are 38 parameters in this model which makes it difficult to use. Energy model for the ARM7TDMI uses 60 specialized tests to estimate the coefficients of each energy sensitive factor. MiBench bench mark suite is used to validate results on different embedded applications. Error less than 6% reported in estimation.

Wang *et al.* [8] considered average power of program instead of individual instructions. Inter instruction effect is not considered which saves lot of calculation and measurements. Power measurements carried out with a 0.51 series resistor between the power supply and the CPU. A digitizing oscilloscope with a sample rate of 2GHz used to measure the instantaneous power. For base cost, loop consisting of 2000 instructions (of size of 8KB) considered. The model developed shows -8.28% maximum estimation error and 4.88% absolute estimation error. To validate model six benchmarks considered.

Lubomir *et al.* [9] considered microprocessor a black box. Current is measured with a shunt resistor and differential amplifier. A look-up table of the target ISA with energy costs is implemented. The energy consumed by each instruction is measured as the average current consumption  $I_{dd}$ , multiplied by the time for its execution, multiplied by the voltage  $V_{dd}$ . To get stable reading each instruction is repeated 1326 times. With the energy model simulations, relative error of 5 % is achieved. However absolute error of 20-30 % measured due to the simple and basic method used for the measurements.

## III. MEASUREMENT METHOD

Conversion of electrical power for given time is called electrical energy. As power changes continuously for a given interval, energy during this interval is integral of converted power. Unit of power is Watt and time in Seconds. Thus unit of electrical energy is Watt-Sec. or Joule. Instantaneous power is product of instantaneous current and instantaneous voltage ( $P(t) = V(t) \cdot I(t)$ ). Electrical energy can be measured by voltage and current measurement over a known period of time. In most of the embedded systems, supply voltage is constant. Therefore energy measurement can be visualized as current and time measurement.

ARM Cortex-M4 based microcontroller is used for experiment. On-board current measurement circuit is used which increases accuracy of measurements and overcomes many of limitations of current measurement mentioned in literature. It consists of a MAX9634T current monitor chip and a 12-bit ADC with a 12- bit sample at 50k to 200ksps. The MAX9634 multiplies the sense voltage by 25 to provide a voltage range suitable for the ADC to measure. Onboard current measurement is used for energy calculation. The ARM Cortex-M4 is a 32-bit core with 3 stage pipeline and Harvard architecture. Sample rate of 200ksps (5us period) is chosen for all measurements. Average current for a period of 1 second is considered for energy calculation.

To find base cost, each instruction is executed 1000 times in a loop. This minimizes the effect of "BL loop" instruction on base cost. Calculation of inter instruction cost involves lot of measurements. Number of measurements is given by [n(n-1)/2]. Where 'n' is number of instructions in Instruction Set Architecture. For a microcontroller with 100 instructions, 4950 combinations of measurements to be carried out to find inter instruction cost. This large volume of measurement is tedious and time consuming. To overcome this problem, some researchers used NOP to find inter instruction cost i.e. NOP is executed with target instruction. With this the measurements for inter instruction cost reduces to 'n' only. This approximation saves time and resources. In some case, inter instruction cost is less than 5%, hence it is neglected [7]. In certain case, it is found to be between 14% and 48% [10]. The total energy is taken as sum of static energy (overall energy consumption of plat form with core and other peripherals in idle state), base energy, inter instruction energy and penalty due to resource constraints. From our experiments it is found that except base cost, all other costs put together works out to be 20%. This 20% has been taken care in estimated energy. It will simplify the process of estimation to a great extent.

Cortex-M4 instruction set can be divided in to 9 groups: memory access, data processing, multiply and divide, saturating, packing and unpacking, bit field, branch and control, floating point and miscellaneous instructions [11].

## IV. EXPERIMENTAL RESULTS

Instruction base cost is determined when a loop containing 1000 instances of same instruction (to nullify the effect of jump instruction) is executed infinitely. Experiments carried out for majority of instructions of ISA.  $V_{dd}$  is 3.3 Volt and cycle time is 0.083 uS (12 MHz). Figure 1-5 shows base cost for different group of instructions. In all these figures, y axis indicates current in mA.





Fig. 2. Base Cost of Data Processing Instructions



Fig. 3. Base Cost of Saturating, Packing & Bitfield Instructions.



Fig. 4. Base Cost of Multiply & Divide Instructions.



Fig. 5. Base Cost of Memory Access & Miscellaneous Instructions

## V. VALIDATION

To validate the results, two set of experiments conducted

- i. In first set, different programs executed. Each program having instructions belonging to the same group. Four programs, each with instructions of same group (memory access, miscellaneous, multiply & divide and data processing). For each program, actual energy consumed is observed, estimated energy consumption and % error is calculated. It is observed that, error is maximum (0.18%) for program containing data processing instructions. The results are shown in Figure 6.
- ii. In second set, program with instructions belonging to different groups is executed. The composition of different groups in micro benchmark program is shown in Fig.7 and results are given in Table I.



Fig. 6. Energy estimation for instructions of same group



Fig. 7. Composition of different group of instructions.

| TABLE I: ENERGY ESTIMATION  |             |          |
|-----------------------------|-------------|----------|
| E <sub>estimated</sub> (nJ) | Eactual(nJ) | % error  |
| 36.08684                    | 36.6181     | -1.45082 |

The error for second micro benchmark program consisting of instructions belonging to different groups is found to be -1.45082%. In first case, where all instructions of a program belongs to same group, the error found to be,-3.33% (memory access), -5.2% (miscellaneous), -0.86% (multiply & divide) and 0.18% (data processing). These results confirm the proposed estimation and measurement methodology.

## VI. CONCLUSION AND FUTURE WORK

A simplified method for instruction level energy estimation for embedded system is presented. With certain approximations established by experiments, tedious and time consuming process of inter instruction cost calculation is avoided. Error in estimation found to be -1.4502% while executing micro benchmark program. A high value of accuracy is achieved. Future work could be focused on covering all instructions of ISA. Validation of results with different benchmark covering diverse areas of applications to be carried out.

#### REFERENCES

- R. Ge, et al., "Performance-constrained Distributed DVS Scheduling for Scientific Applications on Power-aware Clusters," presented at the Proceedings of the 2005 ACM/IEEE conference on Supercomputing, 2005.
- [2] Tiwari, S. Malik, and A. Wolfe, "Power analysis of embedded Software: A first step towards software power minimization," IEEE Trans. VLSI Systems, vol. 2, no. 4, pp. 437–445, Dec. 1994.
- [3] V.A.Kulkarni and G.R.Udupi, "Instruction Level Power Consumption Estimation – Issues and Review", Journal of Multidisciplinary Engineering Science and Technology (JMEST) ISSN: 2458-9403 Vol. 4 Issue 2, February – 2017, pp 6776-6781.
- [4] Nachyuck Chang, Kwanho Kim, and Hyung Gyu Lee, "Cycle-Accurate Energy Measurement and Characterization with a Case Study of the ARM7TDMI", IEEE Trans. VLSI Systems, vol. 10, no. 2, April 2002, pp. 146-154.
- [5] Nikolaos Kavvadias, Periklis Neofotistos, Spiridon Nikolaidis, C. A. Kosmatopoulos, and Theodore Laopoulos, "Measurements Analysis of the Software-Related Power Consumption in Microprocessors", IEEE Trans. Instrum. Measurement, VOL. 53, NO. 4, AUGUST 2004, pp. 1106-1112.
- [6] V. Konstantakos, A. Chatzigeorgiou, S. Nikolaidis, T. Laopoulos, "Energy consumption estimation in embedded systems", IEEE Trans. Instrum. Measurement, VOL. 57, NO. 4, APRIL 2008 pp. 797–804.
- [7] Mostafa Bazzaz, Mohammad Salehi and Alireza Ejlali, "An Accurate Instruction-Level Energy Estimation Model and Tool for Embedded

Systems", IEEE Trans. Instrum. Measurement, VOL. 62, NO. 7, JULY 2013, pp. 1927-1934.

- [8] Wang, Wei and Zwolinski, Mark (2014) "An improved instructionlevel power model for ARM11 microprocessor". In, High Performance Energy Efficient Embedded Systems (HIP3ES), Berlin, DE, 23 Jan 2013. 7pp.
- [9] Lubomir Bogdanov, "Look-up Table-Based Microprocessor Energy Model", International Scientific Conference on Engineering, Technologies and Systems TECHSYS 2016, Technical University – Sofia, Plovdiv branch, 26 – 28 May 2016, Plovdiv, Bulgaria.
- [10] Momcilo V. Krunic, Miroslav V. Popovic, Vlado M. Krunic, Nenad B. Cetic, "Energy Consumption Estimation for Embedded Applications", ELEKTRONIKA IR ELEKTROTECHNIKA, ISSN 1392-1215, VOL. 22, NO. 3, 2016, pp. 44-49.
- [11] Cortex-M4 Devices, Generic User Guide, ARM Limited, 2010.
- [12] V. A. Kulkarni and G. R. Udupi, "Software Power Measurement of ARM Processor Based Embedded System", EJERS, European Journal of Engineering Research and Science Vol. 1, No. 5, November 2016, pp.5-9.
- [13] T. Chou and K. Roy, "Accurate Estimation of Power Dissipation in CMOS Sequential Circuits," IEEE Transaction VLSI Systems, vol. 4, pp. 369–380, September 1996.
- [14] F. N. Najm, "A Survey of Power Estimation Techniques in VLSI Circuits," IEEE Transactions on VLSI Systems, vol. 2, pp. 446–455, 1994.
- [15] Theodore Laopoulos, Periklis Neofotistos, C. A. Kosmatopoulos, and Spiridon Nikolaidis, "Measurement of Current Variations for the Estimation of Software-Related Power Consumption", IEEE Trans. Instrum. Measurement, VOL. 52, NO. 4, AUGUST 2003, pp. 1206-1212.
- [16] N. Kavvadias, P. Neofotistos, S. Nikolaidis, K. Kosmatopoulos, and T. Laopoulos, "Measurements analysis of the software-related power consumption of microprocessors," IEEE Trans. Instrum. Measurement, vol. 53, no. 4, pp. 1106–1112, Aug. 2004.



**V. A. Kulkarni** obtained M.Tech in VLSI design and Embedded Systems. He is pursuing Ph.D. in Visvesvaraya Technological University-Belagavi, Karnataka, India. Currently working as Associate professor and Dean (Academics) in AMGOI, Maharashtra, India. He has published research articles in reputed international peer reviewed journals. He is having 20 years of teaching experience.



**Dr. G. R. Udupi** is working as Professor in Department of Electronics and Communication Engineering, GIT, Belgavi, India. He is having 33 years of teaching experience. His area of interest includes Fuzzy logic, ANN, Biomedical instrumentation, Power quality. He is having 30+ international journal papers to his credit. He has supervised the completion of 02 PhDs and is guiding 06 research scholars. He is member of IEEE, BMSI, IETE and IE(1).