Chapter 5 Bare-metal firmware build and boot process (CMSIS Core(M))
5.1 Introduction
The following chapter examines an bare metal firmware example, which is based on Cortex Microcontroller Software Interface Standard (CMSIS). CMSIS is a “vendor-independent hardware abstraction layer for microcontrollers that are based on Arm Cortex processors.” Chances are high that you are going to encounter CMSIS in your embedded research projects and even if not, the overall concepts shown here pretty much should be the same.
Since the linker file is crucial for understanding the firmware layout, we are going to look into the linker file and the CMSIS source code in parallel and examine how they work together to form the firmware. We are examining a secure world firmware, but again: The concepts for non-secure and secure firmware are the same, non-secure side missing only the non-secure callable region.
Topics and concepts covered in this chapter:
- The Boot process
- Initialization of the RAM layout (
.data
,.bss
) - Setup of the stack and heap
- Setup the non-secure callable region
The following example is based on a STM32L5 SoC, however it is only used to have a descriptive, real-world example with real memory addresses to explain the concepts. Everything covered here is very similar for other systems and vendors.
5.2 Memory segments
In the following chapter we will walk through an actual linker script, which is used in example-stm32-p1. The linker files for non-secure and secure world are pretty similar, so we’ll leave the detailed examination of the non-secure linker script as a task to the reader and only examine the secure world linker script.
Let’s assume that we have decided to use the following memory map for our firmware:
Type | Segment Name | security attributes |
from..to | section overview |
---|---|---|---|---|
Flash | ROM_NSC | NSC | 0x0C03 E000 - 0x0C03 FFFF |
Explored in chapter 5.2.3 |
Flash | ROM | S | 0x0C00 0000 - 0x0C03 DFFF |
Explored in chapter 5.2.1 |
SRAM | RAM | S | 0x3000 0000 - 0x3001 7FFF |
Explored in chapter 5.2.2 |
The system memory map is predefined at a high level by Arm for M-profile architectures: See chapter 3.5. The further separation of the memory map into secure / non-secure / NSC depends on SoC design and configuration. An example where on bit[28] of the address is used to define the security attribution of the respective memory address, is shown in 4.5.
More on Linker Script:
Let’s start our examination of the secure world linker script on the very top. The following code block shows the beginning of the script:
→ example-stm32-p1(/P1_TZEN/Secure/STM32L552ZETXQ_FLASH.ld)
/* Entry Point */
ENTRY(Reset_Handler)
/* Highest address of the user mode stack */
_estack = ORIGIN(RAM) + LENGTH(RAM); /* end of "RAM" Ram type memory */
_Min_Heap_Size = 0x200 ; /* required amount of heap */
_Min_Stack_Size = 0x400 ; /* required amount of stack */
/* Memories definition */
MEMORY
{
RAM (xrw) : ORIGIN = 0x30000000, LENGTH = 96K /* Memory is divided. Actual start is 0x30000000 and actual length is 256K */
ROM (rx) : ORIGIN = 0x0C000000, LENGTH = 248K /* Memory is divided. Actual start is 0x0C000000 and actual length is 512K */
ROM_NSC (rx) : ORIGIN = 0x0C03E000, LENGTH = 8K /* Non-Secure Call-able region */
}
The ENTRY
commands sets the entry point of the executable object file, which is the location where program execution starts. In M-profile cores the execution starts at whatever value is loaded into PC
in the Reset Vector. In the present example the system starts executing at 0x0C00.0000
.
_estack
defines a symbol and calculates the address where the initial stack pointer will point to. Loading the initial stack pointer into SP
is the very first thing the core does after coming out of reset. The initial stack pointer value is the topmost value in the vector table, just before the Reset Vector address.
More on Stack:
In table 5.1 we defined three memory regions, segments, where code and data should be placed into. The MEMORY
command defines these three segments. Chapter 5.2.1, chapter 5.2.2 and chapter 5.2.3 each describe one of the three segments by showing which input sections are written into the particular segment and what their purpose is.
- Chapter 5.2.1 examines the segment “ROM.”
- Chapter 5.2.2 examines the segment “RAM.”
- Chapter 5.2.3 examines the segment “ROM_NSC.”
Let’s quickly recap the relation between an object file’s input regions, output regions and segments:
More on Linking:
5.2.1 ROM segment
The following output sections are written into the ROM
segment:
output section | explanation |
---|---|
.isr_vector |
Vector Table. Important part of boot process. See chapter @ref(sec:stm32i-firmware-boot-process. |
.text |
This output section combines the following input sections: - .text : compiled and assembled code.- glue_7 and glue_7t . See chapter 2.12.- .init and .fini . Libc related functions, see chapter 5.3. |
.rodata |
Constants in the code. |
.ARM and .ARM.extab |
These sections contain information for unwinding the stack, when an Exception is thrown in C++. For C-Projects these should be empty. |
.preinit_array , .init_array ,
.fini_array |
Libc related functions. See chapter 5.3 for details. |
.data |
initialized objects with static storage duration. Explored in chapter 5.3 |
5.2.2 RAM segment
The following output sections are written into the RAM
segment:
section | explanation |
---|---|
.data |
Details discussed in chapter 5.3 and chapter 5.6 |
._user_heap_stack |
Details discussed in chapter 5.6 |
.bss |
Details discussed in chapter 5.3 |
5.2.3 ROM_NSC segment
The following output sections are written into the ROM_NSC
segment:
section | explanation |
---|---|
.gnu.sgstubgs |
See chapter 5.4 |
5.3 ROM segment and Boot Process
Terms explained in this chapter:
- .word
- .weak
- Load Memory Address (LMA)
- Virtual Memory Address (VMA)
We defined the address where the system starts executing to be 0x0C00.0000
, which corresponds to the address ROM
region will be loaded to. Since the linker script is processed from top to bottom, the very first input region the linker will write into ROM
segment is the output section .isr_vector
→ example-stm32-p1(P1_TZEN/Secure/STM32L552ZETXQ_FLASH.ld)
SECTIONS
{
/* The startup code into "ROM" Rom type memory */
.isr_vector :
{
. = ALIGN(8);
KEEP(*(.isr_vector)) /* Startup code */
. = ALIGN(8);
} >ROM
More on Vector Table:
The section .isr_vector
is defined in the CMSIS Startup file named startup_<device >.s:
→ example-stm32-p1(P1_TZEN/Secure/Core/Startup/startup_stm32l552zetxq.s)
.section .isr_vector,"a",%progbits
.type g_pfnVectors, %object
.size g_pfnVectors, .-g_pfnVectors
:
g_pfnVectors.word _estack
.word Reset_Handler
.word NMI_Handler
→ example-stm32-p1(P1_TZEN/Secure/Core/Startup/startup_stm32l552zetxq.s)
.weak NMI_Handler
.thumb_set NMI_Handler,Default_Handler
.weak HardFault_Handler
.thumb_set HardFault_Handler,Default_Handler
More on Assembler Terms:
- Chapter 2.4.1: Assembler Terms
The label g_pfnVectors:
marks the beginning of the exception vector list, which consists of .word <exception vector symbol name>
directives. The .word
directive reserves space and defines the symbol at hand. In startup_<device>.s each of the exception vectors is defined as a .weak
symbol, which makes sure that the Default_Handler
is only used when there is no other symbol defined with the same name in any of the other linked object files. Hence, to implement an exception vector you just have to define a function with the particular exception vectors name, you want to implement. This will overwrite the weak bound DefaultHandler
. If an exception handler is not implemented, the DefaultHandler
will be used.
The Reset_Handler
for example is defined in the startup_<device>.s file (see below.)
_estack
, the topmost value in the vector table, is the initial value of the main stack pointer. This symbol was defined in the very beginning of the linker script. See chapter 5.2.
More on Stack:
More on Vector Table:
Now that we know how the exception vectors are set up, let’s look what happens on reset. The core is set up to start executing at 0x0C00 0000
. On reset the first thing is to pop the first value from the vector table, which is the main stack pointer, into the SP
register (vector_table[0]
). Then the second entry in the vector table (vector_table[1]
), which is the Reset_Handler, is called.
The core register VTOR
can be used to relocate the vector table from a default to a new base address. ArmV8-M core support that features, but in Armv6-M VTOR
is fixed at 0x0000.0000 and can’t be changed. Some SoC designers fix that by implementing a remapping feature in their bootloader.
→ example-stm32-p1(P1_TZEN/Secure/Core/Startup/startup_stm32l552zetxq.s)
.section .text.Reset_Handler
.weak Reset_Handler
.type Reset_Handler, %function
Reset_Handler:
ldr sp, =_estack /* set stack pointer */
/* Copy the data segment initializers from flash to SRAM */
movs r1, #0
b LoopCopyDataInit
CopyDataInit:
ldr r3, =_sidata
ldr r3, [r3, r1]
str r3, [r0, r1]
adds r1, r1, #4
LoopCopyDataInit:
ldr r0, =_sdata
ldr r3, =_edata
adds r2, r0, r1
cmp r2, r3
bcc CopyDataInit
ldr r2, =_sbss
b LoopFillZerobss
/* Zero fill the bss segment. */
FillZerobss:
movs r3, #0
str r3, [r2], #4
LoopFillZerobss:
ldr r3, = _ebss
cmp r2, r3
bcc FillZerobss
/* Call the clock system intitialization function.*/
bl SystemInit
/* Call static constructors */
bl __libc_init_array
/* Call the application's entry point.*/
bl main
LoopForever:
b LoopForever
.size Reset_Handler, .-Reset_Handler
Reset_Handler
starts and sets up the .data
and .bss
sections:
- copy the
.data
section from flash into SRAM (CopyDataInit:
andLoopCopyDataInit:
) - initialize
.bss
with zeros in SRAM
The symbol _sidata
holds the address in flash, where the .data
is going to be written to when the system is flashed The symbols _sdata
and _edata
on the other side hold the start and end address of the region in SRAM, where .data
is being loaded to by the Reset_Handler
in the startup code. The labels CopyDataInit:
and LoopCopyDataInit:
mark routines which copy .data
data from flash into SRAM.
→ example-stm32-p1(P1_TZEN/Secure/STM32L552ZETXQ_FLASH.ld)
/* Used by the startup to initialize data */
_sidata = LOADADDR(.data);
/* Initialized data sections into "RAM" Ram type memory */
.data :
{
. = ALIGN(8);
_sdata = .; /* create a global symbol at data start */
*(.data) /* .data sections */
*(.data*) /* .data* sections */
. = ALIGN(8);
_edata = .; /* define a global symbol at data end */
} >RAM AT> ROM
In many systems there are different types of memory, most often flash and RAM. Firmware is programmed into flash and during reset the system is initialized and some data is copied from flash into RAM, e.g. the .data
section. The linker supports this notion by allowing to differentiate between a Virtual Memory Address (VMA) and a Load Memory Address (LMA). The former is the address where data is loaded to and the latter is the address where data is loaded from. In systems where the whole ELF file is loaded into RAM LMA and VMA do not differ, which is the case on your personal computer at home for example.
By specifying >RAM AT> ROM
, we define that .data
is loaded from ROM
into RAM
. The linker can then use LOADADDR()
to access the LMA of a symbol. In our example the startup code in the reset handler uses the LMA of _sidata
to retrieve the address where from to read .data
.
→ example-stm32-p1(P1_TZEN/Secure/Core/Startup/startup_stm32l552zetxq.s
b LoopCopyDataInit
CopyDataInit:
ldr r3, =_sidata
ldr r3, [r3, r1]
str r3, [r0, r1]
adds r1, r1, #4
LoopCopyDataInit:
ldr r0, =_sdata
ldr r3, =_edata
adds r2, r0, r1
cmp r2, r3
bcc CopyDataInit
The two routines initialize R3 with the LMA and R0 with the VMA of .data
. CopyDataInit:
starts at the LMA loads word by word relative to R3 and stores it relative to R0 into RAM. The offset used in R0 and R3 and number of words copied is tracked in R1 and the loop finishes as soon as all words are copied.
Next up is the .bss
section:
→ example-stm32-p1(P1_TZEN/Secure/STM32L552ZETXQ_FLASH.ld)
/* Uninitialized data section into "RAM" Ram type memory */
. = ALIGN(8);
.bss :
{
/* This is used by the startup in order to initialize the .bss section */
_sbss = .; /* define a global symbol at bss start */
__bss_start__ = _sbss;
*(.bss)
*(.bss*)
*(COMMON)
. = ALIGN(8);
_ebss = .; /* define a global symbol at bss end */
__bss_end__ = _ebss;
} >RAM
We already learned in chapter 2 that .bss
does not hold any data. All variables in .bss
are initialized with zero, as you can see for yourself in the routines labeled FillZerobss
and LoopFillZerobss
in the startup code:
→ example-stm32-p1(P1_TZEN/Secure/Core/Startup/startup_stm32l552zetxq.s
ldr r2, =_sbss
b LoopFillZerobss
/* Zero fill the bss segment. */
FillZerobss:
movs r3, #0
str r3, [r2], #4
LoopFillZerobss:
ldr r3, = _ebss
cmp r2, r3
bcc FillZerobss
The startup code uses the symbols provided in the linker script which defined the start (_sbss
) and end address ( _ebss
) of the output section .bss
to initialize the section in RAM
with zero. Nothing is loaded from flash.
After initializing .data
and .bss
, Reset_Handler
proceeds and jumps to SystemInit
, which initializes SAU (details in chapter 7.2) and VTOR
for NS world. Before jumping to main
__libc_init_array
is called, which in turn calls all libc initialization functions located in the section .preinit_array
, .init_array
and .init
. These were also written by the linker into the ROM section:
→ example-stm32-p1(example-stm32-p1/P1_TZEN/Secure/STM32L552ZETXQ_FLASH.ld)
.preinit_array :
{
. = ALIGN(8);
PROVIDE_HIDDEN (__preinit_array_start = .);
KEEP (*(.preinit_array*))
PROVIDE_HIDDEN (__preinit_array_end = .);
. = ALIGN(8);
} >ROM
.init_array :
{
. = ALIGN(8);
PROVIDE_HIDDEN (__init_array_start = .);
KEEP (*(SORT(.init_array.*)))
KEEP (*(.init_array*))
PROVIDE_HIDDEN (__init_array_end = .);
. = ALIGN(8);
} >ROM
The symbols defined here, e.g. __init_array_start
, are used by the C library (newlibc init.c) to calculate the number of entries in the array:
#ifdef HAVE_INITFINI_ARRAY
/* These magic symbols are provided by the linker. */
extern void (*__preinit_array_start []) (void) __attribute__((weak));
extern void (*__preinit_array_end []) (void) __attribute__((weak));
extern void (*__init_array_start []) (void) __attribute__((weak));
extern void (*__init_array_end []) (void) __attribute__((weak));
extern void _init (void);
/* Iterate over all the init routines. */
void
(void)
__libc_init_array {
size_t count;
size_t i;
= __preinit_array_end - __preinit_array_start;
count for (i = 0; i < count; i++)
[i] ();
__preinit_array_start
();
_init
= __init_array_end - __init_array_start;
count for (i = 0; i < count; i++)
[i] ();
__init_array_start}
#endif
Each address placed into the sections .preinit_array and .init_array are called in __preinit_array_start[i]()
and __init_array_start[i]()
A graphical summary of this chapter:
5.4 Non-Secure Callable segment
Terms explained in this chapter:
- SG veneer
- CSME import library
This chapter applies only for secure world firmware. As discussed in chapter 4 the state (secure or non-secure) is determined by the memory region the processor executes from. When the core executes code from non-secure memory, the state is NS. When the processor executes code from a region, which is attributed as secure the processor runs in secure mode and must fetch instructions from secure memory. The security attribution of memory regions is defined in IDAU and SAU.
More on Implementation Defined Attribution Unit:
More on Security Attribution Unit:
A quick recap on the transition from NS to S: Three requirements must be met for a successful transition to secure world:
- the first instruction of the transition must be a
SG
instruction - the processor must be in NS state, when
SG
is executed - there is a special region, which is the only one allowed to hold
SG
instructions: the NSC region
The following chapter shows how the NSC region is defined in the linker script.
→ example-stm32-p1(P1_TZEN/Secure/STM32L552ZETXQ_FLASH.ld)
.gnu.sgstubs :
{
. = ALIGN(8);
*(.gnu.sgstubs*) /* Secure Gateway stubs */
. = ALIGN(8);
} >ROM_NSC
The output section .gnu.sgstubs
is written into the ROM_NSC segment. The input sections .gnu.sgstubs.*
are secure gateway veneers (SG veneer) generated by the linker. Let’s go through the process using a very simple example and assume you defined the following function in your secure world application:
Compile the example with: arm-none-eabi-gcc -o nsc-test.o -mcpu=cortex-m33 --specs=nosys.specs -mcmse -mthumb -c nsc-test.c
To make a function in S world reachable from NS world you must add the function attribute cmse_nonsecure_entry
to the function. The compiler will add two global symbols to the symbol table for the function add10(int)
(readelf nsc-test.o
):
8: 00000001 34 FUNC GLOBAL DEFAULT 1 add10
9: 00000001 0 FUNC GLOBAL DEFAULT 1 __acle_se_add10
Using the following simple linker script we link nsc-test.o
: arm-none-eabi-ld -o nsc-test --cmse-implib --out-implib=./nsclib.o -T gnustubs.ld nsc-test.o
gnustubs.ld
:
MEMORY {
ROM_NSC (rx) : ORIGIN = 0x0C03E000, LENGTH = 8K
}
SECTIONS {
.gnu.sgstubs :
{
. = ALIGN(8);
*(.gnu.sgstubs*)
} >ROM_NSC
}
When the linker recognizes two symbols at the same address, one prefixed with __acle_se_$name and another one $name, the linker will create a new input section named .gnu.sgstubs
with a SG veneer for each function found. Obviously the shown linker script is far form complete and the linker will use a lot of default values for the memory regions and output sections necessary. Normally GCC uses a default linker script, but when compiling a CSME enabled firmware, we need to define the location of the non-secure callable segment explicitly to successfully build the firmware. In the example, the final binary will have the SG veneers placed in the ROM_NSC segment.
The parameters --csme-implib
and --out-implib=./nsclib.o
will create import library named nsclib.o
. This file will contain a symbol for add10(int)
without any code but pointing to the SG veneer in ROM_NSC
:
Symbol table '.symtab' contains 2 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 0c03e041 8 FUNC GLOBAL DEFAULT ABS add10
You can now use nsclib.o
when linking object files for NS world. The linker will relocate these add10()
calls to the address where the SG veneer will be located (0x0c03e041
) in NSC segment. Having an import library for secure world firmware allows to compile the non-secure firmware completely independent of the secure world firmware. The developers of secure word just pass the import library to the developers of the non-secure world, not even the toolchain used to develop non-secure firmware needs to be CSME aware.
More on Secure function calls:
Let’s quickly have a look at the disassembly of a secure function call. In the following example toggle_led_secure()
is the secure function, which is called from the non-secure main()
. The linker not only added a SG veneer in NSC, but also a long-branch veneer, leading to the call flow:
main()
(NS) calls__toggle_led_secure_veneer()
(NS)__toggle_led_secure_veneer()
(NS) callstoggle_led_secure()
(NSC)toggle_led_secure()
(NSC) callstoggle_led_secure()
(secure) (not shown in the following code example)
arm-none-eabi-objdump --disassemble=main P1_TZEN_NonSecure.elf
Disassembly of section .text:
08040238 <main>:
8040238: b580 push {r7, lr}
804023a: af00 add r7, sp, #0
804023c: f000 f9af bl 804059e <HAL_Init>
8040240: f000 f890 bl 8040364 <MX_GPIO_Init>
8040244: f000 f80c bl 8040260 <MX_ADC1_Init>
8040248: f000 f884 bl 8040354 <MX_RTC_Init>
804024c: f44f 7100 mov.w r1, #512 ; 0x200
8040250: 4802 ldr r0, [pc, #8] ; (804025c <main+0x24>)
8040252: f001 fb97 bl 8041984 <HAL_GPIO_TogglePin>
8040256: f002 f8c3 bl 80423e0 <__toggle_led_secure_veneer>
804025a: e7f7 b.n 804024c <main+0x14>
804025c: 42020000 .word 0x42020000
In main()
before jumping into NSC to the SG veneer, the linker inserted an additional veneer at __toggle_led_secure_veneer()
. From previous chapters we know why:
- the SG veneer for
toggle_led_secure()
is at 0xc03.e009. - the
bl __toggle_led_secure_veneer()
in NSmain()
is at 0x0804.0256. - that results in a relative jump of 0x3FF DDB3 (0xc03.e009 - 0x0804.0256).
bl
in Armv-8M allows offsets from –16777216 to 16777214 (0x00FF.FFFE)- Since the jump is too far (0x3FF.DDB3 > 0x00FF.FFFE) a veneer long-branch veneer (
__toggle_led_secure_veneer()
) is inserted. - The NS SG veneers address is stored in the literal pool (
0x80423ec
) and loaded from there in__toggle_led_secure_veneer()
.
arm-none-eabi-objdump --disassemble=__toggle_led_secure_veneer P1_TZEN_NonSecure.elf
Disassembly of section .text:
080423e0 <__toggle_led_secure_veneer>:
80423e0: b401 push {r0}
80423e2: 4802 ldr r0, [pc, #8] ; (80423ec <__toggle_led_secure_veneer+0xc>)
80423e4: 4684 mov ip, r0
80423e6: bc01 pop {r0}
80423e8: 4760 bx ip
80423ea: bf00 nop
80423ec: 0c03e009 .word 0x0c03e009
The NS world firmware was linked against the import library, so the call to toggle_led_secure()
in NS world was relocated to point to the address of the SG veneer at 0x0c03.e008
in NSC:
arm-none-eabi-objdump --disassemble=toggle_led_secure P1_TZEN_Secure.elf
P1_TZEN_Secure.elf: file format elf32-littlearm
0c03e008 <toggle_led_secure>:
c03e008: e97f e97f sg
c03e00c: f7c2 b942 b.w c000294 <__acle_se_toggle_led_secure>
The SG veneer switches to S world (SG
) and then branches off to the actual implementation of toggle_led_secure()
at __acle_toggle_led_secure()
:
arm-none-eabi-objdump --disassemble=__acle_se_toggle_led_secure P1_TZEN_Secure.elf
Disassembly of section .text:
0c000294 <__acle_se_toggle_led_secure>:
c000294: b580 push {r7, lr}
c000296: af00 add r7, sp, #0
c000298: 2180 movs r1, #128 ; 0x80
c00029a: 481d ldr r0, [pc, #116] ; (c000310 <__acle_se_toggle_led_secure+0x7c>)
c00029c: f000 ffea bl c001274 <HAL_GPIO_TogglePin>
c0002a0: bf00 nop
c0002a2: 46bd mov sp, r7
c0002a4: e8bd 4080 ldmia.w sp!, {r7, lr}
c0002a8: 4670 mov r0, lr
c0002aa: 4671 mov r1, lr
c0002ac: 4672 mov r2, lr
c0002ae: 4673 mov r3, lr
c0002b0: eeb7 0a00 vmov.f32 s0, #112 ; 0x3f800000 1.0
c0002b4: eef7 0a00 vmov.f32 s1, #112 ; 0x3f800000 1.0
c0002b8: eeb7 1a00 vmov.f32 s2, #112 ; 0x3f800000 1.0
c0002bc: eef7 1a00 vmov.f32 s3, #112 ; 0x3f800000 1.0
c0002c0: eeb7 2a00 vmov.f32 s4, #112 ; 0x3f800000 1.0
c0002c4: eef7 2a00 vmov.f32 s5, #112 ; 0x3f800000 1.0
c0002c8: eeb7 3a00 vmov.f32 s6, #112 ; 0x3f800000 1.0
c0002cc: eef7 3a00 vmov.f32 s7, #112 ; 0x3f800000 1.0
c0002d0: eeb7 4a00 vmov.f32 s8, #112 ; 0x3f800000 1.0
c0002d4: eef7 4a00 vmov.f32 s9, #112 ; 0x3f800000 1.0
c0002d8: eeb7 5a00 vmov.f32 s10, #112 ; 0x3f800000 1.0
c0002dc: eef7 5a00 vmov.f32 s11, #112 ; 0x3f800000 1.0
c0002e0: eeb7 6a00 vmov.f32 s12, #112 ; 0x3f800000 1.0
c0002e4: eef7 6a00 vmov.f32 s13, #112 ; 0x3f800000 1.0
c0002e8: eeb7 7a00 vmov.f32 s14, #112 ; 0x3f800000 1.0
c0002ec: eef7 7a00 vmov.f32 s15, #112 ; 0x3f800000 1.0
c0002f0: f38e 8c00 msr CPSR_fs, lr
c0002f4: b410 push {r4}
c0002f6: eef1 ca10 vmrs ip, fpscr
c0002fa: f64f 7460 movw r4, #65376 ; 0xff60
c0002fe: f6c0 74ff movt r4, #4095 ; 0xfff
c000302: ea0c 0c04 and.w ip, ip, r4
c000306: eee1 ca10 vmsr fpscr, ip
c00030a: bc10 pop {r4}
c00030c: 46f4 mov ip, lr
c00030e: 4774 bxns lr
c000310: 52020800 .word 0x52020800
5.5 Calling non-secure world from secure world
This section is here as reference. The linker does not need to generate special regions for NS to S transitions. Everything needed to know is defined in chapter 4.6.2.
5.6 Heap and Stack
From the Linker file we also can learn a lot about the RAM layout of the device: For example where Heap and the Stack will be located. The output sections listed in table 5.3 are written sequentially into the RAM
segment, in the order they were defined in the linker script:
section | explanation |
---|---|
.data |
See chapter 5.3 and chapter 5.6 |
.bss |
See chapter 5.3 |
._user_heap_stack |
See chapter 5.6 |
The output section ._user_heap_stack
sets up the heap and stack regions. 3 important symbols are defined in the beginning of the linker script:
_estack
_Min_Heap_Size
_Min_Stack_Size
→ example-stm32-p1(example-stm32-p1/P1_TZEN/Secure/STM32L552ZETXQ_FLASH.ld)
/* Highest address of the user mode stack */
_estack = ORIGIN(RAM) + LENGTH(RAM); /* end of "RAM" Ram type memory */
_Min_Heap_Size = 0x200 ; /* required amount of heap */
_Min_Stack_Size = 0x400 ; /* required amount of stack */
/* Memories definition */
MEMORY
{
RAM (xrw) : ORIGIN = 0x30000000, LENGTH = 96K /* Memory is divided. Actual start is 0x30000000 and actual length is 256K */
ROM (rx) : ORIGIN = 0x0C000000, LENGTH = 248K /* Memory is divided. Actual start is 0x0C000000 and actual length is 512K */
ROM_NSC (rx) : ORIGIN = 0x0C03E000, LENGTH = 8K /* Non-Secure Call-able region */
}
They are used to set up the size of the output section ._user_heap_stack
:
→ example-stm32-p1(P1_TZEN/Secure/STM32L552ZETXQ_FLASH.ld)
/* User_heap_stack section, used to check that there is enough "RAM" Ram type memory left */
._user_heap_stack :
{
. = ALIGN(8);
PROVIDE ( end = . );
PROVIDE ( _end = . );
. = . + _Min_Heap_Size;
. = . + _Min_Stack_Size;
. = ALIGN(8);
} >RAM
The location counter .
in a linker script always contains the current output section location. When you assign to the location counter, as done here, the location moves and creates free space in the output section.
The final layout of the RAM
segment:
- The segment will start at
0x3000.0000
, as defined in theMEMORY
command of the linker script - First
.data
, then.bss
is written intoRAM
. - Then the location counter is moved by
_Min_Heap_Size
and_Min_Stack_Size
forward.
This results in the RAM layout showed in figure 5.3.
If the sum of the length of .data
, .bss
, stack and heap exceeds the LENGTH
of RAM
, the linker will notify the developer.
The symbol _estack
is set to be at the highest address of RAM, which is 0x3001 7fff
(0x3000 000 + 96 KB). Stack grows upwards as you put data on it, towards the heap. The beginning of heap is defined by the symbol end
(or _end
). These symbols are provided to the C library and used to initialize the heap. When memory on heap is allocated, malloc()
calls the system call sbrk()
to acquire additional space on heap:
→ example-stm32-p1(P1_TZEN/Secure/Core/Src/sysmem.c)
register char * stack_ptr asm("sp");
/* Functions */
/**
_sbrk
Increase program data space. Malloc and related functions depend on this
**/
caddr_t _sbrk(int incr)
{
extern char end asm("end");
static char *heap_end;
char *prev_heap_end;
if (heap_end == 0)
heap_end = &end;
prev_heap_end = heap_end;
if (heap_end + incr > stack_ptr)
{
errno = ENOMEM;
return (caddr_t) -1;
}
heap_end += incr;
return (caddr_t) prev_heap_end;
}
_sbrk()
uses the provided symbol end
to initialize the lowest address of heap when called for the first time. To check if enough memory is available, i.e. stack and heap do not collide, it checks that the current stack pointer address is lower than the current heap address plus the requested amount of bytes.
5.7 ELF to .bin
When flashing a SoC we do not write a ELF file into flash, but an additional step of firmware image creation happens. How the final firmware image is build from an ELF file depends on the target, but for STM32L5 it is simply:
arm-none-eabi-objcopy -O binary "P1_TZEN_Secure.elf" "P1_TZEN_Secure.bin"
objcopy
makes a memory dump of the ELF file. It starts at the load memory address (PhysAddr) of the lowest segment and copies contents from the ELF segment into the .bin file, starting at 0x0. It also fills gaps between segments with zeros. For example the S world ELF:
arm-none-eabi-readelf P1_TZEN_Secure.elf
(segments and program headers shown)
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x010000 0x0c000000 0x0c000000 0x03270 0x03270 RWE 0x10000
LOAD 0x020000 0x30000000 0x0c003270 0x00010 0x00010 RW 0x10000
LOAD 0x02e000 0x0c03e000 0x0c03e000 0x00020 0x00020 R E 0x10000
LOAD 0x030010 0x30000010 0x30000010 0x00000 0x00658 RW 0x10000
Section to Segment mapping:
Segment Sections...
00 .isr_vector .text .rodata .init_array .fini_array
01 .data
02 .gnu.sgstubs
03 .bss ._user_heap_stack
- File bin file will start with the contents of segment 00 (0x0C00.0000)
- Directly followed by the contents of segment 01, which starts at the file offset 0x0000.3270 (0x0c00.3270 - 0x0c00.0000)
- The space between segment 01 and 02 is filled with 0 until file offset 0x0003.e000, where 02 starts
- segment 02 content
- segment 03 is empty and has no content. Nothing will be written into the bin.
The final binary blob is written into STM32L5 at flash offset 0x0C00.0000. When the device starts booting from 0x0c00.0000, all offsets in the binary fit and the firmware runs successfully.