arm-trusted-firmware/bl2u/bl2u.ld.S

135 lines
3.4 KiB
ArmAsm
Raw Normal View History

/*
Introduce unified API to zero memory Introduce zeromem_dczva function on AArch64 that can handle unaligned addresses and make use of DC ZVA instruction to zero a whole block at a time. This zeroing takes place directly in the cache to speed it up without doing external memory access. Remove the zeromem16 function on AArch64 and replace it with an alias to zeromem. This zeromem16 function is now deprecated. Remove the 16-bytes alignment constraint on __BSS_START__ in firmware-design.md as it is now not mandatory anymore (it used to comply with zeromem16 requirements). Change the 16-bytes alignment constraints in SP min's linker script to a 8-bytes alignment constraint as the AArch32 zeromem implementation is now more efficient on 8-bytes aligned addresses. Introduce zero_normalmem and zeromem helpers in platform agnostic header that are implemented this way: * AArch32: * zero_normalmem: zero using usual data access * zeromem: alias for zero_normalmem * AArch64: * zero_normalmem: zero normal memory using DC ZVA instruction (needs MMU enabled) * zeromem: zero using usual data access Usage guidelines: in most cases, zero_normalmem should be preferred. There are 2 scenarios where zeromem (or memset) must be used instead: * Code that must run with MMU disabled (which means all memory is considered device memory for data accesses). * Code that fills device memory with null bytes. Optionally, the following rule can be applied if performance is important: * Code zeroing small areas (few bytes) that are not secrets should use memset to take advantage of compiler optimizations. Note: Code zeroing security-related critical information should use zero_normalmem/zeromem instead of memset to avoid removal by compilers' optimizations in some cases or misbehaving versions of GCC. Fixes ARM-software/tf-issues#408 Change-Id: Iafd9663fc1070413c3e1904e54091cf60effaa82 Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
2016-12-02 13:51:54 +00:00
* Copyright (c) 2015-2017, ARM Limited and Contributors. All rights reserved.
*
* SPDX-License-Identifier: BSD-3-Clause
*/
#include <platform_def.h>
OUTPUT_FORMAT(PLATFORM_LINKER_FORMAT)
OUTPUT_ARCH(PLATFORM_LINKER_ARCH)
ENTRY(bl2u_entrypoint)
MEMORY {
RAM (rwx): ORIGIN = BL2U_BASE, LENGTH = BL2U_LIMIT - BL2U_BASE
}
SECTIONS
{
. = BL2U_BASE;
ASSERT(. == ALIGN(4096),
"BL2U_BASE address is not aligned on a page boundary.")
Introduce SEPARATE_CODE_AND_RODATA build flag At the moment, all BL images share a similar memory layout: they start with their code section, followed by their read-only data section. The two sections are contiguous in memory. Therefore, the end of the code section and the beginning of the read-only data one might share a memory page. This forces both to be mapped with the same memory attributes. As the code needs to be executable, this means that the read-only data stored on the same memory page as the code are executable as well. This could potentially be exploited as part of a security attack. This patch introduces a new build flag called SEPARATE_CODE_AND_RODATA, which isolates the code and read-only data on separate memory pages. This in turn allows independent control of the access permissions for the code and read-only data. This has an impact on memory footprint, as padding bytes need to be introduced between the code and read-only data to ensure the segragation of the two. To limit the memory cost, the memory layout of the read-only section has been changed in this case. - When SEPARATE_CODE_AND_RODATA=0, the layout is unchanged, i.e. the read-only section still looks like this (padding omitted): | ... | +-------------------+ | Exception vectors | +-------------------+ | Read-only data | +-------------------+ | Code | +-------------------+ BLx_BASE In this case, the linker script provides the limits of the whole read-only section. - When SEPARATE_CODE_AND_RODATA=1, the exception vectors and read-only data are swapped, such that the code and exception vectors are contiguous, followed by the read-only data. This gives the following new layout (padding omitted): | ... | +-------------------+ | Read-only data | +-------------------+ | Exception vectors | +-------------------+ | Code | +-------------------+ BLx_BASE In this case, the linker script now exports 2 sets of addresses instead: the limits of the code and the limits of the read-only data. Refer to the Firmware Design guide for more details. This provides platform code with a finer-grained view of the image layout and allows it to map these 2 regions with the appropriate access permissions. Note that SEPARATE_CODE_AND_RODATA applies to all BL images. Change-Id: I936cf80164f6b66b6ad52b8edacadc532c935a49
2016-07-08 14:37:40 +01:00
#if SEPARATE_CODE_AND_RODATA
.text . : {
__TEXT_START__ = .;
*bl2u_entrypoint.o(.text*)
*(.text*)
*(.vectors)
. = NEXT(4096);
__TEXT_END__ = .;
} >RAM
.rodata . : {
__RODATA_START__ = .;
*(.rodata*)
. = NEXT(4096);
__RODATA_END__ = .;
} >RAM
#else
ro . : {
__RO_START__ = .;
*bl2u_entrypoint.o(.text*)
*(.text*)
*(.rodata*)
*(.vectors)
__RO_END_UNALIGNED__ = .;
/*
* Memory page(s) mapped to this section will be marked as
* read-only, executable. No RW data from the next section must
* creep in. Ensure the rest of the current memory page is unused.
*/
. = NEXT(4096);
__RO_END__ = .;
} >RAM
Introduce SEPARATE_CODE_AND_RODATA build flag At the moment, all BL images share a similar memory layout: they start with their code section, followed by their read-only data section. The two sections are contiguous in memory. Therefore, the end of the code section and the beginning of the read-only data one might share a memory page. This forces both to be mapped with the same memory attributes. As the code needs to be executable, this means that the read-only data stored on the same memory page as the code are executable as well. This could potentially be exploited as part of a security attack. This patch introduces a new build flag called SEPARATE_CODE_AND_RODATA, which isolates the code and read-only data on separate memory pages. This in turn allows independent control of the access permissions for the code and read-only data. This has an impact on memory footprint, as padding bytes need to be introduced between the code and read-only data to ensure the segragation of the two. To limit the memory cost, the memory layout of the read-only section has been changed in this case. - When SEPARATE_CODE_AND_RODATA=0, the layout is unchanged, i.e. the read-only section still looks like this (padding omitted): | ... | +-------------------+ | Exception vectors | +-------------------+ | Read-only data | +-------------------+ | Code | +-------------------+ BLx_BASE In this case, the linker script provides the limits of the whole read-only section. - When SEPARATE_CODE_AND_RODATA=1, the exception vectors and read-only data are swapped, such that the code and exception vectors are contiguous, followed by the read-only data. This gives the following new layout (padding omitted): | ... | +-------------------+ | Read-only data | +-------------------+ | Exception vectors | +-------------------+ | Code | +-------------------+ BLx_BASE In this case, the linker script now exports 2 sets of addresses instead: the limits of the code and the limits of the read-only data. Refer to the Firmware Design guide for more details. This provides platform code with a finer-grained view of the image layout and allows it to map these 2 regions with the appropriate access permissions. Note that SEPARATE_CODE_AND_RODATA applies to all BL images. Change-Id: I936cf80164f6b66b6ad52b8edacadc532c935a49
2016-07-08 14:37:40 +01:00
#endif
/*
* Define a linker symbol to mark start of the RW memory area for this
* image.
*/
__RW_START__ = . ;
/*
* .data must be placed at a lower address than the stacks if the stack
* protector is enabled. Alternatively, the .data.stack_protector_canary
* section can be placed independently of the main .data section.
*/
.data . : {
__DATA_START__ = .;
*(.data*)
__DATA_END__ = .;
} >RAM
stacks (NOLOAD) : {
__STACKS_START__ = .;
*(tzfw_normal_stacks)
__STACKS_END__ = .;
} >RAM
/*
* The .bss section gets initialised to 0 at runtime.
Introduce unified API to zero memory Introduce zeromem_dczva function on AArch64 that can handle unaligned addresses and make use of DC ZVA instruction to zero a whole block at a time. This zeroing takes place directly in the cache to speed it up without doing external memory access. Remove the zeromem16 function on AArch64 and replace it with an alias to zeromem. This zeromem16 function is now deprecated. Remove the 16-bytes alignment constraint on __BSS_START__ in firmware-design.md as it is now not mandatory anymore (it used to comply with zeromem16 requirements). Change the 16-bytes alignment constraints in SP min's linker script to a 8-bytes alignment constraint as the AArch32 zeromem implementation is now more efficient on 8-bytes aligned addresses. Introduce zero_normalmem and zeromem helpers in platform agnostic header that are implemented this way: * AArch32: * zero_normalmem: zero using usual data access * zeromem: alias for zero_normalmem * AArch64: * zero_normalmem: zero normal memory using DC ZVA instruction (needs MMU enabled) * zeromem: zero using usual data access Usage guidelines: in most cases, zero_normalmem should be preferred. There are 2 scenarios where zeromem (or memset) must be used instead: * Code that must run with MMU disabled (which means all memory is considered device memory for data accesses). * Code that fills device memory with null bytes. Optionally, the following rule can be applied if performance is important: * Code zeroing small areas (few bytes) that are not secrets should use memset to take advantage of compiler optimizations. Note: Code zeroing security-related critical information should use zero_normalmem/zeromem instead of memset to avoid removal by compilers' optimizations in some cases or misbehaving versions of GCC. Fixes ARM-software/tf-issues#408 Change-Id: Iafd9663fc1070413c3e1904e54091cf60effaa82 Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
2016-12-02 13:51:54 +00:00
* Its base address should be 16-byte aligned for better performance of the
* zero-initialization code.
*/
.bss : ALIGN(16) {
__BSS_START__ = .;
*(SORT_BY_ALIGNMENT(.bss*))
*(COMMON)
__BSS_END__ = .;
} >RAM
/*
* The xlat_table section is for full, aligned page tables (4K).
* Removing them from .bss avoids forcing 4K alignment on
* the .bss section and eliminates the unecessary zero init
*/
xlat_table (NOLOAD) : {
*(xlat_table)
} >RAM
#if USE_COHERENT_MEM
/*
* The base address of the coherent memory section must be page-aligned (4K)
* to guarantee that the coherent data are stored on their own pages and
* are not mixed with normal data. This is required to set up the correct
* memory attributes for the coherent data page tables.
*/
coherent_ram (NOLOAD) : ALIGN(4096) {
__COHERENT_RAM_START__ = .;
*(tzfw_coherent_mem)
__COHERENT_RAM_END_UNALIGNED__ = .;
/*
* Memory page(s) mapped to this section will be marked
* as device memory. No other unexpected data must creep in.
* Ensure the rest of the current memory page is unused.
*/
. = NEXT(4096);
__COHERENT_RAM_END__ = .;
} >RAM
#endif
/*
* Define a linker symbol to mark end of the RW memory area for this
* image.
*/
__RW_END__ = .;
__BL2U_END__ = .;
__BSS_SIZE__ = SIZEOF(.bss);
ASSERT(. <= BL2U_LIMIT, "BL2U image has exceeded its limit.")
}