I frequently work with ESP32 microcontrollers (MCU), both at my day job and in my free time. These devices are very flexible, but like any microcontroller they are quite resource constrained in comparison to anything equivalent to or larger than a single-board computer.
There are a few different variants of the ESP32 these days, but one of the newer and most popular is the ESP32-S3. Compared to the ESP32-S2, it offers a dual-core 32-bit Xtensa processor, support for Bluetooth 5.0, and more on-chip SRAM and ROM (full comparison can be found here). Both of these SoC’s, as well as others from Espressif, allow for integrating external RAM and flash, making it possible to run larger, more complex firmware.
At first thought, it may seem as though it would be desirable for microcontrollers to use dynamic random access memory (DRAM) as it is more dense than static random access memory (SRAM), and thus cheaper. However, as the name suggests, DRAM must be constantly refreshed, which leads to higher power consumption when idle. The refresh behavior of DRAM also makes it more complex to integrate into a system. Given that microcontrollers typically only require a small amount of RAM, SRAM is generally used for on-chip memory.
The ESP32-S3 has 512 KB of on-chip SRAM, but is capable of mapping up to 32 MB of external RAM into its memory address space. In order to make this larger external RAM exhibit similar properties to the on-chip RAM, pseudostatic RAM (PSRAM) is used. PSRAM is technically DRAM, but the external chip contains its own refresh circuitry, allowing it to appear to the MCU like SRAM that is communicated with via a serial peripheral interface (SPI) bus.
esp-idf
provides a few different
mechanisms for dynamically allocating in external RAM, which are enabled via the
CONFIG_SPIRAM_USE
options.
CONFIG_SPIRAM_USE_MEMMAP
: maps external RAM into the virtual address space, but leaves all access and management to the programmer.CONFIG_SPIRAM_USE_CAPS_ALLOC
: maps external RAM into the virtual address space, and makes it accessible to the capability-based heap memory allocator. This allows the programmer to utilize the memory by specifyingMALLOC_CAP_SPIRAM
explicitly withheap_caps_malloc()
.CONFIG_SPIRAM_USE_MALLOC
: does the same asCONFIG_SPIRAM_USE_CAPS_ALLOC
, but also allows for a standardmalloc()
call to allocate in external RAM.
I was recently working on a project where I wanted to use a buffer that exceeded
the size of the on-chip RAM. The buffer would be reused and would be needed
throughout the duration of execution. I started out with using the capability
allocator with MALLOC_CAP_SPIRAM
to dynamically allocate the memory for the
buffer early in the program’s execution. The following is a toy program
demonstrating the functionality.
#include "esp_heap_caps.h"
#include <assert.h>
#include <stdint.h>
#include <stdio.h>
static uint8_t *buf = NULL;
void app_main(void) {
printf("Let's allocate some external memory!\n");
buf = heap_caps_malloc(1000000, MALLOC_CAP_SPIRAM);
assert(buf);
printf("Address: %p\n", (void *)buf);
}
Attempting to allocate the 1 MB buffer in on-chip memory would fail as it alone exceeds the 512 KB capacity. Specifying that external memory should be used results in successful allocation. Flashing and running this program results in the following output.
I (1204) main_task: Started on CPU0
I (1214) main_task: Calling app_main()
Let's allocate some external memory!
Address: 0x3c030b4c
I (1214) main_task: Returned from app_main()
If we look at the sections of the ELF file from our build, we can see that the
virtual address of the allocated memory (0x3c030b4c
) is slightly offset from
the .ext_ram.bss
section address, which currently has a size of zero.
$ readelf -S build/esp_psram_static.elf | head -27
There are 95 section headers, starting at offset 0x3d2ea4:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .rtc.text NOBITS 600fe000 03e000 000010 00 WA 0 0 1
[ 2] .rtc.force_fast PROGBITS 600fe010 03d5f3 000000 00 W 0 0 1
[ 3] .rtc_noinit PROGBITS 50000000 03d5f3 000000 00 W 0 0 1
[ 4] .rtc.force_slow PROGBITS 50000000 03d5f3 000000 00 W 0 0 1
[ 5] .rtc_reserved NOBITS 600fffe8 03dfe8 000018 00 WA 0 0 8
[ 6] .iram0.vectors PROGBITS 40374000 01a000 000403 00 AX 0 0 4
[ 7] .iram0.text PROGBITS 40374404 01a404 00c4d7 00 AX 0 0 4
[ 8] .dram0.dummy NOBITS 3fc88000 00e000 008900 00 WA 0 0 1
[ 9] .dram0.data PROGBITS 3fc90900 016900 00275c 00 WA 0 0 16
[10] .noinit NOBITS 3fc9305c 000000 000000 00 WA 0 0 1
[11] .dram0.bss NOBITS 3fc93060 01905c 000900 00 WA 0 0 8
[12] .flash.text PROGBITS 42000020 027020 0165d3 00 AX 0 0 4
[13] .flash_rodat[...] NOBITS 3c000020 001020 020000 00 WA 0 0 1
[14] .flash.appdesc PROGBITS 3c020020 001020 000100 00 A 0 0 16
[15] .flash.rodata PROGBITS 3c020120 001120 00c59c 00 WA 0 0 16
[16] .flash.rodat[...] PROGBITS 3c02c6bc 03d5f3 000000 00 W 0 0 1
[17] .ext_ram.dummy NOBITS 3c000020 001020 02ffe0 00 WA 0 0 1
[18] .ext_ram.bss PROGBITS 3c030000 03d5f3 000000 00 W 0 0 1
[19] .iram0.text_end NOBITS 403808db 0268db 000025 00 WA 0 0 1
[20] .iram0.data PROGBITS 40380900 03d5f3 000000 00 W 0 0 1
[21] .iram0.bss PROGBITS 40380900 03d5f3 000000 00 W 0 0 1
[22] .dram0.heap_start PROGBITS 3fc93960 03d5f3 000000 00 W 0 0 1
dram
andiram
refer to “data RAM” and “instruction RAM”. They describe how a memory region is being used, not the technology of the underlying physical memory component. Data RAM is not to be confused with dynamic random access memory.
The starting virtual address was assigned by the memory management unit (MMU) on PSRAM initialization by identifying the largest block of free virtual address space, then reserving a block with size equivalent to the minimum of the largest block size and the available PSRAM capacity.
esp-idf/components/esp_psram/esp_psram.c
(link)
//----------------------------------Map the PSRAM physical range to MMU-----------------------------//
/**
* @note 2
* Similarly to @note 1, we expect HW DBUS memory to be consecutive.
*
* If situation is worse in the future (memory region isn't consecutive), we need to put these logics into chip-specific files
*/
size_t total_mapped_size = 0;
size_t size_to_map = 0;
size_t byte_aligned_size = 0;
ret = esp_mmu_map_get_max_consecutive_free_block_size(MMU_MEM_CAP_READ | MMU_MEM_CAP_WRITE | MMU_MEM_CAP_8BIT | MMU_MEM_CAP_32BIT, MMU_TARGET_PSRAM0, &byte_aligned_size);
assert(ret == ESP_OK);
size_to_map = MIN(byte_aligned_size, psram_available_size);
const void *v_start_8bit_aligned = NULL;
ret = esp_mmu_map_reserve_block_with_caps(size_to_map, MMU_MEM_CAP_READ | MMU_MEM_CAP_WRITE | MMU_MEM_CAP_8BIT | MMU_MEM_CAP_32BIT, MMU_TARGET_PSRAM0, &v_start_8bit_aligned);
assert(ret == ESP_OK);
This block is subsequently added to the heap allocator when
CONFIG_SPIRAM_USE_CAPS_ALLOC=y
.
esp-idf/components/esp_psram/esp_psram.c
(link)
esp_err_t esp_psram_extram_add_to_heap_allocator(void)
{
esp_err_t ret = ESP_FAIL;
uint32_t byte_aligned_caps[] = {MALLOC_CAP_SPIRAM | MALLOC_CAP_DEFAULT, 0, MALLOC_CAP_8BIT | MALLOC_CAP_32BIT};
ret = heap_caps_add_region_with_caps(byte_aligned_caps,
s_psram_ctx.regions_to_heap[PSRAM_MEM_8BIT_ALIGNED].vaddr_start,
s_psram_ctx.regions_to_heap[PSRAM_MEM_8BIT_ALIGNED].vaddr_end);
if (ret != ESP_OK) {
return ret;
}
if (s_psram_ctx.regions_to_heap[PSRAM_MEM_32BIT_ALIGNED].size) {
assert(s_psram_ctx.regions_to_heap[PSRAM_MEM_32BIT_ALIGNED].vaddr_start);
uint32_t word_aligned_caps[] = {MALLOC_CAP_SPIRAM | MALLOC_CAP_DEFAULT, 0, MALLOC_CAP_32BIT};
ret = heap_caps_add_region_with_caps(word_aligned_caps,
s_psram_ctx.regions_to_heap[PSRAM_MEM_32BIT_ALIGNED].vaddr_start,
s_psram_ctx.regions_to_heap[PSRAM_MEM_32BIT_ALIGNED].vaddr_end);
if (ret != ESP_OK) {
return ret;
}
}
ESP_EARLY_LOGI(TAG, "Adding pool of %dK of PSRAM memory to heap allocator",
(s_psram_ctx.regions_to_heap[PSRAM_MEM_8BIT_ALIGNED].size + s_psram_ctx.regions_to_heap[PSRAM_MEM_32BIT_ALIGNED].size) / 1024);
return ESP_OK;
}
We can see this in action by observing the logs that preceded our entry to the
main_task
.
I (1174) esp_psram: Adding pool of 8192K of PSRAM memory to heap allocator
However, in this use-case the buffer only needs to be initialized to zero and we
aren’t going to reclaim the memory. Ideally we could avoid the need to
dynamically allocate at all, and instead specify the size of the buffer that we
need at compile time. Fortunately, esp-idf
allows for use of the
EXT_RAM_BSS_ATTR
macro when CONFIG_SPIRAM_ALLOW_BSS_SEG_EXTERNAL_MEMORY=y
.
EXT_RAM_BSS_ATTR
expands to __attribute__((section(".ext_ram.bss")))
, which
indicates that a variable should be placed in the .bss
(zero-initialized
memory) section of the .ext_ram
(external RAM).
#include "esp_heap_caps.h"
#include <stdint.h>
#include <stdio.h>
static EXT_RAM_BSS_ATTR uint8_t buf[1000000];
void app_main(void) {
printf("Let's allocate some external memory!\n");
printf("Address: %p\n", (void *)&buf);
}
Building and running now yields the following output.
I (1275) main_task: Started on CPU0
I (1285) main_task: Calling app_main()
Let's allocate some external memory!
Address: 0x3c030000
I (1285) main_task: Returned from app_main()
If we take a look at the ELF section headers, we’ll see that the .ext_ram.bss
no longer has a size of zero, and the address matches that of the statically
allocated buffer (0x3c030000
). The size (0x0f4240
) matches that of our
1000000
byte buffer.
$ readelf -S build/esp_psram_static.elf | head -27
There are 95 section headers, starting at offset 0x3d2c48:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .rtc.text NOBITS 600fe000 03e000 000010 00 WA 0 0 1
[ 2] .rtc.force_fast PROGBITS 600fe010 03d5c3 000000 00 W 0 0 1
[ 3] .rtc_noinit PROGBITS 50000000 03d5c3 000000 00 W 0 0 1
[ 4] .rtc.force_slow PROGBITS 50000000 03d5c3 000000 00 W 0 0 1
[ 5] .rtc_reserved NOBITS 600fffe8 03dfe8 000018 00 WA 0 0 8
[ 6] .iram0.vectors PROGBITS 40374000 01a000 000403 00 AX 0 0 4
[ 7] .iram0.text PROGBITS 40374404 01a404 00c4d7 00 AX 0 0 4
[ 8] .dram0.dummy NOBITS 3fc88000 00e000 008900 00 WA 0 0 1
[ 9] .dram0.data PROGBITS 3fc90900 016900 00275c 00 WA 0 0 16
[10] .noinit NOBITS 3fc9305c 000000 000000 00 WA 0 0 1
[11] .dram0.bss NOBITS 3fc93060 01905c 0008f8 00 WA 0 0 8
[12] .flash.text PROGBITS 42000020 027020 0165a3 00 AX 0 0 4
[13] .flash_rodat[...] NOBITS 3c000020 001020 020000 00 WA 0 0 1
[14] .flash.appdesc PROGBITS 3c020020 001020 000100 00 A 0 0 16
[15] .flash.rodata PROGBITS 3c020120 001120 00c57c 00 WA 0 0 16
[16] .flash.rodat[...] PROGBITS 3c02c69c 03d5c3 000000 00 W 0 0 1
[17] .ext_ram.dummy NOBITS 3c000020 001020 02ffe0 00 WA 0 0 1
[18] .ext_ram.bss NOBITS 3c030000 00e000 0f4240 00 WA 0 0 1
[19] .iram0.text_end NOBITS 403808db 0268db 000025 00 WA 0 0 1
[20] .iram0.data PROGBITS 40380900 03d5c3 000000 00 W 0 0 1
[21] .iram0.bss PROGBITS 40380900 03d5c3 000000 00 W 0 0 1
[22] .dram0.heap_start PROGBITS 3fc93958 03d5c3 000000 00 W 0 0 1
While statically allocating is not always the right option, using the technique described in this post to target external memory allows for larger static allocation than would be possible if only targeting on-chip RAM.