Kevin Cuzner's Personal Blog

Electronics, Embedded Systems, and Software are my breakfast, lunch, and dinner.


Bootloader for ARM Cortex-M0: No VTOR

Nov 13, 2018

In my most recent project I selected an ARM Cortex-M0 microcontroller (the STM32F042). I soon realized that there is a key architectural piece missing from the Cortex-M0 which the M0+ does not have: The vector table offset register (VTOR).

I want to talk about how I overcame the lack of a VTOR to write a USB bootloader which supports a semi-safe fallback mode. The source for this post can be found here (look in the "bootloader" folder):

https://github.com/kcuzner/midi-fader/tree/master/firmware

Table of contents:

What is the VTOR?

Near the heart of the ARM Cortex is the NVIC, or Nested Vector Interrupt Controller. This is used for prioritizing peripheral interrupts (I2C byte received, USB transaction complete, etc) and core signals (hard fault, system timer tick, etc) while managing the code which is executed in response. The NVIC works by using a lookup table at a specific location to determine what code to execute. As an example, the interrupt table for the STM32F042 looks something like this:

When an interrupt occurs, the NVIC will examine this table, read the handler address from it, push some special information onto the stack (the exception frame), and then execute the handler. This exact sequence is fairly complex, but here are some resources if you're interested in learning more:

For any program meant to run on an ARM Cortex processor there'll be some assembly (or maybe C) that looks like this (this one was provided by ST's CMSIS implementation for the STM32F042):

 1   .section .isr_vector,"a",%progbits
 2  .type g_pfnVectors, %object
 3  .size g_pfnVectors, .-g_pfnVectors
 4
 5
 6g_pfnVectors:
 7  .word  _estack
 8  .word  Reset_Handler
 9  .word  NMI_Handler
10  .word  HardFault_Handler
11  .word  0
12  .word  0
13  .word  0
14  .word  0
15  .word  0
16  .word  0
17  .word  0
18  .word  SVC_Handler
19  .word  0
20  .word  0
21  .word  PendSV_Handler
22  .word  SysTick_Handler
23  .word  WWDG_IRQHandler                   /* Window WatchDog              */
24  .word  PVD_VDDIO2_IRQHandler             /* PVD and VDDIO2 through EXTI Line detect */
25  .word  RTC_IRQHandler                    /* RTC through the EXTI line    */
26  .word  FLASH_IRQHandler                  /* FLASH                        */
27  .word  RCC_CRS_IRQHandler                /* RCC and CRS                  */
28  .word  EXTI0_1_IRQHandler                /* EXTI Line 0 and 1            */
29...

Then in my linker script I have the "SECTIONS" portion start out like this:

 1SECTIONS
 2{
 3    /* General code */
 4    .text :
 5    {
 6        _flash_start = .;
 7        . = ALIGN(4);
 8        /* At beginning of flash is:
 9         *
10         * Required:
11         * 0x0000 Initial stack pointer
12         * 0x0004 Reset Handler
13         *
14         * Optional:
15         * 0x0008 and beyond: NVIC ISR Table
16         */
17        KEEP(*(.isr_vector))
18        . = ALIGN(4);
19        *(.text)
20        *(.text*)
21        *(.glue_7)
22        *(.glue_7t)
23
24        /* C startup support */
25        /* TODO: Convert to -nostartfiles for maximum DIY */
26        *(.eh_frame)
27        KEEP(*(.init))
28        KEEP(*(.fini))
29    } > FLASH
30...

The assembly snippet creates the table for the NVIC (g_pfnVectors in this example) and assigns it to the ".isr_vector" section. The linker script then locates this section right at the beginning of the flash (the "KEEP(*(.isr_vector))" right at the beginning after some variable declarations). When the program is compiled what I end up with it something that looks like this (this is an assembly dump of the beginning of one of my binaries):

 1Disassembly of section .text:
 2
 308000000 <_flash_start>:
 4 8000000:    20001800        andcs   r1, r0, r0, lsl #16
 5 8000004:    08001701        stmdaeq r0, {r0, r8, r9, sl, ip}
 6 8000008:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
 7 800000c:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
 8 8000010:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
 9 8000014:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
10 8000018:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
11 800001c:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
12 8000020:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
13 8000024:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
14 8000028:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
15 800002c:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
16 8000030:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
17 8000034:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
18 8000038:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
19 800003c:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
20 8000040:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
21 8000044:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
22 8000048:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
23 800004c:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
24 8000050:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
25 8000054:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
26 8000058:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
27 800005c:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
28 8000060:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
29 8000064:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
30 8000068:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
31 800006c:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
32 8000070:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
33 8000074:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
34 8000078:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
35 800007c:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
36 8000080:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
37 8000084:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
38 8000088:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
39 800008c:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
40 8000090:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
41 8000094:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
42 8000098:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
43 800009c:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
44 80000a0:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
45 80000a4:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
46 80000a8:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
47 80000ac:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
48 80000b0:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
49 80000b4:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
50 80000b8:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
51 80000bc:    080005af        stmdaeq r0, {r0, r1, r2, r3, r5, r7, r8, sl}
52
53080000c0 <bootloader_tick>:
54 80000c0:    4a0d            ldr     r2, [pc, #52]   ; (80000f8 <bootloader_tick+0x38>)
55 80000c2:    2300            movs    r3, #0
56 80000c4:    0011            movs    r1, r2
57 80000c6:    b570            push    {r4, r5, r6, lr}
58 80000c8:    4c0c            ldr     r4, [pc, #48]   ; (80000fc <bootloader_tick+0x3c>)
59...

For the first several 32-bit words I have created a bunch of function pointers which make up the table that the NVIC will read. After that table, the actual code starts.

So, what is the VTOR? In some ARM Cortex architectures (I know at least the ARM Cortex-M0+, ARM Cortex-M3, and ARM Cortex-M4 support this) there is a register located at address 0xE000ED08 called the "Vector Table Offset Register". This is a 7-bit aligned address (so its 7 LSBs must be zero) which points to the location of this interrupt vector table. On boot this register contains 0x00000000 and so when power comes up, the handler whose address lives at 0x00000004 is executed to handle the reset. Later on, the program might modify the VTOR so that it points at some other location in memory. For an example, let's say 0x08008000. After that point, the NVIC will look up the addresses for each handler relative to that address. So if an SVCall exception occurred the NVIC would read 0x0800802C to determine the address of the handler to call.

One thing you may have noticed at this point is that my assembly dump earlier had everything living relative to address 0x08000000. However, I said that that the VTOR's reset value was 0x00000000. So, how does the STM32's ARM core know where to find the table? All STM32's I've seen so far implement a "boot remapping" feature which uses the physical "BOOT0" pin to map the flash (which starts at 0x08000000) onto the memory space starting at 0x00000000 like so (may vary slightly by STM32):

Some STM32s have support for extra modes like mapping the SRAM (address 0x20000000) onto 0x00000000. So although the VTOR's default value is 0x00000000, since the STM32 is remapping 0x08000000 into that space the ARM Cortex core sees the contents of the flash when it loads information from locations relative to 0x00000000 if the BOOT0 pin is tied low.

Bootloaders and the VTOR

At this point we can talk about how bootloaders would use the VTOR. In my last post on the subject, I didn't really talk extensively about interrupts beyond mentioning that the VTOR is overwritten as part of the process of jumping to the user program. The reason this is done is so that after the bootloader has decided to transfer execution to the user program that interrupts executed in the program are directed to the handlers dictated by the user program. Ideally, the user program doesn't even need to worry about the fact that its running in a boot-loaded manner.

On a microcontroller with a separate bootloader and user program the flash is partitioned into two segments: The bootloader which always lives right at the beginning of flash so that the STM32 boots into the bootloader and the user program which lives much further down in the flash. I usually put my user programs at around the 8KB mark since the (inefficient and clumsy) hobbyist bootloaders i write tend to use just a little over 4K of the flash. When the bootloader runs it performs the following sequence:

  1. Determine if a user program exists. If the user program does not exist, start running the main bootloader program and abort this sequence.
  2. Disable interrupts (important!)
  3. Set the VTOR register to the start of the user program (which just so happens to be the location of the user program's vector table, since the table lives right at the beginning of the flash image of the program).
  4. Read the address of the stack pointer from the first word of the user program.
  5. Read the reset handler address from the second word of the user program.
  6. Set the stack pointer and jump to the reset handler.

So long as the user program doesn't go and mess with the VTOR, any interrupts that occur after the user program re-enables interrupts will cause the NVIC to use the user program's table to determine where the handlers are. Isn't that awesome?

There is one step that the user program has to do, however. It needs to properly offset all of its addresses in the flash. As I mentioned in my previous post about bootloaders this is pretty easy to do in the linker script by just tricking it into thinking that the flash starts at the beginning of the user program partition (example on a 32K microcontroller):

1_flash_origin = 0x08002000;
2_flash_length = 24K;
3
4MEMORY
5{
6    FLASH (RX) : ORIGIN = _flash_origin, LENGTH = _flash_length
7    RAM (W!RX)  : ORIGIN = 0x20000000, LENGTH = 6K
8}

The user program is now tricked into thinking that flash starts at 0x08002000 and is only 24K. We can see that this was successful if we take a look at the beginning of the disassembly of a compiled program:

 1Disassembly of section .text:
 2
 308002000 <_flash_start>:
 4 8002000:    20001800        andcs   r1, r0, r0, lsl #16
 5 8002004:    08004141        stmdaeq r0, {r0, r6, r8, lr}
 6 8002008:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
 7 800200c:    08003c29        stmdaeq r0, {r0, r3, r5, sl, fp, ip, sp}
 8     ...
 9 800202c:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
10     ...
11 8002038:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
12 800203c:    08002f05        stmdaeq r0, {r0, r2, r8, r9, sl, fp, sp}
13 8002040:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
14 8002044:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
15 8002048:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
16 800204c:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
17 8002050:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
18 8002054:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
19 8002058:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
20 800205c:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
21 8002060:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
22 8002064:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
23 8002068:    08002e07        stmdaeq r0, {r0, r1, r2, r9, sl, fp, sp}
24 800206c:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
25 8002070:    08002c51        stmdaeq r0, {r0, r4, r6, sl, fp, sp}
26 8002074:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
27 8002078:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
28 800207c:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
29 8002080:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
30     ...
31 800208c:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
32 8002090:    00000000        andeq   r0, r0, r0
33 8002094:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
34 8002098:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
35 800209c:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
36 80020a0:    00000000        andeq   r0, r0, r0
37 80020a4:    08002e05        stmdaeq r0, {r0, r2, r9, sl, fp, sp}
38 80020a8:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
39 80020ac:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
40 80020b0:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
41 80020b4:    00000000        andeq   r0, r0, r0
42 80020b8:    080041c1        stmdaeq r0, {r0, r6, r7, r8, lr}
43 80020bc:    08003919        stmdaeq r0, {r0, r3, r4, r8, fp, ip, sp}
44
45080020c0 <configuration_begin_request>:
46 80020c0:    b513            push    {r0, r1, r4, lr}
47 80020c2:    4668            mov     r0, sp
48 80020c4:    0002            movs    r2, r0
49...

All the addresses are offset by 0x08002000. Now all the bootloader has to do is set the VTOR to 0x08002000 and this user program will execute normally, interrupts and all.

Dealing with an absent VTOR

After I purchased the microcontroller for my project (an STM32F042) I discovered that it was a Cortex-M0 and did not have a VTOR. This was a rather unpleasant surprise and now I know that the M0 sucks compared to the M0+. Nonetheless, I was able to overcome this with a fairly simple software shim and that's what I want to share.

There are two main issues that the VTOR addresses:

  • Determining the address of an interrupt when it isn't relative to 0x00000000.
  • Forwarding execution of the interrupt routine to that custom address.

Since I don't have a VTOR all of my interrupts will be executed from the bootloader by default. This is obviously unacceptable since things like a USB interrupt occurring would cause the user program to suddenly revert back to being the bootloader program (and probably into some undefined state since the SRAM would be all different).

To address the first problem, I had to make some changes to my bootloader and to the user program:

  1. I designated a certain area of SRAM in the bootloader program as holding data that will be valid while the processor is running.
  2. The user program's linker script had its SRAM startpoint moved beyond this reserved section.

I implemented this with these linker script memory modifications:

Bootloader linker script:

1_flash_origin = 0x08000000;
2_flash_length = 32K;
3
4MEMORY
5{
6    FLASH (RX) : ORIGIN = _flash_origin, LENGTH = 8K
7    RAM_RSVD (W!RX) : ORIGIN = 0x20000000, LENGTH = 256
8    RAM (W!RX)  : ORIGIN = 0x20000100, LENGTH = 6K - 256
9}

Device linker script:

1_flash_origin = 0x08002000;
2_flash_length = 24K;
3
4MEMORY
5{
6    FLASH (RX) : ORIGIN = _flash_origin, LENGTH = _flash_length
7    RAM (W!RX)  : ORIGIN = 0x20000100, LENGTH = 6K - 256
8}

And this section addition in the bootloader linker script:

1...
2    .boot_data :
3    {
4        *(.rsvd.data)
5        *(.rsvd.data*)
6    } > RAM_RSVD
7...

Now I have some reserved memory that the user program won't touch. I use this area to store a psuedo-VTOR:

 1/**
 2 * Places a symbol into the reserved RAM section. This RAM is not
 3 * initialized and must be manually initialized before use.
 4 */
 5#define RSVD_SECTION ".rsvd.data,\"aw\",%nobits//"
 6#define _RSVD __attribute__((used, section(RSVD_SECTION)))
 7
 8static volatile _RSVD uint32_t bootloader_vtor;
 9
10extern uint32_t *g_pfnVectors;
11
12void bootloader_init(void)
13{
14    bootloader_vtor = (uint32_t)(&g_pfnVectors);
15...

When the bootloader starts it will set this "bootloader_vtor" variable to the location of the bootloader's vector table (the "extern uint32_t *g_pfnVectors" is linked to that table defined in assembly earlier).

Then, if the bootloader determines that the user program exists it overwrites bootloader_vtor with the following:

 1void bootloader_init(void)
 2{
 3...
 4    uint32_t user_vtor_value = 0;
 5...load the user value...
 6    //if the prog_start field is set and there are no entry bits set in the CSR (or the magic code is programmed appropriate), start the user program
 7    if (user_vtor_value &&
 8            (!reset_entry || (magic == BOOTLOADER_MAGIC_SKIP)))
 9    {
10...housekeeping before we jump to the user program...
11        __disable_irq();
12
13        uint32_t *user_vtor = (uint32_t *)user_vtor_value;
14        uint32_t sp = user_vtor[0];
15        uint32_t pc = user_vtor[1];
16        bootloader_vtor = user_vtor_value;
17        __asm__ __volatile__("mov sp,%0\n\t"
18                "bx %1\n\t"
19                : /* no output */
20                : "r" (sp), "r" (pc)
21                : "sp");
22        while (1) { }
23    }
24}

Ok, so that solves the issue of "where do the user's interrupts live". The next issue is actually jumping to those. Turns out, that's not a hard problem to solve now. A quick change to the interrupt handlers makes short work of that:

 1/**
 2 * Entry point for all exceptions which passes off execution to the appropriate
 3 * handler. This adds some non-trivial overhead, but it does tail-call the
 4 * handler and I think it's about as minimal as you can get for emulating the
 5 * VTOR.
 6 */
 7void __attribute__((naked)) Bootloader_IRQHandler(void)
 8{
 9    __asm__ volatile (
10            " ldr r0,=bootloader_vtor\n" // Read the fake VTOR into r0
11            " ldr r0,[r0]\n"
12            " ldr r1,=0xE000ED04\n" // Prepare to read the ICSR
13            " ldr r1,[r1]\n" // Load the ICSR
14            " mov r2,#63\n"  // Prepare to mask SCB_ICSC_VECTACTIVE (6 bits, Cortex-M0)
15            " and r1, r2\n"  // Mask the ICSR, r1 now contains the vector number
16            " lsl r1, #2\n"  // Multiply vector number by sizeof(function pointer)
17            " add r0, r1\n"  // Apply the offset to the table base
18            " ldr r0,[r0]\n" // Read the function pointer value
19            " bx r0\n" // Aaaannd branch!
20            );
21}

What this does is determine which interrupt number is executing, multiply that number by 4, adds it to bootloader_vtor, and jumps to that location. This does naively what the VTOR does from the perspective of a program. This routine does stomp all over r0, r1, and r2, but since those registers are part of the ARM Exception Context, the original values have already been pushed onto the stack. Since we haven't modified the stack at all (no pushes or pops here), the actual interrupt handler should be none the wiser that something happened before it (and it shouldn't care what's in r0, r1, and r2 as well).

The bootloader also gets a rather non-trivial change to its interrupt vector table:

 1/******************************************************************************
 2*
 3* The minimal vector table for a Cortex M0.  Note that the proper constructs
 4* must be placed on this to ensure that it ends up at physical address
 5* 0x0000.0000.
 6*
 7******************************************************************************/
 8   .section .isr_vector,"a",%progbits
 9  .word  _estack
10  .word  Reset_Handler
11  .word  Bootloader_IRQHandler
12  .word  Bootloader_IRQHandler
13  .word  Bootloader_IRQHandler
14  .word  Bootloader_IRQHandler
15  .word  Bootloader_IRQHandler
16  .word  Bootloader_IRQHandler
17  .word  Bootloader_IRQHandler
18  .word  Bootloader_IRQHandler
19  .word  Bootloader_IRQHandler
20  .word  Bootloader_IRQHandler
21  .word  Bootloader_IRQHandler
22  .word  Bootloader_IRQHandler
23  .word  Bootloader_IRQHandler
24  .word  Bootloader_IRQHandler
25  .word  Bootloader_IRQHandler                   /* Window WatchDog              */
26  .word  Bootloader_IRQHandler             /* PVD and VDDIO2 through EXTI Line detect */
27  .word  Bootloader_IRQHandler                    /* RTC through the EXTI line    */
28  .word  Bootloader_IRQHandler                  /* FLASH                        */
29  .word  Bootloader_IRQHandler                /* RCC and CRS                  */
30  .word  Bootloader_IRQHandler                /* EXTI Line 0 and 1            */
31  .word  Bootloader_IRQHandler                /* EXTI Line 2 and 3            */
32  .word  Bootloader_IRQHandler               /* EXTI Line 4 to 15            */
33  .word  Bootloader_IRQHandler                    /* TSC                          */
34  .word  Bootloader_IRQHandler          /* DMA1 Channel 1               */
35  .word  Bootloader_IRQHandler        /* DMA1 Channel 2 and Channel 3 */
36  .word  Bootloader_IRQHandler        /* DMA1 Channel 4 and Channel 5 */
37  .word  Bootloader_IRQHandler                   /* ADC1                         */
38  .word  Bootloader_IRQHandler    /* TIM1 Break, Update, Trigger and Commutation */
39  .word  Bootloader_IRQHandler                /* TIM1 Capture Compare         */
40  .word  Bootloader_IRQHandler                   /* TIM2                         */
41  .word  Bootloader_IRQHandler                   /* TIM3                         */
42  .word  Bootloader_IRQHandler                                 /* Reserved                     */
43  .word  Bootloader_IRQHandler                                 /* Reserved                     */
44  .word  Bootloader_IRQHandler                  /* TIM14                        */
45  .word  Bootloader_IRQHandler                                 /* Reserved                     */
46  .word  Bootloader_IRQHandler                  /* TIM16                        */
47  .word  Bootloader_IRQHandler                  /* TIM17                        */
48  .word  Bootloader_IRQHandler                   /* I2C1                         */
49  .word  Bootloader_IRQHandler                                 /* Reserved                     */
50  .word  Bootloader_IRQHandler                   /* SPI1                         */
51  .word  Bootloader_IRQHandler                   /* SPI2                         */
52  .word  Bootloader_IRQHandler                 /* USART1                       */
53  .word  Bootloader_IRQHandler                 /* USART2                       */
54  .word  Bootloader_IRQHandler                                 /* Reserved                     */
55  .word  Bootloader_IRQHandler                /* CEC and CAN                  */
56  .word  Bootloader_IRQHandler                    /* USB                          */

All the interrupts point to this new Bootloader_IRQHandler except Reset. We now have another problem: What about the interrupts for when we actually need to execute the bootloader program instead of the user program. Well, that's fairly simple now. We just move the g_pfnVectors table so that it is just like any other table:

 1/**
 2 * Default vector table local to the bootloader. This is used by the
 3 * emulated VTOR functionality to actually dispatch interrupts. It must
 4 * be word-aligned since "ldr" is used to access it.
 5 */
 6   .section .text.LocalVectors,"a",%progbits
 7  .type g_pfnVectors, %object
 8  .size g_pfnVectors, .-g_pfnVectors
 9  .align 4
10
11g_pfnVectors:
12  .word  _estack
13  .word  Reset_Handler
14  .word  NMI_Handler
15  .word  HardFault_Handler
16  .word  0
17  .word  0
18  .word  0
19  .word  0
20  .word  0
21  .word  0
22  .word  0
23  .word  SVC_Handler
24  .word  0
25  .word  0
26  .word  PendSV_Handler
27  .word  SysTick_Handler
28  .word  WWDG_IRQHandler                   /* Window WatchDog              */
29  .word  PVD_VDDIO2_IRQHandler             /* PVD and VDDIO2 through EXTI Line detect */
30  .word  RTC_IRQHandler                    /* RTC through the EXTI line    */
31  .word  FLASH_IRQHandler                  /* FLASH                        */
32  .word  RCC_CRS_IRQHandler                /* RCC and CRS                  */
33  .word  EXTI0_1_IRQHandler                /* EXTI Line 0 and 1            */
34  .word  EXTI2_3_IRQHandler                /* EXTI Line 2 and 3            */
35...

I placed it in its own section for fun, but you'll see that it now lives in ".text". This means that it ends up in flash just like any other read only variable would and I don't really care where it ends up. I suppose I could also have put it into the "rodata" section and that would probably be more correct, but it hasn't caused a problem yet. Anyway, as we saw during bootloader_init the address of the bootloader's g_pfnVectors is loaded into bootloader_vtor and if there's no user program it will remain there.

With those two pieces together, we have effectively emulated the VTOR functionality. There are a few corner cases that this doesn't handle very well (such as exceptions before the bootloader_vtor value is initialized) which likely result in Hard Faults, but I haven't encountered an issue there yet.

Debugging the user program

With my other bootloader which relied on the VTOR, the presence of the bootloader was not only transparent to the user program, it was also transparent to the debugger. If I needed to run a stack trace during an interrupt or exception, it knew the names of all the symbols it would find in the trace. But now that we've mixed together the bootloader and user program, that makes things less straightfoward since the elf file from the user program won't have any knowledge of the code executed by the bootloader.

While I didn't overcome this issue completely and stack traces can be a little awkward if they are interrupted at just the right time, I did manage to massage gdb enough to make it somewhat usable:

1gdb -ex "target remote localhost:3333" -ex "add-symbol-file ./path/to/my/bootloader.elf 0x08000000" ./path/to/my/user/program.elf

The "add-symbol-file" directive points gdb towards my bootloader's elf file and informs it about any symbols it might find if we just so happen to break while inside the bootloader's program space. It also knows about the names of symbols inside the bootloader's reserved SRAM space.

Conclusion

Here we've seen how the VTOR works, why it's useful to bootloaders, and one way to overcome the issue of not having a VTOR in certain architectures like the Cortex-M0. If you have any questions or comments, feel free to leave a comment on this post. This isn't the most robust way of fixing the problem, but for my hacking around it works just fine. I only hope that this post is useful and maybe sparks some idea with someone who is trying to overcome a similar problem.


arm-programming arm-cortex bootloader hardware nvic programming stm32 vtor