STM32 Architecture: Difference between revisions

From bibbleWiki
Jump to navigation Jump to search
No edit summary
 
(162 intermediate revisions by the same user not shown)
Line 1: Line 1:
=Introduction=
=Introduction=
This is the page for all things STM32. Currently working with a Nucleo F302R8
My knowledge is very small, no not just in general, but on this subject in computers. This is probably where I regret not having a degree. But here goes I am going to try and understand enough of the diagram from the STM32F0xx Cortex-M0 to be dangerous.
=Setting Up=
[[File:STM32F0xx Cortex-M0 System Overview.png| 400px]]<br>
This was a trial and a half and here in case some others struggle and find this help.<br>
I am again looking at Intermation and their course which is Computer Organization and design.
==STM32CubeIDE (Eclipse)==
=NVIC and EXTI=
My goal was to use eclipse because this is what is used in the videos. So with ubuntu 23.04 install went to install this with eclipse. But the eclipse version requires python 2.7 which is no longer available.
Not fully on board with this but the NVIC (Nested Vectored Interrupt Controller) is a interrupt controller connected to the CPU. From one of the docs (STM32G4) it lists its features as
==VS Code==
*102 interrupt sources,
Luckily STM32 had brought out an extensions for my preferred solution VS Code. Installed the extension and went about installing the 3 other products it mentioned.
*16 programmable priority levels,
*STM32CubeMX
*Low-latency exception and interrupt handling,
*STMCUFinder
*Automatic nesting,
*stm32cubeclt_1.12.1
*Power management control.
Started it up but the import project button did nothing at all so I assumed it must need STM32CubeIDE. So went back to trying to install libpython2.7 and found https://askubuntu.com/questions/101591/how-do-i-install-the-latest-python-2-7-x-or-3-x-on-ubuntu. Unfortunately whatever I did must have move ld.so  or something serious so had to re-install. But the good news was I could retry the STM32 extension. Having a new install I tried the STM32 Extension and it did indeed say could not find STM32CubeIDE. I documented this on the STM32 forum under https://community.st.com/s/question/0D53W00002IMDFZSA5/import-project-in-vs-code-ubuntu-2404
In the lesson I was doing this came up because of the EXTI (EXTernal Interrupt/Event) controller which is connected to the NVIC. When using CubeMX you can configure handlers for the GPIO pin which connects to the EXTI on the NVIC. In my case there are 28 lines on the EXTI<br>
==STM32CubeIDE Attempt 2==
[[File:NVIC EXTI.png|400px]]
The was an additional install for STM32CubeIDE for vanilla linux. So I downloaded this and installed it. But on start up it failed with an error org.eclipse.swt.internal.C::strlen. <br>
=Pending Request Register=
<br>
When we press a button it is flagged in the pending request register shown above. We can get the address of the register from the manual. In the case of the STM32F302R8 we can first find the EXTI in the manual<br>
But the next morning googling I found https://github.com/adoptium/adoptium-support/issues/785 and the solution to getting it to work.
[[File:STM32F302R8 EXTI.png|400px]]<br>
<syntaxhighlight lang="bash">
This is probably more about navigating the documentation than the detail but here is the EXTI_PR1 document. After all the software is easy<br>
mkdir /tmp/SWT-GDBusServer
[[File:EXTI PR Register.png|400px]]<br>
</syntaxhighlight>
So the address of the EXIT is 0x4001 0400 - 0x4001 07FF and when we look for EXTI_PR1 it is offset 0x14 so the address is 0x4001 0414.<br>
So may find solution was
They were very keen to stress that it is the programmers (so old fashioned) job to clear the bit in the PR when done. Using the CubeMX this is what is generated for you via macros.<br>
*Ubuntu 23.04
[[File:STM32 EXTI Handler.png|400px]]<br>
*en.st-stm32cubeide_1.12.1_16088_20230420_1057_amd64.sh.zip
=STM32 Header Files=
*en.ST-MCU-FinderLin_v5-0-0.zip
Briefly ARM have a thing called CMIS. Vendors follow these guidelines and share common macros etc.  
*en.stm32cubemx-lin-v6-8-1.zip
==Volatile Keyword==
=Debugging=
Looking at the headers at lot of the headers specify volatile. This forces the compiler to always read the value and not optimize out. With an optimizer the value of p in the code below is not updated and remains in the first loop if the volatile keyword is not used.
==Getting Started==
<syntaxhighlight lang='c'>
Well now have all of the bits installed. Next it is time to start debugging
 
==Update nucleo f302r8==
#include <stdint.h>
This required an Update to the firmware. Google is your friend. Downloaded en.stsw-link007-v3-12-3.zip. This contained the udev rules which you can install using dpkg. Don't forget to reload rules with
 
<syntaxhighlight lang="bash">
 
sudo udevadm control --reload-rules
#define SRAM_ADDRESS1  0x20000004U
sudo udevadm trigger
 
</syntaxhighlight>
int main(void)
Next under the stsw-link007/AllPlatforms
<syntaxhighlight lang="bash">
sudo java -jar STLinkUpgrade.jar
</syntaxhighlight>
Hopefully this all goes well
==Launch Settings==
This was surprisingly easy once I had all the other things in place. Here is my file.
<syntaxhighlight lang="json">
{
{
  "version": "0.2.0",
  "configurations": [


    {
  uint32_t value = 0;
      "name": "Debug via ST-Link",
  uint32_t volatile *p = (uint32_t *) SRAM_ADDRESS1;
      "cwd": "${workspaceRoot}",
 
      "type": "cortex-debug",
    while(1)
      "executable": "./build/debug/build/Test6.elf",
  {
      "request": "launch",
  value = *p;
      "servertype": "stlink",
  if(value) break;
      "device": "STM32F02R8",
      "interface": "swd",
  }
      "runToEntryPoint": "main",
 
      "svdFile": "STM32F302.svd",
  while(1);
      "v1": false,
 
      "showDevDebugOutput": "both",
   return 0;
      "armToolchainPath": "/opt/st/stm32cubeclt_1.12.1/GNU-tools-for-STM32/bin"
  }, 
   ]
}
}
</syntaxhighlight>
</syntaxhighlight>
I used the following files for help
=GPIO and Ports=
*Marus on Github https://github.com/Marus/cortex-debug/blob/master/debug_attributes.md
==Resetting Ports==
*MaJerle on Github https://github.com/MaJerle/stm32-cube-cmake-vscode/blob/main/README.md
Again for documentation most of the STM32 boards will list the peripheral and have the register as the last entry. When you look at the ports some of the reset value might not be neccessarily 0x000 0000. For the STM32F302R8 they were<br>
=Blinky=
'''Address offset:0x00'''<br>
Did take me a while to read the documentation. Especially around finding the LEDs on the board. Starting a new app provided the proper view. Maybe it is something I need to learn. Also lost a bit of time putting the code in the main rather than in the while loop further down. So here is the setting of GPIO pins STM32 style.
*Reset value: 0xA800 0000 for port A
<syntaxhighlight lang="c++">
*Reset value: 0x0000 0280 for port B
  /* Infinite loop */
*Reset value: 0x0000 0000 for other ports
  /* USER CODE BEGIN WHILE */
Each GPIO should have a pullup resistor. This ensures pins are not floating,neither positive or negative, which will happen due to residual voltage. The pullup resistor value can be found in the documentation searching for Rₚᵤ. or Weak Pull-up.<br>
  while (1)
==GPIO Modes==
  {
A bit was said about this and the importance of using pullup resistors. The open drain setting was brought up with I2C so may come back to this.
     /* USER CODE END WHILE */
*Input
*Output
**Push/Pull (0 or 1)
**Open Drain (0 or floating)
==Speed (Output Only)==
We can set the speed of the output using the OSPEEDRy. There are two bits for each port this effect the rising time and falling time. You have to refer to the datasheet (separate from reference manual) to understand the different available speeds. Search for OSPEEDS. I will be very happy if I ever need this. The speed are based on the voltage and clock capacitance.<br>
[[File:GPIO Speeds.png|600px]]<br>
<br>
A use case I have heard for setting these speeds is bit-banging which currently I do not understand but believe you could fake an interface by this technique.<br>
The slew rate is defined as the maximum rate of output voltage change per unit time. It is denoted by the letter S. The slew rate helps us to identify the amplitude and maximum input frequency suitable to an operational amplifier (OP amp) such that the output is not significantly distorted.<br>
==Alternate Function Mapping==
There are 16 different alternate functions pins can be used for. For STM you can generally see this on the pinout when googling but he datasheet also holds a table Alternate Function Mapping showing which pins support what. These can be configured using the Alternate Function Register High (AFRH) and Alternate Function Register Low (AFRL)
=Other Stuff=
==Memory Hierarchy==
There is a Hierarchy
*Registers
*Cache (L1 Local, L2 Shared)
*Main Memory (RAM)
*Long Term Storage (Hard Disks, Tapes etc)
==Types Of Memory==
*DRAM Dynamic RAM uses capacitors, slower, cheaper, requires refreshing
*SRAM Static RAM uses transistors, faster<br>
[[File:SRAM vs DRAM.jpg|500px]]<br>
 
==Types Of Addressing Memory==
These are the types of addressing.
*Random Address (RAM)
*Direct Addressing (HDD)
*Sequential Addressing (Tapes)
*Associative Addressing (Caching)
Speed and Cost Per Bit change goes down as performance goes down.
==Cache Memory==
===Introduction===
Terms
*Miss Element was not found
*Hit Element was found
*Hit Rate Percentage of time that element was found
 
Effective Access Time =
     Hit Rate * Time from Cache  + (1 - Hit rate) - time from Memory


      // HAL_GPIO_TogglePin(GPIOB, GPIO_PIN_13);
Looking at examples, it was the hit rate that the tutor wanted us to focus on as the other parts would not change.<br>
      // HAL_Delay(500);
<br>
      if(HAL_GPIO_ReadPin(GPIOC,GPIO_PIN_13) == GPIO_PIN_RESET) {
For loops indicate to the compiler that we will probably access data more than once and maybe its neighbours.
        HAL_GPIO_WritePin(GPIOB, GPIO_PIN_13, GPIO_PIN_SET);
===Types of Associative Cache===
        HAL_Delay(500);
There are 3 types of associative addressing cache algorithms
        HAL_GPIO_WritePin(GPIOB, GPIO_PIN_13, GPIO_PIN_RESET);
*Fully Associative
      }
*Direct Mapping
*Set Associative
Each of the approaches uses tag value approach where the tag is part of the address and the value is organized to allow to find the element


    /* USER CODE BEGIN 3 */
===Fully Associative===
  }
For this approach the Full address is split between the Block ID and a number of bits known as the Word ID. The Block ID becomes the Tag. And the value is divided into slots based on the size of the word ID.
  /* USER CODE END 3 */
*1-bits 2 elements are stored slot-0, slot-1
</syntaxhighlight>
*2-bits 4 elements are stored slot-0, slot-1, slot-2,-slot-3
=Memory=
This is shown below<br>
Almost forgot to do this. So we now have this working in VS Code. Some brief reminders of some basics with regard to memory. So here is some simple code to copy some data from flash to the SRAM. On the board I have SRAM starts at 0x20000000.  
[[File:Fully Associative.jpg|300px]]<br>
<syntaxhighlight lang="c++">
The values are replaced in the case using a replacement alogorthm such as
char const myData[] = "I love Programming";
*FIFO First in First Out
#define BASE_ADDRESS_OF_SRAM 0x20000000
*LRU Least Recently Used. Replaced Block least used
*LFU Least Frequently Used. Replaced Block least frequently used
*Random Just pick one
This approach has the least chance of thrashing but is expensive and slow.
===Direct Mapping===
For this approach we use the more of the address to store a line ID.  
The Full Address is not split into
*Tag first part of address
*Line ID, which line of the cache to store data in
*Word ID, as before the slot for the data in the value
This does not need a replacement algorithm and is therefore fast and cheap. But given the line IDs means data can be replaced often it is prone to thrashing.
===Set Associative===
This is similar to the above example where instead of a Line ID, a Set ID is stored. I.E. the cache has n rows size of set ID. Shown below is a line ID of 2-bits so each row has two slots. For this approach we do need a replacement algorithm. This approach is used by raspberry PI and many manufacturers<br>
[[File:Set Asscociative.jpg|300px]]<br>
===Flags in Cache===
Along with tag and value there are flags associated with a row. They are
*Type Data or Instructrion
*Valid Whether valid
*Lock Lock flag
*Dirty bit - Identifies a line of data that has been written to but not been updated


void foo2() {
==Memory Device Interface==
  for(int i = 0; i < sizeof(myData); i++) {
Here was shown how the circuit might work with D Flip-Flops and a clock line. Added was an address decoders to allow the device to select the right clock line. Three other signals are required, a write, a read and a chip select.<br>
    *((uint8_t*) BASE_ADDRESS_OF_SRAM +i ) = myData[i];
[[File:MemoryInterface.jpeg]]<br>
  }
Ironically he went on to show this device which I reckon is the one I am using for the 6502.<br>  
}
[[File:32k Memory.png |200px]]<br>
</syntaxhighlight>
==Chip Select==
With eclipse you can add a memory window and set the format to ASCII and then go to the location to see the copy.
When we look at an MCU there is a memory map which shows where the peripherals are on the device. Each device has a range of memory used to operate it.
[[File:Memory debug Eclipse.png | 400px]]<br>
There are address lines within the mcu which are connected to the device. In the example below there are 20 address lines and the Graphic Card is located between 0xE0000 and 0xFFFFF. Putting this number into binary shows that address lines A19-A17 is the CS (Chip Select) and address line A16-A00 is the graphic card. Setting A19-A17 to binary 111 effectively means you are using the address lines for the graphics card.<br>
For VS Code I have struggled to get this going with hex 0x20000000 but for decimal 536870912 this works fine. <br>
[[File:Chip Select.jpeg|600px]]<br>
[[File:Memory View VS Code.png | 400px]]<br>
In order to operate the correct memory device from the processor you use the correct chip select.<br>
By adding &myData you can view the address in the memory view too. We can see here that myData is stored at 0x8002864. I have used the fred variable to demonstrate this. You can see the actual address is the first address in the memory view in yellow 0x8002864 however the display in white starts at 0x8002860.<br>
To maybe see a real world example here is windows showing us the memory map range for a device.<br>
[[File:Memory View VS Code2.png | 400px]]<br>
[[File:Chip Select Real World.jpeg|700px]]<br>

Latest revision as of 22:52, 5 February 2025

Introduction

My knowledge is very small, no not just in general, but on this subject in computers. This is probably where I regret not having a degree. But here goes I am going to try and understand enough of the diagram from the STM32F0xx Cortex-M0 to be dangerous.
I am again looking at Intermation and their course which is Computer Organization and design.

NVIC and EXTI

Not fully on board with this but the NVIC (Nested Vectored Interrupt Controller) is a interrupt controller connected to the CPU. From one of the docs (STM32G4) it lists its features as

  • 102 interrupt sources,
  • 16 programmable priority levels,
  • Low-latency exception and interrupt handling,
  • Automatic nesting,
  • Power management control.

In the lesson I was doing this came up because of the EXTI (EXTernal Interrupt/Event) controller which is connected to the NVIC. When using CubeMX you can configure handlers for the GPIO pin which connects to the EXTI on the NVIC. In my case there are 28 lines on the EXTI

Pending Request Register

When we press a button it is flagged in the pending request register shown above. We can get the address of the register from the manual. In the case of the STM32F302R8 we can first find the EXTI in the manual

This is probably more about navigating the documentation than the detail but here is the EXTI_PR1 document. After all the software is easy

So the address of the EXIT is 0x4001 0400 - 0x4001 07FF and when we look for EXTI_PR1 it is offset 0x14 so the address is 0x4001 0414.
They were very keen to stress that it is the programmers (so old fashioned) job to clear the bit in the PR when done. Using the CubeMX this is what is generated for you via macros.

STM32 Header Files

Briefly ARM have a thing called CMIS. Vendors follow these guidelines and share common macros etc.

Volatile Keyword

Looking at the headers at lot of the headers specify volatile. This forces the compiler to always read the value and not optimize out. With an optimizer the value of p in the code below is not updated and remains in the first loop if the volatile keyword is not used.

#include <stdint.h>


#define SRAM_ADDRESS1   0x20000004U

int main(void)
{

  uint32_t value = 0;
  uint32_t volatile *p = (uint32_t *) SRAM_ADDRESS1;
  
    while(1)
  {
   value = *p;
   if(value) break;
		
  }
  
  while(1);

  return 0;
}

GPIO and Ports

Resetting Ports

Again for documentation most of the STM32 boards will list the peripheral and have the register as the last entry. When you look at the ports some of the reset value might not be neccessarily 0x000 0000. For the STM32F302R8 they were
Address offset:0x00

  • Reset value: 0xA800 0000 for port A
  • Reset value: 0x0000 0280 for port B
  • Reset value: 0x0000 0000 for other ports

Each GPIO should have a pullup resistor. This ensures pins are not floating,neither positive or negative, which will happen due to residual voltage. The pullup resistor value can be found in the documentation searching for Rₚᵤ. or Weak Pull-up.

GPIO Modes

A bit was said about this and the importance of using pullup resistors. The open drain setting was brought up with I2C so may come back to this.

  • Input
  • Output
    • Push/Pull (0 or 1)
    • Open Drain (0 or floating)

Speed (Output Only)

We can set the speed of the output using the OSPEEDRy. There are two bits for each port this effect the rising time and falling time. You have to refer to the datasheet (separate from reference manual) to understand the different available speeds. Search for OSPEEDS. I will be very happy if I ever need this. The speed are based on the voltage and clock capacitance.


A use case I have heard for setting these speeds is bit-banging which currently I do not understand but believe you could fake an interface by this technique.
The slew rate is defined as the maximum rate of output voltage change per unit time. It is denoted by the letter S. The slew rate helps us to identify the amplitude and maximum input frequency suitable to an operational amplifier (OP amp) such that the output is not significantly distorted.

Alternate Function Mapping

There are 16 different alternate functions pins can be used for. For STM you can generally see this on the pinout when googling but he datasheet also holds a table Alternate Function Mapping showing which pins support what. These can be configured using the Alternate Function Register High (AFRH) and Alternate Function Register Low (AFRL)

Other Stuff

Memory Hierarchy

There is a Hierarchy

  • Registers
  • Cache (L1 Local, L2 Shared)
  • Main Memory (RAM)
  • Long Term Storage (Hard Disks, Tapes etc)

Types Of Memory

  • DRAM Dynamic RAM uses capacitors, slower, cheaper, requires refreshing
  • SRAM Static RAM uses transistors, faster


Types Of Addressing Memory

These are the types of addressing.

  • Random Address (RAM)
  • Direct Addressing (HDD)
  • Sequential Addressing (Tapes)
  • Associative Addressing (Caching)

Speed and Cost Per Bit change goes down as performance goes down.

Cache Memory

Introduction

Terms

  • Miss Element was not found
  • Hit Element was found
  • Hit Rate Percentage of time that element was found
Effective Access Time = 
   Hit Rate * Time from Cache  + (1 - Hit rate) - time from Memory

Looking at examples, it was the hit rate that the tutor wanted us to focus on as the other parts would not change.

For loops indicate to the compiler that we will probably access data more than once and maybe its neighbours.

Types of Associative Cache

There are 3 types of associative addressing cache algorithms

  • Fully Associative
  • Direct Mapping
  • Set Associative

Each of the approaches uses tag value approach where the tag is part of the address and the value is organized to allow to find the element

Fully Associative

For this approach the Full address is split between the Block ID and a number of bits known as the Word ID. The Block ID becomes the Tag. And the value is divided into slots based on the size of the word ID.

  • 1-bits 2 elements are stored slot-0, slot-1
  • 2-bits 4 elements are stored slot-0, slot-1, slot-2,-slot-3

This is shown below

The values are replaced in the case using a replacement alogorthm such as

  • FIFO First in First Out
  • LRU Least Recently Used. Replaced Block least used
  • LFU Least Frequently Used. Replaced Block least frequently used
  • Random Just pick one

This approach has the least chance of thrashing but is expensive and slow.

Direct Mapping

For this approach we use the more of the address to store a line ID. The Full Address is not split into

  • Tag first part of address
  • Line ID, which line of the cache to store data in
  • Word ID, as before the slot for the data in the value

This does not need a replacement algorithm and is therefore fast and cheap. But given the line IDs means data can be replaced often it is prone to thrashing.

Set Associative

This is similar to the above example where instead of a Line ID, a Set ID is stored. I.E. the cache has n rows size of set ID. Shown below is a line ID of 2-bits so each row has two slots. For this approach we do need a replacement algorithm. This approach is used by raspberry PI and many manufacturers

Flags in Cache

Along with tag and value there are flags associated with a row. They are

  • Type Data or Instructrion
  • Valid Whether valid
  • Lock Lock flag
  • Dirty bit - Identifies a line of data that has been written to but not been updated

Memory Device Interface

Here was shown how the circuit might work with D Flip-Flops and a clock line. Added was an address decoders to allow the device to select the right clock line. Three other signals are required, a write, a read and a chip select.

Ironically he went on to show this device which I reckon is the one I am using for the 6502.

Chip Select

When we look at an MCU there is a memory map which shows where the peripherals are on the device. Each device has a range of memory used to operate it. There are address lines within the mcu which are connected to the device. In the example below there are 20 address lines and the Graphic Card is located between 0xE0000 and 0xFFFFF. Putting this number into binary shows that address lines A19-A17 is the CS (Chip Select) and address line A16-A00 is the graphic card. Setting A19-A17 to binary 111 effectively means you are using the address lines for the graphics card.

In order to operate the correct memory device from the processor you use the correct chip select.
To maybe see a real world example here is windows showing us the memory map range for a device.