STM32 Architecture: Difference between revisions

From bibbleWiki
Jump to navigation Jump to search
No edit summary
 
(173 intermediate revisions by the same user not shown)
Line 1: Line 1:
=Introduction=
=Introduction=
This is the page for all things STM32. Currently working with a Nucleo F302R8
My knowledge is very small, no not just in general, but on this subject in computers. This is probably where I regret not having a degree. But here goes I am going to try and understand enough of the diagram from the STM32F0xx Cortex-M0 to be dangerous.
=Setting Up=
[[File:STM32F0xx Cortex-M0 System Overview.png| 400px]]<br>
This was a trial and a half and here in case some others struggle and find this help.<br>
I am again looking at Intermation and their course which is Computer Organization and design.
==STM32CubeIDE (Eclipse)==
=NVIC and EXTI=
My goal was to use eclipse because this is what is used in the videos. So with ubuntu 23.04 install went to install this with eclipse. But the eclipse version requires python 2.7 which is no longer available.
Not fully on board with this but the NVIC (Nested Vectored Interrupt Controller) is a interrupt controller connected to the CPU. From one of the docs (STM32G4) it lists its features as
==VS Code==
*102 interrupt sources,
Luckily STM32 had brought out an extensions for my preferred solution VS Code. Installed the extension and went about installing the 3 other products it mentioned.
*16 programmable priority levels,
*STM32CubeMX
*Low-latency exception and interrupt handling,
*STMCUFinder
*Automatic nesting,
*stm32cubeclt_1.12.1
*Power management control.
Started it up but the import project button did nothing at all so I assumed it must need STM32CubeIDE. So went back to trying to install libpython2.7 and found https://askubuntu.com/questions/101591/how-do-i-install-the-latest-python-2-7-x-or-3-x-on-ubuntu. Unfortunately whatever I did must have move ld.so  or something serious so had to re-install. But the good news was I could retry the STM32 extension. Having a new install I tried the STM32 Extension and it did indeed say could not find STM32CubeIDE. I documented this on the STM32 forum under https://community.st.com/s/question/0D53W00002IMDFZSA5/import-project-in-vs-code-ubuntu-2404
In the lesson I was doing this came up because of the EXTI (EXTernal Interrupt/Event) controller which is connected to the NVIC. When using CubeMX you can configure handlers for the GPIO pin which connects to the EXTI on the NVIC. In my case there are 28 lines on the EXTI<br>
==STM32CubeIDE Attempt 2==
[[File:NVIC EXTI.png|400px]]
The was an additional install for STM32CubeIDE for vanilla linux. So I downloaded this and installed it. But on start up it failed with an error org.eclipse.swt.internal.C::strlen. <br>
=Pending Request Register=
When we press a button it is flagged in the pending request register shown above. We can get the address of the register from the manual. In the case of the STM32F302R8 we can first find the EXTI in the manual<br>
[[File:STM32F302R8 EXTI.png|400px]]<br>
This is probably more about navigating the documentation than the detail but here is the EXTI_PR1 document. After all the software is easy<br>
[[File:EXTI PR Register.png|400px]]<br>
So the address of the EXIT is 0x4001 0400 - 0x4001 07FF and when we look for EXTI_PR1 it is offset 0x14 so the address is 0x4001 0414.<br>
They were very keen to stress that it is the programmers (so old fashioned) job to clear the bit in the PR when done. Using the CubeMX this is what is generated for you via macros.<br>
[[File:STM32 EXTI Handler.png|400px]]<br>
=STM32 Header Files=
Briefly ARM have a thing called CMIS. Vendors follow these guidelines and share common macros etc.  
==Volatile Keyword==
Looking at the headers at lot of the headers specify volatile. This forces the compiler to always read the value and not optimize out. With an optimizer the value of p in the code below is not updated and remains in the first loop if the volatile keyword is not used.
<syntaxhighlight lang='c'>
 
#include <stdint.h>
 
 
#define SRAM_ADDRESS1  0x20000004U
 
int main(void)
{
 
  uint32_t value = 0;
  uint32_t volatile *p = (uint32_t *) SRAM_ADDRESS1;
 
    while(1)
  {
  value = *p;
  if(value) break;
  }
 
  while(1);
 
  return 0;
}
</syntaxhighlight>
=GPIO and Ports=
==Resetting Ports==
Again for documentation most of the STM32 boards will list the peripheral and have the register as the last entry. When you look at the ports some of the reset value might not be neccessarily 0x000 0000. For the STM32F302R8 they were<br>
'''Address offset:0x00'''<br>
*Reset value: 0xA800 0000 for port A
*Reset value: 0x0000 0280 for port B
*Reset value: 0x0000 0000 for other ports
Each GPIO should have a pullup resistor. This ensures pins are not floating,neither positive or negative, which will happen due to residual voltage. The pullup resistor value can be found in the documentation searching for Rₚᵤ. or Weak Pull-up.<br>
==GPIO Modes==
A bit was said about this and the importance of using pullup resistors. The open drain setting was brought up with I2C so may come back to this.
*Input
*Output
**Push/Pull (0 or 1)
**Open Drain (0 or floating)
==Speed (Output Only)==
We can set the speed of the output using the OSPEEDRy. There are two bits for each port this effect the rising time and falling time. You have to refer to the datasheet (separate from reference manual) to understand the different available speeds. Search for OSPEEDS. I will be very happy if I ever need this. The speed are based on the voltage and clock capacitance.<br>
[[File:GPIO Speeds.png|600px]]<br>
<br>
<br>
But the next morning googling I found https://github.com/adoptium/adoptium-support/issues/785 and the solution to getting it to work.
A use case I have heard for setting these speeds is bit-banging which currently I do not understand but believe you could fake an interface by this technique.<br>
<syntaxhighlight lang="bash">
The slew rate is defined as the maximum rate of output voltage change per unit time. It is denoted by the letter S. The slew rate helps us to identify the amplitude and maximum input frequency suitable to an operational amplifier (OP amp) such that the output is not significantly distorted.<br>
mkdir /tmp/SWT-GDBusServer
==Alternate Function Mapping==
</syntaxhighlight>
There are 16 different alternate functions pins can be used for. For STM you can generally see this on the pinout when googling but he datasheet also holds a table Alternate Function Mapping showing which pins support what. These can be configured using the Alternate Function Register High (AFRH) and Alternate Function Register Low (AFRL)
So may find solution was  
=Other Stuff=
*Ubuntu 23.04
==Memory Hierarchy==
*en.st-stm32cubeide_1.12.1_16088_20230420_1057_amd64.sh.zip
There is a Hierarchy
*en.ST-MCU-FinderLin_v5-0-0.zip
*Registers
*en.stm32cubemx-lin-v6-8-1.zip
*Cache (L1 Local, L2 Shared)
=Debugging=
*Main Memory (RAM)
==Getting Started==
*Long Term Storage (Hard Disks, Tapes etc)
Well now have all of the bits installed. Next it is time to start debugging
==Types Of Memory==
==Update nucleo f302r8==
*DRAM Dynamic RAM uses capacitors, slower, cheaper, requires refreshing
This required an Update to the firmware. Google is your friend. Downloaded en.stsw-link007-v3-12-3.zip. This contained the udev rules which you can install using dpkg. Don't forget to reload rules with
*SRAM Static RAM uses transistors, faster<br>
<syntaxhighlight lang="bash">
[[File:SRAM vs DRAM.jpg|500px]]<br>
sudo udevadm control --reload-rules
 
sudo udevadm trigger
==Types Of Addressing Memory==
</syntaxhighlight>
These are the types of addressing.
Next under the stsw-link007/AllPlatforms
*Random Address (RAM)
<syntaxhighlight lang="bash">
*Direct Addressing (HDD)
sudo java -jar STLinkUpgrade.jar
*Sequential Addressing (Tapes)
</syntaxhighlight>
*Associative Addressing (Caching)
Hopefully this all goes well
Speed and Cost Per Bit change goes down as performance goes down.
==Cache Memory==
===Introduction===
Terms
*Miss Element was not found
*Hit Element was found
*Hit Rate Percentage of time that element was found
 
Effective Access Time =
    Hit Rate * Time from Cache  + (1 - Hit rate) - time from Memory
 
Looking at examples, it was the hit rate that the tutor wanted us to focus on as the other parts would not change.<br>
<br>
For loops indicate to the compiler that we will probably access data more than once and maybe its neighbours.
===Types of Associative Cache===
There are 3 types of associative addressing cache algorithms
*Fully Associative
*Direct Mapping
*Set Associative
Each of the approaches uses tag value approach where the tag is part of the address and the value is organized to allow to find the element
 
===Fully Associative===
For this approach the Full address is split between the Block ID and a number of bits known as the Word ID. The Block ID becomes the Tag. And the value is divided into slots based on the size of the word ID.
*1-bits 2 elements are stored slot-0, slot-1
*2-bits 4 elements are stored slot-0, slot-1, slot-2,-slot-3
This is shown below<br>
[[File:Fully Associative.jpg|300px]]<br>
The values are replaced in the case using a replacement alogorthm such as
*FIFO First in First Out
*LRU Least Recently Used. Replaced Block least used
*LFU Least Frequently Used. Replaced Block least frequently used
*Random Just pick one
This approach has the least chance of thrashing but is expensive and slow.
===Direct Mapping===
For this approach we use the more of the address to store a line ID.
The Full Address is not split into
*Tag first part of address
*Line ID, which line of the cache to store data in
*Word ID, as before the slot for the data in the value
This does not need a replacement algorithm and is therefore fast and cheap. But given the line IDs means data can be replaced often it is prone to thrashing.
===Set Associative===
This is similar to the above example where instead of a Line ID, a Set ID is stored. I.E. the cache has n rows size of set ID. Shown below is a line ID of 2-bits so each row has two slots. For this approach we do need a replacement algorithm. This approach is used by raspberry PI and many manufacturers<br>
[[File:Set Asscociative.jpg|300px]]<br>
===Flags in Cache===
Along with tag and value there are flags associated with a row. They are
*Type Data or Instructrion
*Valid Whether valid
*Lock Lock flag
*Dirty bit - Identifies a line of data that has been written to but not been updated
 
==Memory Device Interface==
Here was shown how the circuit might work with D Flip-Flops and a clock line. Added was an address decoders to allow the device to select the right clock line. Three other signals are required, a write, a read and a chip select.<br>
[[File:MemoryInterface.jpeg]]<br>
Ironically he went on to show this device which I reckon is the one I am using for the 6502.<br>
[[File:32k Memory.png |200px]]<br>
==Chip Select==
When we look at an MCU there is a memory map which shows where the peripherals are on the device. Each device has a range of memory used to operate it.
There are address lines within the mcu which are connected to the device. In the example below there are 20 address lines and the Graphic Card is located between 0xE0000 and 0xFFFFF. Putting this number into binary shows that address lines A19-A17 is the CS (Chip Select) and address line A16-A00 is the graphic card. Setting A19-A17 to binary 111 effectively means you are using the address lines for the graphics card.<br>
[[File:Chip Select.jpeg|600px]]<br>
In order to operate the correct memory device from the processor you use the correct chip select.<br>
To maybe see a real world example here is windows showing us the memory map range for a device.<br>
[[File:Chip Select Real World.jpeg|700px]]<br>

Latest revision as of 22:52, 5 February 2025

Introduction

My knowledge is very small, no not just in general, but on this subject in computers. This is probably where I regret not having a degree. But here goes I am going to try and understand enough of the diagram from the STM32F0xx Cortex-M0 to be dangerous.
I am again looking at Intermation and their course which is Computer Organization and design.

NVIC and EXTI

Not fully on board with this but the NVIC (Nested Vectored Interrupt Controller) is a interrupt controller connected to the CPU. From one of the docs (STM32G4) it lists its features as

  • 102 interrupt sources,
  • 16 programmable priority levels,
  • Low-latency exception and interrupt handling,
  • Automatic nesting,
  • Power management control.

In the lesson I was doing this came up because of the EXTI (EXTernal Interrupt/Event) controller which is connected to the NVIC. When using CubeMX you can configure handlers for the GPIO pin which connects to the EXTI on the NVIC. In my case there are 28 lines on the EXTI

Pending Request Register

When we press a button it is flagged in the pending request register shown above. We can get the address of the register from the manual. In the case of the STM32F302R8 we can first find the EXTI in the manual

This is probably more about navigating the documentation than the detail but here is the EXTI_PR1 document. After all the software is easy

So the address of the EXIT is 0x4001 0400 - 0x4001 07FF and when we look for EXTI_PR1 it is offset 0x14 so the address is 0x4001 0414.
They were very keen to stress that it is the programmers (so old fashioned) job to clear the bit in the PR when done. Using the CubeMX this is what is generated for you via macros.

STM32 Header Files

Briefly ARM have a thing called CMIS. Vendors follow these guidelines and share common macros etc.

Volatile Keyword

Looking at the headers at lot of the headers specify volatile. This forces the compiler to always read the value and not optimize out. With an optimizer the value of p in the code below is not updated and remains in the first loop if the volatile keyword is not used.

#include <stdint.h>


#define SRAM_ADDRESS1   0x20000004U

int main(void)
{

  uint32_t value = 0;
  uint32_t volatile *p = (uint32_t *) SRAM_ADDRESS1;
  
    while(1)
  {
   value = *p;
   if(value) break;
		
  }
  
  while(1);

  return 0;
}

GPIO and Ports

Resetting Ports

Again for documentation most of the STM32 boards will list the peripheral and have the register as the last entry. When you look at the ports some of the reset value might not be neccessarily 0x000 0000. For the STM32F302R8 they were
Address offset:0x00

  • Reset value: 0xA800 0000 for port A
  • Reset value: 0x0000 0280 for port B
  • Reset value: 0x0000 0000 for other ports

Each GPIO should have a pullup resistor. This ensures pins are not floating,neither positive or negative, which will happen due to residual voltage. The pullup resistor value can be found in the documentation searching for Rₚᵤ. or Weak Pull-up.

GPIO Modes

A bit was said about this and the importance of using pullup resistors. The open drain setting was brought up with I2C so may come back to this.

  • Input
  • Output
    • Push/Pull (0 or 1)
    • Open Drain (0 or floating)

Speed (Output Only)

We can set the speed of the output using the OSPEEDRy. There are two bits for each port this effect the rising time and falling time. You have to refer to the datasheet (separate from reference manual) to understand the different available speeds. Search for OSPEEDS. I will be very happy if I ever need this. The speed are based on the voltage and clock capacitance.


A use case I have heard for setting these speeds is bit-banging which currently I do not understand but believe you could fake an interface by this technique.
The slew rate is defined as the maximum rate of output voltage change per unit time. It is denoted by the letter S. The slew rate helps us to identify the amplitude and maximum input frequency suitable to an operational amplifier (OP amp) such that the output is not significantly distorted.

Alternate Function Mapping

There are 16 different alternate functions pins can be used for. For STM you can generally see this on the pinout when googling but he datasheet also holds a table Alternate Function Mapping showing which pins support what. These can be configured using the Alternate Function Register High (AFRH) and Alternate Function Register Low (AFRL)

Other Stuff

Memory Hierarchy

There is a Hierarchy

  • Registers
  • Cache (L1 Local, L2 Shared)
  • Main Memory (RAM)
  • Long Term Storage (Hard Disks, Tapes etc)

Types Of Memory

  • DRAM Dynamic RAM uses capacitors, slower, cheaper, requires refreshing
  • SRAM Static RAM uses transistors, faster


Types Of Addressing Memory

These are the types of addressing.

  • Random Address (RAM)
  • Direct Addressing (HDD)
  • Sequential Addressing (Tapes)
  • Associative Addressing (Caching)

Speed and Cost Per Bit change goes down as performance goes down.

Cache Memory

Introduction

Terms

  • Miss Element was not found
  • Hit Element was found
  • Hit Rate Percentage of time that element was found
Effective Access Time = 
   Hit Rate * Time from Cache  + (1 - Hit rate) - time from Memory

Looking at examples, it was the hit rate that the tutor wanted us to focus on as the other parts would not change.

For loops indicate to the compiler that we will probably access data more than once and maybe its neighbours.

Types of Associative Cache

There are 3 types of associative addressing cache algorithms

  • Fully Associative
  • Direct Mapping
  • Set Associative

Each of the approaches uses tag value approach where the tag is part of the address and the value is organized to allow to find the element

Fully Associative

For this approach the Full address is split between the Block ID and a number of bits known as the Word ID. The Block ID becomes the Tag. And the value is divided into slots based on the size of the word ID.

  • 1-bits 2 elements are stored slot-0, slot-1
  • 2-bits 4 elements are stored slot-0, slot-1, slot-2,-slot-3

This is shown below

The values are replaced in the case using a replacement alogorthm such as

  • FIFO First in First Out
  • LRU Least Recently Used. Replaced Block least used
  • LFU Least Frequently Used. Replaced Block least frequently used
  • Random Just pick one

This approach has the least chance of thrashing but is expensive and slow.

Direct Mapping

For this approach we use the more of the address to store a line ID. The Full Address is not split into

  • Tag first part of address
  • Line ID, which line of the cache to store data in
  • Word ID, as before the slot for the data in the value

This does not need a replacement algorithm and is therefore fast and cheap. But given the line IDs means data can be replaced often it is prone to thrashing.

Set Associative

This is similar to the above example where instead of a Line ID, a Set ID is stored. I.E. the cache has n rows size of set ID. Shown below is a line ID of 2-bits so each row has two slots. For this approach we do need a replacement algorithm. This approach is used by raspberry PI and many manufacturers

Flags in Cache

Along with tag and value there are flags associated with a row. They are

  • Type Data or Instructrion
  • Valid Whether valid
  • Lock Lock flag
  • Dirty bit - Identifies a line of data that has been written to but not been updated

Memory Device Interface

Here was shown how the circuit might work with D Flip-Flops and a clock line. Added was an address decoders to allow the device to select the right clock line. Three other signals are required, a write, a read and a chip select.

Ironically he went on to show this device which I reckon is the one I am using for the 6502.

Chip Select

When we look at an MCU there is a memory map which shows where the peripherals are on the device. Each device has a range of memory used to operate it. There are address lines within the mcu which are connected to the device. In the example below there are 20 address lines and the Graphic Card is located between 0xE0000 and 0xFFFFF. Putting this number into binary shows that address lines A19-A17 is the CS (Chip Select) and address line A16-A00 is the graphic card. Setting A19-A17 to binary 111 effectively means you are using the address lines for the graphics card.

In order to operate the correct memory device from the processor you use the correct chip select.
To maybe see a real world example here is windows showing us the memory map range for a device.