nicocvn / cppreg Goto Github PK

View Code? Open in Web Editor NEW

55.0 6.0 4.0 527 KB

A C++11 header-only library for MMIO registers

Home Page: https://nicocvn.github.io/cppreg/

License: Other

CMake 2.25% C 0.53% C++ 97.22%

cpp cmsis mcu register

cppreg's Introduction

cppreg

Description

cppreg is a header-only C++11 library to facilitate the manipulation of MMIO registers (i.e., memory-mapped I/O registers) in embedded devices. The idea is to provide a way to write expressive code and minimize the likelihood of ill-defined expressions when dealing with hardware registers on a MCU. The current features are:

expressive syntax which shows the intent of the code when dealing with registers and fields,
efficiency and performance on par with traditional C implementations (e.g. CMSIS C code) when at least some compiler optimizations are enabled,
emphasis on ensuring the assembly is the same if not better than CMSIS versions,
field access policies (e.g. read-only vs read-write) detect ill-defined access at compile-time,
compile-time detection of overflow,
register memory can easily be mocked up so that testing is possible.

For a short introduction and how-to see the quick start guide. A more complete and detailed documentation is available here.

The features provided by cppreg come with no overhead or performance penalty compared to traditional low-level C approaches. We give here an example comparing the assembly generated by a CMSIS-like implementation versus a cppreg-based one.

Requirements

cppreg is designed to be usable on virtually any hardware that satisfies the following requirements:

MMIO register sizes are integral numbers of bytes (e.g., 8 bits, 16 bits, ...),
registers are properly aligned: a N-bit register is aligned on a N-bit boundary,

GCC (4.8 and above) and Clang (3.3 and above) are supported and it is expected that any other C++11-compliant compiler should work (see the quick start guide for recommended compiler settings).

Manifest

This project started when looking at this type of C code:

// Now we enable the PLL as source for MCGCLKOUT.
MCG->C6 |= (1u << MCG_C6_PLLS_SHIFT);

// Wait for the MCG to use the PLL as source clock.
while ((MCG->S & MCG_S_PLLST_MASK) == 0)
    __NOP();

This piece of code is part of the clock setup on a flavor of the K64F MCU. MCG is a peripheral and MCG->C6 and MCG->S are registers containing some fields which are required to be set to specific values to properly clock the MCU. Some of the issues with such code are:

the intent of the code is poorly expressed, and it requires at least the MCU data sheet or reference manual to be somewhat deciphered/understood,
since the offsets and masks are known at compile time, it is error prone and somewhat tedious that the code has to re-implement shifting and masking operations,
the code syntax itself is extremely error prone; for example, forget the | in |= and you are most likely going to spend some time debugging the code,
there is no safety at all, that is, you might overflow the field, or you might try to write to a read-only field and no one will tell you (not the compiler, not the linker, and at runtime this could fail in various ways with little, if any, indication of where is the error coming from).

This does not have to be this way, and C++11 brings a lot of features and concepts that make it possible to achieve the same goal while clearly expressing the intent and being aware of any ill-formed instructions. Some will argue this will come at the cost of a massive performance hit, but this is actually not always the case (and more often than not, a C++ implementation can be very efficient; see Ken Smith paper and the example below).

This project has been inspired by the following previous works:

cppreg's People

Contributors

Stargazers

Watchers

Forkers

imhmede stephenwhittle yuribezoss tastenmo

cppreg's Issues

Add grouping registers together, like how CMSIS maps a struct of types to a base address

Issue

As saw in #5 , there is a discrepancy between CMSIS and Cppreg which was found to be due to the CMSIS style allowing the ability to inform the compiler about groupings of registers. This allows the compiler to, when reading and writing from memory, use relative memory operations like this

       C++(Cppreg)                    CMSIS          vs    CPPREG
int a[0] = SomeField0::read();     LD R0, R10 [#0]       LD R0, R10
int a[1] = SomeField1::read();     LD R1, R10 [#4]       LD R1, R11
int a[2] = SomeField2::read();     LD R2, R10 [#8]       LD R2, R12

Since CppReg doesn't have that information explicitly stated, the compiler is stuck assuming that the registers are totally unrelated and is unable to get the address via offsets from a base address. I guess the optimization process is unable to do that extensive inspection for memory addresses (though it can do that for immediate as shown in the sidenote of #6 it seems).

Solution

Add a way to group registers together. I do not have a syntax example off the top of my head, but as @sendyne-nclauvelin mentioned in #5 this would require a not trivial amount of API rework, so before implementing, lets see what other potential API points of pain there are.

Finalize RegisterPack implementation

(This issue is a continuation of the discussions and suggestions in #7)

Before merging the register pack implementation the following items need to be decided:

Implement an enumeration type for register width (@hak8or) to limit the possibility of typo/error and enforce supported widths.
RegisterPack takes as template argument the size of the pack in bytes; we keep it this way or we decide that the size should be in bits.
PackedRegister takes as template argument the offset with respect to the pack base in bits; we keep it this way or we decide that the offset should be in bytes.
Naming: should we change RegisterPack and PackedRegister to something else (Peripheral and PeripheralRegister, Device and DeviceRegister) ...
Alignment checking is currently done in a "naive" way; this should be revised using a safer implementation and the limitations should be documented (in a pack, of register of size N bits can only be defined is the pack base is aligned on a N bits boundary and if the register address is at an offset multiple of N),
Update documentation accordingly.

Documentation should explain how to access whole register memory

This issue is created following the suggestion in #14.

The current documentation does not mention the ro_mem_device() and rw_mem_device() functions which can be used to access the memory of a register type. These functions can be useful in some cases (for example, see #14 about write to toggle fields update).

Update documentation to reflect changes from API revision

Due to #9 documentation needs to be updated (in particular for merge write operations).

Fails to compile for PowerPC GCC 4.8.5

As of 5cf8a85 it seems we fail to compile on PowerPC GCC 4.8.5 according to GodBolt as shown here.

with the following compiler error:

<source>:138:40: error: redeclaration 'cppreg::Shadow<Register, true>::use_shadow' differs in 'constexpr'
     const bool Shadow<Register, true>::use_shadow;
                                        ^
<source>:133:37: error: from previous declaration 'cppreg::Shadow<Register, true>::use_shadow'
         constexpr static const bool use_shadow = true;
                                     ^
<source>:138:40: error: declaration of 'constexpr const bool cppreg::Shadow<Register, true>::use_shadow' outside of class is not definition [-fpermissive]
     const bool Shadow<Register, true>::use_shadow;
                                        ^
Compiler returned: 1

Add a list of requirements/limitations to the documentation

This is currently missing from the documentation and should include:

supported compilers: GCC/Clang ... should work on other C++11 compilers but no guarantee,
supported hardware: requires that register minimal size is 8 bit, designed for Cortex-M{0+,3,4,7} originally but nothing specifics so this should be generic enough to run on virtually any MCUs.

Failed to access to single-bit field

Hello!

First of -- great lib! Pure enjoyment using this approach -- code become small and easy to read and juggle with.

Recently I encountered a very strange problem -- I couldn't work with single bit programming STM32. Details below.

I'm working with STM32F042F6. Problem arises when I tried to activate filter banks -- to do that I had to clear a single bit FINIT. Here is his definition of this partical part:

#pragma once

#include <cppreg.h>

using namespace cppreg;

struct Peripheral {
  struct bxCAN : RegisterPack<0x4000'6400, 1024> {
    // Filter initialization mode
    using FINIT =
        Field<PackedRegister<bxCAN, RegBitSize::b32, 0x200>, 1, 0, read_write>;
  };
};

Here is a screenshot of what does that register have:

Everywhere else lib works flawlessly! But only here I had to hand-write access to this particular register.

void can_t::init_filters_() {
  // Strangest bug! Not working FINIT::set/clear();
  // Well, let's get dirty!
  uint32_t &finit = *((uint32_t *)(0x4000'6600));
  finit |= 0b1;
  
  // ... magic ...

  // Turn all filters initialization -> active
  finit &= ~0b1;
  while (finit & 0b1) __nop();
}

I found at last while()-cycle, where I check for successful activation of filters. If I use cppreg's access to FINIT-bitfield -- it does nothing, bit never changes. It always read bit as '1' (bit is always raised, cannot clear).

No one got hurt, except my pride. And a bit of my personal time.

Did I miss some caveat?

Toggle-only access policy support?

I'm working on refactoring a small USB stack for STM32 to avoid its dependencies (CMSIS, HAL etc) and have my own svd->cppreg header generation flow which is working great.
However, the STM32 USB registers, specifically the endpoint registers, have a number of fields which are toggle-only (ie read is permitted, write 1 to flip) mixed in with normal read-write fields.

Would it be feasible to devise an access policy which changes the way cppreg attempts to maintain the value of a field?

Writing the existing value back, as I presume the existing implementation would do (to try to preserve the value of the toggle fields) when I'm trying to write to a normal field in the same register, will cause all the toggles to flip if they are currently set.

Happy to entertain other suggestions, too of course. CMSIS handles this just fine by forcing all toggle field values to 0 during read-modify-write, but to do the same I'd have to be able to read the entire packed register at once, then do a merge_write to all fields, and packed registers don't support read().
Looking over the documentation it seems that perhaps a modification of the shadow register functionality could achieve this, too?

Very minor Performance issue in comparison to CMSIS

When working on #1 using the following code, I spotted the following;

Differences

First up is the read modify write for the MODER register which takes 2 more instructions in cppreg vs cmsis for cortex m0plus. This seems to be because for cppreg the address of the register is built via multiple immediate while in CMSIS it re-uses the MODER address when accessing BSRR.
Secondly, it seems cppreg for writing a value to the BSRR does a simpler str r1, [r3] in cppreg instead of the potentially more expensive str r2, [r3, #24] instruction under CMSIS. The Cortex-M0+ Technical Reference Manual says there is no difference, but this may be different for M7 and more complicated architectures.

Potential Cause

I feel the first and second difference are somewhat related because in CMSIS the compiler is informed that MODER and BSRR are related via offsets from a base pointer, while in CPPReg they look totally unrelated. This results in the compiler having to "rebuild" the address twice for cppreg, once from immediates for MODERand another from a hardcoded address stored in the .text section. Furthermore, it seems the two instruction difference is also due to the masking and applying the value

Solution

I view this as being caused by the architecture limitations of Cortex M and the design of cppreg.

Regarding the architecture, if you compile this for X86-64 or non thumb ARM then you see the issue goes away (the assembly is identical). I think this is because immediates in those ISA's/ARCH's can be huge due to, well, instructions being allowed to be very large too. This results in all stores/loads being done via full immediates instead of an immediate and offset or shifting immediate to build the address.
Regarding cppreg, you cannot specify that multiple registers are just an offset from each other instead of totally unrelated areas (CMSIS does this by placing a huge struct on the address). I do not see an easy way to give the compiler that sort of information either.

Real World Implications

To be frank, this difference in assembly from a performance standpoint is small, very small. Ideally the reason for the discrepancy can be verified/found with cppreg adopting the smaller of the two. But, writing to a register is very rarely a bottle neck unless you are bit banging, in which case you should probably be starting to seriously consider assembly instead.

Document cppreg (zero) impact on performance and code size.

The main goal is to provide a document (e.g., Performance.md) where we show assembly outputs for various level of optimizations (and possibly compilers) to illustrate that cppreg does not affect runtime performance or code size.

This requires:

design a code example with a cppreg implementation and a traditional implementation (à la CMSIS)
produce compiler outputs (the target is GCC ARM but others could be added) for Og, O1, O2, O3, Os (although O[1-3] will most likely be identical)
write Performance.md

We could put the various materials in a benchmark directory to not pollute the main one.

Revise access policies implementation for better performance.

Access policies should be revised to leverage the fact that most data are actually known at compile time (and when using the template form of write all data are actually known at compile time).

This issue was created following the discussion in #5.

Include link to github page in description

Almost all other projects include a link in their project description, I suggest we follow that route too since there is no other way to view the github page.

Writes to read_write fields which fill the size of the register still create a read

Scenario

I was working on an example with a UART, which involves writing to a (usually) 8 bit register that has one field that is also 8 bits large (fills the size of the register). This field is both read and write (same register to send data to the UART TX FIFO and to get data out of a UART RX FIFO). The code is on godbolt here

// CppReg example
struct UART {
    static constexpr uintptr_t UART0_BASE = 0x48000000;

    struct FIFO : cppreg::Register<UART0_BASE + 0, 8u> {
        using Data = cppreg::Field<FIFO, 8u, 0u, cppreg::read_write>;
    };
};

void UART_Test_CPPREG(void){
    UART::FIFO::Data::write<0x21>();
    UART::FIFO::Data::write<0x11>();
    UART::FIFO::Data::write<0x90>();
    UART::FIFO::Data::write<0xFF>();
    UART::FIFO::Data::write<0x01>();
    UART::FIFO::Data::write<0x02>();
    UART::FIFO::Data::write<0x03>();
}

// CMSIS example
#define __IO volatile
typedef struct {
    __IO uint8_t FIFO;
} UART_TypeDef;

#define UART ((UART_TypeDef *) 0x48000000)

void GPIO_Test_CMSIS(void){
    UART->FIFO = 0x21;
    UART->FIFO = 0x11;
    UART->FIFO = 0x90;
    UART->FIFO = 0xFF;
    UART->FIFO = 0x01;
    UART->FIFO = 0x02;
    UART->FIFO = 0x03;
}

Expected behavior.

The CMSIS Example behaves as expected, we get the immediate into a register and write it to memory. Interestingly the immediates are gotten by adding or subtracting from the previous immediate instead of just a mov register, #number instruction. I would have thought an ADD register, #number would have longer, but for Cortex M0/M0+ (ARMv6m) they are the same (single cycle). Surprisingly enough, ARM don't say what the cycle count is for ARMv7m, so maybe it's different there, hence the discrepancy.

Anyways, I am expecting that cppreg will look the same or very similar, but instead I am seeing a load being inserted between each store. I consider this to be critical because for some peripherals, a read may cause the peripheral to change it's flow of execution and do something else, effectively putting the peripheral in an unknown state.

GPIO_Test_CMSIS():     UART_Test_CPPREG():
  mov r3, #1207959552    mov r3, #1207959552
  movs r2, #33           ldrb r2, [r3] @ zero_extendqisi2
  strb r2, [r3]          movs r2, #33
  movs r2, #17           strb r2, [r3]
  strb r2, [r3]          ldrb r2, [r3] @ zero_extendqisi2
  movs r2, #144          movs r2, #17
  strb r2, [r3]          strb r2, [r3]
  movs r2, #255          ldrb r2, [r3] @ zero_extendqisi2
  strb r2, [r3]          movs r2, #144
  movs r2, #1            strb r2, [r3]
  strb r2, [r3]          ldrb r2, [r3] @ zero_extendqisi2
  movs r2, #2            movs r2, #255
  strb r2, [r3]          strb r2, [r3]
  movs r2, #3            ldrb r2, [r3] @ zero_extendqisi2
  strb r2, [r3]          movs r2, #1
  bx lr                  strb r2, [r3]
                         ldrb r2, [r3] @ zero_extendqisi2
                         movs r2, #2
                         strb r2, [r3]
                         ldrb r2, [r3] @ zero_extendqisi2
                         movs r2, #3
                         strb r2, [r3]
                         bx lr

Solution

This brings two potential ways to fix this. First I was thinking to just put in a check in the ::write() that, if the width of the field is equal to the width of the register then perform the write like in the write_only policy. This should be an easy way to fix the issue.

But then I noticed that there are some situations in which there are writes to a register field during which the state of the other fields do not need to be kept. This is probably going into fairly fancy enhancement territory, which wouldn't be trivial to implement, but ~~possibly adding an optional argument to the creation of a field that lets you specify if you want the contents to be maintained in between writes to other fields~~. Just noticed this means you cannot change which field to ignore during writes. Instead, maybe for chained writes it would be possible to have a chained write accept either a integral or a null value. If the null value is present then it does not attempt retain that field contents during the write chain.

For example.

// CppReg example
struct UART {
    static constexpr uintptr_t UART0_BASE = 0x48000000;

    struct FIFO : cppreg::Register<UART0_BASE + 0, 8u> {
        using Data = cppreg::Field<FIFO, 7u, 0u, cppreg::read_write>;

        // Set to have UART start dumping FIFO contents. Remains set until done.
        // If a write to this register occurs with this bit set while the bit is already set
        // then undefined behavior can occur (stops current byte transmission and skips to next one).
        using Start = cppreg::Field<FIFO, 1u, 7u, cppreg::read_write>;
    };
};

UART::FIFO::Data::write<0x05>();
UART::FIFO::Data::write<0x15>();
UART::FIFO::merge_write<Start>(1)::with<Data>(std::ignore);
// or ideally
UART::FIFO::merge_write<Start>(1)::with<Data, std::ignore>();

Copy/paste error in register pack example

Channel0 in Field<Channel0, ...> should be Channel0/1/2/3
https://github.com/sendyne/cppreg/blob/42bd134067b361f45b75ceb83e857c64923fd3fe/API.md?plain=1#L154-L165

nicocvn / cppreg Goto Github PK

cppreg's Introduction

cppreg

Description

Requirements

Manifest

cppreg's People

Contributors

Stargazers

Watchers

Forkers

cppreg's Issues

Issue

Solution

Differences

Potential Cause

Solution

Real World Implications

Scenario

Expected behavior.

Solution

Recommend Projects

Recommend Topics

Recommend Org