Rust and Vendor SDKs(II)

As in the previous installment of this article, we’ll take another look at vendor SDKs and their integration with Rust. Today we’ll look at ST’s HAL & Cube ecosystem. While Rust has had quite robust support for ST’s chips (in fact, the STM32 HAL implementation is probably the most mature HAL available for embedded Rust) there’s still a lot to be said for ST’s tooling, for example I absolutely adore the way we get to visually configure clocks in Cube:

IMHO you can’t do much better than this. Compare that to what you’ll have to do to configure clocks using the STM32 PAC:

#![no_std]
#![no_main]

// Imports
use cortex_m_rt::entry;
use panic_halt as _;
use stm32f401_pac as pac;

#[entry]
fn main() -> ! {
    // Setup handler for device peripherals
    let dp = pac::Peripherals::take().unwrap();

    // Enable HSE Clock
    dp.RCC.cr.write(|w| w.hseon().set_bit());

    // Wait for HSE clock to become ready
    while dp.RCC.cr.read().hserdy().bit() {}

    // Configure PCLK1 Prescalar
    dp.RCC.cfgr.write(|w| unsafe { w.ppre1().bits(0b100) });

    // Configure PLL M
    dp.RCC.pllcfgr.write(|w| {
        w.pllm5()
            .bit(false)
            .pllm4()
            .bit(false)
            .pllm3()
            .bit(true)
            .pllm2()
            .bit(false)
            .pllm1()
            .bit(false)
            .pllm0()
            .bit(false)
    });

    // Configure PLL N
    dp.RCC.pllcfgr.write(|w| {
        w.plln8()
            .set_bit()
            .plln7()
            .clear_bit()
            .plln6()
            .set_bit()
            .plln5()
            .clear_bit()
            .plln4()
            .set_bit()
            .plln3()
            .clear_bit()
            .plln2()
            .clear_bit()
            .plln1()
            .clear_bit()
            .plln0()
            .clear_bit()
    });

    // Configure PLL P
    dp.RCC
        .pllcfgr
        .write(|w| w.pllp0().bit(true).pllp1().bit(false));

    // Enable PLL
    dp.RCC.cr.write(|w| w.pllon().set_bit());

    // Wait for PLL to become ready
    while dp.RCC.cr.read().pllrdy().bit() {}

    // Select PLL as System Clock Source
    dp.RCC.cfgr.write(|w| w.sw1().set_bit().sw0().clear_bit());

    // Wait for PLL to be selected as System Clock Source
    while dp.RCC.cfgr.read().sws1().bit_is_set() && dp.RCC.cfgr.read().sws0().bit_is_clear() {}

    //Enable Clock to GPIOA
    dp.RCC.ahb1enr.write(|w| w.gpioaen().set_bit());

    //Configure PA5 as Output
    dp.GPIOA.moder.write(|w| unsafe { w.moder5().bits(0b01) });

    // Set PA5 Output to High signalling end of configuration
    dp.GPIOA.odr.write(|w| w.odr5().set_bit());

    loop {}
}

(Source: https://dev.to/apollolabsbin/stm32f4-embedded-rust-at-the-pac-system-clock-configuration-39j1).

The above is probably a somewhat extreme example, then again, this wouldn’t look all that much different, when written in C. That all being said: I don’t like the Rust version at all for several reasons:

  • It is very convoluted an hard to understand
  • The use of lambdas to write to a register makes this even harder to understand
  • The code does too little to deal with the actual domain (clocks & frequencies) and talks to much about bits and registers.
  • Looking at the code I will need the datasheet to figure out which frequencies end up being generated for which part of the clock tree.

Yuck, okay, but that’s the very low level part, the stuff we get, if no HAL implementation is available, and we wanted to look at HALs. So, another example, this time using the actual ST HAL:

let dp = pac::Peripherals::take().unwrap();
let rcc = dp.RCC.constrain();
let clocks = rcc
    .cfgr
    .use_hse(8.MHz())
    .sysclk(168.MHz())
    .pclk1(24.MHz())
    .i2s_clk(86.MHz())
    .require_pll48clk()
    .freeze();
    // Test that the I2S clock is suitable for 48000kHz audio.
    assert!(clocks.i2s_clk().unwrap() == 48.MHz().into());

(taken from https://docs.rs/stm32f4xx-hal/latest/stm32f4xx_hal/rcc/index.html)

This is arguably a lot better in that it captures the language of the domain and is very compact. I also like how the designers used a fluent builder interface here. However – at least for me – this is still somewhat confusing:

  • the call to i2s_clk gets told to use 86 MHz
  • But apparently we should end up with the i2s_clk being 48 Mhz in the assert. There’s obviously a prescaler here somewhere, but that is sadly not visible.
  • At last, and this is a nitpick, I know, naming the method, that actually configures the clock with the settings done before, “freeze” is IMHO a bit dubious and not self explaining (the documentation helps)
  • The same goes for the “constrain” method, although here the documentation is not very helpful, as it just states “Constrains the RCC peripheral so it plays nicely with the other abstractions”. This is about as helpfuls as “the ‘doStuff’ function does stuff”.

Again, this is nitpicking and something I would probably not let pass in a code review but not a serious problem. Still I like the obviousness of Cube better. If we were to look at how peripherals are configured in Rust the story would be similar, i.e. you can use a pure Rust approach, however there is stuff to be said about the existing HAL (among other things, that it is pretty mature), so how do we get it to play nicely with a Rust application?

The story is – as expected – pretty much the same as it is with TI’s MSPM0, i.e. we’ll generate a bunch of bindings using the build script.There’s not all that much to it. If you’ve followed the MSP example you’ll exactly know what to do.

The Build

We’ll start in CubeMX and generate a makefile project. I used a random Nucleo board I had lying around as base and added a configuration for the onboard “LPUART” peripheral. I won’t bother you with the details as that is STM32 1×1.

Generating the code will leave us with the following directory structure:

We’ll use the – by now – well known standard approach and build the Rust part of our application as static library, so our Cargo.toml file looks as follows:

[package]
name = "rustm32"
version = "0.1.0"
edition = "2021"

[lib]
crate-type=["staticlib"]

[dependencies]

To start executing Rust code as early as possible, we’ll add a simple entry point in our lib.rs file:

#![no_std]

#[no_mangle]
fn rs_main() -> !{
    loop{}
}

#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
    loop {}
}

Compiling this as thumbv7em-none-eabi will produce a .a file in the target directory:

To complete the first step, all that remains is setting up the code generated by CubeMX to call into Rust, and to tell the makefile to link our library.

First, we edit main.c as follows:

/* Private user code ---------------------------------------------------------*/
/* USER CODE BEGIN 0 */
extern void rs_main();
/* USER CODE END 0 */

/**
  * @brief  The application entry point.
  * @retval int
  */
int main(void)
{
  /* USER CODE BEGIN 1 */
  rs_main();
  /* USER CODE END 1 */

Since annotaded rs_main in lib.rs with the “no_mangle” attribute it will be available as a plain C symbol and can be used as extern function to call from C as shown above.

Finally, we add the library to our makefile:

# libraries
LIBS = -lc -lm -lnosys -Wl,-Bstatic E:/code/ruststm32nohal/rustm32/target/thumbv7em-none-eabihf/debug/librustm32.a
LIBDIR = 

The above is obviously super dirty and not advised for production use. Ideally we’d use a CMake based build, but we want to talk about the ST Hal in this post…

Building everything we have right now, should yield an executable ELF binary. The final missing bit is, to generate the acutal binding to the ST HAL. We’ll use a decidedly simple buildscript for that:

extern crate bindgen;
use std::path::{PathBuf};

fn main() {
    let bindings = bindgen::Builder::default()
    .use_core()
    .detect_include_paths(true)
    .clang_arg("-I./Core/Inc")
    .clang_arg("-I./Drivers/STM32WBxx_HAL_Driver/Inc")
    .clang_arg("-I./Drivers/CMSIS/Device/ST/STM32WBxx/Include")
    .clang_arg("-I./Drivers/CMSIS/Include")
    .clang_arg("-DUSE_HAL_DRIVER")
    .clang_arg("-DSTM32WB55xx")
    .header("./Core/Inc/wrapper.h")
    .generate()
    .expect("Unable to generate bindings");

    bindings
    .write_to_file(PathBuf::from("./src/bindings.rs"))
    .expect("Couldn't write bindings!");
}

I created “wrapper.h” for this purpose. It however only includes “main.h”. The reason for this is to seperate binding generation from the regular code. Rebuilding the Rust part of the project will now yield “bindings.rs”, which is just a regular Rust module. We can now call into the HAL easily:

#![no_std]

#[allow(clippy::all)]
mod bindings {
    include!("bindings.rs");
}
pub use bindings::*;

extern "C" {   
    pub static mut hlpuart1: bindings::UART_HandleTypeDef;
    
}

#[no_mangle]
fn rs_main() -> !{

    unsafe
    {
        bindings::HAL_UART_Transmit(&mut hlpuart1, 
            b"Hello World!\r\n".as_ptr(), 13, 1000);
    }
    loop{}
}

#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
    loop {}
}

There’s a couple of things to note:

  • We added the generated bindings using an “include!” macro. The whole construct should keep clippy somewhat quite. Otherwise we will get a load of warnings which is obviously annoying
  • We made the handle to “hlpuart1” available via an extern “C” block. This works pretty much the same as the “extern” keyword in C does, i.e. it will create a link time dependency to a symbol named “hlpuart1”. That symbol was generated by CubeMX for us in main.c

Wrap Up, Part 2

As expected, compiling for the STM did not prove too challenging. We started off with a C SDK and added a Rust library on top of it. The benefit, compared to using the embedded HAL here is, that we get to use CubeMX. With the above setup we can just load up the .ioc file in Cube and make changes to the chip configuration and have those changes available in the application right away (adding new peripherals will obviously require us to add new “extern” definitions as in the example). The build script, we needed to generate the binding is a lot simpler, than what was needed for the MSPM0 (and it was a lot simpler to figure out what was needed). It is also trivially easy to get a useful build by wrapping all this up into a CMake script, that controls both the generated makefile and cargo. Now, should you use this approach for Rust projects that target ST chips, or should you rather use the embedded HAL? It depends. If you are targetting an ST chip, which has no HAL implementation available, this approach is probably your best bet as, it is – IMHO – better than using a PAC and only control registers (this is the case for the STM32WB line of MCUs we used in this example). Also, if you need more sophisticated access to the MCU peripheral than the HAL provides (e.g. if you want timeouts when reading from UARTs) the presented approach will suit you better than the embedded HAL. And lastly, if you, like me, actually like the ease, with wich CubeMX allows us to configure peripherals, this approach is obviously also for you. If you need an easily portable app and don’t want do create a HAL, that is closely matched to your needs, the embedded HAL is probably a good choice.

In the end I’ll rate the ST HAL 5/5.

Leave a Reply

Your email address will not be published. Required fields are marked *