Wishful Coding

Didn't you ever wish your computer understood you?

SimServer: Tight Integration with Decoupled Simulators

I am working on Mosaic, a modern, open source schematic entry and simulation program for IC design. With a strong focus on simulation, I want to offer deep integration with the simulator, but also be able to run it on a beefy server and shield my program from simulator crashes. To this end, I have developed an RPC abstraction for interfacing with simulators remotely.

Here is a demo of a short Python script that uses Pandas, Matplotlib, and Cap’n Proto to run a CMOS netlist on Ngspice and Xyce and a behavioural Verilog equivalent on CXXRTL, allowing for easy verification of the transistor implementation.

You can see that the behavioural simulation is nearly instantaneously, while the spice results stream in much slower because they are doing a fully analog transistor simulation. You can see there is a bit of overshoot at the edges, and zooming in on that, you can see minor differences between the analog simulators because Xyce is using an adaptive timestep.

close up of overshoot

Now let’s take a step back and take a look at the design and implementation of this system. There are several reasons why I chose for a simulation server.

  • Ease of installation. Xyce is notoriously hard to install and only works on Linux as far as I know. An RPC protocol allows Xyce to run in a Docker container.
  • Performance. My laptop might not be the best place to run the simulation. An RPC protol allows the simulator to run on a beefy server, while running my user interface locally for a snappy experience.
  • Integration. Running a simulator in batch mode provides no indication of progress and requires setting up and parsing output files. An RPC protocol allows for direct, streaming access to simulation results.
  • Stability. It would not be the first time I’ve seen Ngspice segfault, and I’d hate for it to take the user interface along with it. An RPC protocol allows the same tight integration as its C API without linking the simulator into the GUI.

For the RPC library I settled on Cap’n Proto, but the next question is, what does the actual API look like? Ngspice has quite an extensive API, but the same can’t be said for Xyce and CXXRTL. So I could offer the lowest common denominator API of “load files, run, read results”, but one of my main goals was deep integration, so this is unsatisfactory. What I ended up doing is define small interfaces that expose a single functionality, and use multiple inheritance to assemble simulator implementations.

So I currently have 3 implementations of the run interface, and on top of that Ngspice implements the tran, op, and ac interfaces, with more to follow. I hope that in the future JuliaSpice will be a simulator that provides even deeper integration.

Please check out the code, and let me know your thoughts: github.com/NyanCAD/SimServer (How to expose simulator configuration and other functionality? Can we do remote cosimulation? Any other interesting usecases?)

Meanwhile, here is a demo of the example Python client running a transient and AC simulation on my VPS.

# on my VPS
docker pull pepijndevos/ngspicesimserver:latest
sudo docker run -d -p 5923:5923 pepijndevos/ngspicesimserver:latest
# in the examples folder
python ../client.py ngspice myvps:5923 rc.sp tran 1e-6 2e-3 0 ac 10 1 1e5

transient result AC result

Pepijn de Vos

Switching Continuously Variable Transmission

What if you took a boost converter and converted it to the rotational mechanical domain? Switching CVT!

At the University of Twente, they teach Bond Graphs, a modelling system for multi-domain systems that is just perfect for this job. Unlike domaing-specific systems or block diagrams, Bond Graphs model a system as abstract connections of power. Power is what you get when you multiply an effort with a flow. The two examples we’re interested is voltage × current and force × velocity, or to be exact, angular momentum × angular velocity.

Here is a schematic of a boost converter (source). It goes from a high voltage (effort, force) to a low voltage, but from a high current (flow, velocity) to a low current. It works by charging the inductor by shorting it to ground, and then discharging it via the diode into the capacitor.

boost converter

The classic example of model equivalence is that an electrical inductor-capacitor-resistor system behaves equivalent to a mechanical mass-spring-damper system. In the rotational domain, the equivalent of a switch is a clutch, and the equivalent of a diode is a ratchet. So we have all we need to convert the system! Step one is making the bond graph from the electrical system.

boost converter bond graph

Quick Bond Graph primer if you’re too lazy to read the Wikipedia page. Se is a source of effort. R, I, and C are generalized resistance, inertance, and compliance. mR is a modulated resistance I used for the switch/clutch. D is a diode/ratchet that I just made up. 0 junctions are sum of flows, equal effort. 1 junctions are sum of effort, equal flow. An ideal electrical net has equal voltage (effort), and a sum of currents, but a mechanical joint has an equal velocity (flow), but a sum of forces. With that in mind, we can convert the bond graph to the mechanical system.

mechanical boost converter

I’m not sure if those are even remotely sane mechanical symbols, so I added labels just in case. The motor spins up a flywheel, and then when the clutch engages it winds up the spring. Then when the clutch is disengaged, the ratched keeps the spring wound up, driving the output while the motor can once more spin up the flywheel.

It works exactly analogous to the boost converter, and also suffers from the same problems. Most ciritically, switching/clutching losses. I imagine applying PWM to your clutch will at best wear it down quickly, and maybe just make it go up in smoke. Like with a MOSFET, the transition period where there is a nonzero effort and flow on the clutch, there is power loss and heat.

Anyway, I decided to build it in LEGO to see if it’d work. I used a high-speed ungeared motor that can drive absolutely no load at all, and connected it with a 1:1 gear ratio to the wheels with only a flywheel, clutch, ratchet, and spring inbetween. This proves that there is actually power conversion going on!

lego mechanical boost converter

If you get rich making cars with this CVT system, please spare me a few coins. If you burn out your clutch… I told you so ;)

Pepijn de Vos

A Rust HAL for your LiteX FPGA SoC

ULX3S demo

FPGAs are amazing in their versatility, but can be a real chore when you have to map out a giant state machine just to talk to some chip over SPI. For such cases, nothing beats just downloading an Arduino library and quickly hacking some example code. Or would there be a way to combine the versatility of an FPGA with the ease of Arduino libraries? That is the question I want to explore in this post.

Of course you can use an f32c softcore on your FPGA as an Arduino, but that’s a precompiled core, and basically doesn’t give you the ability to use your FPGA powers. Or you can build your own SoC with custom HDL components, but then you’re back to bare-metal programming.

Unless you can tap into an existing library ecosystem by writing a hardware abstraction layer for your SoC. And that is exactly what I’ve done by writing a Rust embedded HAL crate that works for any LiteX SoC!

LiteX allows you to assemble a SoC by connecting various components to a common Wishbone bus. It supports various RISC-V CPU’s (and more), and has a library of useful components such as GPIO and SPI, but also USB and Ethernet. These all get memory-mapped and can be accessed via the Wishbone bus by the CPU and other components.

The amazing thing is that LiteX can generate an SVD file for the SoC, which contains all the registers of the components you added to the SoC. This means that you can use svd2rust to compile this SVD file into a peripheral access crate.

This PAC crate abstracts away memory addresses, and since the peripherals themselves are reusable components, it is possible to build a generic HAL crate on top of it that supports a certain LiteX peripheral in any SoC that uses it. Once the embedded HAL traits are implemented, you can use these LiteX peripherals with every existing Rust crate.

The first step is to install LiteX. Due to a linker bug in Rust 1.45, I used the 1.46 beta. I’m also installing into a virtualenv to keep my system clean. While we’re going to use Rust, gcc is still needed for compiling the LiteX BIOS and for some objcopy action.

#rustup default beta
virtualenv env
source env/bin/activate
wget https://raw.githubusercontent.com/enjoy-digital/litex/master/litex_setup.py
chmod +x litex_setup.py
./litex_setup.py init install
./litex_setup.py gcc
export PATH=$PATH:$(echo $PWD/riscv64-*/bin/)

Now we need to make some decisions about which FPGA board and CPU we’re going to use. I’m going to be using my ULX3S, but LiteX supports many FPGA boards out of the box, and others can of course be added. For the CPU we have to pay careful attention to match it with an architecture that Rust supports. For example Vexrisc supports the im feature set by default, which is not a supported Rust target, but it also supports an i and imac variant, both of which Rust supports. PicoRV32 only supports i or im, so can only be used in combination with the Rust i target.

So let’s go ahead and make one of those. I’m going with the Vexrisc imac variant, but on a small iCE40 you might want to try the PicoRV32 (or even Serv) to save some space. Of course substitute the correct FPGA and SDRAM module on your board.


cd litex-boards/litex_boards/targets
python ulx3s.py --cpu-type vexriscv --cpu-variant imac --csr-data-width 32 --device LFE5U-85F --sdram-module AS4C32M16 --csr-svd ulx3s.svd --build --load
rustup target add riscv32imac-unknown-none-elf


python ulx3s.py --cpu-type picorv32 --cpu-variant minimal --csr-data-width 32 --device LFE5U-85F --sdram-module AS4C32M16 --csr-svd ulx3s.svd --build --load
rustup target add riscv32i-unknown-none-elf

Most parameters should be obvious. The --csr-data-width 32 parameter sets the register width, which I’m told will be the default in the future, and saves a bunch of bit shifting later on. --csr-svd ulx3s.svd tells LiteX to generate an SVD file for your SoC. You can omit --build and --load and manually do these steps by going to the build/ulx3s/gateware/ folder and running build_ulx3s.sh. I also prefer to use the awesome openFPGALoader rather than the funky ujprog with a sweet openFPGALoader --board ulx3s ulx3s.bit.

Now it is time to generate the PAC crate with svd2rust. This crate is completely unique to your SoC, so there is no point in sharing it. As long as the HAL crate can find it you’re good. Follow these instructions to create a Cargo.toml with the right dependencies. In my experience you may want to update the version numbers a bit. I had to use the latest riscv and riscv-rt to make stuff work, but keep the other versions to not break the PAC crate.

cargo new --lib litex-pac
cd litex-pac/src
svd2rust -i ulx3s.svd --target riscv
cd ..
vim Cargo.toml

Now we can use these instructions to create our first Rust app that uses the PAC crate. I pushed my finished example to this repo. First create the app as usual, and add dependencies. You can refer to the PAC crate as follows.

litex-pac = { path = "../litex-pac", features = ["rt"]}

Then you need to create a linker script that tells the Rust compiler where to put stuff. Luckily LiteX generated the important parts for us, and we only have to define the correct REGION_ALIAS expressions. Since we will be using the BIOS, all our code will get loaded in main_ram, so I set all my aliases to that. It is possible to load code in other regions, but my attempts to put the stack in SRAM failed horribly when the stack grew too large, so better start with something safe and then experiment.


Next, you need to actually tell the compiler about your architecture and linker scripts. This is done with the .cargo/config file. This should match the Rust target you installed, so be mindful if you are not using imac. Note the regions.ld file that LiteX generated, we’ll get to that in the next step.

rustflags = [
  "-C", "link-arg=-Tregions.ld",
  "-C", "link-arg=-Tmemory.x",
  "-C", "link-arg=-Tlink.x",

target = "riscv32imac-unknown-none-elf"

The final step before jumping in with the Rust programming is writing a build.rs file that copies the linker scripts to the correct location for the compiler to find them. I mostly used the example provided in the instructions, but added a section to copy the LiteX file. export BUILD_DIR to the location where you generated the LiteX SoC.

    let mut f = File::create(&dest_path.join("regions.ld"))
        .expect("Could not create file");
    f.write_all(include_bytes!(concat!(env!("BUILD_DIR"), "/software/include/generated/regions.ld")))
        .expect("Could not write file");

That’s it. Now the code you compile will actually get linked correctly. I found these iCEBreaker LiteX examples very useful to get started. This code will actually run with minimal adjustment on our SoC, and is a good start to get a feel for how the PAC crate works. Another helpful command is to run cargo doc --open in the PAC crate to see the generated documentation.

To actually upload the code, you have to convert the binary first.

cargo build --release
cd /target/riscv32imac-unknown-none-elf/release
riscv64-unknown-elf-objcopy litex-example -O binary litex-example.bin
litex_term --kernel litex-example.bin /dev/ttyUSB0

From here we “just” need to implement HAL traits on top of the PAC to be able to use almost any embedded library in the Rust ecosystem. However, one challenge is that the peripherals and their names are not exactly set in stone. The way that I solved it is that the HAL crate only exports macros that generate HAL trait implementations. This way your SoC can have 10 SPI cores and you just have to call the spi macro to generate a HAL for them. I uploaded the code in this repo.

Of course so far we’ve only used the default SoC defined for the ULX3S. The real proof is if we can add a peripheral, write a HAL layer for it, and then use an existing library with it. I decided to add an SPI peripheral for the OLED screen. First I added the following pin definition

    ("oled_spi", 0,
        Subsignal("clk",  Pins("P4")),
        Subsignal("mosi", Pins("P3")),
    ("oled_ctl", 0,
        Subsignal("dc",   Pins("P1")),
        Subsignal("resn", Pins("P2")),
        Subsignal("csn",  Pins("N2")),

and then the peripheral itself

    def add_oled(self):
        pads = self.platform.request("oled_spi")
        pads.miso = Signal()
        self.submodules.oled_spi = SPIMaster(pads, 8, self.sys_clk_freq, 8e6)

        self.submodules.oled_ctl = GPIOOut(self.platform.request("oled_ctl"))

This change has actually been accepted upstream, so now you can just add the --add-oled command line option and you get a brand new SoC with an SPI controller for the OLED display. Once the PAC is generated again and the FullDuplex trait has been implemented for it, it is simply a matter of adding the SSD1306 or SDD1331 crate, and copy-pasting some example code. Just as easy as an Arduino, but on your own custom SoC!

Pepijn de Vos