What if you took a boost converter and converted it to the rotational mechanical domain? Switching CVT!
At the University of Twente, they teach Bond Graphs, a modelling system for multi-domain systems that is just perfect for this job. Unlike domaing-specific systems or block diagrams, Bond Graphs model a system as abstract connections of power. Power is what you get when you multiply an effort with a flow. The two examples we’re interested is voltage × current and force × velocity, or to be exact, angular momentum × angular velocity.
Here is a schematic of a boost converter (source). It goes from a high voltage (effort, force) to a low voltage, but from a high current (flow, velocity) to a low current. It works by charging the inductor by shorting it to ground, and then discharging it via the diode into the capacitor.
The classic example of model equivalence is that an electrical inductor-capacitor-resistor system behaves equivalent to a mechanical mass-spring-damper system. In the rotational domain, the equivalent of a switch is a clutch, and the equivalent of a diode is a ratchet. So we have all we need to convert the system! Step one is making the bond graph from the electrical system.
Quick Bond Graph primer if you’re too lazy to read the Wikipedia page. Se is a source of effort. R, I, and C are generalized resistance, inertance, and compliance. mR is a modulated resistance I used for the switch/clutch. D is a diode/ratchet that I just made up. 0 junctions are sum of flows, equal effort. 1 junctions are sum of effort, equal flow. An ideal electrical net has equal voltage (effort), and a sum of currents, but a mechanical joint has an equal velocity (flow), but a sum of forces. With that in mind, we can convert the bond graph to the mechanical system.
I’m not sure if those are even remotely sane mechanical symbols, so I added labels just in case. The motor spins up a flywheel, and then when the clutch engages it winds up the spring. Then when the clutch is disengaged, the ratched keeps the spring wound up, driving the output while the motor can once more spin up the flywheel.
It works exactly analogous to the boost converter, and also suffers from the same problems. Most ciritically, switching/clutching losses. I imagine applying PWM to your clutch will at best wear it down quickly, and maybe just make it go up in smoke. Like with a MOSFET, the transition period where there is a nonzero effort and flow on the clutch, there is power loss and heat.
Anyway, I decided to build it in LEGO to see if it’d work. I used a high-speed ungeared motor that can drive absolutely no load at all, and connected it with a 1:1 gear ratio to the wheels with only a flywheel, clutch, ratchet, and spring inbetween. This proves that there is actually power conversion going on!
If you get rich making cars with this CVT system, please spare me a few coins. If you burn out your clutch… I told you so ;)
FPGAs are amazing in their versatility, but can be a real chore when you have to map out a giant state machine just to talk to some chip over SPI. For such cases, nothing beats just downloading an Arduino library and quickly hacking some example code. Or would there be a way to combine the versatility of an FPGA with the ease of Arduino libraries? That is the question I want to explore in this post.
Of course you can use an f32c softcore on your FPGA as an Arduino, but that’s a precompiled core, and basically doesn’t give you the ability to use your FPGA powers. Or you can build your own SoC with custom HDL components, but then you’re back to bare-metal programming.
Unless you can tap into an existing library ecosystem by writing a hardware abstraction layer for your SoC. And that is exactly what I’ve done by writing a Rust embedded HAL crate that works for any LiteX SoC!
LiteX allows you to assemble a SoC by connecting various components to a common Wishbone bus. It supports various RISC-V CPU’s (and more), and has a library of useful components such as GPIO and SPI, but also USB and Ethernet. These all get memory-mapped and can be accessed via the Wishbone bus by the CPU and other components.
The amazing thing is that LiteX can generate an SVD file for the SoC, which contains all the registers of the components you added to the SoC. This means that you can use svd2rust to compile this SVD file into a peripheral access crate.
This PAC crate abstracts away memory addresses, and since the peripherals themselves are reusable components, it is possible to build a generic HAL crate on top of it that supports a certain LiteX peripheral in any SoC that uses it. Once the embedded HAL traits are implemented, you can use these LiteX peripherals with every existing Rust crate.
The first step is to install LiteX I’m installing into a virtualenv to keep my system clean. While we’re going to use Rust, gcc is still needed for compiling the LiteX BIOS and for some objcopy action.
Now we need to make some decisions about which FPGA board and CPU we’re going to use. I’m going to be using my ULX3S, but LiteX supports many FPGA boards out of the box, and others can of course be added. For the CPU we have to pay careful attention to match it with an architecture that Rust supports. For example Vexrisc supports the im feature set by default, which is not a supported Rust target, but it also supports an i and imac variant, both of which Rust supports. PicoRV32 only supports i or im, so can only be used in combination with the Rust i target.
So let’s go ahead and make one of those. I’m going with the Vexrisc imac variant, but on a small iCE40 you might want to try the PicoRV32 (or even Serv) to save some space. Of course substitute the correct FPGA and SDRAM module on your board.
Most parameters should be obvious. --csr-svd ulx3s.svd tells LiteX to generate an SVD file for your SoC. You can omit --build and --load and manually do these steps by going to the build/ulx3s/gateware/ folder and running build_ulx3s.sh. I also prefer to use the awesome openFPGALoader rather than the funky ujprog with a sweet openFPGALoader --board ulx3s ulx3s.bit.
Now it is time to generate the PAC crate with svd2rust. This crate is completely unique to your SoC, so there is no point in sharing it. As long as the HAL crate can find it you’re good. Follow these instructions to create a Cargo.toml with the right dependencies. In my experience you may want to update the version numbers a bit. I had to use the latest riscv and riscv-rt to make stuff work, but keep the other versions to not break the PAC crate.
cargo new --lib litex-pac
cd litex-pac/src
svd2rust -i ulx3s.svd --target riscv
cd ..
vim Cargo.toml
Now we can use these instructions to create our first Rust app that uses the PAC crate. I pushed my finished example to this repo. First create the app as usual, and add dependencies. You can refer to the PAC crate as follows.
litex-pac={path="../litex-pac",features=["rt"]}
Then you need to create a linker script that tells the Rust compiler where to put stuff. Luckily LiteX generated the important parts for us, and we only have to define the correct REGION_ALIAS expressions. Since we will be using the BIOS, all our code will get loaded in main_ram, so I set all my aliases to that. It is possible to load code in other regions, but my attempts to put the stack in SRAM failed horribly when the stack grew too large, so better start with something safe and then experiment.
Next, you need to actually tell the compiler about your architecture and linker scripts. This is done with the .cargo/config file. This should match the Rust target you installed, so be mindful if you are not using imac. Note the regions.ld file that LiteX generated, we’ll get to that in the next step.
Then create flash.sh with the following content, which allows cargo run to upload the binary.
#!/usr/bin/env bashset-e# Create bin file
riscv64-elf-objcopy $1-O binary $1.bin
# Program FPGA
litex_term --kernel litex-example.bin /dev/ttyUSB0
The final step before jumping in with the Rust programming is writing a build.rs file that copies the linker scripts to the correct location for the compiler to find them. I mostly used the example provided in the instructions, but added a section to copy the LiteX file. export BUILD_DIR to the location where you generated the LiteX SoC.
letmutf=File::create(&dest_path.join("regions.ld")).expect("Could not create file");f.write_all(include_bytes!(concat!(env!("BUILD_DIR"),"/software/include/generated/regions.ld"))).expect("Could not write file");
That’s it. Now the code you compile will actually get linked correctly. I found these iCEBreaker LiteX examples very useful to get started. This code will actually run with minimal adjustment on our SoC, and is a good start to get a feel for how the PAC crate works. Another helpful command is to run cargo doc --open in the PAC crate to see the generated documentation.
You can now use the runner script to convert the ELF executable to a raw binary and upload it. Don’t forget to press the reset button after uploading or nothing will happen.
cargo run --release
From here we “just” need to implement HAL traits on top of the PAC to be able to use almost any embedded library in the Rust ecosystem. However, one challenge is that the peripherals and their names are not exactly set in stone. The way that I solved it is that the HAL crate only exports macros that generate HAL trait implementations. This way your SoC can have 10 SPI cores and you just have to call the spi macro to generate a HAL for them. I uploaded the code in this repo.
Of course so far we’ve only used the default SoC defined for the ULX3S. The real proof is if we can add a peripheral, write a HAL layer for it, and then use an existing library with it. I decided to add an SPI peripheral for the OLED screen. First I added the following pin definition
This change has actually been accepted upstream, so now you can just add the --add-oled command line option and you get a brand new SoC with an SPI controller for the OLED display. Once the PAC is generated again and the FullDuplex trait has been implemented for it, it is simply a matter of adding the SSD1306 or SDD1331 crate, and copy-pasting some example code. Just as easy as an Arduino, but on your own custom SoC!
I believe in the importance of open source synthesis, and think it’s important that open source tools support both Verilog and VHDL. Even though my GSoC proposal to add VHDL support to Yosys was rejected, I’ve still been contributing small bits and pieces to GHDL and its Yosys plugin.
This week we reached what I think is an important milestone: I was able to synthesize my VHDL CPU and then formally verify the ALU of it using completely open source tools. (and then synthesize it to an FPGA which is not supported by open source tools yet) There is a lot to unpack here, so let’s jump in.
Yosys, Nextpnr, SymbiYosys, GHDL, ghdlsynth-beta
Yosys is an open source synthesis tool that is quickly gaining momentum and supporting more and more FPGAs. Yosys currently supports Verilog, and turns that into various low-level netlist representations.
Nextpnr is a place-and-rout tool, which takes a netlist and turns it into a bitstream for any of the supported FPGA types. These bitstream formats are not publicly documented, so this is a huge reverse-engineering effort.
SymbiYosys is a tool based around Yosys and various SAT solvers to let you do formal verification on your code. More on formal verification later. But important to know is that it works on the netlists generated by Yosys.
GHDL is an open source VHDL simulator, and as far as I know, one of its kind. VHDL is notoriously hard to parse, so many other open source attempts at VHDL simulation and synthesis have faltered. Work is underway to add synthesis to GHDL.
And last but not least, ghdlsynth-beta is a plugin for Yosys that converts the synthesis format of GHDL to the intermediate representation of Yosys, allowing it to be synthesized to various netlist formats and used for FPGA, ASIC, formal verification, and many other uses. It is currently a separate repository, but the goal is to eventually upstream it into Yosys.
Formal Verification
I think formal verification sounds harder and more scary than it is. An alternative description is property testing with a SAT solver. Think Quickcheck, not Coq. This is much simpler and less formal than using a proof assistent.
Basically you describe properties about your code, and SymbiYosys compiles your code an properties to a netlist and from a netlist to Boolean logic. A SAT solver is then used to find inputs to your code that (dis)satisfy the properties you described. This does not “prove” that your code is correct, but it proves that it satisfies the properties you defined.
In hardware description languages you describe properties by assertions and assumptions. An assumption constrains what the SAT solver can consider as valid inputs to your program, and assertions are things you believe to be true about your code.
The powerful thing about formal verification is that it considers all valid inputs at every step, and not just the happy case you might test in simulation. It will find so many edge cases it’s not even funny. Once you get the hang of it, it’s actually less work than writing a testbench. Just a few assertions in your code and the bugs come flying at you.
If you want to learn more about formal verification, Dan Gisselquist has a large number of articles and tutorials about it, mainly using Verilog.
Installation
To play along at home, you need to install a fair number of programs, so better get some of your favourite hot beverage.
At this point you should be able to run ghdl --synth foo.vhd -e foo which will output a VHDL representation of the synthesized netlist. You should be able to run yosys -m ghdl and use the Yosys command ghdl foo.vhd -e foo to obtain a Yosys netlist which you can then show, dump, synth, or even write_verilog.
Verifying a bit-serial ALU
To demonstrate how formal verification works and why it is so powerful, I want to walk you through the verification of the ALU of my CPU.
I’m implementing a bit-serial architecture, which means that my ALU operates on one bit at a time, producing one output bit and a carry. The carry out is then the carry in to the next bit. The logic that produces the output and the carry depends on a 3-bit opcode.
process(opcode,a,b,ci)begincaseopcodeiswhen"000"=>-- addy<=axorbxorci;-- outputco<=(aandb)or(aandci)or(bandci);-- carrycr<='0';-- carry reset value-- [...]endcase;endprocess;process(clk,rst_n)beginif(rising_edge(clk))thenif(rst_n='0')thenci<=cr;-- reset the carryelseci<=co;-- copy carry out to carry inendif;endif;endprocess;
Important to note is the carry reset value. For addition, the first bit is added without carry, but for subtraction the carry is 1 because -a = (not a) + 1, and similarly for other different opcodes. So when in reset, the ALU sets the carry in to the reset value corresponding to the current opcode.
So now onward to the verification part. Since VHDL only has assert and none of the SystemVerilog goodies, Property Specification Language is used. (that link contains a good tutorial) PSL not only provides restrict, assume, and cover, but also allows you to express preconditions and sequences.
To make my life easier, I want to specify that I want to restrict valid sequences to those where the design starts in reset, processes 8 bits, back to reset, and repeat, so that the reset will look like 011111111011111111...
restrict { {rst_n = '0'; (rst_n = '1')[*8]}[+]};
Then, I want to specify that when the ALU is active, the opcode will stay constant. Else you’ll just get nonsense.
Note that I did not define any clock or inputs. Just limiting the reset and opcode is sufficient. With those assumptions in place, we can assert what the output should look like. I shift the inputs and outputs into 8-bit registers, and then when the ALU goes into reset, we can verify the output. For example, if the opcode is “000”, the output should be the sum of the two inputs.
And now all that’s left to do is write the SymbiYosys script and run it. The script just specifies how to compile the files and the settings for the SAT solver. Note that -fpsl is required for reading --psl code in comments, or --std=08 to use VHDL-2008 which supports PSL as part of the core language.
[options]
mode bmc
depth 20
[engines]
smtbmc z3
[script]
ghdl --std=08 alu.vhd -e alu
prep -top alu
[files]
alu.vhd
To load the GHDL plugin, SymbiYosys has to be run as follows:
Oh no! We have a bug! Let’s open the trace to see what went wrong.
gtkwave alu/engine_0/trace.vcd
So we’re doing a subtraction, and according to my math 29-150=-121 but the ALU output is -122, so we’re off by one. A little head-scratching later, we can see the problem: On the first cycle of the subtraction the carry in is zero rather than one! Why? Because on the previous clock cycle the instruction was exclusive or, which reset the carry in to zero.
Note that this bug would never show up if you did a test bench that executes a fixed instruction from reset. But the SAT solver managed to find a specific sequence of opcodes that cause the carry to be wrong. Awesome.
So how do we fix it? There are two ways. The first is to change the code to asynchronously determine the carry in. The second is to write you code so the opcode is stable before the ALU comes out of reset, which ended up using less logic. In this case we can change the opcode assumption to
Note that we used the thin arrow |-> rather than the fat arrow |=> now. The fat arrow triggers after the precondition has been met, while the thin arrow overlaps with the end of the precondition. So now we’re saying that when reset became inactive, the opcode is the same as it was while the device was in reset. Let’s try again.
It should be said that all of this is very experimental and you are therefore likely to run into bugs and missing features. I would say that at this point it is feasible to write new code and work around GHDL’s current limitations (or fix them!), but running large existing codebases is unlikely to be successful. (but very much the goal!)
When you run into errors, the first step is to find out if it is a bug in the plugin or GHDL itself.
If you see Unsupported(1): instance X of Y. this means the plugin does not know how to translate a GHDL netlist item to Yosys. These are usually pretty easy to fix. See this pull request for an example. Good to know: Id_Sub is defined in ghdlsynth_gates.h which is generated from netlists-gates.ads. module->addSub is defined in rtlil.h.
If you just see ERROR: vhdl import failed. this likely means GHDL crashed. Run GHDL outside Yosys (ghdl --synth) to see the actual error. Usually it’ll show something like some_package: cannot handle IIR_KIND_SOMETHING (mycode.vhd:26:8) which means that some_pacakge in the src/synth part of GHDL can’t handle some language construct yet. This can be anything from a missing operator to whole language constructs, and the fix can be anything for copy-pasting another operator to a serious project. See this pull request for an example on how to add a missing operator.
If it’s not obvious what is going on, it’s time to break out gdb. It’s important to know that in the GHDL repo there is a .gdbinit that you can source inside gdb. This enables catching exceptions and includes utilities for printing IIR values. If you want to debug inside Yosys, it is helpful to first run the program without breakpoints so all the libraries are loaded and gdb understands there is Ada code involved. Thensource .gdbinit, set breakpoints, and run again. (note: GHDL/Yosys command line arguments are passed to run and not gdb)