At the end of my last post I promised I would have another non-determinism (AKA, race condition) example from recent real-life experience. Here it comes.
Before I show you any code I want to explain how this race condition was introduced. We had a signal in an interface that needed to be widened. We had a function in some simulation-only code that looked at part of that signal and didn't care about the new bits that were added. The engineer who widened the signal decided not to change the function and instead added a new variable and assigned (using the
assign keyword) the bits of interest from the newly widened signal to this new variable. He then passed this new variable to the original function in place of the original newly-widened one. Seems reasonable, right? Well, after he made that change some tests started failing and after some digging it began to look like a race condition, but it wasn't obvious where the race was coming from. The problem was that
assign statement, because
assign creates yet another process. The original newly widened signal was given a value in one process and the
assigned variable got updated in a separate process. There was now a race between those two values (the newly-widened one and the
assigned one) to get to the third process that consumed them (the process that includes the previously mentioned function).
Got all that? Here's the boiled-down code that illustrates what was going on:
module top; reg [3:0] foo; wire foo_0; reg ready; assign foo_0 = foo; initial begin #10; foo = 'h0; ready = 0; #10; foo = 'hf; ready = 1; #10; foo = 'h0; ready = 0; end initial begin forever begin @(ready); $display("foo: 'b%0b", foo); $display("foo_0: 'b%0b", foo_0); $display("------------"); end end endmodule
And of course, an EDA Playground version that you can play with. See the differences with different simulators? It was interesting that GPL Cver and Veriwell give the same results as the simulator I use at work (Cadence): the value of
foo_0 is always a step behind the value of
foo. Also interesting that if I change the wire to a reg Cadence changes and gives the same results as Modelsim and Icarus (plain verilog won't allow that to be a reg, that's why I made it a wire for EDA Playground), which was not the behavior I was getting in the larger production testbench code (
foo_0 was a step behind there). Non-determinism in the flesh.
The solution was to get rid of the new variable (
foo_0) and the
assign and just pass the
foo to the function. Modifying the function to deal with a wider
foo wasn't too difficult and the race between
foo_0 was eliminated. Easy, right? Trust me, fixing the problem wasn't the hard part. Identifying the root cause was much, much harder.
I'm understanding Jan Decaluwe's concerns with Verilog more and more as I encounter and dig into these race conditions. I think before reading his blog entries on Sigasi I thought concurrency and non-determinism had to go hand in hand. My embedded software background and experience with real-life concurrency was a large contributer to that opinion: isn't Verilog just modeling concurrency accurately when it's non-deterministic? Jan's point, I believe, is that the pain of this "accurate" modeling doesn't really buy you anything, and because processes in Verilog spring up and multiply almost (ok, definitely) without you noticing it can be very difficult to have them communicate reliably. In software like C you know exactly when you are creating another thread but in Verilog it is not so obvious. RTL Verilog has it easy because you can follow a basic (if over-restrictive) guideline (synchronize to the same clock and use non-blocking assignments in clocked processes), but in higher-level simulation-only code I don't know of an easy guideline you can follow. I guess the guideline is this: learn what creates a process and think about values racing from one process to another.