Avoiding Verilog's Non-determinism, Part 2
At the end of my last post I promised I would have another non-determinism (AKA, race condition) example from recent real-life experience. Here it comes.
Before I show you any code I want to explain how this race condition was introduced. We had a signal in an interface that needed to be widened. We had a function in some simulation-only code that looked at part of that signal and didn't care about the new bits that were added. The engineer who widened the signal decided not to change the function and instead added a new variable and assigned (using the assign
keyword) the bits of interest from the newly widened signal to this new variable. He then passed this new variable to the original function in place of the original newly-widened one. Seems reasonable, right? Well, after he made that change some tests started failing and after some digging it began to look like a race condition, but it wasn't obvious where the race was coming from. The problem was that assign
statement, because assign
creates yet another process. The original newly widened signal was given a value in one process and the assign
ed variable got updated in a separate process. There was now a race between those two values (the newly-widened one and the assign
ed one) to get to the third process that consumed them (the process that includes the previously mentioned function).
Got all that? Here's the boiled-down code that illustrates what was going on:
module top;
reg [3:0] foo;
wire foo_0;
reg ready;
assign foo_0 = foo[0];
initial begin
#10;
foo = 'h0;
ready = 0;
#10;
foo = 'hf;
ready = 1;
#10;
foo = 'h0;
ready = 0;
end
initial begin
forever begin
@(ready);
$display("foo[0]: 'b%0b", foo[0]);
$display("foo_0: 'b%0b", foo_0);
$display("------------");
end
end
endmodule
And of course, an EDA Playground version that you can play with. See the differences with different simulators? It was interesting that GPL Cver and Veriwell give the same results as the simulator I use at work (Cadence): the value of foo_0
is always a step behind the value of foo[0]
. Also interesting that if I change the wire to a reg Cadence changes and gives the same results as Modelsim and Icarus (plain verilog won't allow that to be a reg, that's why I made it a wire for EDA Playground), which was not the behavior I was getting in the larger production testbench code (foo_0
was a step behind there). Non-determinism in the flesh.
The solution was to get rid of the new variable (foo_0
) and the assign
and just pass the foo
to the function. Modifying the function to deal with a wider foo
wasn't too difficult and the race between foo
and foo_0
was eliminated. Easy, right? Trust me, fixing the problem wasn't the hard part. Identifying the root cause was much, much harder.
I'm understanding Jan Decaluwe's concerns with Verilog more and more as I encounter and dig into these race conditions. I think before reading his blog entries on Sigasi I thought concurrency and non-determinism had to go hand in hand. My embedded software background and experience with real-life concurrency was a large contributer to that opinion: isn't Verilog just modeling concurrency accurately when it's non-deterministic? Jan's point, I believe, is that the pain of this "accurate" modeling doesn't really buy you anything, and because processes in Verilog spring up and multiply almost (ok, definitely) without you noticing it can be very difficult to have them communicate reliably. In software like C you know exactly when you are creating another thread but in Verilog it is not so obvious. RTL Verilog has it easy because you can follow a basic (if over-restrictive) guideline (synchronize to the same clock and use non-blocking assignments in clocked processes), but in higher-level simulation-only code I don't know of an easy guideline you can follow. I guess the guideline is this: learn what creates a process and think about values racing from one process to another.
Comments
Here's the equivalent in VHDL and note that I only use signal assignments:
use std.textio.all;
entity top is
end entity;
architecture behave of top is
signal foo: bit_vector(3 downto 0);
signal foo_0: bit;
signal ready: bit;
begin
-- equivalent of assign
foo_0 <= foo(0);
initial1: process
begin
wait for 10 ns;
foo <= (others=>'0');
ready <= '0';
wait for 10 ns;
foo <= (others=>'1');
ready <= '1';
wait for 10 ns;
foo <= (others=>'0');
ready <= '0';
wait;
end process initial1;
initial2: process(ready)
variable L: line;
begin
write(L, string'("foo[0]:"));
write(L, foo(0));
write(L, string'(" foo_0:"));
write(L, foo_0);
writeline(OUTPUT,L);
write(L, string'(" -----------"));
writeline(OUTPUT,L);
end process initial2;
end behave;
Here's the output from questa:
# Loading std.standard
# Loading std.textio(body)
# Loading work.top(behave)#1
VSIM 1> run -a
# foo[0]:0 foo_0:0
# -----------
# foo[0]:1 foo_0:0
# -----------
# foo[0]:0 foo_0:1
# -----------
Paul.
Paul.
http://www.edaplayground.com/x/3ak
Thanks for putting that together.
With a different simulator, the original modified test bench might not have failed. The code would have become fragile, but no one would know yet.