Thursday, February 27, 2014

Avoiding Verilog's Non-determinism, Part 2

At the end of my last post I promised I would have another non-determinism (AKA, race condition) example from recent real-life experience. Here it comes.

Before I show you any code I want to explain how this race condition was introduced. We had a signal in an interface that needed to be widened. We had a function in some simulation-only code that looked at part of that signal and didn't care about the new bits that were added. The engineer who widened the signal decided not to change the function and instead added a new variable and assigned (using the assign keyword) the bits of interest from the newly widened signal to this new variable. He then passed this new variable to the original function in place of the original newly-widened one. Seems reasonable, right? Well, after he made that change some tests started failing and after some digging it began to look like a race condition, but it wasn't obvious where the race was coming from. The problem was that assign statement, because assign creates yet another process. The original newly widened signal was given a value in one process and the assigned variable got updated in a separate process. There was now a race between those two values (the newly-widened one and the assigned one) to get to the third process that consumed them (the process that includes the previously mentioned function).

Got all that? Here's the boiled-down code that illustrates what was going on:

module top;
   reg [3:0] foo;
   wire foo_0;
   reg ready;
   
   assign foo_0 = foo[0];

   initial begin
      #10;
      foo = 'h0;
      ready = 0;
      #10;
      foo = 'hf;
      ready = 1;
      #10;
      foo = 'h0;
      ready = 0;
   end

   initial begin
      forever begin
         @(ready);
         $display("foo[0]: 'b%0b", foo[0]);
         $display("foo_0:  'b%0b", foo_0);
         $display("------------");
      end
   end
endmodule

And of course, an EDA Playground version that you can play with. See the differences with different simulators? It was interesting that GPL Cver and Veriwell give the same results as the simulator I use at work (Cadence): the value of foo_0 is always a step behind the value of foo[0]. Also interesting that if I change the wire to a reg Cadence changes and gives the same results as Modelsim and Icarus (plain verilog won't allow that to be a reg, that's why I made it a wire for EDA Playground), which was not the behavior I was getting in the larger production testbench code (foo_0 was a step behind there). Non-determinism in the flesh.

The solution was to get rid of the new variable (foo_0) and the assign and just pass the foo to the function. Modifying the function to deal with a wider foo wasn't too difficult and the race between foo and foo_0 was eliminated. Easy, right? Trust me, fixing the problem wasn't the hard part. Identifying the root cause was much, much harder.

I'm understanding Jan Decaluwe's concerns with Verilog more and more as I encounter and dig into these race conditions. I think before reading his blog entries on Sigasi I thought concurrency and non-determinism had to go hand in hand. My embedded software background and experience with real-life concurrency was a large contributer to that opinion: isn't Verilog just modeling concurrency accurately when it's non-deterministic? Jan's point, I believe, is that the pain of this "accurate" modeling doesn't really buy you anything, and because processes in Verilog spring up and multiply almost (ok, definitely) without you noticing it can be very difficult to have them communicate reliably. In software like C you know exactly when you are creating another thread but in Verilog it is not so obvious. RTL Verilog has it easy because you can follow a basic (if over-restrictive) guideline (synchronize to the same clock and use non-blocking assignments in clocked processes), but in higher-level simulation-only code I don't know of an easy guideline you can follow. I guess the guideline is this: learn what creates a process and think about values racing from one process to another.

4 comments:

Paul Marriott said...

This is not a race condition, it's just an example of poor coding where you have different path delays.

Here's the equivalent in VHDL and note that I only use signal assignments:

use std.textio.all;
entity top is
end entity;
architecture behave of top is
signal foo: bit_vector(3 downto 0);
signal foo_0: bit;
signal ready: bit;

begin
-- equivalent of assign
foo_0 <= foo(0);

initial1: process
begin
wait for 10 ns;
foo <= (others=>'0');
ready <= '0';
wait for 10 ns;
foo <= (others=>'1');
ready <= '1';
wait for 10 ns;
foo <= (others=>'0');
ready <= '0';
wait;
end process initial1;

initial2: process(ready)
variable L: line;
begin
write(L, string'("foo[0]:"));
write(L, foo(0));
write(L, string'(" foo_0:"));
write(L, foo_0);
writeline(OUTPUT,L);
write(L, string'(" -----------"));
writeline(OUTPUT,L);
end process initial2;
end behave;

Here's the output from questa:

# Loading std.standard
# Loading std.textio(body)
# Loading work.top(behave)#1
VSIM 1> run -a
# foo[0]:0 foo_0:0
# -----------
# foo[0]:1 foo_0:0
# -----------
# foo[0]:0 foo_0:1
# -----------

Paul.

Paul Marriott said...

Sorry, I don't know how to format the code to preserve the indentation - I couldn't use the < pr e> or < code > tags.

Paul.

Bryan said...

Paul, here's the link to your EDA Playground version of the VHDL code (which is formatted nicely):

http://www.edaplayground.com/x/3ak

Thanks for putting that together.

Jan Decaluwe said...

Saying that the VHDL is equivalent is wrong and misses the crucial point: VHDL guarantees the delta cycle but Verilog does not.

With a different simulator, the original modified test bench might not have failed. The code would have become fragile, but no one would know yet.