Thursday, February 27, 2014

Avoiding Verilog's Non-determinism, Part 2

At the end of my last post I promised I would have another non-determinism (AKA, race condition) example from recent real-life experience. Here it comes.

Before I show you any code I want to explain how this race condition was introduced. We had a signal in an interface that needed to be widened. We had a function in some simulation-only code that looked at part of that signal and didn't care about the new bits that were added. The engineer who widened the signal decided not to change the function and instead added a new variable and assigned (using the assign keyword) the bits of interest from the newly widened signal to this new variable. He then passed this new variable to the original function in place of the original newly-widened one. Seems reasonable, right? Well, after he made that change some tests started failing and after some digging it began to look like a race condition, but it wasn't obvious where the race was coming from. The problem was that assign statement, because assign creates yet another process. The original newly widened signal was given a value in one process and the assigned variable got updated in a separate process. There was now a race between those two values (the newly-widened one and the assigned one) to get to the third process that consumed them (the process that includes the previously mentioned function).

Got all that? Here's the boiled-down code that illustrates what was going on:

module top;
   reg [3:0] foo;
   wire foo_0;
   reg ready;
   
   assign foo_0 = foo[0];

   initial begin
      #10;
      foo = 'h0;
      ready = 0;
      #10;
      foo = 'hf;
      ready = 1;
      #10;
      foo = 'h0;
      ready = 0;
   end

   initial begin
      forever begin
         @(ready);
         $display("foo[0]: 'b%0b", foo[0]);
         $display("foo_0:  'b%0b", foo_0);
         $display("------------");
      end
   end
endmodule

And of course, an EDA Playground version that you can play with. See the differences with different simulators? It was interesting that GPL Cver and Veriwell give the same results as the simulator I use at work (Cadence): the value of foo_0 is always a step behind the value of foo[0]. Also interesting that if I change the wire to a reg Cadence changes and gives the same results as Modelsim and Icarus (plain verilog won't allow that to be a reg, that's why I made it a wire for EDA Playground), which was not the behavior I was getting in the larger production testbench code (foo_0 was a step behind there). Non-determinism in the flesh.

The solution was to get rid of the new variable (foo_0) and the assign and just pass the foo to the function. Modifying the function to deal with a wider foo wasn't too difficult and the race between foo and foo_0 was eliminated. Easy, right? Trust me, fixing the problem wasn't the hard part. Identifying the root cause was much, much harder.

I'm understanding Jan Decaluwe's concerns with Verilog more and more as I encounter and dig into these race conditions. I think before reading his blog entries on Sigasi I thought concurrency and non-determinism had to go hand in hand. My embedded software background and experience with real-life concurrency was a large contributer to that opinion: isn't Verilog just modeling concurrency accurately when it's non-deterministic? Jan's point, I believe, is that the pain of this "accurate" modeling doesn't really buy you anything, and because processes in Verilog spring up and multiply almost (ok, definitely) without you noticing it can be very difficult to have them communicate reliably. In software like C you know exactly when you are creating another thread but in Verilog it is not so obvious. RTL Verilog has it easy because you can follow a basic (if over-restrictive) guideline (synchronize to the same clock and use non-blocking assignments in clocked processes), but in higher-level simulation-only code I don't know of an easy guideline you can follow. I guess the guideline is this: learn what creates a process and think about values racing from one process to another.

Wednesday, February 26, 2014

Avoiding Verilog's Non-determinism, Part 1

In my last post we looked at some example code that showed off Verilog's non-determinism. Here it is again (you can actually run it on multiple simulators here on EDA Playground):

module top;
   reg ready;
   integer result;

   initial begin
      #10;
      ready <= 1;
      result <= 5;
   end

   initial begin
      @(posedge ready);
      if(result == 5) begin
       $display("result was ready");
      end
      else begin
       $display("result was not ready");
      end
   end   
endmodule

Just to review from last time, the problem is that sometimes the @(posedge ready) will trigger before result has the value 5 and sometimes it will trigger after result has the value 5. We have called this non-determinism but a more common term for it is, race condition. There is a race between the values of ready and result making it to that second process (the second initial block). If result is updated first (wins the race) then everything runs as the writer of the code intended. If ready is updated first (wins the race) then the result will not actually be ready when the writer of the code intended.

Now the question is, is there a way to write this code so that there is no race condition? Well, first of all I surveyed my body of work on simulation-only code and didn't find very many uses of non-blocking assignments like that. The common advice in the Verilog world is to use non-blocking assignments in clocked always blocks not in "procedural" code like this. If we change the above to use blocking instead of non-blocking assignments, does that fix the problem? Here's what the new first initial block looks like:

   initial begin
      #10;
      ready = 1;
      result = 5;
   end

You can try it on EDA Playground and see that it still behaves the same as it did before except for with GPL Cver. With non-blocking assignments you get "result was not ready" with Cver and now you get "result was ready." That doesn't give me a lot of warm fuzzy feelings though. In fact, looking at that code makes me feel worse. If I'm thinking procedurally it looks totally backwards to set ready to one before assigning the value to result. My instinct would be to write the first initial block like this:

   initial begin
      #10;
      ready = 1;
      result = 5;
   end

Is that better for avoiding race conditions? If I take the explanation for why race-conditions exist in Verilog from Jan Decaluwe's VHDL's Crown Jewel post at face value, I think it actually is. That post explains that right after the first assignment (signal value update, if we use Jan's wording) in the first initial block Verilog could decide to trigger the second process (the second initial block). That case causes problems in the original code because the first assignment is to ready and result doesn't yet have its updated value. With the assignments re-ordered as above even if the second initial block is activated after the first assignment it will not try to read the value of result. It will just block waiting for a posedge ready (which will happen next). Race condition: eliminated. Here is the full fixed code example on EDA Playground.

Strangely enough, I spent the day yesterday debugging and fixing a race condition in our production testbench code here at work. It was very different from this one, so don't get too confident after reading this single blog post. I was able to boil the problem from yesterday down into another small example and so my next post will show off that code and how I eliminated that particular race.

UPDATE: As promised another example of a race condition.

Saturday, February 22, 2014

Is Verilog's Non-determinism Really a Problem?

A series of blog posts by Jan Decaluwe criticize Verilog for being "non-deterministic" and therefore fundamentally broken.

This one has a code example that illustrates the fact. During a lively Twitter conversation about it I fleshed out the example and put it on EDA Playground so we could all view it and, even better, run it with all the simulators that EDA Playground provides and see what actually happens. Click here to see and run the example code yourself.

If you run the example you'll see that it behaves the same for all the simulators available on EDA Playground, except one. Jan explains that they are all compliant with the Verilog specification because the specification allows for either behavior. Jan expertly explains in this post how that can be.

Despite these clear examples and explanations I'm left with the feeling of, why should I care? Apparently a lot of other users of the Verilog language have the same feeling as me. I tried to see Jan's point and ask some honest questions in comments and on Twitter and I learned some more. My first question is, if we were to try and synthesize his example Verilog code, what kind of hardware would we get? Wouldn't it be non-deterministic in exactly the same way as the Verilog code? And therefore wouldn't the non-deterministic Verilog be an accurate model and not a sign of Verilog's brokenness? The tweets I got in reply agreed that nobody would design hardware like this and so that's a non-issue. See:






In other words, if you are writing your Verilog in normal RTL style then the non-determinism is not a problem. When writing Verilog that will not be synthesized (simulation-only code) people rightfully abandon the restrictions of RTL. As Chris pointed out, people could then fall into the trap of writing code like this example. I believe that's true, so let's look more closely at this example and see what's going on.

Each initial block is a process executing concurrently. The first process assigns a value to result and signals the second process that result can be read by assigning a value of one to ready. The second process blocks until the value of ready changes from zero to one. It then immediately reads the value of result. Now, I have some extensive experience writing embedded C-code with multiple threads and processes. In that world you would be insane to synchronize two threads using simple shared variables like this. That's because, similar to Verilog, you can't predict when your two threads will be scheduled and run by the OS, when interrupts will occur, and so forth. Instead you would use a synchronization construct provided by the operating system such as a semaphore or mailbox. So again I ask, why do we care about this Verilog non-determinism? Isn't it just the same as in other software environments?

I think the answer to my own question might be, no it's not the same in plain Verilog. Sure, SystemVerilog added semaphores and mailboxes (for just this reason, I assume) but plain Verilog does not have those. Using shared variables is the only way to synchronize and share information between processes (really?). If I'm not wrong then that is indeed a problem for those who want to write Verilog code at a higher level of abstraction than RTL. In fact, I'm starting to wonder about the body of verification code that my team has written at work. Do we have any cases of code like this that could suffer from Verilog's non-deterministic behavior? We are using SystemVerilog and the UVM with its TLM interfaces that give you safe ways to communicate between processes so probably not, but I can imagine where someone could be tempted to work outside the nice safe structure of the UVM.

I'm hopeful that others will read this and chime in with any needed clarifications, corrections, and help. I have some ideas for modifying the example code to make it safer that I will explore in a separate blog entry. Stay tuned.

UPDATE: I have written the follow-on post that shows the fix for this particular code example.

Friday, August 2, 2013

SVUnit Upgrade

Being a relatively early adopter of svunit (for unit testing SystemVerilog code), I had a fair amount of code written to work with the early 0.X versions of svunit. The maintainers of svunit have made some good progress and are now on version 2.3 (as of writing this). My old tests don't work with the new version of the framework, but I figured out how to update them. Just in case anyone else is in the same predicament, I will share the steps I took to fix things:

  • In the *_unit_test.sv file:
    1. remove typedef class c_<UUT>_unit_test
    2. keep the module <UUT>_unit_test declaration, but delete everything in the module except for the string name… and any interface declarations you may have added
    3. delete the c_<UUT>_unit_test class declaration
    4. add svunit_testcase svunit_ut; under the string name…
    5. Now that this is a module and not a class, tasks and functions declared in here might need to have the automatic keyword added to the declaration in order to behave the same
    6. change function new… to function void build();
    7. change super.new(name); to svunit_ut = new(name);
    8. inside task setup, change super.setup(); to svunit_ut.setup();
    9. inside task teardown, change super.teardown(); to svunit_ut.teardown();
    10. remove (<testname>) from all `SVTEST_END macros (it no longer takes an argument)
    11. change last line of file from endclass to endmodule

That should be it.

Friday, June 28, 2013

Git Branches Are Not Branches

Git branches have confused me (someone who uses mercurial a lot and git a little) for a while, I have finally realized why. The problem is that git branch is a poorly chosen name for the thing that they really are. You see, all the changeset history in git is stored as a Directed Acyclic Graph (DAG). The code history might be simple and linear which will make the DAG have a simple path like so (o's are nodes in the graph, called changesets, -'s are references from one node to another, with time progressing from left to right):

o-o-o-o-o

Or the code history and corresponding DAG could be more complicated:

                 o-o-o
                /     
    o-o-o     o-o-o-o-o
   /     \   /         \
o-o-o-o-o-o-o-o-o-o-----o-o

Most English language speakers would agree that those parts of the DAG (code history) where a node has two children (representing two parallel lines of development) are called, branches. The above example has four branches in the history, four branches in the DAG, right? The confusion with git branches, however, is that the above diagram may actually represent a git repository with only one git branch, and the diagram above that with the linear history could represent a git repository with any number git branches. A git branch is not a branch in the DAG representation of the changeset history.

The reason this is possible is because a git branch is actually just a label attached to a changeset. It's just a name associated with a node in the DAG, and you can add labels to any node you want. You can also delete these labels any time you want as well. I believe the git developers chose to use the term branch for these labels because the labels are primarily used to keep track of DAG branches, but in practice the overloading of the term causes a lot of confusion. When a git users says he's deleting a branch, he's really just deleting the label on the branch in the DAG. When a git user shows you a linear history like in the first diagram and then starts talking about the branches contained in that history, he's really just talking about the different labels applied to various changesets in that history.

Labels such as these are very common in computer programs and there are a number of common English terms that convey a much more clear picture of their function and purpose: label, tag, pointer, and bookmark come to mind. There are pages and pages of explanation on the internet that try to explain and clarify what git branches are and what you can and can't do with them, when, I believe, using a better name would alleviate the need for most of that. Personally, I now just say label or tag or bookmark in my head whenever I read branch in a git context and things are much less confusing.

I hope that helps someone besides me who is learning git. Next week I'll talk about how the git index is nothing like an index :-)

(By the way, if you have a choice in which to use, mercurial works about the same as git and has better names for things)