Wednesday, April 30, 2014

A Quick Look at svlib

I just took a quick look at svlib from Verilab.  Very cool.  It's a library for SystemVerilog that gives you file globbing, regular expressions, a better string class, simple ini config file parsing (with yaml support promised for the future!), and more.  It was announced back in March and it took me this long to getting around to reading about it.  Hopefully it doesn't take me that long to actually try it out :-)

They welcome feedback so brace yourself, here it comes.  First of all it's open source (Apache license) which is excellent.  It's open source and it has documentation.  Amazing!  :-)  It is not currently developed openly though.  Could we get a github, bitbucket, or sourceforge project going?  Our industry (design verification) desperately needs to admit and recognize that we are software developers.  I mean no, we are verifiers!  Bug finders!  It just so happens that writing software is the primary technique we use to verify designs and find bugs (hence the need for svlib).  We need to get better at writing software.  Open Source projects are a great way for us to help each other develop those skills.  The paper and presentation talk about the trade-offs and design considerations that were considered by the svlib authors as they wrote svlib.  How much better would it be for all of us to be able to see and participate in the discussions that led to the particular design of something like svlib?

Second item of feedback.  Recommending people do an import svlib_pkg::* is no surprise, but it's still bad.  C++ and Python programmers long ago realized how bad their equivalents are: using namespace and from import *.  We SystemVerilog programmers need to realize this too.  Brian Hunter makes the case for SystemVerilog in his seminal Namespaces, Build Order, and Chickens video.  You can see the reasoning for C++ and Python all over the internet.  As I have made this argument people have pointed out how awkward the alternative svlib_pkg::foo looks in your code.  A big thing that would help with that is to drop the _pkg suffix.  We don't need that suffix, it's like doing this. svlib::foo is not that bad and clearly shows exactly where foo came from.

Third item of feedback.  It was bold and probably justified to put this all together in a single library named svlib.  It probably simplified some things and expedited getting the code working and out the door.  Those are good things.  Long-term though, I think we'll be better served if this were split into multiple smaller libraries.  Maybe one for regexes alone, one for os/system interactions, one for ini parsing, and so forth.  Python and its libraries are good examples to follow here.  If you make the project open I promise to help out with this where I can and I'm sure others would too.

Those concerns aside, svlib is a great thing to happen to the SystemVerilog community and hopefully just the start of better things to come.  Collaboration and sharing of libraries and tools like this will help our entire industry grow and and progress.

Thursday, February 27, 2014

Avoiding Verilog's Non-determinism, Part 2

At the end of my last post I promised I would have another non-determinism (AKA, race condition) example from recent real-life experience. Here it comes.

Before I show you any code I want to explain how this race condition was introduced. We had a signal in an interface that needed to be widened. We had a function in some simulation-only code that looked at part of that signal and didn't care about the new bits that were added. The engineer who widened the signal decided not to change the function and instead added a new variable and assigned (using the assign keyword) the bits of interest from the newly widened signal to this new variable. He then passed this new variable to the original function in place of the original newly-widened one. Seems reasonable, right? Well, after he made that change some tests started failing and after some digging it began to look like a race condition, but it wasn't obvious where the race was coming from. The problem was that assign statement, because assign creates yet another process. The original newly widened signal was given a value in one process and the assigned variable got updated in a separate process. There was now a race between those two values (the newly-widened one and the assigned one) to get to the third process that consumed them (the process that includes the previously mentioned function).

Got all that? Here's the boiled-down code that illustrates what was going on:

module top;
   reg [3:0] foo;
   wire foo_0;
   reg ready;
   
   assign foo_0 = foo[0];

   initial begin
      #10;
      foo = 'h0;
      ready = 0;
      #10;
      foo = 'hf;
      ready = 1;
      #10;
      foo = 'h0;
      ready = 0;
   end

   initial begin
      forever begin
         @(ready);
         $display("foo[0]: 'b%0b", foo[0]);
         $display("foo_0:  'b%0b", foo_0);
         $display("------------");
      end
   end
endmodule

And of course, an EDA Playground version that you can play with. See the differences with different simulators? It was interesting that GPL Cver and Veriwell give the same results as the simulator I use at work (Cadence): the value of foo_0 is always a step behind the value of foo[0]. Also interesting that if I change the wire to a reg Cadence changes and gives the same results as Modelsim and Icarus (plain verilog won't allow that to be a reg, that's why I made it a wire for EDA Playground), which was not the behavior I was getting in the larger production testbench code (foo_0 was a step behind there). Non-determinism in the flesh.

The solution was to get rid of the new variable (foo_0) and the assign and just pass the foo to the function. Modifying the function to deal with a wider foo wasn't too difficult and the race between foo and foo_0 was eliminated. Easy, right? Trust me, fixing the problem wasn't the hard part. Identifying the root cause was much, much harder.

I'm understanding Jan Decaluwe's concerns with Verilog more and more as I encounter and dig into these race conditions. I think before reading his blog entries on Sigasi I thought concurrency and non-determinism had to go hand in hand. My embedded software background and experience with real-life concurrency was a large contributer to that opinion: isn't Verilog just modeling concurrency accurately when it's non-deterministic? Jan's point, I believe, is that the pain of this "accurate" modeling doesn't really buy you anything, and because processes in Verilog spring up and multiply almost (ok, definitely) without you noticing it can be very difficult to have them communicate reliably. In software like C you know exactly when you are creating another thread but in Verilog it is not so obvious. RTL Verilog has it easy because you can follow a basic (if over-restrictive) guideline (synchronize to the same clock and use non-blocking assignments in clocked processes), but in higher-level simulation-only code I don't know of an easy guideline you can follow. I guess the guideline is this: learn what creates a process and think about values racing from one process to another.

Wednesday, February 26, 2014

Avoiding Verilog's Non-determinism, Part 1

In my last post we looked at some example code that showed off Verilog's non-determinism. Here it is again (you can actually run it on multiple simulators here on EDA Playground):

module top;
   reg ready;
   integer result;

   initial begin
      #10;
      ready <= 1;
      result <= 5;
   end

   initial begin
      @(posedge ready);
      if(result == 5) begin
       $display("result was ready");
      end
      else begin
       $display("result was not ready");
      end
   end   
endmodule

Just to review from last time, the problem is that sometimes the @(posedge ready) will trigger before result has the value 5 and sometimes it will trigger after result has the value 5. We have called this non-determinism but a more common term for it is, race condition. There is a race between the values of ready and result making it to that second process (the second initial block). If result is updated first (wins the race) then everything runs as the writer of the code intended. If ready is updated first (wins the race) then the result will not actually be ready when the writer of the code intended.

Now the question is, is there a way to write this code so that there is no race condition? Well, first of all I surveyed my body of work on simulation-only code and didn't find very many uses of non-blocking assignments like that. The common advice in the Verilog world is to use non-blocking assignments in clocked always blocks not in "procedural" code like this. If we change the above to use blocking instead of non-blocking assignments, does that fix the problem? Here's what the new first initial block looks like:

   initial begin
      #10;
      ready = 1;
      result = 5;
   end

You can try it on EDA Playground and see that it still behaves the same as it did before except for with GPL Cver. With non-blocking assignments you get "result was not ready" with Cver and now you get "result was ready." That doesn't give me a lot of warm fuzzy feelings though. In fact, looking at that code makes me feel worse. If I'm thinking procedurally it looks totally backwards to set ready to one before assigning the value to result. My instinct would be to write the first initial block like this:

   initial begin
      #10;
      ready = 1;
      result = 5;
   end

Is that better for avoiding race conditions? If I take the explanation for why race-conditions exist in Verilog from Jan Decaluwe's VHDL's Crown Jewel post at face value, I think it actually is. That post explains that right after the first assignment (signal value update, if we use Jan's wording) in the first initial block Verilog could decide to trigger the second process (the second initial block). That case causes problems in the original code because the first assignment is to ready and result doesn't yet have its updated value. With the assignments re-ordered as above even if the second initial block is activated after the first assignment it will not try to read the value of result. It will just block waiting for a posedge ready (which will happen next). Race condition: eliminated. Here is the full fixed code example on EDA Playground.

Strangely enough, I spent the day yesterday debugging and fixing a race condition in our production testbench code here at work. It was very different from this one, so don't get too confident after reading this single blog post. I was able to boil the problem from yesterday down into another small example and so my next post will show off that code and how I eliminated that particular race.

UPDATE: As promised another example of a race condition.

Saturday, February 22, 2014

Is Verilog's Non-determinism Really a Problem?

A series of blog posts by Jan Decaluwe criticize Verilog for being "non-deterministic" and therefore fundamentally broken.

This one has a code example that illustrates the fact. During a lively Twitter conversation about it I fleshed out the example and put it on EDA Playground so we could all view it and, even better, run it with all the simulators that EDA Playground provides and see what actually happens. Click here to see and run the example code yourself.

If you run the example you'll see that it behaves the same for all the simulators available on EDA Playground, except one. Jan explains that they are all compliant with the Verilog specification because the specification allows for either behavior. Jan expertly explains in this post how that can be.

Despite these clear examples and explanations I'm left with the feeling of, why should I care? Apparently a lot of other users of the Verilog language have the same feeling as me. I tried to see Jan's point and ask some honest questions in comments and on Twitter and I learned some more. My first question is, if we were to try and synthesize his example Verilog code, what kind of hardware would we get? Wouldn't it be non-deterministic in exactly the same way as the Verilog code? And therefore wouldn't the non-deterministic Verilog be an accurate model and not a sign of Verilog's brokenness? The tweets I got in reply agreed that nobody would design hardware like this and so that's a non-issue. See:






In other words, if you are writing your Verilog in normal RTL style then the non-determinism is not a problem. When writing Verilog that will not be synthesized (simulation-only code) people rightfully abandon the restrictions of RTL. As Chris pointed out, people could then fall into the trap of writing code like this example. I believe that's true, so let's look more closely at this example and see what's going on.

Each initial block is a process executing concurrently. The first process assigns a value to result and signals the second process that result can be read by assigning a value of one to ready. The second process blocks until the value of ready changes from zero to one. It then immediately reads the value of result. Now, I have some extensive experience writing embedded C-code with multiple threads and processes. In that world you would be insane to synchronize two threads using simple shared variables like this. That's because, similar to Verilog, you can't predict when your two threads will be scheduled and run by the OS, when interrupts will occur, and so forth. Instead you would use a synchronization construct provided by the operating system such as a semaphore or mailbox. So again I ask, why do we care about this Verilog non-determinism? Isn't it just the same as in other software environments?

I think the answer to my own question might be, no it's not the same in plain Verilog. Sure, SystemVerilog added semaphores and mailboxes (for just this reason, I assume) but plain Verilog does not have those. Using shared variables is the only way to synchronize and share information between processes (really?). If I'm not wrong then that is indeed a problem for those who want to write Verilog code at a higher level of abstraction than RTL. In fact, I'm starting to wonder about the body of verification code that my team has written at work. Do we have any cases of code like this that could suffer from Verilog's non-deterministic behavior? We are using SystemVerilog and the UVM with its TLM interfaces that give you safe ways to communicate between processes so probably not, but I can imagine where someone could be tempted to work outside the nice safe structure of the UVM.

I'm hopeful that others will read this and chime in with any needed clarifications, corrections, and help. I have some ideas for modifying the example code to make it safer that I will explore in a separate blog entry. Stay tuned.

UPDATE: I have written the follow-on post that shows the fix for this particular code example.

Friday, August 2, 2013

SVUnit Upgrade

Being a relatively early adopter of svunit (for unit testing SystemVerilog code), I had a fair amount of code written to work with the early 0.X versions of svunit. The maintainers of svunit have made some good progress and are now on version 2.3 (as of writing this). My old tests don't work with the new version of the framework, but I figured out how to update them. Just in case anyone else is in the same predicament, I will share the steps I took to fix things:

  • In the *_unit_test.sv file:
    1. remove typedef class c_<UUT>_unit_test
    2. keep the module <UUT>_unit_test declaration, but delete everything in the module except for the string name… and any interface declarations you may have added
    3. delete the c_<UUT>_unit_test class declaration
    4. add svunit_testcase svunit_ut; under the string name…
    5. Now that this is a module and not a class, tasks and functions declared in here might need to have the automatic keyword added to the declaration in order to behave the same
    6. change function new… to function void build();
    7. change super.new(name); to svunit_ut = new(name);
    8. inside task setup, change super.setup(); to svunit_ut.setup();
    9. inside task teardown, change super.teardown(); to svunit_ut.teardown();
    10. remove (<testname>) from all `SVTEST_END macros (it no longer takes an argument)
    11. change last line of file from endclass to endmodule

That should be it.