My point has been consistent and straightforward: SystemC offers little benefit over Verilog/VHDL/SystemVerilog for the expression of control logic (and, probably, is even a step backward for complex control designs; with SystemC, the description will necessarily be just as low-level and manual, but it will add an additional translation step away from RTL).
But, in this week's EETimes, Gabe Moretti quotes Gary Smith as saying that "Now we have ESL synthesizers from Forte Design Systems and Cadence that target both the Algorithmic and Control logic domains. Because of that we are moving away from the three domains (algorithmic, processor/memory, and control logic) view of ESL to a more traditional look at the methodology. The ESL methodology is indeed maturing."
I'd love to understand how things have matured in the control area -- as I'm not sure that much has beyond the messaging. Another piece this week might provide some clues:
John Sanguinetti of Forte has a contributed article for EDADesignLine called Abstraction and Control-Dominated Hardware Designs. Anyone considering SystemC for control logic should take a critical look at the example provided, particularly as it's the only example supporting the thesis. A cursory look at the example would indicate that there's a good code savings with the SystemC implementation -- a closer look highlights that this is a very special example, and one that doesn't particularly support the notion that SystemC provides unique abstraction for control logic. Here are some things I noticed:
- The code illustrated is a basic sequential set of steps, with no flow control, no conflicts with shared resources, no conflicts with other FSMs in the system -- sure, it's "control" code, but it's hardly representative of what makes control complex.
- The article mentions that this is illustrating a cooperating FSM (because it "cooperates" with a memory, I guess). Well, this is a pretty special (not to mention convenient) case of a "cooperating" FSM: the interactions are deterministic and stepwise and mirror images of each other. This is hardly an illustration of cooperating FSMs that have any usefully interesting interactions.
- Isn't this showing abstraction? It definitely is -- and, in general, it's powerful to be able to hide code in a library and reuse it. It's especially elegant to overload operators as illustrated in this example. In SystemVerilog RTL, you wouldn't be able to do this type of overloading (I don't believe) and have it synthesize. That said, I believe you could do almost the exact same thing -- and provide almost the identical succinctness -- by using SystemVerilog tasks (I'm not an expert at SystemVerilog RTL -- so I'm not sure I'm using the right terminology). Basically, the point is that there's nothing about RTL that precludes this type of abstraction -- and I believe you could closely mimick it. Does that make SystemVerilog ESL?
- The code that's "eliminated" in this example has to be written elsewhere in a class library -- so, for a single use, there's no code succinctness
- What does the code in the library look like? Is it much better than what's illustrated in the RTL in this example? Or, is it RTL-like SystemC code put away in a library so that you write it once instead of at every instance?
- What if the memory subsystem had different behaviors every time? For example, it completed transactions in differing numbers of cycles. What would that code look like?
- How flexible is this code -- and how prone to error is it for an engineer to use, especially now that it's been hidden away? For example, in this design, it's assumed that only one process accesses the memory at any one time -- who guarantees this and how? Is it only usable in situations where you know it's the only thing accessing memory -- or accessing it "at that time"? What's required to use it if you need to worry about other processes potentially accessing the memory at the same time? So, if multiple processes try to simultaneously access the memory, what would happen? (I presume you'd get a bug, unless the library accounts for this (see the next point))
- What would the code look like in the library if it had to account for multiple processes needing to access the single memory resource at the same time? Would the synthesis work for this particular case where the memory access is abstracted with []?
These more complex interactions are one important area where atomic transactions offer a profoundly better abstraction than RTL, making complex concurrency dramatically simpler to express, easier to change, and much more. Abstraction for control is about addressing and improving shared resource management, arbitration, scheduling, flow control, etc.
I'm not saying that SystemC can't be used to express control logic -- and I'm not suggesting that SystemC doesn't provide benefits over C/C++. It does -- and I'm sure there are many times where a dataflow implementation benefits from finer grained control and concurrency expressiveness. But, these aren't the fundamental questions.
The fundamental question is: what types of designs benefit from synthesizable design with SystemC versus Verilog/VHDL/SystemVerilog? And, how do they benefit? Abstraction buys you little if the quality of the results are not acceptable. Abstraction also buys you little if it improves the 20%, not the 80%. SystemC buys you little when there's little abstraction.
2 comments:
In the context of designing SoCs, I don't see a need for one unifying design language: SoC designers are not going to start designing their CPUs in C, SystemC or SystemVerilog – they will be purchased IP from the usual suspects. They won’t design SRAMs in these languages – they will be generated via SRAM generators. The critical thing is that designers are able to create or acquire each part efficiently, model all of these pieces together, and when the parts are integrated, one doesn’t lose the benefits gained in design and verification.
Okay, then which IP should be designed in which language? I'll argue that we should go for the highest level of abstraction that provides good quality of results because that will deliver the lowest design and silicon cost.
I'm focused on the design of the algorithmic IP, which is typically the heart of an SoC for consumer markets and where companies are differentiating in power, performance, area and features. This is a prime target for high level tools because it often has design goals and proprietary content that preclude using off-the-shelf or configurable IP.
For this type of IP, I think C/C++ is ideal. It provides a simple programming model that is familiar to many designers, provides high level of abstraction to manage complexity, and provides both excellent productivity and QoR attributes provided that the synthesis tool has the features to deliver good results. Many people will be surprised at the extent we can already deliver good QoR for large, abstract C applications.
To start from C or C++, we need to specify the behavior in a sequential, deterministic way. If we approach system design bottom-up, we run into problems when we encounter control-oriented parts of the design, for example, memory arbitration. The solution to this is in the synthesis tool. If the tool can instantiate memory arbitration based on inference or user specification, then we can retain a high level of abstraction in the source, but get the desired behavior in the hardware.
To write a sequential source in the arbitration case, the tool must be capable of designing at a high level so that the arbitration is encapsulated. We can write a video codec in C at the high level even though the corresponding HW will be parallel and may be even non-deterministic at the micro-level (e.g. re-ordering of DMA, local memory access, access to processing resources, etc.).
If the tool has this capability, as Synfora's PICO does, we can use very simple concepts (e.g., accesses to an array from separate procedures or loop nests) and the tool can instantiate very complex and interesting hardware such as arbitrated memories.
Arbitration is just one example, though. The goal is really to have simple and well-known concepts for many interesting hardware behaviors. There are many such examples: Multi-threaded hardware behavior, flow control between different blocks, memory mapping, stalling for a variable latency process (e.g., DRAM access), timing control of overlapping/parallel processes, etc.
The more the source code has to specify these factors (for example, as in SystemVerilog, in SystemC using hardware modeling features, or in similar languages) the more we lose abstraction and the corresponding benefits – we move back toward RTL, maybe in different syntax. In some cases (like SystemC), the approach reflects a desire to do “everything” in one language assuming that the compiler cannot handle all this with a high level of abstraction. The result is a mix of high level synthesis and hardware control written in pseudo-RTL. If we can retain a high level of abstraction for more of the system design, we gain more benefits from the automated design and verification.
Synfora has adopted this different approach based on a sophisticated compiler technology and a flexible hardware template that reflects the real types of hardware that people are designing by hand. The result is that the tool gives a rich set of design, exploration and control options to the architect to build hardware with good quality of results and with high level of abstraction with no specification of time or resources in the source.
Craig Gleason
Director of Customer Engagements
Synfora
Craig,
Thank you for the thoughtful commentary.
I agree wholeheartedly with your notion that "we should go for the highest level of abstraction that provides good quality of results because that will deliver the lowest design and silicon costs". I assume by "good quality of results" you mean optimal QoR -- as silicon cost and power are huge considerations for many chip markets, especially for consumer chip designs.
However, I think there are two implicit assumptions in your statement that I'd question:
1. That there are only unique best-of-breed languages for each design type – that is, languages that provide both optimal Quality of Results and optimal productivity only do this for very specific, and different, design types. If there were a single language that delivered across the spectrum, wouldn’t that be preferable and offer benefits that a niche specific language couldn’t?
2. That C/C++ provides consistently optimal QoR in algorithmic implementations. I’m hardly convinced of that. I’ve seen recent comparisons where production-level C/C++-based solutions were 11%, 42% and +100% larger than equivalent designs done with BSV. Perhaps we’ve gotten to the point where consumer electronics companies are okay with this type of added cost, but I doubt it. I have no doubt that there are designs where C/C++ can do quite well – but, I don’t believe that this is consistently true, especially with more sophisticated, larger algorithms (though I am sure there are exceptions).
One can argue with the level of abstraction of BSV, of course. In the 42% case above, the whole BSV implementation was done with less effort than it took for just the timing closure effort of the comparable design done with a leading C/C++ toolset.
Post a Comment