Gary Smith EDA Consulting in Electronic Design
Gary Smith EDA (GSEDA) is the leading provider of market intelligence and advisory services for the global Electronic Design Automation (EDA), Electronic System Level (ESL) design, and related technology markets.
Optimize despite Amdahl’s Law
Jem Davies and I have been having a running conversation on Domain Specific, or Application Specific microprocessors. There probably is a set definition for both terms but as they seem to be merging I’ll continue to call then Application Specific for now. Jem’s last blog (Embedded and Desktops – Similarities and Differences) reminded me of an Industry Note (our term for blog – it’s free) I was going to write last month. As expected my notes were lost in one of my piles on my desk.
Amdahl’s Law and Parallel ComputingI was at Synopsys, eighteen months or so ago, in a product review. They were introducing a parallel version of HSPICE. They gave some performance estimates that I replied were probably unobtainable. Amdahl’s Law came up and I pointed out that optimizing HSPICE would probably give them a maximum of four times speed up. If they were to reach the ten times plus they would have to rewrite HSPICE, from the algorithm down.
I’d love to take credit for that but as with most of what I pass on, that actually came from someone else, in this case Gene Amdahl himself. Patrick Madden invited Gene to talk at IC CAD 2007 and after the talk I was fortunate enough to be on a panel with Gene on the topic of parallel computing. At the end of the panel someone from the audience asked Gene how he could get around Amdahl’s Law. Gene’s answer was rewrite the algorithm.
So I was at Synopsys this September looking at the new version of HSPICE and sure enough they had completely rewritten it and they were getting ten times plus speed up in many applications.
What’s Next?Now that all of the parallel computing hype has calmed down, and the quick fixes have proven busts, we can concentrate on what EDA developers do best, come up with brilliant algorithms. No magic green button just plain old hard work. At least it’s the work the EDA tool developers really enjoy. That then passes the challenge back to the hardware guys. We can no longer live with the sub-optimal microprocessors of the past.
That takes us to the work being done by Kurt Keutzer and David Sheffield at Berkeley. To give you a little background the EDA vendors were approached by GPU suppliers promoting the GPU as the answer to EDA’s parallel processing problems. It didn’t quite work out the way they planned. Although some EDA applications showed amazing results, OPC being the prime example, most EDA applications didn’t accelerate at all. It ends up that most EDA applications, certainly many of the real important ones, are Graph based. That’s quite a different animal than the embarrassingly parallel algorithms the GPU is use to working with. On the other hand OPC, being “optical”, fits right into the strength of the GPU.
Kurt and David have found that Graph algorithms and Backtrack, Branch and Bound are the most pervasive computational patterns in EDA. Now they are looking for a processor architecture that best fits. That, by the way, is why I call them applications processors; there is more than one computational area that needs to be optimized.
Concurrent MemoryActually the biggest bottleneck is in the memory area. We thought we might have an answer with the Transactional Memory, but it proved to be either too slow or too power hungry when run at high speed. The best thing the semiconductor industry can do for us today is to come up with a workable solution for a Concurrent Memory.
Tech OnlineBuild, Borrow, Buy: New Approaches to Addressing Software Complexity
CadenceThe Denali Party by Cadence
Cadence blogsCadence DAC 2013 and Denali Party Update
Cadence blogsPanel: 3D-IC Design Experts Tackle “Practical Issues” in 2.5D and 3D TSV Deployment
Cadence blogsElectronic System Level (ESL) Design Gets a Pragmatic Look at EDPS Workshop