Tuesday, May 5, 2020

Static Timing Analysis free essay sample

With  designs becoming increasingly complex  by the day and transistor geometries shrinking, almost all the functional domains across SoC design teams are having a hard time to signoff their functions and Static Timing Analysis (STA) timing closure is also no exception. STA Timing closure is always an important and critical part of SoC design and lower technology nodes have only compounded the challenges for  STA teams. As the VLSI industry has entered  the epoch of a lower technology node,  diminishing transistor sizes and interconnect lengths have disturbed the ratio of cell and interconnect delays. This leads to requirement of signing off the SoC at multiple corners. After timing signoff at multiple Processes, Voltage, Temperature (PVT) corners, the silicon fabricated at submicron technology nodes shows appreciable increase in yield in terms of meeting timing specifications of the design. However, timing closure at multiple PVT corners is in itself a huge challenge for the physical design team. This article will discuss these challenges and touch upon methodologies available to overcome them. We will discuss in detail, our solution to reduce the number of optimization corners in order to achieve efficient and coherent timing closure in minimum time. But before this, let us discuss in brief, the need to have multiple PVT corners for timing signoff. Cell delays and interconnect delays are governed by manufacturing Process (P), operating Voltage (V) and ambient Temperature (T) properties of dies. These factors determine the physical properties of cells and interconnect like W/L ratio of cells and Resistance (R) and Capacitance (C) value of interconnects. At the 180-nm technology node and above, timing signoff at worst and best standard cell PVT corners with 2 RC extraction corners, namely, Cmax Rmin (Cmax)   , and Cmin Rmax ( Cmin) was sufficient. On similar lines at 90 nm node 2 additional process corners Best Hot (Best process, Voltage at max temperature) and Worst cold (Worst process, voltage at min temperature) were introduced for the robust timing signoff, specifically for hold timing signoff as hold is skew dependent . The RC corners for these 2 process corners were similarly Cmax at min temperature and Cmin at max temperature respectively. In 90-nm technology and above, a timing path is predominantly governed by cell delays. However below 90nm node, the contribution of interconnect delay in a timing path is significant and the Coupling Cap component (Cc) in net delay can significantly alter slack values at an endpoint of a timing path. In all we have 4 X 4 = 16 corners for a single Timing Mode/View. If we have 8 STA modes for a design, then in all we have 8 X16 = 128 runs for the design. The first solution to avoid such an enervating analysis for a single mode is to look for a corner that forms a superset of the reset of corners. However a graphical distribution of slack values for a design block across all the 16 corners shows that none of the 16 corners was a complete superset over the others, thereby leaving us with no other option but to signoff the design at 16 corners. A silver lining amid all challenges  listed above is that the situation is not that bad for setup timing analysis. Setup timing violations are primarily dependent on the delay of the timing path (cell delays and interconnect delays, combinational and sequential arcs). These delays are significantly different for cell PVT corners (worst corners have delays considerably greater than the best corners) . For setup timing where worst corners are a complete superset over the best corners, the choice is between worst cold and worst hot standard cell corners to find out most critical corner for setup analysis. Conventionally, worst hot corner has more delays but at lower technology nodes, worst cold can have more delays because the threshold voltage of MOS comes into picture and transistor gets slower at lower temperature due to temperature inversion phenomena). When it comes to RC extraction corners, cmin is never more critical than other 3 RC corners. So for multi mode multi corner optimization for setup we can select 2 worst corner cell corners and cmax RC extraction corner (xtalk corner also if necessary) for meeting most of the setup paths in the design. But the situation is completely different for Hold Timing. As hold is skew driven, it is very difficult to judge which combination of process cell corner and RC extraction corner out of the 16 combinations would have most of the hold violations in the design. As the slack distribution plots for hold violations show, none of the 16 combination is a superset over the other (4 plots have been shown here for convenience). The challenge is to find the optimum number of optimization corners so as to ensure that appreciable numbers of violations are fixed ithout compromising the memory and runtime requirements of timing and placement tool. This task becomes more daunting as extraction corners depend heavily on design layout. Even in the same design, different blocks are found to have different RC combinations that yield maximum violations, and so is the case across different designs. The graphs shown below represent slack distribution of? a design in   two different RC Corners while keeping cell corner common. Here each graph shows the slack at each endpoint for the corner combination specified in x and y axis. The frequency of blue dots both above and below the unity slope line indicates that some  endpoints are more critical for x axis corner while an equally considerable  number are more critical for y axis  corner. Thus no RC corner is superset over other RC corner. So our focus here is to find   a generic approach that help us in deciding few optimization corners out of all signoff corners such that by fixing timing violations in only these few corners   by   APR tool,   most of the timing violations are fixed in one go. Our methodology is to find the optimum number of corners for hold timing signoff and Multi Mode Multi Corner hold optimization. We took 2 design blocks and did a comprehensive hold analysis across all 16 corners individually. It isn’t necessary that selecting the top most critical corners for optimization would solve this issue but instead we can look for finding out corner that have the maximum common violations with   the other 15 corners . The magnitude of violations could be taken care by adding extra pessimism in the optimization runs through uncertainties. 1. For this we prepared a 16 X 16 matrix where an element of the matrix m (i,j) showed the number of common violations between ith  and   jth combination corner. . In the next step we considered one  best process corner, among the 8  (highlighted in blue color) having   most number of common violations with   each of the 8 worst process corners,   for example best xtalk (in blue)   has the   maximum number of common violations with each of   the 8 worst corners and similarly we considered one worst   process corner, among the 8 (highlighted in purple) having most number of common violations with each of the   8 best cases. As shown in the figure worst cold xtalk (in purple) has the maximum number of common violations with each of the 8 best corners. Please note that this case can be already covered under Step 2 listed above but in our case violations in worst process and best process violations were not correlating. In some designs one of best corner can have most common number of violations with worst corner and can be marked with different color code. Now for each row/column the corner with the maximum number of red,green and (blue/puple) elements would be out best choice for hold optimization. In our case, this gave us the hold optimization corners as â€Å"best xtalk† and â€Å"worst cold   xtalk†. After that we fixed hold violations in these two corners best xtalk and worst cold xtalk. Again a 16 X 16 matrix was made with the same rules as the first. Corners fixed : Best cmin and worst cold xtalk Again step 2 was followed and this time the worst corner with maximum common violations was found to best cmin. The first 2 set of fixes plus a third set of fixes on best cmin were sourced across all corners to give us extremely positive results. Corners fixed : Best cmin ,Worst_cold_xtalk, Best Xtalk Observation : The matrix formed after this third level of hold fixing showed us that on an average more than 98 % of each of the 16 corners ‘s original violations were found to be fixed. The only violations remaining were the uncommon or mutually exclusive violations. We were able to narrow down from 16 corners to 3 corners which can be a part of the MMMC hold optimization thereby reducing tool run time/memory requirement and also reducing the number of hold violations to a far extent. The exercise can be repeated further to improve the percentage of fixed hold violations. The same methodology can be extended across multiple STA modes also to find mode and corner combinations having most common violations among multiple modes and multiple corners.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.