Search: 
GSRC Student Profile:

Ming Gao

mgao@ece.ucsb.edu
http://cadlab.ece.ucsb.edu/~mgao

University of California, Santa Barbara
Advisor: Tim Cheng

GSRC theme:  viability
Expected graduation:  Mar, 2012

Research Overview:  On Error Modeling and Error Checking for SoC Validation and Self-Testing

Post-silicon validation for electrical bugs becomes both expensive and challenging with increasing variability and shrinking noise margins. High quality validation resources are needed to address this problem. Currently, the development and selection of validation tests and DfD structures primarily rely on intuition and the brute force of voluminous test stimuli either generated randomly or derived from applications. Although these validation resources can be effective, error models for coverage measure are needed to evaluate the sufficiency of the test suites, to quantify effectiveness of DfD alternatives, and to complement the potential biases of intuition.

Effective error models for post-silicon validation have specific criteria: they must be efficiently computable with functional tests, must be sufficiently aware of bug activation conditions, and must account for the limited error observability in silicon. However, no prior error models meet all these goals. While the errors induced by electrical bugs ultimately manifest as an excessive delay, delay faults are not suitable bug models for validation tests. An important reason is that checking error detections at system observability points requires functional fault simulations which are unaffordable. On the other hand, the Random Bit-Flip (RBF) error model was commonly employed for observability evaluation but it can hardly provide any meaningful coverage measure without taking any electrical bug activation conditions into account.

We introduce COBE, a COnstrained Bit-flip Error model that combines the accuracy of bug activation conditions extracted from low-level circuit model with the efficiency of error observability evaluation at high-level. We also propose a timing simulation based “ground truth” error modeling technique for model evaluation in lieu of silicon samples. Experimental results using an Alpha 21264 processor implementation and the OpenRISC SoC design demonstrate that COBE model correlates with electrical bugs significantly stronger than RBF models (correlation factors 0.921 vs. 0.482). It also exposes the shortcomings of the COBE model with only dynamic hazard constraints, highlighting the need and opportunities for improvement. We also proposed a "MUX-glitch" error constraint to improve the COBE model by more than six times in accuracy with negligible simulation overhead.