The research in the Resilient Systems theme is centered on providing reliable computation on
unreliable silicon platforms of future technology nodes: in the upcoming transition from the
late- to post-silicon era, a host of diverse threats endanger the availability and survivability
of silicon-based platforms.
The Resilient Systems theme will concentrate on post-deployment or lifetime resiliency and will
address availability and survivability issues with new thinking gained from its past experiences.

The research will follow several guiding principles:
- Resiliency to a high number of faults with graceful degradation.
For transient failures, this corresponds to a low error rate (<0.01%).
In presence of multiple failures, performance should gracefully degrade with incidence of faults,
i.e., solutions should provide a fluid trade-off between a platform<92>s health and its performance.
- Near-zero cost resilience through hardware/software techniques, i.e., solutions that
operate across the hardware/software boundary to achieve improved quality or cost in resiliency
- Runtime verification solutions for chip multi-processors. The research team plans
to investigate new practical and low-cost solutions targeting highly concurrent platforms where the
memory/communication subsystem adds additional complexity
- Tailored resiliency, that is,
a focus to achieve improved resiliency at lower cost by providing tailored solutions that
leverage the flexibility of specific architectures and/or applications,
- Resiliency for ultra-low power.

GSRC has traditionally been an incubator for low cost defect-tolerant solutions, protecting systems
from permanent and transient failures and extending their expected lifetime. The new generation
of the Resilient Systems theme will focus on steadily reducing the costs of reliability through
through novel cross-layer mechanisms of robustness and adaptivity, and will be an important source
of progress with respect to these long-standing cross-cutting roadmap challenges. Six tasks provide
research coverage on all main platform segments. For infrastructure platforms, where component replacement may not be possible (or immediate),
especially for battlefield defense applications, availability is of the utmost importance:
they demand graceful degradation to failures, high fault resiliency, and they benefit from runtime
validation solutions. Mobile platforms have tighter power and cost constraints, thus making them more
suited for integrated hardware/software techniques and tailored resiliency. Finally, sensor nodes
are naturally deployed in large numbers, thus they can inherently sustain some failures.