Search: 
 
Listed here are recent talks given at GSRC venues.
For a specific venue, use the pull-down list to select.
Click here for a full list of GSRC venues, with their associated talks.


GSRC talks


Select by venue:  


   Architecture and Synthesis Support for Accelerator-Rich CMPs
Pub ID:  2762 Author:  Jason Cong
Motivation – Why accelerator-rich CMPs (AXR-CMP)? Accelerator management and virtualization AXR-CMP implementation alternatives Accelerator synthesis and generation Accelerator selection Concluding remarks and future works
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Alternative Research Theme Overview
Pub ID:  2742 Author:  Naresh Shanbhag
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Stochastic Communications     [ edit ]   
Pub ID:  2764 Authors:  Andrew singer, naresh shanbhag
Stochastic Communications In this talk we look at non-traditional design of ADCs in an AFE for a communication link We consider the goal of preserving information, rather than the precise waveform (i.e. use BER as a guide, not SNDR) This leads to non-traditional ADCs: where quantization is nonuniform in time or in amplitude This provides a BER gain / power savings We are able to maintain these benefits in the presence of non-ideal circuit models
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Application Drivers Theme Overview
Pub ID:  2743 Author:  Todd Austin
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   RF Real-Time Adaptation for Error Resilience, Low Power and Performance
Pub ID:  2765 Author:  Abhijit Chatterjee
CMOS technology scaling along with the resulting large variability of circuit performance metrics in the presence of manufacturing process variations has made post-silicon circuit built-in test and adaptation/tuning almost a necessity for deeply scaled (DSM) technologies. Currently, circuits are designed to tolerate worst-case process corners. In addition, circuits must be designed for worst case operating conditions as well (e.g. environmental noise). This forces designers to excessively guard-band their products and increasingly more so as technology scales down to the 45nm node and beyond, resulting in unacceptable power-performance-yield tradeoffs. One way to tackle this problem is to design circuits that are "self-aware" and can adapt to environmental operating conditions and process variations to conserve power while maximizing yield and reliability. Such self-awareness involves incorporation of built-in test, diagnosis and tuning/adaptation mechanisms into the circuits and systems concerned. A key issue is that of test, diagnosis and tuning of complex circuit and system-level parameters that must be evaluated and traded off against one another during the adaptation process without access to complex external test instrumentation.This talk summarizes recent results obtained in the design of self-aware/adaptive wireless communications systems and points to directions for future work in this area.
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Platform Architectures Theme Overview
Pub ID:  2744 Authors:  Margaret Martonosi, Luca Carloni
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Stochastic Computing: Principles and Practice
Pub ID:  2766 Author:  Naresh Shanbhag
This presentation describes various principles and practices involved in Stochastic Computing
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Platform Viability Theme Overview
Pub ID:  2745 Author:  Kwang‑Ting Cheng
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Programming Concurrent Systems Theme Overview
Pub ID:  2746 Author:  Kurt Keutzer
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Resilient Systems Theme Overview
Pub ID:  2747 Author:  Valeria Bertacco
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Durability and Availability in RAMCloud
Pub ID:  2749 Author:  John Ousterhout
RAMCloud is a DRAM-based storage system that provides inexpensive durability and availability by recovering quickly after crashes, rather than storing replicas in DRAM. RAMCloud scatters backup data across hundreds or thousands of disks, and it harnesses hundreds of servers in parallel to reconstruct lost data. The system uses a log-structured approach for all its data, in DRAM as well as on disk; this provides high performance both during normal operation and during recovery. RAMCloud employs randomized techniques to manage the system in a scalable and decentralized fashion. In a 60-node cluster, RAMCloud recovers 35~GB of data from a failed server in 1.6 seconds. Our measurements suggest that the approach will scale to recover larger memory sizes (64~GB or more) in less time with larger clusters.
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   SAT-based Post-silicon Fault Localization     [ edit ]   
Pub ID:  2750 Authors:  Sharad Malik, Shucheng Zhu, Georg Weissenbacher
The localisation of faults in integrated circuits is a challenging problem and a dominating factor in the overall verification effort. Electrical bugs, in particular, surface only in the fabricated prototypes, leading to behaviour deviating from the golden model. Limited observability complicates their localisation: Logging mechanisms such as trace buffers allow us to retain only a limited execution history. A symbolic analysis of the RTL design can find discrepancies between the values recorded in the trace buffer and the intended behaviour. Contemporary MAX-SAT solvers are then able to identify a maximal subset of the RTL design that is consistent with the observed behaviour. The elements in the complement of this subset represent potential locations of the fault. The scalability of contemporary decision procedures dictates the size of a window of execution cycles which we can analyse using symbolic techniques. Current MAX-SAT-based fault localisation techniques require this window to span the fault as well as the error it causes. To address the scalability issues resulting from large window sizes, we propose to slide a smaller window along the temporal axis, constraining it with the information recorded in the trace buffer for the respective execution cycles. In this scenario, the localisation attempt may fail: The limited information provided by the trace buffer may be insufficient to pin down the exact temporal and spatial location of the fault. We propose to use backbones to identify information that can be propagated across sliding windows. The backbone of a symbolic representation of a circuit is the set of signals that are immutable under the given constraints (e.g., the output and trace buffer values). This additional information has several benefits: Firstly, it may be instrumental in locating the fault. Secondly,it may enable a reduction of the size of the of trace buffers and the sliding window. Our preliminary experimental results demonstrate that the use of backbones allows us to reduce the size of the sliding windows or the trace buffer.
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Relyzer: Application Resiliency Analyzer for Transient Faults
Pub ID:  2751 Authors:  Siva Kumar Sastry Hari, Sarita Adve, Helia Naeimi, Pradeep Ramachandran
Future microprocessors need low-cost solutions for reliable operation in the presence of failure-prone devices. A promising approach is to detect hardware faults by deploying low cost monitors of software-level symptoms of such faults. Recently, researchers have shown these mechanisms work well, but there remains an uncomfortably non-negligible risk that several faults remain undetected and result in silent data corruptions (SDCs). Further, most prior evaluations of symptom-based detectors are based on fault injection campaigns for application benchmarks, where each run simulates the impact of a fault injected at a hardware site at a certain point in the application’s execution. Since the total number of such faults is prohibitive (trillions), it is not feasible to study all possible faults. Previous work therefore typically studies a randomly selected sample of faults. However, these sampling methods, especially for application sites, have not been validated. These mechanisms also do not provide feedback on the portions of the application that remain vulnerable to SDCs so they could be protected through other means if needed. This talk presents Relyzer, an approach that systematically analyzes all application fault sites and carefully picks a small subset to perform selective fault injections. Relyzer employs novel fault pruning techniques that prune faults by either predicting their outcomes or showing them equivalent to others.
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Bug Positioning System
Pub ID:  2752 Authors:  Valeria Bertacco, Andrew DeOrio, Daya Shanker Khudia
Presentation on a novel post-silicon bug diagnosis technique to detect bugs that manifest inconsistently over multiple executions.
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Enabling Advanced Inference in Small-scale Sensors: resilient devices for analyzing physiological signals
Pub ID:  2753 Author:  Naveen Verma
Presentation of work in Application Drivers Theme, Task 5.1.2.
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Content Aware Channel Adaptive Low Power MIMO System for Video Transmission
Pub ID:  2724 Authors:  Debashis Banerjee, Joshua Wells, Abhijit Chatterjee
With increasing demand for reliable, fast communication over adverse channel conditions, new technologies have come to the fore to ensure robust error-free data transmission and reception. A key concept that has had a large footprint in the area of reliable wireless communication that of multiple-input-multiple-output (MIMO) wireless systems. With the help of adaptive RF circuits and systems we can trade off performance for power in MIMO Virtually Zero Margin Adaptive RF Front- end(MIMO-VIZOR) receiver when applicable.Further, intelligent encoding algorithms could be used for video transmission to save power at the baseband.
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Bio-Inspired Sensory Signal Processing
Pub ID:  2754 Authors:  Ping‑Chen Huang, Jan Rabaey
Signal processing tasks such as classification or recognition may benefit from implementation strategies inspired by biological sensory pathways. In this project, we explore the biological system from both the top-down and bottom-up perspectives. From the top down, we explore the functional models in computational neuroscience and seek for architectures that allow the use of low-power and low-precision computational units. From the bottom up, we investigate the strategies that the sensory systems have used to seamless interact with the analog inputs and perform the computation in an asynchronous way. In this talk, an odor recognition architecture is proposed based on the olfactory sensory pathway. This design seamlessly interact with the analog sensor responses and performs distributed computation with an overcomplete number of low-power, low-precision analog components, leading to energy efficiency and resiliency.
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Hierarchical Mixed-Signal/RF System Level Test &Validation
Pub ID:  2725 Authors:  Shyam Kumar Devarakond , Vishwanath Natarajan, Debashis Banerjee, Aritra Banerjee, Hyun Choi, Shreyas Sen
In this work, a new hierarchical signature driven testing/validation approach for RF systems has been developed. The proposed method determines module level performances from the system level response (signature) to an applied RF diagnostic test using top-down model diagnosis. A comprehensive set of specifications of multiple RF modules chains are computed simultaneously from the observed DUT response using a single data acquisition. A key contribution of this work is in the use of test generation algorithms to determine the optimized test stimulus from which all the DUT specifications including the system-level EVM metric are computed. The proposed concept is applied to MIMO & SISO OFDM WLAN RF systems. Experimental results provided for a 2.4 GHz commercial WLAN transceiver product validates the proposed concept.
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Resilient Coherence in Many-Core CMPs
Pub ID:  2756 Authors:  Konstantinos Aisopos, Valeria Bertacco, Li‑Shiuan Peh, Andrew DeOrio
The talk introduces two novel techniques to address the recovery of data withing correct protocol specifications in interconnect networks for CMPs facing transistor failures.
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   PROMOTE: PROcess MOnitoring and TEsting of Analog/RF circuits     [ edit ]   
Pub ID:  2728 Authors:  Shyam Kumar Devarakond , Shreyas Sen, Abhijit Chatterjee, Soumendu Bhattacharya
In this paper, a novel process-specification (causeeffect) monitoring approach that allows the effects of process variations and DUT specification variations for Analog/RF systems to be monitored on a per-chip basis is presented. As opposed to existing techniques that rely only on electrical test data gathered across lots of wafers, greater degree of process control monitoring can be achieved through the proposed technique. The method relies on the use of alternate diagnostic tests under which the DUT response (alternate diagnostic signature) exhibits strong simultaneous correlation with its specifications as well as critical spice-level device parameters. This allows both to be predicted accurately from the DUT response with virtually zero extra test-time or testhardware cost. A key consequence is the ability to perform cause-effect analysis, relating specification perturbations to device level anomalies on a per-chip basis to provide essential diagnostics.
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Network-Driven Chips
Pub ID:  2757 Authors:  Li‑Shiuan Peh, Tushar Krishna
Network-Driven Chips: Towards the ideal NoC
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Coherent 3D Scene Understanding from Images     [ edit ]   
Pub ID:  2740 Authors:  Sid Ying‑Ze Bao, Jason Clemons, Mohit Bagra, Todd Austin, Silvio Savarese
The one slide overview of the layout estimation part for the visual sonificaiton system.
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Parallel Assertions for Debugging Parallel Programs
Pub ID:  2758 Author:  Daniel Schwartz‑Narbonne
A parallel program must execute correctly even in the presence of unpredictable thread interleavings. This interleaving makes it hard to write correct parallel programs, and also makes it hard to find bugs in incorrect parallel programs. A range of tools have been developed to help debug parallel programs, ranging from atomicity-violation and data-race detectors to model-checkers and theorem provers. One technique that has been successful for debugging sequential programs, but less effective for parallel programs, is running the program using assertion predicates provided by the developer. These assertions allow programmers to specify and check their assumptions. In a multi-threaded program, the programmer's assumptions include both the current state, and any actions (e.g. access to shared memory) that other, parallel executing threads might take. We introduce parallel assertions which allow programmers to express these assumptions for parallel programs using simple and intuitive syntax and semantics. We present a proof-of-concept implementation, and demonstrate its value by testing a number of benchmark programs using parallel assertions
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   GSRC State of the Center
Pub ID:  2741 Author:  Sharad Malik
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Running 1000 Threads on a General-Purpose Multi-Core
Pub ID:  2760 Authors:  Daniel Sanchez, Christos Kozyrakis
Scaling chip-multiprocessors (CMPs) to support thousands of threads requires significant innovation across the software-hardware stack. We present a set of software and hardware contributions that tackle these important scalability challenges. First, we enable scalable software by designing runtimes that support rich abstractions for parallelism, heterogeneity, and locality, and perform scheduling dynamically and at fine granularity to avoid load imbalance. Moreover, we introduce flexible hardware support to accelerate fine-grain scheduling, ensuring low scheduler overheads at high thread and core counts. Second, we present a set of techniques that enable scalable coherent cache hierarchies that are highly efficient, provide QoS and are configurable by software. We first design a novel cache array that implements high associativity cheaply and provides analytical guarantees on associativity, and use it to implement a scalable cache partitioning technique (so that thousands of threads can share the cache in a controlled manner, providing QoS guarantees) and scalable cache coherence.
Nov 16, 2011,   GSRC/MuSyC Annual Joint Review

   Characterizing and Improving Last-level Cache Management using Signature-based and Prefetch-aware Approaches
Pub ID:  2717 Author:  Carole‑Jean Wu
Hardware prefetching and last-level caching are two independent mechanisms to mitigate the growing latency to memory. Prefetching improves performance by fetching useful data in advance, but introduces performance variability for applications under different cache management policies. In this talk, I will present a Prefetch-Aware Cache Management (PACMan) proposal for providing better and more predictable performance under the influence of prefetching by modifying the cache insertion and hit promotion policies to treat demand and prefetch requests differently. Then, I will present a novel Signature-based Hit Predictor (SHiP) to learn the re-reference behavior of cache lines. SHiP correlates cache references with unique signatures: memory region, program counter, and instruction history sequence, and uses these signatures to better predict the re-reference intervals of cache references. While using less hardware, SHiP doubles the performance gains of the prior arts. PACMan and SHiP will be presented in MICRO 2011.
Nov 1, 2011,   GSRC e-seminar: Characterizing and Improving Last-level Cache Management using Signature-based and Prefetch-aware Approaches

   Debugging Parallel Programs for the Masses     [ edit ]   
Pub ID:  2639 Authors:  Daniel Schwartz‑Narbonne, Feng Liu, Tarun Pondicherry, David August, Sharad Malik
A parallel program must execute correctly even in the presence of unpredictable thread interleavings. This interleaving makes it hard to write correct parallel programs, and also makes it hard to find bugs in incorrect parallel programs. A range of tools have been developed to help debug parallel programs, ranging from atomicity-violation and data-race detectors to model-checkers and theorem provers. One technique that has been successful for debugging sequential programs, but less effective for parallel programs, is running the program using assertion predicates provided by the developer. These assertions allow programmers to specify and check their assumptions. In a multi-threaded program, the programmer's assumptions include both the current state, and any actions (e.g. access to shared memory) that other, parallel executing threads might take. We introduce parallel assertions which allow programmers to express these assumptions for parallel programs using simple and intuitive syntax and semantics. We present a proof-of-concept implementation, and demonstrate its value by testing a number of benchmark programs using parallel assertions.
Sep 13, 2011,   GSRC e-seminar: Debugging Parallel Programs for the Masses