Search: 
 

GSRC posters

authored by

Bryan Catanzaro


Select by venue:  


   Damascene: Highly Parallel Image Contour Detection     [ edit ]   
Pub ID:  2024 Authors:  Bryan Catanzaro, Narayanan Sundaram, Bor‑Yiing Su, Yunsup Lee, Mark Murphy, Kurt Keutzer
Image contour detection is fundamental to many image analysis applications, including image segmentation, object recognition and classification. However, highly accurate image contour detection algorithms are also very computa- tionally intensive, which limits their applicability, even for offline batch processing. In this work, we examine efficient parallel algorithms for performing image contour detec- tion, with particular attention paid to local image analysis as well as the generalized eigensolver used in Normalized Cuts. Combining these algorithms into a contour detector, along with careful implementation on highly parallel, com- modity processors from Nvidia, our contour detector pro- vides uncompromised contour accuracy, with an F-metric of 0.70 on the Berkeley Segmentation Dataset. Runtime is reduced from 4 minutes to 1.8 seconds. The efficiency gains we realize enable high-quality image contour detection on much larger images than previously practical, and the al- gorithms we propose are applicable to several image seg- mentation approaches. Efficient, scalable, yet highly accu- rate image contour detection will facilitate increased per- formance in many computer vision applications.
Sep 3, 2009,   GSRC Annual Symposium 2009

   Damascene: Highly Parallel Image Contour Detection     [ edit ]   
Pub ID:  1510 Authors:  Bryan Catanzaro, Narayanan Sundaram, Bor‑Yiing Su, Yunsup Lee, Mark Murphy, Kurt Keutzer
Image contour detection is fundamental to many image analysis applications, including image segmentation, object recognition and classification. However, highly accurate image contour detection algorithms are also very computationally intensive, which limits their applicability, even for offline batch processing. In this work, we examine efficient parallel algorithms for performing image contour detection, with particular attention paid to local image analysis as well as the generalized eigensolver used in Normalized Cuts. Combining these algorithms into a contour detector, along with careful implementation on highly parallel, commodity processors from Nvidia, our contour detector provides uncompromised contour accuracy, with an F-metric of 0.70 on the Berkeley segmentation Dataset. Runtime is reduced from >4 minutes to 2 seconds. The efficiency gains we realize enable high-quality image contour detection on much larger images than previously practical, and the algorithms we propose are applicable to several image segmentation approaches. Efficient, scalable, yet highly accurate image contour detection will facilitate increased performance in many computer vision applications.
Mar 9, 2009,   GSRC Workshop, Dallas TX

   Fast SVM Training and Classification on a GPU
Pub ID:  1399 Authors:  Bryan Catanzaro, Narayanan Sundaram, Kurt Keutzer
We are motivating our research into frameworks for parallel processing by investigating concrete computations from the area of Computer Vision and Machine Learning. This poster examines implementation and algorithmic issues encountered with Support Vector Machine Training (a Quadratic Programming optimization problem) and Support Vector Machine Classification on highly parallel Graphics Processors. An adaptive variable selection heuristic is proposed for the Sequential Minimal Optimization algorithm used for SVM Training. Results are presented showing a 9-35x speedup on SVM training, and a 5-24x speedup on SVM classification, compared to CPU routines performing the same computation. Accuracy is identical between the GPU and CPU implementations. We conclude that highly parallel processors can perform well on SVM training and classification.
Sep 29, 2008,   GSRC Annual Symposium 2008

   Efficient Parallelization of H.264 Decoding with Macro Block Level Scheduling
Pub ID:  369 Authors:  Jike Chong, Nadathur Rajagopalan Satish, Bryan Catanzaro, Kaushik Ravindran, Kurt Keutzer
The H.264 decoder has a sequential, control intensive front end that makes it difficult to leverage the potential performance of emerging manycore processors. Preparsing is a functional parallelization technique to resolve this front end bottleneck. However, the resulting parallel macro block (MB) rendering tasks have highly input-dependent execution times and precedence constraints, which make them difficult to schedule efficiently on manycore processors. To address these issues, we propose a two step approach: (i) a custom preparsing technique to resolve control dependencies in the input stream and expose MB level data parallelism, (ii) an MB level scheduling technique to allocate and load balance MB rendering tasks. The run time MB level scheduling increases the efficiency of parallel execution in the rest of the H.264 decoder, providing 60% speedup over greedy dynamic scheduling and 9-15% speedup over static compile time scheduling for more than four processors. The preparsing technique coupled with run time MB level scheduling enables a potential 7x speedup for H.264 decoding.
Sep 20, 2007,   GSRC Annual Symposium 2007