skip to main content
Department of Computer Science University of Colorado Boulder
cu: home | engineering | mycuinfo | about | cu a-z | search cu | contact cu cs: about | calendar | directory | catalog | schedules | mobile | contact cs
home · events · colloquia · 2011-2012 · 

Colloquium - Yuan

DLC 1B70

Diagnosing Production Failures with Better Logging Support
University of Illinois, Urbana-Champaign

Software systems often fail in production environment. As these failures directly affect the customers, large system vendors typically have to invest significant amounts of resources in diagnosing them. Unfortunately, diagnosing these production failures is notoriously difficult. Indeed, constrained by both privacy and expense reasons, software vendors often cannot reproduce such failures. Therefore, support engineers and developers continue to rely on the logs printed by the run time system to diagnose the production failures. However, the ad-hoc nature of today's system logs are frequently insufficient for effective failure diagnosis.

In this talk, I will describe our work on improving the software logging for better production failure diagnosis. One approach, LogEnhancer, uses a novel combination of program analysis and system techniques to collect additional information for each existing log message. Another approach, LogError, tackles the problem of "silent failures" -- failures without any log messages printed. We applied LogEnhancer and LogError to a broad range of real software systems, and found that we can significantly improve the postmortem failure diagnosis by improved software logging. The insights we learnt could also benefit programmers towards better designs of their software for better failure diagnosability.

Ding Yuan is a PhD candidate at the University of Illinois at Urbana-Champaign. He is also a visiting student at the University of California, San Diego. His research interests span the areas of systems, software engineering and programming languages, with a focus on practical approaches for failure diagnosis. He has received two ASPLOS best paper nominees, an ACM SIGSOFT Distinguished Paper award, an Outstanding Teaching Assistant award, and a Saburo Muroga Fellowship. His research on failure diagnosis has been requested for release by large vendors including Cisco, EMC, Huawei, NetApp, Qualcomm, etc.

Hosted by Li Shang.

The Department holds colloquia throughout the Fall and Spring semesters. These colloquia, open to the public, are typically held on Thursday afternoons, but sometimes occur at other times as well. If you would like to receive email notification of upcoming colloquia, subscribe to our Colloquia Mailing List. If you would like to schedule a colloquium, see Colloquium Scheduling.

Sign language interpreters are available upon request. Please contact Stephanie Morris at least five days prior to the colloquium.

See also:
Department of Computer Science
College of Engineering and Applied Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
Send email to

Engineering Center Office Tower
ECOT 717
FAX +1-303-492-2844
XHTML 1.0/CSS2 ©2012 Regents of the University of Colorado
Privacy · Legal · Trademarks
May 5, 2012 (13:29)