|
Haryadi S. Gunawi
Professor
JCL 347
CV: [pdf]
Bio: [html]
[ My research group is unfortunately not recruiting students at the moment ]
|
About UChicago CS
Research
Publications
-
DBLP
||
Full list w/ author names + slides
- Heimdall: Optimizing Storage I/O Admission with Extensive Machine Learning Pipeline [EuroSys '25]
- GPEmu: A GPU Emulator for Faster and Cheaper Prototyping and Evaluation of Deep Learning System Research [VLDB '25]
- EVStore: Storage and Caching Capabilities for Scaling Embedding Tables in Deep Recommendation Systems [ASPLOS '23]
- Design Considerations and Analysis of Multi-Level Erasure Coding in Large-Scale Data Centers [Supercomputing '23]
- Extending and Programming the NVMe I/O Determinism Interface for Flash Arrays [TOS '23]
- Towards Continually Learning Application Performance Models [MLFS '23]
- Layered Contention Mitigation for Cloud Storage [CLOUD '22]
- Fantastic SSD internals and how to learn and use them [SYSTOR '22]; BEST PAPER AWARD
- IODA: A Host/Device Co-Design for Strong Predictability Contract on Modern Flash Storage [SOSP '21]
- Experiences in Managing the Performance and Reliability of a Large-Scale Genomics Cloud Platform [ATC '21]
- Fractional-Overlap Declustered Parity: Evaluating Reliability for Storage Systems [PDSW '20]
- LinnOS: Predictability on Unpredictable Flash Storage (with a Light Neural Network) [OSDI '20]
- Extreme Protection Against Data Loss with Single-Overlap Declustered Parity [DSN '20]
- Lessons Learned from the Chameleon Testbed [ATC '20]
- LeapIO: Efficient and Portable Virtual NVMe Storage on ARM SoCs [ASPLOS '20]
- FlyMC: Highly Scalable Testing for Complex Interleavings in Cloud Systems [EuroSys '19]
- ScaleCheck: A Single-Machine Approach for Discovering Scalability Bugs in Large Systems [FAST '19]
- E2E: Embracing User Heterogeneity to Improve Quality of Experience on the Web [SIGCOMM '19]
- IASO: A Fail-Slow Detection and Mitigation Framework for Distributed Storage Services [ATC '19]
- Fail-Slow at Scale: Evidence of Hardware Performance Faults in Large Production Systems [FAST '18] Best paper nominee
- The CASE of FEMU: Cheap, Accurate, Scalable and Extensible Flash Emulator [FAST '18]
- StrongBox: Stream Ciphers for Full Drive Encryption [ASPLOS '18]
- PCatch: Automatically Detecting Performance Cascading Bugs in Cloud Systems [EUROSYS '18]
- MittOS: Supporting Millisecond Tail Tolerance with Fast Rejecting SLO-Aware OS Interface [SOSP '17]
- PBSE: A Robust Path-Based Speculative Execution for Degraded-Network Tail Tolerance [SoCC '17]
- Scalability Bugs: When 100-Node Testing is Not Enough [HotOS '17]
- Tiny-Tail Flash: Near-Perfect Elimination of GC Tail Latencies in NAND SSDs [TOS '17] Fast-tracked
- Tiny-Tail Flash: Near-Perfect Elimination of GC Tail Latencies in NAND SSDs [FAST '17] Best paper nominee
- DCatch: Automatically Detecting Distributed Concurrency Bugs in Cloud Systems [ASPLOS '17]
- Why Does the Cloud Stop Computing? Lessons from Hundreds of Outages [SoCC '16]
- The Tail at Store: A Revelation from Millions of Hours of Disk+SSD Deployments [FAST '16]
- TaxDC: A Taxonomy of Concurrency Bugs in Datacenter Distributed Systems [ASPLOS '16]
- Manylogs: Improved CMR/SMR Bandwidth and Durability with Scattered Logs [MSST '16]
- What Bugs Live in the Cloud? A Study of Issues in Scalable Distributed Systems [;login: '15]
- Pre-Deployment Detection of Performance Failures in Cloud Systems [HotCloud '15]
- A Fast Model Checker for Finding Heisenbugs in Distributed Systems [ISSTA '15]
- SAMC: Semantic-Aware Model Checking for Fast Discovery of Bugs in Cloud Systems [OSDI '14]
- What Bugs Live in the Cloud? A Study of 3000+ Issues in Cloud Systems [SoCC '14]
- Drill-Ready Cloud Computing (Vision Paper) [SoCC '14]
- Limplock: Understanding the Impact of Limpware on Scale-Out Cloud Systems [SoCC '13]
- The Case for Limping-Hardware Tolerant Clouds [HotCloud '13]
- HARDFS: Hardening HDFS with Selective and Lightweight Versioning [FAST '13]
- Prior to UChicago:
- Failure as a Service (FaaS): A Service for Online Failure Drills [TR-UCB '11]
- PreFail: A Programmable Tool for Multiple-Failure Injection [OOPSLA '11]
- FATE and DESTINI: A Framework for Cloud Recovery Testing [NSDI '11]
- Towards Checking Thousands of Failures with Micro-specifications [HotDep '10]
- Impact of Disk Corruption on Open-Source DBMS [ICDE '10]
- Error Propagation Analysis for File Systems [PLDI '09]
- SQCK: A Declarative File System Checker [OSDI '08]
- EIO: Error Handling is Occasionally Correct [FAST '08]
- Improving File System Reliability with I/O Shepherding [SOSP '07]
- IRON (Internal Robustness) File Systems [SOSP '05]
- Deconstructing Commodity Storage Clusters [ISCA '05]
- Deploying Safe User-Level Network Services with icTCP [OSDI '04]
- Transforming Policies into Mechanisms with Infokernel [SOSP '03]
Services
Program Co-Chairs:
2025: USENIX FAST
2018: USENIX ATC
2015: GCASR
Program Committee (and ERC):
2025: semi-sabbatical
2024: SOSP, ASPLOS, FAST, OSDI
2023: SOSP, REP, ASPLOS
2022: OSDI, EuroSys
2021: NSDI, OSDI, EuroSys
2020: OSDI (Heavy PC), FAST, SYSTOR, ApSys
2019: SOSP (Heavy PC), FAST, NSDI ASPLOS (ERC)
2018: USENIX ATC, FAST, OSDI, SoCC, VLDB
2017: SoCC, TOS, HotCloud, HPDC, ICDCS, Middleware, SOSP SRC
2016: FAST, ATC, VLDB, HotStorage, ASPLOS (ERC)
2015: GCASR (Co-chair), FAST, USENIX ATC, ASPLOS (ERC), MSST, CLOUD
2014: SoCC, HotCloud, HPDC, MSST, INFLOW
2013: SoCC, VLDB
2012: VLDB, MSST, NAS
2011: PDSW, DBTest
Teaching
(not really updated recently, most recent ones are internal Canvas links)
- CMSC 230, Operating Systems:
2017 waitlist,
2016,
2015,
2014,
2013,
2012
- CMSC 154, Introduction to Computer Systems:
2017,
2016,
2015,
2014
- CMSC 331, Advanced Operating Systems:
2014,
2013
- CMSC 332, Topics in Operating Systems:
2021 (MF 3-4:20pm),
2013
Academic Background
|
|
 |
|