This week I attended the amazing FCCM 2022 conference at Cornell Tech, which was the first in-person/hybrid FPGA conference since the COVID pandemic! It is exciting that I met some friends that I have not seen since FPGA 2020, which was the last in-person FPGA conference before the pandemic (!), and made a lot of new friends! Here are some highlights in three dimensions.

HLS Tool Talks
The first session of the conference was Tools, which is my favorate session as an HLS-tool person. There were four in-person talks and they were all interesting.
Lana Josipovic from ETH presented her recent progress of her HLS tool Dynamatic: resource sharing of dynamically scheduled circuits. Her work solves a well-known problem of dynamic scheduling that the dynamically scheduled hardware cannot share resources, which leads to dramatically more area than static scheduling. Her work significantly reduces the area overhead and makes dynamic scheduling more feasible to realistic applications.
We also presented our work of bringing C-slow pipelining to HLS. This idea came up three years ago. One of my supervisors George Constantinides suggested that I read the book Reconfigure Computing when I started my PhD. Then I firstly read about C-slow pipelining and got interested. C-slow pipelining is one of the old optimisation techniques that can significantly improve the hardware performance but has since been forgotten in the textbook. We managed to revisit it and integrated it to the HLS flow. More details about our work can be found in George’s blog and our paper.
Ecenur Üstün from Cornell Univeristy presented her work on rewriting large-bitwidth multiplications for HLS. Her work significantly reduces the resource ovehead in these multiplications by using E-graphs, allowing the designs to fit into a FPGA device and get optimised. E-graphs have recently become popular and useful for expression rewrting.
My colleague Michalis Pardalos presented an on-going work on formally verified resource sharing in Vercert. Vericert is a formally-verified HLS tool that is bug free, developed by my colleague Yann Herklotz. This work potentially improves the hardware area for Vericert.
HLS-Related Talks
There are also other interesting HLS-related works in other sessions. The following are the highlights of the in-person talks:
Tiancheng Xu from Rice University presented his work on accelerating Genome variant calling. He uses HLS tools to implement efficient memory architecture and parallelise column-wise computation. They achieved 30x speedup compared to a 16-thread CPU.
Archit Gajjar from NC State University presented his XGBoost inference using HLS. His work achieves a speedup of up to 65.8x over CPU (Intel Xeon E5-2686 v4) and 4.1x over GPU (Nvidia Tensor Core T4).
Johannes de Fine Licht from ETH gave a fantastic talk about this work on fast arbitrary precision floating point for HPC. His work enables wide-mantissa floating point multiplication. This is related to his previous work of flexible communication avoiding matrix multiplication which was published in FPGA 2020.
Best Paper Award
Aman Arora from The University of Texas at Austin presented compute-in-memory blocks on FPGAs. These blocks are quite flexible and can be reconfigured at runtime, which potentially achieves better performance than traditional FPGAs. His work has won the best paper award, and I look foward to it being integrated into the commercialised FPGA devices someday.
Workshops & Tutorials
This FCCM Yann Herklotz and I helped Lana Josipovic and John Wickerson organize the first flashlight workshop, First Workshop on Formal Methods in HLS. We invited Christian Pilato from Politecnico di Milano, Ecenur Üstün and Rachit Nigam from Cornell University to present their work. Yann and I also shared our recent progress of using formal methods to prove/improve the hardware.

(photo by John Wickerson).
The workshop went smoothly, and I also managed to meet Rajit Manohar from Yale University in person. I have been following his asynchronous HLS work, which is very related to Dynamatic and DASS. He showed me a brilliant demo of using his tool (I just randomly asked about it), which went from writing some user code to getting asynchronous Verilog designs in seconds!
Licheng Guo from UCLA gave a tutorial of their toolflow named TAPA. TAPA integrates AutoBridge and RapidStream (which have won FPGA 2021 & FPGA 2022 best papers) and enables efficient dataflow hardware implementation.
Tim Callahan from Google gave a tutorial of customising TinyML processors on an FPGA board. This brought me back to my undergraduate time of playing with embedded system using FPGAs. Tim showed us how to implement an efficient CPU-CFU system on small FPGAs. He also kindly offered each partipant a tiny FOMU FPGA board, which contains 5k LUTs and is as big as a USB connector.

Last but not least…
Many thanks to Zhiru Zhang and Eriko Nurvitadhi for making this possible. There are many volunteers in the conference who helped make everything went on smoothly. Thanks to them for bringing us such good experience. We all had a great time in New York!













