3D Things at FCCM 2022…

May 20, 2022May 20, 2022 ~ Jianyi Cheng ~ Leave a comment

This week I attended the amazing FCCM 2022 conference at Cornell Tech, which was the first in-person/hybrid FPGA conference since the COVID pandemic! It is exciting that I met some friends that I have not seen since FPGA 2020, which was the last in-person FPGA conference before the pandemic (!), and made a lot of new friends! Here are some highlights in three dimensions.

HLS Tool Talks

The first session of the conference was Tools, which is my favorate session as an HLS-tool person. There were four in-person talks and they were all interesting.

Lana Josipovic from ETH presented her recent progress of her HLS tool Dynamatic: resource sharing of dynamically scheduled circuits. Her work solves a well-known problem of dynamic scheduling that the dynamically scheduled hardware cannot share resources, which leads to dramatically more area than static scheduling. Her work significantly reduces the area overhead and makes dynamic scheduling more feasible to realistic applications.

We also presented our work of bringing C-slow pipelining to HLS. This idea came up three years ago. One of my supervisors George Constantinides suggested that I read the book Reconfigure Computing when I started my PhD. Then I firstly read about C-slow pipelining and got interested. C-slow pipelining is one of the old optimisation techniques that can significantly improve the hardware performance but has since been forgotten in the textbook. We managed to revisit it and integrated it to the HLS flow. More details about our work can be found in George’s blog and our paper.

Ecenur Üstün from Cornell Univeristy presented her work on rewriting large-bitwidth multiplications for HLS. Her work significantly reduces the resource ovehead in these multiplications by using E-graphs, allowing the designs to fit into a FPGA device and get optimised. E-graphs have recently become popular and useful for expression rewrting.

My colleague Michalis Pardalos presented an on-going work on formally verified resource sharing in Vercert. Vericert is a formally-verified HLS tool that is bug free, developed by my colleague Yann Herklotz. This work potentially improves the hardware area for Vericert.

HLS-Related Talks

There are also other interesting HLS-related works in other sessions. The following are the highlights of the in-person talks:

Tiancheng Xu from Rice University presented his work on accelerating Genome variant calling. He uses HLS tools to implement efficient memory architecture and parallelise column-wise computation. They achieved 30x speedup compared to a 16-thread CPU.

Archit Gajjar from NC State University presented his XGBoost inference using HLS. His work achieves a speedup of up to 65.8x over CPU (Intel Xeon E5-2686 v4) and 4.1x over GPU (Nvidia Tensor Core T4).

Johannes de Fine Licht from ETH gave a fantastic talk about this work on fast arbitrary precision floating point for HPC. His work enables wide-mantissa floating point multiplication. This is related to his previous work of flexible communication avoiding matrix multiplication which was published in FPGA 2020.

Best Paper Award

Aman Arora from The University of Texas at Austin presented compute-in-memory blocks on FPGAs. These blocks are quite flexible and can be reconfigured at runtime, which potentially achieves better performance than traditional FPGAs. His work has won the best paper award, and I look foward to it being integrated into the commercialised FPGA devices someday.

Workshops & Tutorials

This FCCM Yann Herklotz and I helped Lana Josipovic and John Wickerson organize the first flashlight workshop, First Workshop on Formal Methods in HLS. We invited Christian Pilato from Politecnico di Milano, Ecenur Üstün and Rachit Nigam from Cornell University to present their work. Yann and I also shared our recent progress of using formal methods to prove/improve the hardware.

Me talking about probabilistic scheduling using Petri nets
(photo by John Wickerson).

The workshop went smoothly, and I also managed to meet Rajit Manohar from Yale University in person. I have been following his asynchronous HLS work, which is very related to Dynamatic and DASS. He showed me a brilliant demo of using his tool (I just randomly asked about it), which went from writing some user code to getting asynchronous Verilog designs in seconds!

Licheng Guo from UCLA gave a tutorial of their toolflow named TAPA. TAPA integrates AutoBridge and RapidStream (which have won FPGA 2021 & FPGA 2022 best papers) and enables efficient dataflow hardware implementation.

Tim Callahan from Google gave a tutorial of customising TinyML processors on an FPGA board. This brought me back to my undergraduate time of playing with embedded system using FPGAs. Tim showed us how to implement an efficient CPU-CFU system on small FPGAs. He also kindly offered each partipant a tiny FOMU FPGA board, which contains 5k LUTs and is as big as a USB connector.

Last but not least…

Many thanks to Zhiru Zhang and Eriko Nurvitadhi for making this possible. There are many volunteers in the conference who helped make everything went on smoothly. Thanks to them for bringing us such good experience. We all had a great time in New York!

My Experience of Artifact Evaluation in FPGA2020

July 8, 2020July 8, 2020 ~ Jianyi Cheng ~ Leave a comment

This year, FPGA conference starts to add ACM badges to the accepted papers, which promotes the improved reproducibility of research results in the FPGA community. Authors can submit their artifacts and get at most three badges after the evaluation process. In FPGA2020, I have been honoured to be both an author whose work was evaluated, and a volunteer evaluating the artifacts submitted by others. This blog shares my experience during the evaluation of artifacts on both sides.

As an author

My paper “Combining Dynamic & Static Scheduling in High-level Synthesis” with my supervisors, George Constantinides and John Wickerson, and my collaborators, Lana Josipovic and Paolo Ienne, has received a “Artifacts Available” badge. We submitted the benchmark source and experimental results for evaluation (the source code of our tool was released after FPGA2020). Our work was evaluated by Amin Kalantar Chahouki, and the whole evaluation process was guided by Miriam Leeser and Suhaib Fahmy. Many thanks to them for helping me through the evaluation process to get my first ACM-badged paper. For instance, Amin found that our code had not been appropriately licensed and spotted that out so we could fix it by the deadline.

Since our tool builds on Dynamatic from EPFL, which has not been open-sourced before FPGA 2020, we do not expect Amin to reproduce all the results. However, he still managed to reproduce some of the experimental results from our submitted artifacts and kindly sent us the feedback. Since all the source of our benchmarks and the results are publically available with DOIs, we have received the “Artifacts Available” badge in the end.

Back to then, I could not understand the guideline very well about how close-soured tools should be evaluated. However, we strongly agree that artifacts are important for the whole research community, and we should encourage it among the FPGA community as well. Therefore, we submitted our benchmark source for support. The lesson we have learnt is that we could have give access to our local machine to the evaluators. Then they can use the tool directly through the evaluation process if setting up the environment is time-consuming and the license of the tools allows it.

As an evaluator

On the other hand, I evaluated the paper “Flexible Communication Avoiding High-Level Synthesis Matrix Multiplication” by Johannes de Fine Licht, Grzegorz Kwasniewski and Torsten Hoefler. Although we both work on high-level synthesis, there is still a big difference in techincal details.

Johannes has been working in the high-performance computing area, and his work requires a large FPGA board like VCU1525. Before evaluating the process, I was having difficulty setting up all the environments and finding the same FPGA board they have used. As I have no experience in this area, it did not go well after struggling for a month. Fortunately, Johannes has been very responsive and helpful. He managed to set up an account for me on their local machine and kindly allowed me to use their board remotely. With his help all along, the following process went well. I used his tool and got the same results demonstrated in his paper successfully. Eventually, his paper has received all three badges, as I recommended.

Suggestions for the authors

I strongly suggest those who are submitting to FPGA 2021 or any other conferences that support artifact evaluation to submit their artifacts. Here are a few suggestions that may help you through the evaluation process:

Preparing a virtual machine with your tools installed can help the evaluators to start easily; also for the potential users.
For hardware artifacts, a experimental hardware set that allows remote control by the evaluator would also be helpful.
Detailed documentation about the tool can help the evaluator to understand and evaluate the functionality efficiently.

Finally, I want to thank my supervisors for introducing the opportunity of being an artifact evaluator to me. Also, many thanks to the artifact evaluation chairs for FPGA2020, Miriam Leeser and Suhaib Fahmy, to make this happen.

Day 1 of PLDI2020

June 15, 2020June 16, 2020 ~ Jianyi Cheng ~ Leave a comment

This year, PLDI2020 goes virtual like many other conferences due to COVID19 issue. It is my first virtual conference and also my first programming language conference. An advantage for a virtual conference is that I can switch between talks at any time without disturbing other audience. On the other side, we have limited interactions between other attendees. Fortunately, the fantastic organising committee of PLDI2020 has provided an online social platform named Gather so people can still chat “along the beach”.

Gather has a virtual beach outside the conference room with sounds of waves.

Today is my first day in PLDI2020. I have learnt something about static analysis, something about machine learning in program language processing and something about quantum computing. All the talks are excellent, and the following are the talks that I attended and think interesting:

SOAP2020

Formal Reasoning and the Hacker Way

Peter O’Hearn shares his story of how he brought static formal reasoning into the industry. He points out one of the biggest challenges of software verification in the latest software development: rapid modification of large codebases, and then he explains how separation logic to overcome this problem. Next, he tells the story of achieving the work of concurrency separation logic, which is really inspiring for me. He shows how industrial environmental has affected his plan and what decision he has made from the initial thought, then mental proofs, to prototyping. Finally, he introduces his most recent great work on the incorrectness logic to the presence of bugs.

Beyound Code: New Signals for Static Analysis

Caitlin Sadowski shares her story at Google of how to develop the static analyser to help programmer to write better programs. Firstly, she showes the importance of scalability and usability of a static analyser to bring the programmer benefits and low cost. She also shares the story of Bugbot and concludes the lesson learnt. Finally, she introduces her most recent work based on Tricorder and how she incorporates the terminology, data and configuration into the static analysis as signals. This talk gives me guidance about how to develop a high-quality static analyser and build a program analysis ecosystem.

PLMW2020

Neural Methods for Programming Language Processing

Eran Yahav shows how neural networks can help the programmers complete and correct their code. He shows an interesting comparison between natual language processing (NLP) and program language processing (PLP). Then he shows how the network can optimise the code, such as type/name predication, code captioning, code completion/generation and bug findings. There were a few discussions about the differences between a neural method and a formal method, such as analysis locality, correctness and irregular behaviours. Finally, he introduces his work of code2vec, which uses neural networks to perform the above optimisations. They also have an online playground to try (I really like the colourful representation of the AST).

Hacks to Compensate for Lack of Novelty in Programming Languages Research

Many thanks to our general chair Alastair Donaldson who spends a lot of time and efforts to make PLDI2020 so great. Even he is busy, and he also presents one of the most helpful talks in PLMW today. He shares a few examples when his research got in trouble with the “novelty” issue, and then how he overcame these problems and achieved excellent publications. This talk gives tons of research suggestions for PhD students like us, and all of them are helpful. Unfortunately, due to the time limit, the talk was not finished. I hope the talk slides will be available to us.

How to design talks

Ranjit Jhala gives the funkiest talk today on how to design funky talks. He shows how to choose landmarks for the minimum bandwidth. Then he describes how to divide sections into pieces with these landmarks. Finally, he shows how to use transitions by only showing the diffs of the slides at the frame level to retain the audience’s attention. He is very funny. One disadvantage of the virtual conference is that he could not hear how loud I have laughed.

Charting your Path

Alexandra Silva, Stephen Freund, Madan Musuvathi and Loris D’Antoni have a discussion about how PhD students to choose a career path. They have covered several realistic topics, like setting expectations and making future plans, and give suggestions about how to overcome these. The attendees asked a lot of useful questions, and they all gave helpful answers — especially the one for Stephen’s dog.

Programming Quantum Computers

A Primer with IBM Q and D-Wave Exercises

I did not expect quantum computing would be presented at PLDI2020, and obviously, I was wrong… Frank Mueller gives a tutorial about quantum computing. He introduces how a qubit is computed and measured in quantum computing and compares the difference from the classical computation. The first part of the tutorial gives me an insight into an area that I am totally unfamiliar with, but I find it really interesting.

One disappointing thing about parallel tracks is that I could not attend all the talks that I am interested in. A good thing is that these talks have been recorded so I can finish them in my spare time. Also, look forward to the talks tomorrow!

My FPGA2020 Highlights: An HLS Perspective

February 28, 2020 ~ Jianyi Cheng ~ 3 Comments

This week I attended FPGA2020 in Monterey, Seaside, California. This was my second FPGA conference, and I really enjoyed it. The conference was held at the same place as last time, and the omelette in the hotel was amazing as usual. The following are my summary of interesting talks related to high-level synthesis (HLS). Please feel free to correct me in the comments if I understand wrong.

Tutorial: From C/C++ to Dynamically Scheduled Circuits

In the first morning, there were three tutorials held in parallel: Vitis introduction, dynamically scheduling HLS tool and FPGA hardware security. As an HLS tool guy, I attended the second tutorial.

Lana Josipovic, Andrea Guerrieri and Paolo Ienne from EPFL introduce their HLS tool named Dynamatic. Dynamatic is a new HLS tool that supports the synthesis of dynamically scheduled hardware from C++ code. Lana gave us a talk to show how to use the tool to generate the hardware and perform hardware optimisation like buffering and LSQ configuration.

HeteroHalide: From Image Processing DSL to Efficient FPGA Acceleration

Yuze Chi introduced a frontend of HeteroCL. His paper is a continuing work of the FPGA2019 best paper HeteroCL, which allows users to write more effcient code.

Massively Simulating Adiabatic Bifurcations with FPGA to Solve Combinatorial Optimization

Yu Zou introduced a general combinatorial optimisation problem solver based on FPGA. The application was quite new to me, but the talk was quite clear and interesting. Their graph-based formulation attrated my attention, and I look forward to reading his paper later.

Keynote II: Xilinx’s Vitis Unified Software Platform

Vinod Kathail gave a keynote talk to introduce Vitis. In the HLS part, the two most exciting thing for me. In Vitis HLS, they get rid of the support of pragmas, so the tool can automatically perform hardware optimisation based on a simple input code. Also, the frontend, i.e. the LLVM transformation, will be open-sourced. Vitis HLS allows LLVM IR code as input, and its backend will automatically schedule the code and synthesise into hardware.

Flexible Communication Avoiding Matrix Multiplication on FPGA with High-Level Synthesis

How complicated can a matrix multiplication be? Johannes de Fine Licht from ETH tells you it is so different in the HPC area. He shows his great open-sourced work to accelerate matrix multiplication in HLS for HPC. He showed that their library has better performance than the pre-installed IP library in Vitis, which was quite impressive! As the evaluator for their artifacts, I am impressed by their work and feel grateful to try his library at a very early stage. Use their tool here!

FPGA2020 Best paper: Buffer Placement and Sizing for High-Performance Dataflow Circuits

Lana Josipovic proposes a novel approach to use marked graph formulation to model the buffering problem in the dynamically scheduled dataflow circuit. It solves the problem of low throughput due to the backpressure in the unbuffered circuit. Like last year, she used interesting animations in the slides to demonstrate how token flows in the dataflow circuit. I really like it!

Combining Dynamic & Static Scheduling in High-level Synthesis

This year I co-authored with Lana Josipovic and Paolo Ienne (also with my supervisors George Constantinides and John Wickerson) to propose an HLS platform tool named DASS. The existing HLS tools both have advantages and disadvantages, and the users have to compromise when choosing one of them. Our tool built on top of these HLS tools to provide another solution that allows users to customise their scheduling approaches in different parts of the hardware. Our results are quite promising, and more details are here (paper) and here (tool). Our collaboration was a great success, and I want to thank Lana and Paolo again to make this work possible! I also thank the guidance from my two amazing supervisors to make me better today!

Using OpenCL to Enable Software-like Development of an FPGA-Accelerated Biophotonic Cancer Treatment Simulator

Tanner Young-Schultz gave a talk of how he used OpenCL to make designs on FPGA to kill cancer! The animations were quite exciting and brought a lot of fun in the talk. I am a big fan of his work as he detailedly explained what challenges he had during the hardware optimisation at the code level and how he solved them.

Other applications I found interesting:

Apart from Tanner’s work, there are also sevaral interesting application works using HLS:

FPGA-Accelerated Samplesort For Large Data Sets
BiS-KM: Enabling Any-Precision K-Means on FPGAs
Energy-Efficient 360-Degree Video Rendering on FPGA via Algorithm-Architecture Co-Design

Finding and Understanding Bugs in FPGA Synthesis Tools

This may not be a HLS-related topic, but I really like it! It is a work by my colleague Yann Herklotz and my supervisor John Wickerson. They build a fuzzing tool to automatically find Verilog bugs in the modern FPGA synthesis tools like Quartus and Vivado. The reason why it is interesting is that they found no bug in Quartus Prime and increasing numbers of bugs in Vivado with the version. 😛

Summary

Compared to last year, my general impression for this year is that the number of works with HLS increases significantly. This motivates me a lot to provide better HLS tool supports for them!

Highlights of FaceTAV2019

November 21, 2019November 22, 2019 ~ Jianyi Cheng ~ Leave a comment

This week, I attended my first Facebook Testing and Verification Symposium in Facebook London. As a first-time visitor, I find it really interesting that they made a microphone a soft cube, so they can throw it like a football to those who want to ask questions.

The followings are the talks I find interesting. If I misunderstand any talk, feel free to comment and correct me. 🙂

Analysis and Testing of Mobile Systems
Julia Rubin evaluated the existing available tools for malware detection in mobile devices. In her experiments, she showed that static analysis of manifest and bytecode is beneficial. Also, machine learning is popular but needs to be carefully applied with the suitable training data. She gave a joke about adoption errors: as a trained network, if it detected snow, it predicted wolf, and if not, it predicted husky. However, avoiding the attack is still challenging due to the difficulties in code reasoning.

Testing and Verification Technology Transfer
Stephen Magill talked about tranfering technology from academia to industry. I really like that he introduced the background of the talk in a Star War opening crawl. In the talk, he gave 4 principles that most of the companies are applying: platform approach for multiple tools; integration into the existing framework; minimisation of false-positive rate for the developers; analysis tools to increase productivity. He also addressed the challenges in academia. For instance, no open-source proxy at a massive scale for academic researchers. Also, the information is limited because of security issues when sending input in and getting results out.

Probabilistic Model Checking
Marta Kwiatkowska talked about the probabilistic model checking in autonomous vehicles. She used an example of a 2-player game to show how to abstract the model and verify it using PRISM. She also addressed several open questions like adding learning, parameter estimation, and continuous adaption into the data-rich model, and prioritising the weights in the Bayesian neural network.

Fuzz Based Testing
Andreas Zeller gave an exciting talk about the “world fastest fuzzer”. I really like the talk and the way he presented. He showed how the tool automatically generates beautiful grammar, creates a producer and generates the test inputs for the program. Then the tool compares the inputs and sends it back to the test generator. The closed-loop automation looks really cool.

Survey of Program Synthesis
Cristina David talked about the history and recent progresses of program synthesis. She addressed that the active areas of other researches, like SMT/SAT solvers and machine learning, renew the interests of program synthesis. However, the challenges still exist, like writing specifications, scalability and correctness guarantees.

Incorrectness Logic
Peter O’Hearn talked about underapproximation approach to find bugs in the program. Instead of using overapproximation to get false positive, he used underapproximation to get false negative. This is interesting as it proves the presence of bugs!

Using Machine Learning to Recommend Correctness Checks for Geographic Map Data
Atif Memon explained the challenges in the address search and routing for Apple map. I really like his presentation style and his jokes. Also I am really impressed that Apple map still chooses to compete with Google map while strictly maintaining users’ privacy.

Perspectives on Testing and Verification
Tony Hoare showed several verified software projects in the past and the on-going ones. He suggests the community to focus on the educational toolset and unification of theories for general-purpose languages. Also, he addressed several conflicts of cultures between the academia and the industry.

Facebook Test & Verification Tools
Nadia Alshahwan introduced a dynamic testing tool named Sapienz from Facebook to find UI bugs from human behaviour. On the static analysis side, Ezgi Çiçek introduced Infer also from Facebook to evaluate the execution cost of the programs along developing. Satish Chandra gave a talk about how to use AST analysis to repair the bugs in the code automatically using the tool named Getafix from Facebook. Thanks to Satish getting up so early to share at 6am in California.

Apart from these, the dinner was on the boat named Dixie Queen travelling along Thames water. Thanks to Facebook, we had the chance to see the London Tower Bridge opening for us.