Designing FPGA Applications Through Intelligent Design Space Exploration

FPGAs are becoming an important hardware to improve the ratio compute power / energy utilization for many applications. Robotics applications are notably impacted by requiring more processing power for decision making, while often running on limited power. However, designing an efficient application for FPGA is very time-consuming, and a single design can potentially take several hours to map onto the hardware. Each application is usually composed of many parameters that can affect the running time, accuracy, and other desired objectives. Due to the very large space of possible combinations of these parameters, it is often difficult - even for an expert - to predict the outcome of one particular design. We propose smart design space exploration algorithms based on active learning techniques, capable of selecting the most optimal architectures among a pool of possible designs. We also propose tunable FPGA designs to generate and analyze design spaces for multiple applications, and improve the design space exploration methods. We specifically analyze SLAM (Simultaneous Localization And Mapping) design spaces to improve the implementation of such algorithms on FPGA hardware.

(Download poster)

Spector: OpenCL benchmarks for FPGA

High-level synthesis tools allow programmers to use OpenCL to create FPGA designs. Unfortunately, these tools have a complex compilation process that can take several hours to synthesize a single design. Understanding the design space and guiding the optimization process is crucial, but requires a significant amount of design space data that is currently unavailable or difficult to generate. To solve this problem, we have developed Spector, an OpenCL FPGA benchmark suite. We outfitted each benchmark with a range of optimization parameters (or knobs), compiled thousands of unique designs using the Altera OpenCL SDK, and recorded their corresponding performance and utilization characteristics. These benchmarks and results are completely open-source and available on our repository.

We published and presented this work at the ICFPT 2016 conference in Xi'an, China.

Maya Archaeology: Tunnel Mapping

Many Maya archaeological sites are fragile and not open to the public. We are experimenting with data collection methods to help create 3D visualizations. To enable fast real-time scanning, we are building upon mobile technologies and RGB-D sensors such as Microsoft Kinect, Intel RealSense, the Google Tango tablet, and the NVIDIA Jetson TX2 board. For the 2016 Guatemala deployment, we have developed a basic 3D reconstruction application on the Google Tango and collected data in the excavation sites. For the 2017 deployment, we have built a prototype scanning device consisting of a backpack carrying a laptop and batteries, connected to an external tablet with light and sensors.

More information on the Engineers for Exploration webpage.
Here is the corresponding poster that was presented at the UCSD Research expo 2016, and below are related videos.

Building a 3D scanner prototype for the 2017 season:

Collecting data in the archaeological site of El Zotz in Guatemala, field season 2016:

Testing custom 3D reconstruction software on the Google Tango, in Anza-Borrego mud caves, before the 2016 season:

KinectFusion on FPGA

This work is based on KinectFusion, a project developed by Microsoft Research. You can use a Kinect camera to reconstruct your environment in 3D in real-time, just by holding the camera and moving around. However this program requires a modern GPU that uses a lot of power. We want to run it on a more power-efficient hardware and hopefully get to 3D reconstruction for embedded systems. We are modifying the open-source version of KinectFusion, Kinfu, to make it run on a FPGA, by using high-level tools such as the Altera OpenCL SDK. The program is divided into three parts: Iterative Closest Point (ICP) for camera motion tracking, Volumetric Integration (VI) to build the 3D model, and Ray Tracing for screen rendering. We have integrated the ICP algorithm on an FPGA to make an hybrid GPU/FPGA application run in real-time, and we are working on optimizing VI to run efficiently on the FPGA.

We published and presented our work at the ICFPT 2014 conference in Shanghai.

This video presents the project and was created as part of a classwork requirement: