Wildfire Project
Enabling Embedded Systems and IoT with hls4ml
Background
Wildfires pose an increasingly urgent global threat, as evidenced by recent devastating events in Maui, Hawaii, and across Alaska. To address this challenge, robust and reliable AI-based wildfire detection models are imperative. Our ongoing research has yielded significant advancements in video and image-based wildfire detection and ember detection AI models aimed at early prevention efforts.
Strategy
Recognizing the computational demands of these models, we propose leveraging Field Programmable Gate Arrays (FPGAs) due to their proven flexibility and parallel computation advantages. FPGAs serve as efficient hardware accelerators for deploying deep learning models, ensuring timely and accurate wildfire detection.
Results
To facilitate the integration of our AI detection models onto FPGAs, which have been trained using various frameworks including PyTorch and TensorFlow-Keras, we rely on the pivotal role of hls4ml in implementation. Our project focuses on demonstrating the effectiveness of AI models on FPGAs through the utilization of hls4ml, thereby enabling rapid and efficient wildfire detection and prevention strategies.
High Speed Camera+4D TEM
Enabling Material Science with hls4ml
Background
4D Scanning Transmission Electron Microscopy (4D-STEM) is a powerful technique for atomic resolution imaging. One common imaging mode captures 2D diffraction images at each pixel position in real space. The direct electron detectors used can reach 4K resolution at frame rates up to 5000 frames-per-second. This has led to orders of magnitude increase in the volume and velocity of the data collected, creating challenges in how to efficiently extract actionable information.
Strategy
We propose and demonstrate a machine learning hardware implementation for real-time crystal structure, rotation, and strain detection in 4D-STEM by leveraging a novel deep neural network (DNN) called a cycle-consistent spatial-transforming autoencoder (CC-ST-AE) capable of learning affine transformations on real and simulation data. We then use distillation to train a smaller, quantized, easily-deployable version of the model to enable real-time inference and high throughput.
Results
We use hls4ml to synthesize the distilled model and optimize the implementation to meet the required latency constraint of 100us. We then integrate the neural network in the readout path of the imaging system onboard a Euresys CoaXPress frame grabber to minimize IO-related overhead. This work provides a proof-of-concept for real-time crystal structure detection in 4D-STEM, significantly increasing the potential for fast materials characterization and discovery.