Skip to content

Latest commit

 

History

History
48 lines (31 loc) · 2.56 KB

README.md

File metadata and controls

48 lines (31 loc) · 2.56 KB

Customizable FPGA-Based Hardware Accelerator for Standard Convolution Processes

This work fits into the need to apply CNN-based solutions optimized for point cloud in devices with reduced resources. The design and implementation of a convolutional module were proposed to implement CNNs in hardware. In terms of configurability, it is possible to adjust all typical parameters and explore parallelism depending on the resource constraints, making it a solution capable of performing any convolution found in the literature.

Overall

The module enables the configuration of the typical convolution parameters so it can be applied in any CNN layer. While the ReLU operation is default executed the user can enable the MaxPolling operation as a alternative of operations using stride.

Conv_ip

The main focus during development was the energy efficiency. However, parallelism was also integrated to enable competitive processing times. Several Processing Elements can be triggered to increase throughput. The level of parallelism is conditioned by the amount of available resources in the target hardware platform.

Bram

Each Processing element operates with high level of efficiency. For that a cascade processing was adopted in the processing unit core to perform the operations.

Master-cascade

As a case study, the convolutional module was integrated with the well-known 3D object detection model for both validation and evaluate the performance in a real case scenario.

ConvM_validation_integration

Using the PointPillars model as a case study, the use of the module allowed to reduce the processing time up to 25% without compromising the detections performance.

PP_6k


Made by pedromiguelcp & duartesilva16. Project no longer under development. 🏁 Checkout our article: https://www.mdpi.com/1424-8220/22/6/2184
Contact [email protected] for more information!