NIELIT-INTERNSHIP-ON-HLS-PROGRAMMING

Objective of this Internship

Provide theoretical concepts of High-Level Synthesis (HLS)
Provide a comprehensive understanding of HLS design flow for FPGAs
In-system Hardware-debugging of the post-implemented design
Case studies and projects using HLS Programming

Outcome of the Internship

After successful completion of this Internship, the I learnt to

Create my own IP and SoC design for FPGA implementation
Develop my own software application with Zynq APSoC

WEEK 1

DAY 1 - Introduction to HLS programming

Applications of FPGA

FPGA Design Approach

HLS in Industries

NVIDIA for designing embedded GPU
Google in building VP9 encoder/decoder
AMD in desigining Microprocessor
STM for H.265

Why do we need FPGA?

FPGA and CPU both can have logic synthesis

CPU have a OS on it whereas FPGA don't than why should we work with FPGA

FPGA Vs CPU

FPGA	CPU
Computing platform with no predefined hardware architecture,each C/C++ program we create have it's own optimized hardware architecture	CPUs are computing platform with fixed hardware architecture, generic to perform any logical algorithmic function as long as it is compiled into CPU instructions
Translate C functions into hardware block,that is why it can perform operations in parallel	In CPU compiler translate c function to assembly language,sequential execution
Parallelism	Sequential execution

Computing Structure of CPU

FPGA computing structure

GPU also performs parallelism , but every application is not suitable for GPU

FPGA BASICS

FPGA Structure

FPGA Application Layer and Configuration Layer

In FPGA (Field-Programmable Gate Array) design, understanding the application layer and configuration layer is essential. Let's explore these layers:

1. Application Layer

Purpose:
- The application layer represents the top-level functionality implemented on the FPGA.
- It includes user-defined logic, custom IP cores, and specific algorithms.
- Designers create and program this layer to perform specific tasks or computations.

2. Configuration Layer

Purpose:
- The configuration layer handles the initial setup of the FPGA.
- It involves loading the configuration data into the FPGA's memory elements (such as SRAM cells or flash memory).
- The configuration layer determines how the FPGA behaves during operation.

An LUT (LookUp Table) in FPGA design is a fundamental building block. It can implement any Boolean function of N input variables. Essentially, an LUT acts as a truth table, defining how your combinatorial logic behaves. When an FPGA is configured, it fills in the LUT with output values, which are stored in SRAM bits. The same physical LUT can implement different logic functions by changing the LUT-Mask (the truth table). Think of LUTs as memory, where inputs serve as addresses, and outputs are the data stored in those addresses

FPGA Design Flow and Basic Elements

FPGA Design Flow

The FPGA (Field-Programmable Gate Array) design flow involves several critical steps to ensure the successful implementation of digital circuits. Below is an overview of the typical stages:

Specification: Define the requirements and functionality of the design.
Design Entry: Create the design using hardware description languages (HDL) such as VHDL or Verilog, or through schematic entry.
Synthesis: Convert the HDL code into a gate-level netlist.
Implementation: Map, place, and route the netlist onto the FPGA fabric. This involves:
- Mapping: Assign logic to specific resources.
- Placement: Determine the physical locations of logic blocks.
- Routing: Establish the connections between the placed blocks.
Simulation: Verify the design functionality at various stages (pre-synthesis, post-synthesis, and post-implementation) using simulation tools.
Configuration: Program the FPGA with the final design using configuration files.
Validation and Testing: Test the programmed FPGA in the real hardware environment to ensure it meets the specified requirements.

FPGA Basic Elements

FPGA consists of several fundamental components:

Logic Blocks: The basic computational units, typically comprising lookup tables (LUTs), flip-flops, and multiplexers. They perform the primary logic functions.
Interconnects: The routing resources that connect the logic blocks. They allow for complex signal routing within the FPGA.
I/O Blocks: Interfaces that connect the internal FPGA logic to the external pins of the device. They manage input and output signals.
Clock Management: Components like PLLs (Phase-Locked Loops) and MMCMs (Mixed-Mode Clock Managers) for clock distribution and timing management.
Embedded Memory: Blocks of memory such as RAM and ROM within the FPGA for data storage and buffering.
DSP Blocks: Dedicated blocks for digital signal processing tasks, offering efficient implementation of arithmetic operations.

LUT Example

A Look-Up Table (LUT) is a fundamental component in digital logic design, particularly in FPGAs (Field-Programmable Gate Arrays). It is used to implement combinational logic functions. The LUT works by mapping input combinations to their corresponding output values based on a pre-defined truth table.

Understanding LUT with the Provided Example

Given Logic Function

The logic function provided is: [ d = (\neg a \land b) \lor c ] where a, b, and c are inputs, and d is the output.

Truth Table

The truth table lists all possible combinations of inputs a, b, and c, and their corresponding output d based on the given logic function.

a	b	c	d
0	0	0	0
1	0	0	0
0	1	0	1
1	1	0	0
0	0	1	1
1	0	1	1
0	1	1	1
1	1	1	1

LUT Implementation

A LUT can be visualized as a table that stores output values for each possible input combination. For a 3-input LUT (as in this example), there are ( 2^3 = 8 ) possible input combinations.

Inputs: The inputs a, b, and c are fed into the LUT.
Output: The LUT outputs the value d based on the combination of inputs.

How LUT Works

Inputs Mapping

Each combination of inputs (a, b, c) is treated as an address to access a specific location in the LUT.
For instance, if a = 0, b = 0, and c = 0, this combination points to the first row in the truth table, resulting in d = 0.

Output Determination

The LUT stores the output values for each input combination based on the truth table.
When a specific combination of inputs is applied to the LUT, it outputs the pre-stored value for that combination.

Example Walkthrough

Input Combination (0, 0, 0):
- The combination a = 0, b = 0, c = 0 corresponds to the first row in the truth table, giving d = 0.
Input Combination (0, 1, 0):
- The combination a = 0, b = 1, c = 0 corresponds to the third row in the truth table, giving d = 1.
Input Combination (1, 1, 1):
- The combination a = 1, b = 1, c = 1 corresponds to the last row in the truth table, giving d = 1.

The diagram on the right shows a LUT with three inputs (a, b, c) and one output (d).
The truth table on the left provides the mapping of inputs to the corresponding output values.
When the inputs are applied to the LUT, it looks up the pre-stored value based on the input combination and produces the output.

FPGA Architecture: Slices, Configuration Logic Block, and Overall FPGA Structure

This document provides an explanation of key components in FPGA architecture, focusing on slices, configuration logic blocks, and the overall FPGA structure. The images included illustrate these components in the context of Xilinx's 7 Series FPGAs.

Slices

A slice is a fundamental building block within an FPGA's Configurable Logic Block (CLB). Each slice contains multiple Look-Up Tables (LUTs), flip-flops, and multiplexers. These elements are used to implement combinational and sequential logic functions.

Components of a Slice:
- LUTs (Look-Up Tables): Used for implementing combinational logic.
- Flip-Flops: Used for storing state information and implementing sequential logic.
- Multiplexers: Used for routing signals within the slice and to/from the CLB.

The image illustrates the internal structure of a slice, showing how these components are interconnected to form complex logic functions.

Configuration Logic Block (CLB)

A CLB is a larger building block within an FPGA that contains multiple slices. The CLB provides the necessary routing and switching logic to connect slices to each other and to other parts of the FPGA.

Components of a CLB:
- Slices: Each CLB contains multiple slices (usually two in Xilinx FPGAs).
- Switch Matrix: Facilitates the routing of signals between slices and to/from other CLBs.
- CIN/COUT (Carry In/Carry Out): Used for implementing fast carry chains in arithmetic operations.

The image shows the structure of a CLB, highlighting its internal slices and the switch matrix used for routing signals.

Overall FPGA Architecture

The overall FPGA architecture consists of an array of CLBs, each containing multiple slices. The CLBs are interconnected by a complex routing matrix, allowing for flexible signal routing and interconnection.

Components of FPGA Architecture:
- CLBs (Configurable Logic Blocks): The primary building blocks containing slices.
- Switch Matrix: Facilitates the routing of signals between CLBs.
- I/O Pads: Used for interfacing the FPGA with external signals and devices.
- Wires: Interconnect the CLBs and other components within the FPGA. High-Level Synthesis (HLS) involves two main tasks:

Function Synthesis:
- Objective: Convert the software function into a hardware module.
- Process: Translate the function body into hardware logic, optimizing for parallelism.
- Components: Input ports (arguments), function body (logic), output ports (arguments).
Interface Synthesis:
- Objective: Define the communication protocols for the hardware module.
- Process: Create interfaces that handle the input/output ports, ensuring efficient data transfer and control.
- Considerations: Bandwidth, throughput, and protocol adherence.

Software/Hardware Analogy and High-Level Synthesis Tasks

This document explains the analogy between software functions and hardware modules, highlighting their differences and detailing the main tasks in High-Level Synthesis (HLS).

Software/Hardware Analogy

The analogy between software functions and hardware modules can be understood as follows:

Software Function:
- Input Arguments: Variables passed to the function.
- Function Body: The main logic of the function.
- Output Arguments: Variables returned by the function.
Hardware Modules:
- Input Ports: Equivalent to input arguments in software.
- Hardware Body: The core logic of the hardware module.
- Output Ports: Equivalent to output arguments in software.

The image illustrates how a software function can be mapped to a hardware module, showing the correspondence between input/output arguments and ports, as well as the function body and hardware body.

Software/Hardware Analogy - Differences

Despite the analogy, there are key differences between software functions and hardware modules:

Execution Model:
- Software Function: Sequential execution.
- Hardware Modules: Fully parallel execution, leveraging the intrinsic parallelism of electronic circuits.
Memory Access:
- Software Function: Easy memory access, often abstracted away by the programming environment.
- Hardware Modules: Complicated memory access, requiring explicit control and design.
Control and Controller:
- Software: Relies on the underlying cash and controller mechanisms of the processor.
- Hardware: Needs explicit design of control mechanisms to manage data flow and operations.

The image highlights these differences, showing the sequential nature of software execution versus the parallel nature of hardware execution, as well as the differences in memory access complexity.

Two Main HLS Tasks

High-Level Synthesis (HLS) involves two main tasks:

Function Synthesis:
- Objective: Convert the software function into a hardware module.
- Process: Translate the function body into hardware logic, optimizing for parallelism.
- Components: Input ports (arguments), function body (logic), output ports (arguments).
Interface Synthesis:
- Objective: Define the communication protocols for the hardware module.
- Process: Create interfaces that handle the input/output ports, ensuring efficient data transfer and control.
- Considerations: Bandwidth, throughput, and protocol adherence.

DAY 2

Design Flow in Vivado HLS

Overview

The design flow in Vivado HLS allows you to create custom hardware accelerators from high-level C/C++ code. This process simplifies FPGA (Field-Programmable Gate Array) development by abstracting low-level details. Here’s how it works:

1. Coding Functionality

Write your algorithm or functionality in C/C++.
This high-level code will serve as the basis for your hardware design.

2. C-Simulation (Optional)

Simulate your C/C++ code to verify correctness.
Use test vectors to ensure expected behavior.

3. C/RTL Co-Simulation (Optional)

Co-simulate the C/C++ code with RTL (Register Transfer Level) code.
Validate consistency between the high-level and low-level representations.

4. Synthesis

Transform your C/C++ code into RTL.
Vivado HLS generates hardware components (datapaths, controllers, etc.).

5. Co-Simulation (Optional)

Verify the synthesized RTL against the original C/C++ code.
Ensure functional equivalence.

6. IP Generation

Export the generated IP (Intellectual Property) for use in Vivado Design Suite.
Integrate the custom IP into your FPGA project.

Hardware Design and Software Programming Analogy

Hardware Design

Inputs: Analogous to call-by-value arguments in software programming.
Outputs: Analogous to function returns and call-by-reference arguments in software programming.
Inouts: Inputs/outputs (inout) in hardware design correspond to call-by-reference arguments in software programming.
Design: The hardware design itself is analogous to the function body in software programming.

Software Programming

The software side of the analogy explains how arguments and return values are handled in a function:

Call by Value: Arguments passed by value, meaning a copy of the value is passed.
Call by Reference: Arguments passed by reference, allowing the function to modify the variable.
Function Return: The value that the function returns after execution.

Example Code

The example code provided in the diagram demonstrates a function that performs an addition operation using both call-by-value and call-by-reference arguments:

int add (int a, int *b, int &c) {
    int d;
    d = a + *b + c;
    return d;
}

Hardware Port Examples

1 bit data
8 bit data
A complex port

Types of Ports:

USB port: Universal Serial Bus port, used for connecting peripherals to the hardware design, providing data transfer and power supply.
DDR memory port: Double Data Rate memory port, used for connecting to high-speed memory modules, facilitating fast data read/write operations.
Data stream port: Used for continuous data transmission between the hardware design and external devices or systems, often in real-time applications.
I²C access port: Inter-Integrated Circuit port, used for communication between low-speed peripherals and the hardware design, typically requiring only two lines (SDA and SCL).

HLS Design Ports

void ledonoff(unsigned char *o) {
    *o = 0b11111000;
}

DAY 3

#define data_type unsigned char

void basic_input_output(
    data_type input
    data_type *output) {
    
    *output = input;
}

Interface Synthesis

Description: When the top-level function is synthesized, the arguments (or parameters) to the function are synthesized into RTL (Register Transfer Level) ports. This process is called interface synthesis.
Illustration: The diagram shows a top-level function top-function() with arguments p1, p2, and p3 that are synthesized into ports. This top function calls several other functions (func1(), func2(), func3(), etc.), which are also connected via ports.

Port Interfaces

Description: Port interfaces define how data is transferred between different modules or functions in the FPGA design. The diagram provides an example of a function basic_input_output() that uses port interfaces.
Example Code:
```
void basic_input_output(data_type *output) {
    #pragma HLS INTERFACE ap_none port=output
    *output = 0b11110000;
}
```
Explanation: The #pragma HLS INTERFACE directive specifies the type of interface for the output port. In this case, ap_none indicates that the port does not use any specific handshaking protocol.

List of Interface Modes

ap_ack
ap_bus
ap_ctrl_hs
ap_fifo
ap_hs
ap_memory
ap_none
ap_ovld

These modes define different ways to handle data transfer and synchronization between the modules.

1. `ap_none`

Configuration: Simple data output.
Description:
- The module has an 8-bit output o[7:0] connected directly to the led[7:0].
- No handshaking signals are used.
- This configuration is the most straightforward and has no control signals.

2. `ap_hls`

Configuration: Data output with HLS protocol signals.
Description:
- The module has an 8-bit output o[7:0] connected to the led[7:0] with control signals.
- Control signals include:
  - ap_ack_0: Acknowledge signal for the process.
  - ap_clk_0: Clock signal for synchronization.
  - ap_rst_0: Reset signal.
  - ap_vld_0: Valid signal indicating valid data.
- These control signals facilitate handshaking, ensuring proper data transfer and synchronization between the module and the connected components.

3. `s_axilite`

Configuration: Data output with AXI-Lite interface.
Description:
- The module interfaces using the AXI-Lite protocol, suitable for control registers and low-throughput data transfers.
- Signals include:
  - s_axi_AXILiteS_AWADDR_0[9:0]: Write address.
  - s_axi_AXILiteS_AWVALID_0: Write address valid.
  - s_axi_AXILiteS_AWREADY_0: Write address ready.
  - s_axi_AXILiteS_WDATA_0[31:0]: Write data.
  - s_axi_AXILiteS_WSTRB_0[3:0]: Write strobes.
  - s_axi_AXILiteS_WVALID_0: Write data valid.
  - s_axi_AXILiteS_WREADY_0: Write data ready.
  - s_axi_AXILiteS_BRESP_0[1:0]: Write response.
  - s_axi_AXILiteS_BVALID_0: Write response valid.
  - s_axi_AXILiteS_BREADY_0: Write response ready.
  - s_axi_AXILiteS_ARADDR_0[9:0]: Read address.
  - s_axi_AXILiteS_ARVALID_0: Read address valid.
  - s_axi_AXILiteS_ARREADY_0: Read address ready.
  - s_axi_AXILiteS_RDATA_0[31:0]: Read data.
  - s_axi_AXILiteS_RRESP_0[1:0]: Read response.
  - s_axi_AXILiteS_RVALID_0: Read data valid.
  - s_axi_AXILiteS_RREADY_0: Read data ready.
- This interface allows communication with the module using standard AXI-Lite signals, making it suitable for integration in larger systems with AXI interconnects.

These configurations demonstrate different levels of complexity and capability for interfacing a design module, from simple direct connections (ap_none) to more complex handshaking (ap_hls) and standardized bus interfaces (s_axilite).

Combinational Circuits

C/C++ Testbench

Verification

In this course we will learn simulation based verification

Flowchart of Design Process

This flowchart outlines the steps involved in computer engineering design. Here’s a brief summary:

C Test Bench: Create a test environment in the C programming language.
Design in C/C++: Perform the actual design work using C or C++.
Constraints Directives: Specify design constraints, such as timing requirements.
C Synthesis: Synthesize the design into a hardware description language (HDL).
RTL Description: Create a Register Transfer Level (RTL) description.
Package IP: Package the final design as intellectual property (IP).

WEEK 1 - LABS

WEEK 6-8: Capstone Project

HUFFMAN ENCODING

huffman-encoding-core

Canonical Huffman Encoding Project

Overview

The Canonical Huffman Encoding project involves designing, simulating, and evaluating a hardware accelerator for Huffman encoding using High-Level Synthesis (HLS) tools. The primary goal is to implement a Canonical Huffman code, which is a variant of Huffman coding where the codes are ordered in a way that minimizes storage requirements and simplifies the encoding and decoding processes.

Introduction

Canonical Huffman codes are a variant of Huffman codes with specific properties that make them easier to store and manipulate. This project implements a Canonical Huffman encoding process in hardware using HLS tools. The project includes several stages, from histogram calculation to tree truncation and codeword assignment.

Prerequisites

Vivado HLS or similar HLS tools
Basic understanding of Huffman encoding
Familiarity with C/C++ programming
Basic knowledge of FPGA design and Verilog

Project Structure

.
├── src
│   ├── huffman.h
│   ├── huffman.cpp
│   ├── create_codeword.cpp
│   └── testbench.cpp
├── data
│   ├── huffman.random256.txt
│   └── huffman.random256.gold
└── README.md

src: Contains the source code for the Huffman encoding project.
data: Contains the input frequency data and the golden reference output.

Modules and Functions

Histogram Calculation

This module calculates the histogram of symbol frequencies from the input data. The histogram is used to build the Huffman tree.

Detailed Description:

The histogram calculation step involves reading the input data and counting the occurrences of each symbol. This frequency count is essential for constructing the Huffman tree, as it determines the weights of the tree nodes.

Input: An array of symbol frequencies.
Output: A histogram array where each index represents a symbol, and the value at that index represents the frequency of that symbol.

This module calculates the histogram of symbol frequencies from the input data. The histogram is used to build the Huffman tree.

Code Snippet:

void calculate_histogram(const int* frequencies, int* histogram) {
    for (int i = 0; i < INPUT_SYMBOL_SIZE; ++i) {
        histogram[frequencies[i]]++;
    }
}

Tree Construction

This module constructs the Huffman tree based on the symbol frequencies. It uses a priority queue to ensure that the tree is built correctly.

The Huffman tree construction involves creating a binary tree where each leaf node represents a symbol, and the path from the root to the leaf determines the symbol's codeword. The priority queue is used to build the tree efficiently, ensuring that nodes with lower frequencies are combined first.

Input: A histogram array of symbol frequencies.

Output: A Huffman tree where each node contains a symbol and its frequency.

Steps:

Initialize a priority queue and insert all non-zero frequency symbols as leaf nodes. Extract the two nodes with the smallest frequencies from the queue, combine them into a new node, and insert the new node back into the queue. Repeat until there is only one node left in the queue, which will be the root of the Huffman tree.

Code Snippet:

HuffmanTree construct_tree(const int* histogram) {
    std::priority_queue<HuffmanNode, std::vector<HuffmanNode>, Compare> pq;
    for (int i = 0; i < INPUT_SYMBOL_SIZE; ++i) {
        if (histogram[i] > 0) {
            pq.push(HuffmanNode(i, histogram[i]));
        }
    }

    while (pq.size() > 1) {
        HuffmanNode left = pq.top(); pq.pop();
        HuffmanNode right = pq.top(); pq.pop();
        HuffmanNode parent(left, right);
        pq.push(parent);
    }

    return pq.top();
}

Tree Truncation

This module truncates the Huffman tree to a maximum codeword length. This step is necessary to ensure that the codewords fit within the desired bit-width.

Detailed Description

Tree truncation involves adjusting the Huffman tree to ensure that no codeword exceeds a specified maximum length. This is important for hardware implementations where the bit-width is limited.

Input: A Huffman tree and the maximum allowed codeword length.

Output: A truncated Huffman tree.

Steps:

Traverse the Huffman tree to determine the initial lengths of the codewords. If any codeword exceeds the maximum length, redistribute the codewords to shorten the longer ones while maintaining the prefix property of Huffman codes.

Code Snippet:

void truncate_tree(HuffmanTree& tree, int max_length) {
    // Truncation logic here
}

Create Codeword

This module assigns codewords to each symbol based on the truncated tree. It ensures that the codewords are in canonical form, which simplifies storage and manipulation.

Detailed Description

Canonical Huffman codes assign the shortest codes to the most frequent symbols. The codes are sorted by length, and within each length, the codes are assigned in lexicographic order.

Input: A truncated Huffman tree.

Output: An array of codewords for each symbol.

Steps:

Calculate the first codeword for each length. Assign codewords to symbols based on their lengths, ensuring the codes are in lexicographic order.

Process Description:

First Codeword Calculation:

Determine the first codeword for each length using the codeword length histogram.

Formula:

first_codeword[1] = 0;
for(int i = 2; i <= MAX_CODEWORD_LENGTH; ++i) {
    first_codeword[i] = (first_codeword[i - 1] + codeword_length_histogram[i - 1]) << 1;
}

Assigning Codewords:
- Assign codewords to symbols and format them for encoding and decoding.
- Store codewords in bit-reversed order for easier decoding.

Code Snippet:

void create_codeword(
    const CodewordLength symbol_bits[INPUT_SYMBOL_SIZE],
    const ap_uint<SYMBOL_BITS> codeword_length_histogram[TREE_DEPTH],
    PackedCodewordAndLength encoding[INPUT_SYMBOL_SIZE]
) {
    Codeword first_codeword[MAX_CODEWORD_LENGTH];
    first_codeword[0] = 0;
    for (int i = 1; i < MAX_CODEWORD_LENGTH; ++i) {
        first_codeword[i] = (first_codeword[i - 1] + codeword_length_histogram[i - 1]) << 1;
    }

    for (int i = 0; i < INPUT_SYMBOL_SIZE; ++i) {
        CodewordLength length = symbol_bits[i];
        if (length != 0) {
            Codeword out_reversed = first_codeword[length];
            out_reversed.reverse();
            out_reversed >>= (MAX_CODEWORD_LENGTH - length);
            encoding[i] = (out_reversed << CODEWORD_LENGTH_BITS) + length;
            first_codeword[length]++;
        } else {
            encoding[i] = 0;
        }
    }
}

Testbench

The final part of the code is the testbench. It reads the input frequency values from a file, processes them using the Huffman encoding function, and compares the resulting codewords with an existing golden reference stored in a file.

Detailed Description The testbench verifies the correctness of the Huffman encoding implementation by comparing the output with a precomputed golden reference.

Input: A file containing symbol frequencies.

Output: A file containing the encoded symbols. Steps:

Read the input frequencies from a file.

Perform the Huffman encoding process.

Write the encoded symbols to an output file.

Compare the output file with the golden reference file to verify correctness.

Code Snippet:

int main() {
    ap_uint<16> *frequencies = NULL;
    file_to_array("data/huffman.random256.txt", frequencies, INPUT_SYMBOL_SIZE);

    Symbol in[INPUT_SYMBOL_SIZE];
    for (int i = 0; i < INPUT_SYMBOL_SIZE; ++i) {
        in[i].frequency = frequencies[i];
        in[i].value = i;
    }

    PackedCodewordAndLength encoding[INPUT_SYMBOL_SIZE];
    int num_nonzero_symbols;
    huffman_encoding(in, encoding, &num_nonzero_symbols);

    FILE* output_file = fopen("data/huffman.random256.out", "w");
    for (int i = 0; i < INPUT_SYMBOL_SIZE; ++i) {
        fprintf(output_file, "%d, %x\n", i, (unsigned int)encoding[i]);
    }
    fclose(output_file);

    if (system("diff data/huffman.random256.out data/huffman.random256.gold")) {
        printf("FAIL: Output DOES NOT match the golden output\n");
        return 1;
    } else {
        printf("PASS: The output matches the golden output\n");
        return 0;
    }
}

Usage

Clone the Repository

git clone https://github.com/AnoushkaTripathi/NIELIT-INTERNSHIP-ON-HLS-PROGRAMMING
cd huffman-encoding

Compile the Project

Use Vivado HLS or your preferred HLS tool to compile the project.
Run the Testbench

Execute the testbench to verify the implementation.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
Capstone_Project_Huffman_Encoding_core/huffman-encoding-core-master/hls-src		Capstone_Project_Huffman_Encoding_core/huffman-encoding-core-master/hls-src
WEEK_1		WEEK_1
WEEK_2		WEEK_2
WEEK_3+4		WEEK_3+4
LICENSE		LICENSE
README.md		README.md

License

AnoushkaTripathi/NIELIT-INTERNSHIP-ON-HLS-PROGRAMMING

Folders and files

Latest commit

History

Repository files navigation

NIELIT-INTERNSHIP-ON-HLS-PROGRAMMING

Objective of this Internship

Outcome of the Internship

DAY 1 - Introduction to HLS programming

FPGA BASICS

FPGA Application Layer and Configuration Layer

1. Application Layer

2. Configuration Layer

FPGA Design Flow and Basic Elements

FPGA Design Flow

FPGA Basic Elements

LUT Example

Understanding LUT with the Provided Example

Given Logic Function

Truth Table

LUT Implementation

How LUT Works

Inputs Mapping

Output Determination

Example Walkthrough

FPGA Architecture: Slices, Configuration Logic Block, and Overall FPGA Structure

Slices

Configuration Logic Block (CLB)

Overall FPGA Architecture

Software/Hardware Analogy and High-Level Synthesis Tasks

Software/Hardware Analogy

Software/Hardware Analogy - Differences

Two Main HLS Tasks

DAY 2

Design Flow in Vivado HLS

Overview

1. Coding Functionality

2. C-Simulation (Optional)

3. C/RTL Co-Simulation (Optional)

4. Synthesis

5. Co-Simulation (Optional)

6. IP Generation

Hardware Design and Software Programming Analogy

Hardware Design

Software Programming

Example Code

Hardware Port Examples

Types of Ports:

HLS Design Ports

DAY 3

Interface Synthesis

Port Interfaces

List of Interface Modes

1. ap_none

2. ap_hls

3. s_axilite

Combinational Circuits

C/C++ Testbench

Verification

Flowchart of Design Process

WEEK 6-8: Capstone Project

HUFFMAN ENCODING

Canonical Huffman Encoding Project

Overview

Table of Contents

Introduction

Prerequisites

Project Structure

Modules and Functions

Histogram Calculation

Tree Construction

Tree Truncation

Create Codeword

Process Description:

Testbench

Usage

References

About

Resources

1. `ap_none`

2. `ap_hls`

3. `s_axilite`

Packages