Skip to content

Conversation

Tanawin1701d
Copy link

Description

VitisUnified backend

Motivation

  • The current Vitis backend does not support a complete flow from HLS4ML to bitstream generation and driver deployment (zcu102 for my case)

Summarized features

  • Automatic creation of AXI master reader/writer interfaces for HLS4ML kernels.
  • Based on the v++ compiler and packaging flow.
  • Configuration aligned with the Vitis Unified IDE project structure.
  • Seamless integration with Xilinx hardware platforms:
    • the platform is the Xilinx package that contains the hardware structure such as axi interconnect, PS configuration, interrupt(except HLS4ML kernel)
    • Platforms encapsulate hardware structures such as AXI interconnects, PS configuration, and interrupts (excluding the HLS4ML kernel).
    • Xilinx provides platform for some boards integrated in Vitis /tools/Xilinx/Vitis/2023.2/base_platforms
    • Developers can also create custom platforms by following the official tutorial: https://github.com/Xilinx/Vitis-Tutorials/tree/2025.1/Vitis_Platform_Creation/Design_Tutorials/01-Edge-KV260
  • Automatic PYNQ driver generation for streamlined deployment.

Type of change

For a new feature or function, please create an issue first to discuss it
with us before submitting a pull request.

Note: Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

Tests

  • we test with the tiny keras unet model @ test/pytest/test_backend/vitis_unified.py with 4 main aspect

bridge test

  • Compare VitisUnified with Vitis
  • we check predict file with 100% match

cosimulation

  • we use two VitisUnified:
    • first one is used to generate bridge simulation
    • second one is used to generate start cosimulation and get the simulation result from cosimulator
  • Compare with 1e-4 acceptable torelant (it comes from dat file rounding a bit)

fifo test optimization

  • Procedure is similar to Co-simulation and inspect that there is fifo_depth.json exist

hardware test

  • Stress test with 10,000 queries but have only 128 (input) + 128 (output) buffer size to make sure there is no deadlock from autogenerated xilinx platform axi-connection
  • the tested was in function test_gen_unified in
    • the test was conducted in zcu102 with pynq framework

test reproduce

  • Run pytest on test/pytest/test_backend/vitis_unified.py file
  • for hardware test (test_gen_unified), you should specify XPFM_PATH(path to xpfm file) to the correct place.
  • if LOG_STD == True, HLS4ML will give the HLS+linker compiling message @ console.
  • if not, HLS4ML will give the messages @ <output_project_dir>/<prefix>_err.log or <output_project_dir>/<prefix>_out.log

Test Configuration:

Checklist

  • I have read the guidelines for contributing.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have made corresponding changes to the documentation.
  • My changes generate no new warnings. (see the section below)
  • I have installed and run pre-commit on the files I edited or added.
  • I have added tests that prove my fix is effective or that my feature works.

implementation detail

vitisUnifiedBackendFlow
  • This flow of this backend to build the ready to ship file, we should do three flow things
    • file generation(HLS4ML generated file) prepare file for system Generation and pynq driver
    • synthesis Kernel (Synthesis Kernel (v++)) do c-synthesis for HLS4ML model
    • linker (Linker+vivado+Bitfile+hwh)

File structure

template structure

  • the tree below is the template file allocate at hls4ml/templates/vitis_unified
├── build_lib_multigraph.sh
├── build_lib.sh
├── driver
│   └── pynq
│       └── pynq_driver.py.hls4ml (template for pynq driver)
├── hls_kernel_config.cfg         (config for HLS4ML model Synthesis)
├── myproject_bridge.cpp          (wrapper for C++ simulation using python .predict())
├── myproject_dm.cpp              (wrapper for HLS4ML model convert axi to axi stream)
├── myproject_dm.h   
├── myproject_test.cpp            (for cosimulation and fifo-optimization)
└── workspace
    ├── projectName
    │   └── vitis-comp.json       (project meta-data used for opening using vitis unified IDE)
    └── sysProj
        ├── buildAcc.sh           (script for linking the kernel with platform)
        └── buildConfig.cfg       (config file for linking progress)

output file structure

├ export                        (ready to ship file placed here!)
│   ├ pynq_driver.py
│   ├ system.bit
│   └ system.hwh
├ firmware
│   ├ <project_name>_dm.cpp     (wrapper for HLS4ML model convert axi to axi stream)
│   ├ <project_name>_dm.h       (the syntesizer not use this but required by cosim and bridge sim)
│   ├ <other files>             (other HLS src file generated from Vitis and vivado backend)
├ unifiedWorkspace              (folder for kernel synthesis and linking progress)
│   ├ linker                    (folder for platform linking project)
│   │   ├ buildAcc.sh           (build script for platform link)
│   │   ├ buildConfig.cfg       (config script for platform link)
│   │   └ <other files>         (file that v++ generated during link the platform)
│   └ <project_name>            (folder for HLS project from HLS4ML model)
│       ├ unifiedPrj            (folder for Vitis HLS internal file)
│       └ vitis-comp.json       (project meta-data used for opening using vitis unified IDE)
├ build_lib.sh                  (build script for bridge simulation)
├ hls_kernel_config_cosim.cfg   (config file for cosim and fifo depth optimization)
├ hls_kernel_config_csim.cfg    (config file for csim )
├ myproject_bridge.cpp          (wrapper for C++ simulation using python .predict())
└ myproject_test.cpp          (for cosimulation and fifo-optimization)

configuration

board='zcu102',
        part=None,
        clock_period=5,
        clock_uncertainty='12.5%',
        io_type='io_stream',
        driver='python',
        input_type='float',
        output_type='float',
        in_stream_buf_size=128,
        out_stream_buf_size=128,
        xpfmPath='/opt/Xilinx/Vitis/2023.2/base_platforms/' 'xilinx_zcu102_base_202320_1/xilinx_zcu102_base_202320_1.xpfm',
        **_,
  • input_type and output_type are support only float and double. And it must be match
  • {in/out}_stream_buf_size unit is in amount elements of the nnet::array
    xpfmPath

note to developer

  • In case, you want to debug the generated HLS project using Vitis unified IDE, you can select the workspace folder at the program at unifiedWorkspace. The IDE will automatically detect your project
  • For bridge simulation, if the configuration input_type/output_type was set to type x (double or float), you cannot predict with numpy array with different input/output type
  • the depth argument @ axi_master write @ <project_name>_dm.cpp must be match of the array size generated the output array@ ````myproject_test.cpp``` for cosim and csim.
    • if the array allocation is larger than depth, the result will not correct
    • if the array allocation is lower than depth, the result is correct, but the system will throw segment falut error
    • the depth size will not impact the resource usage in hls generation
  • The linked Vivado project are at <project_folder>/unifiedWorkspace/linker/_x/link/vivado/vpl/prj
  • This backend will reject multigraph feature

note to tutorial

  • we provide the tutorial at this repository
    https://github.com/Tanawin1701d/vitisUnifiedTutorial

generated warning

  • warning in HLS4ML is only about the unet model that we use in pytest, I think it is not warning in the new backend
WARNING:absl:Skipping variable loading for optimizer 'Adam', because it has 17 variables whereas the saved optimizer has 1 variables. 
WARNING: Config parameter "algorithm" overwrites an existing attribute in layer "up_sampling2d" (Resize)
  • for kernel synthesis with Vitis, I think it is general warning such as unused parameter, deprecated pragma, dataflow conflict

…i wrapper for vitisUnified partial backend and build the skeleton code for other generation section
@Tanawin1701d Tanawin1701d marked this pull request as draft September 2, 2025 17:41
@nghielme nghielme assigned nghielme and unassigned nghielme Sep 3, 2025
@nghielme nghielme self-requested a review September 3, 2025 07:54
@bo3z
Copy link
Contributor

bo3z commented Sep 3, 2025

Thank you for this contribution!

Could you please elaborate how this would compare against the Vitis Accelerator IP flow in #1134? Both PRs seem to add support for end-to-end deployment on ZCU devices.

@Tanawin1701d
Copy link
Author

Why we can't completly reuse fifo depth optimization code from vitis

  • when we use the vitis backend, the fifo channel info file /.autopilot/db/channel_info.csv
    • There is the column 1 layer_name and column 3 layer's info file name that we have to gather the data

Vitis backend/Vitis Unified backend differeces

layer name diff in channel_info.csv

  • in each row in channel_info.csv
backend loop_name (col 0) layer_name (col 1) <empty_cell> (col 2) linked_file_name (col 3)
vitis <loop_name> layer14_out_U <empty_cell> chan_status6.csv
vitis unified <loop_name> layer14_out_i_U <empty_cell> chan_status6.csv
  • since vitis unified has the axi wrapper that convert axi memory map to axi stream, it makes layer_name (col 1) have extra (_i)

different place of HLS work directory

  • hls internal project directory dir
    • vitis backend locate the project @ <outputDir>/<project_name>_prj/solution1/.autopilot/db/channel_info.csv
    • vitis unified locate the project @ <<outputDir>>/unifiedWorkspace/<project_name>/unifiedPrj/hls/.autopilot/db/channel_info.csv
  • the place is differnet because I think that gathering the HLS work place and linking work place in the dedicated directory to prevent it polutes other HLS4ML file structures. And, I think it would be easier for managing the project's subsystem using Vitis Unified Ide.

summarize

  • from layer name and work dir diff make Vitis Unified Backend must have its own get_vitis_optimized_fifo_depths

@Tanawin1701d
Copy link
Author

Briefly compare with Vitis Accelerator IP Flow

  • If there are any mistakes, please let me know.

differences

the linking progress

  • vitis acc
    • based on dedicated vivado tcl script for each specific board that designer/maintain have to manually build it for each board. (more infomation @ hls4ml/backends/vitis_unified/supported_boards.json)
  • vitis unified
    • based on XPFM file, it is the xilinx platform that you can entirely create it using Vivado GUI + Vitis GUI
    • the vitis provides ready to use XPFM file located at ( /tools/Xilinx/Vitis/2023.2/base_platforms)
    • the xilinx will decide linking HLS4ML kernel with the platform automatically
    • designer can can share xpfm file freely, just specify to the desired XPFM path

the kernel

  • vitis acc
    • it is fixed to single axi_stream read and single axi_stream write port
    • I think the control system is placed in vivado tcl script file
  • vitis unified
    • it is fixed to multiple axi_mmap read and write port
    • the control system such as ap_start/ap_done can be access via axi_lite (v++ will automatic link it)

file structure and configuration support

  • vitis acc
    • in HLS4ML kernel synthesis, it based on tcl script file and vitis_hls workflow.
    • I think vitis_hls will be deprecated by Xilinx in a few version
  • vitis unified
    • in HLS4ML kernel synthesis and linker based on vitis run and v++ workflow
    • the configuration will be based on .cfg file
    • the project meta data is built for vitis unifed ide
    • designer can open HLS project in vitis unified ide for debuging and manual operation

multigraph support

  • vitis acc
    • support multigraph for single IO stream port
  • vitis unified
    • not support multigraph
    • We think we should have the another dedicated backend such as vitis unified partial backend
    • We think that in complete flow for multigraph features should have its own dedicated dfx (partial reconfiguration) and its control mechanism,
    • so we should have the another backend to specifically support them by reusing some code from this backend.

@Tanawin1701d
Copy link
Author

Thank you for this contribution!

Could you please elaborate how this would compare against the Vitis Accelerator IP flow in #1134? Both PRs seem to add support for end-to-end deployment on ZCU devices.

Thank you for your comment. The one above is a comparison with the Vitis accelerator IP flow. If there are any aspects you would like me to elaborate on, please let me know.

@qberthet qberthet mentioned this pull request Sep 5, 2025
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants