High-Level and Behavioral Synthesis: Trends and Optimization
Session Chair: Zhufei Chu, Ningbo University
FM-AM: Fused-Metrics Approximate Multiplier Exploration Framework for DNNs
Presenter: Zihan Zou, Southeast University
Abstract: Quantization combined with approximate computing has been extensively explored in Deep Neural Networks (DNNs) for edge devices and hardware accelerators, offering significant reductions in memory footprint and computation overhead. However, effectively integrating these techniques presents two key challenges: (1) Conventional error metrics, such as mean squared error, fail to accurately capture the proper tradeoff between network accuracy and hardware efficiency. (2) The design space for mixed-precision DNNs is vast, making exhaustive exploration computationally prohibitive. To address these challenges, we propose the Fused-Metrics Approximate Multiplier Exploration (FM-AM) framework, which introduces two key innovations: (1) A cross-layer optimization approach that jointly configures quantization bit-widths and energy-efficient approximate multipliers, guided by fused-metrics error estimation and hardware constraints. (2) A Bayesian Optimization-based search algorithm that efficiently selects approximate computing units by pruning the multiplier library, significantly reducing search time while meeting accuracy requirements. We evaluate FM-AM on AlexNet, ResNet-18, and ResNet-50 using the CIFAR-100 dataset. Experimental results demonstrate energy reductions of 60.2%, 58.9%, and 50.0%, respectively, with only 1.63%, 1.59%, and 1.34% accuracy degradation when implemented in a 28-nm CMOS industrial technology.
PipelineGen: Towards Automated Generation of Pipeline from ISA Formal Semantic Specification
Presenter: Xu He, Hunan University
Abstract: Currently, there are numerous manual factors in microprocessor pipeline design, which may cause problems such as ambiguity in design understanding, low development efficiency, and proneness to errors. In this paper, we propose a method for automatically generating an in-order and single-issue pipeline from formal semantic descriptions of instruction set architecture of a given microprocessor. Based on a divide-and-conquer strategy, an individual datapath for each instruction is synthesized first, and then the datapaths of all instructions are composed into a single datapath using signals-reuse and various proposed techniques, to generate the final pipeline. We implement the proposed method as an open source tool - PipelineGen, which realizes a complete tool-chain from instruction set formal specification to RTL code, and enriches the ecosystem of the Sail language. We evaluate PipelineGen using three typical microprocessor ISAs. PipelineGen can generate correct pipeline designs in minutes, verifying the core concept of ”synthesis is correct”. The promising experimental results demonstrate that our method effectively alleviates the problems in manual pipeline design.
DOMAC: Differentiable Optimization for High-Speed Multipliers and Multiply-Accumulators
Presenter: Chenhao Xue, Peking University
Abstract: Multipliers and multiply-accumulators (MACs) are fundamental building blocks for compute-intensive applications such as artificial intelligence. With the diminishing returns of Moore’s Law, optimizing multiplier performance now necessitates process-aware architectural innovations rather than relying solely on technology scaling. In this paper, we introduce DOMAC, a novel approach that employs differentiable optimization for designing multipliers and MACs at specific technology nodes. DOMAC establishes an analogy between optimizing multi-staged parallel compressor trees and training deep neural networks. Building on this insight, DOMAC reformulates the discrete optimization challenge into a continuous problem by incorporating differentiable timing and area objectives. This formulation enables us to utilize existing deep learning toolkit for highly efficient implementation of the differentiable solver. Experimental results demonstrate that DOMAC achieves significant enhancements in both performance and area efficiency compared to state-of-the-art baselines and commercial IPs in multiplier and MAC designs.
SAT-Sweeping Based on XOR-Majority Graph
Presenter: Jiaxin Peng, Ningbo University
Abstract: Combinational equivalence checking is a fundamental aspect of electronic design automation (EDA). However, the intricate interconnections of XOR structures in a circuit pose significant challenges for traditional equivalence checking methods. To address these challenges, this paper proposes an equivalence checking method based on SAT-sweeping for XMG networks.
This approach fully exploits the efficiency and compactness of XOR and MAJ logic gates in XMG networks for representing logic functions. It incorporates XMG-specific rewriting optimizations and enhances the traditional SAT-sweeping strategy by introducing a SAT-guided method for generating high-toggle-rate simulation vectors, enabling more efficient verification of internal node equivalence. These improvements collectively enhance the overall verification efficiency. In our experimental results, the final results achieve 2.81$\times$ and 2.74$\times$ improvement compared to state-of-the-art equivalence checking methods.
Area-oriented Boolean Resubstitution with Efficient Dependency Function Computation
Presenter: Chen Lv, Ningbo University
Abstract: Boolean resubstitution is a widely recognized and utilized optimization algorithm for logic networks. In this paper, we present an enhancement to the traditional resubstitution approach by integrating the Semi-Tensor Product (STP). Specifically, our novel method leverages STP to compute all feasible dependency functions for a target node in a single calculation during the resubstitution process. By utilizing these dependency functions, we can derive a more optimal implementation for the target node through decomposition and exact synthesis techniques, ultimately leading to more efficient logic network optimization. Experiments on the IWLS benchmark suite demonstrate that our method outperforms state-of-the-art~(SOTA) methods. Specifically, our approach achieves an 7\% improvement in the size optimization. In 6-LUT mapping, it achieves an 4.6\% improvement in the size-depth product. Furthermore, applying our method after iterating the SOTA resubstitution algorithm until convergence further enhances optimization by 8.17\%.
FM-AM: Fused-Metrics Approximate Multiplier Exploration Framework for DNNs
Presenter: Zihan Zou, Southeast University