USCID | Name | |
---|---|---|
4014212838 | [email protected] | Arash Fayyazi |
- Low-latency inference
Inference without reading parameters off memory → More efficient hardware implementation - Scalability issues
Different architectures for each neuron → Use truth table as a unique signature or constrain parameters during trainingLarge number of inputs → Exponential growth of combinations
Logic minimization is the process of finding a functionally-equivalent representation of a given logic circuit with the goal of reducing area, delay, and/or power consumption. ESPRESSO-II, which was developed in 1982 2, is the most popular two-level logic minimization algorithm.
Our work relies on ESPRESSO-II for minimization of sparse incompletely specified functions and implements a GPU version of ESPRESSO-II which is two orders of magnitude faster than ESPRESSO. We will use CUDA for the implementation.
We will explore which loops can be parallelized on GPUs, the benefits of parallelizing those loops and then sort them, and the details of how parallelization can be achieved.
$ chmod +x configure
$ /bin/bash ./configure
$ make
$ sudo make install
$ espresso -t <pla_file>
This distribution is just a reworked version of the c. 1989 Berkeley espresso source code. All kudos to the original authors.
- Mahdi Nazemi, Ghasem Pasandi, and Massoud Pedram. 2019. Energy-efficient, low-latency realization of neural networks through boolean logic minimization. In Proceedings of the 24th Asia and South Pacific Design Automation Conference (ASPDAC '19). Association for Computing Machinery, New York, NY, USA, 274–279. DOI:https://doi.org/10.1145/3287624.3287722
- R. K. Brayton, G. D. Hachtel, C. T. McMullen, and A. L. Sangiovanni-Vincentelli,Logic Minimization Algorithms for VLSI Synthesis, ser. TheKluwer International Series in Engineering and Computer Science. Springer,1984, vol. 2. [Online]. Available: https://doi.org/10.1007/978-1-4613-2821-6