Hardware Implementation of a High Efficiency and High-Speed Squaring Architecture


  •   Shiva Maleki Varnosfaderani

  •   Bahram Rashidi

  •   Mohammad Alhawari


This paper presents the design of a high speed and simple squaring structure based on half adder and full adder. The proposed architecture consists of three main steps; simplification of the squaring structure, calculation and transferring of the carry bit to the next part, and finally, applying the modified Wallace Tree Adder to calculate the summation of the products. The proposed squaring architecture is formed only by 13 half adders and 20 full adders for 8-bit squaring which has the lowest complexity compared to other works. The proposed structure is modeled by using Field-Programmable Gate Array (FPGA) and has been successfully synthesized and implemented with Xilinx Spartan-6 and Virtex-4 FPGA. Simulation results show that the proposed structure has high speed and excellent performance with low power consumption while having the lowest propagation delay (3.25 ns) and acceptable hardware utilization compared to existing squaring models. The gate-level design of 8-bit squaring is implemented using Cadence layout tools in 65 nm CMOS technology which has a total area of 0.095 mm2.

Keywords: Field-Programmable Gate Array (FPGA), full adder (FA), gate-level design, half adder (HA), Wallace Tree Adder, Xilinx Spartan-6, Xilinx Virtex-4, 65 nm CMOS technology, 8-bit squaring


Deepa A, Marimuthu CN, Murugesan C. An efficient high speed squaring and multiplier architecture using yavadunam sutra and bit reduction technique. Journal of Physics: Conference Series. 2020; 1432(1): 012080.

George ML, Tomar GS. Comparative review of floating-point multiplier systems. International Journal of Hybrid Information Technology. 2019; 12(2): 21-48.

Ramanammma P, Malashree N. Low power square and cube architectures using Vedic Sutras. International Journal of Engineering Research and General Science. 2017; 5(3): 241-8.

Liu Z, Seo H, Kim H. A synthesis of multi-precision multiplication and squaring techniques for 8-bit sensor nodes: state-of-the-art research and future challenges. Journal of Computer Science and Technology. 2016; 31(2): 284-99.

Chung K, Kim LS. Area-efficient special function unit for mobile vertex processors. Electronics Letters. 2009; 45(16): 826-7.

Donofrio DD, Li X, inventors; Intel Corp, assignee. Enhanced floating-point unit for extended functions. United States patent US 7,676,535. 2010 Mar 9.

Pasca B. Correctly rounded floating-point division for DSP-enabled FPGAs. IEEE, International Conference on Field Programmable Logic and Applications (FPL), 2012: 249-254.

De Dinechin F, Pasca B. Designing custom arithmetic data paths with FloPoCo. IEEE Design & Test of Computers. 2011; 28(4): 18-27.

Uma R, Dhavachelvan P. Logic optimization using technology independent mux based adders in FPGA. International Journal of VLSI design & Communication Systems. 2012; 3(4): 133.

Rashidi B. High performance and low-power finite impulse response filter based on ring topology with modified retiming serial multiplier on FPGA. IET Signal Processing. 2013; 7(8): 743-753.

Mohanty PS. Design and implementation of faster and low power multipliers. 2009.

Sharma R, Kaur M, Singh G. Design and FPGA implementation of optimized 32-bit Vedic multiplier and square architectures. 2015 International Conference on Industrial Instrumentation and Control (ICIC). 2015: 960-964.

Poornima M, Patil SK, Shivukumar SK, Sanjay H. Implementation of multiplier using Vedic algorithm. International Journal of Innovative Technology and Exploring Engineering (IJITEE). 2013; 2(6): 219-23.

Vaithiyanathan G, Venkatesan K, Sivaramakrishnan S, Siva S, Jayakumar S. Simulation and implementation of Vedic multiplier using VHDL code. International Journal of Scientific & Engineering Research. 2013; 4(1).

Bathija RK, Meena RS, Sarkar S, Sahu R. Low power high speed 16x16 bit multiplier using vedic mathematics. International Journal of Computer Applications. 2012; 59(6).

Ramachandran S, Pande KS. Design implementation and performance analysis of an integrated vedic multiplier architecture. International Journal of Computational Engineering Research. 2012; 2(3): 697-703.

Kumar GG, Charishma V. Design of high speed vedic multiplier using vedic mathematics techniques. International Journal of Scientific and Research Publications. 2012; 2(3): 1.

Sethi K, Panda R. An improved Squaring circuit for binary numbers. International Journal of Advance Computer Science and Applications. 2012; 3(2).

Siddhi AK. Hardware Implementation of 16* 16 bit Multiplier and Square using Vedic Mathematics.

Reddy BN. Design and implementation of high performance and area efficient square architecture using Vedic Mathematics. Analog Integrated Circuits and Signal Processing. 2020; 102(3): 501-6.

Wang JS, Kuo CN, Yang TH. Low-power fixed-width array multipliers. In Proceedings of the 2004 International Symposium on Low Power Electronics and Design. 2004: 307-312.

Mishra S, Dhakad SK. A High Speed and device efficient FPGA based Squaring circuit. International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering. 2013; 2(11): 5647-54.

Zhang W, Lin D, Zhang H, Zhou X, Gao Y, Chen C. A lightweight multi-precision squaring on embedded processors for ECC. In2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). 2018: 1014-1019.

Kasliwal PS, Patil BP, Gautam DK. Performance evaluation of squaring operation by Vedic mathematics. IETE journal of Research. 2011; 57(1): 39-41.


How to Cite
Maleki Varnosfaderani, S., Rashidi, B. ., & Alhawari, M. (2022). Hardware Implementation of a High Efficiency and High-Speed Squaring Architecture. European Journal of Information Technologies and Computer Science, 2(4), 1–4. https://doi.org/10.24018/compute.2022.2.4.71