In this paper, performance comparison of FPGAs and GPUs are introduced. Numerical methods to solve sparse matrices are evaluated as the main case-study. The experimental results showed that GPUs show superior performance over FPGAs/HW Emulation in terms of run time for small #equations. For large number of equations “in order of ten millions”, the FPGAs/HW emulation outperforms GPUs as the parallelism rate of the emulation becomes higher in that case.
Numerical Method, FPGA, GPU, Sparse Matrices, Matrices.
 SmithGD. Numerical Solution of Partial Differential Equations: Finite Difference Methods. Oxford, UK: Oxford University Press, 1978.
 FixG. StrangG, An Analysis of the Finite Element Method. Englewood Cliffs NJ,USA: Prentice-Hall, 1973.
 LeVequeR, Finite Volume Methods for Hyperbolic Problems. Cambridge, UK: Cambridge University Press, 2002.
 K. Banerjep,”Boundary Element Methods in Engineering”. New York, NY, USA: McGraw-Hill, 1994.
 R. F. Carvalho, C. A. P. S. Martins, R. M. S. Batalha, and A. F. P. Camargos, '3D parallel conjugate gradient solver optimized for GPUs’, in Digests of the 2010 14th Biennial IEEE Conference on Electromagnetic Field Computation, 2010, pp. 1–1.
 G. Wu, X. Xie, Y. Dou, and M. Wang, 'High-Performance Architecture for the Conjugate Gradient Solver on FPGAs’, IEEE Trans. Circuits Syst. II Express Briefs, vol. 60, no. 11, pp. 791–795, Nov. 2013.
 Kendall A. Atkinson, an Introduction to Numerical Analysis (2nd ed.). New York: John Wiley & Sons, 1989.
 Mordecai Avriel, Nonlinear Programming: Analysis and Methods. Dover Publishing, 2003.
 Gene H. Golub and Charles F Van Loan, "Chapter 10". Matrix computations (3rd ed.). Johns Hopkins University Press, 2011.
 Y. Saad, "Iterative methods for sparse linear systems” (2nd ed.).SIA, 2005.
 http://www.nvidia.com/object/cuda_home_ new.html
 David B. Kirk and Wen-mei W. Hwu, Programming Massively Parallel Processors - A Hands-On Approach.: Morgan Kaufmann, 2012.
 K. Salah. "IP Cores Design from Specifications to Production: Modeling, Verification, Optimization, and Protection." IP Cores Design from Specifications to Production. Springer International Publishing, 2016.
 Mentor Graphics. Veloce Emulator. [Online]. http://www.mentor.com/products/fv/emulat ion.html.
 B.-L. Nie, S. Wong, C. Macon, and J.-M. Jin H.-T. Meng, "GPU accelerated finiteelement computation for electromagnetic analysis," IEEE Antennas Propag. Mag., vol. 56, no. 2, pp. 39-62, Apr. 2014.
 Z. Peng and Z. Nie, "Acceleration of the method of moments calculations by using graphics processing units," IEEE Transactions on Antennas and Propagation, pp. 2130-2133, July 2008.
 A. Karwowski, and A. Noga T. Topa, "Using GPU with CUDA to accelerate MoM-based electromagnetic simulation of wire-grid models," EEE Antennas and Wireless Propagation Letters, pp. 342-345, april 2011.
 A. Esposito, G. Monti, and L. Tarricone D. De Donno, "Parallel efficient method of moments exploiting graphics processing units," Microwave and Optical Technology Letters, Nov. 2010.
 B. Livshitz, and V. Lomakin S. Li, "Fast evaluation of Helmholtz potential on graphics processing units (GPUs)," Journal of Computational Physics, Nov. 2010
 .E. Lezar and D. B. Davidson, "GPUaccelerated method of moments by example: Monostatic scattering," IEEE Antennas and Propagation Magazine, Dec. 2010.
 A. Dziekonski, and M. Mrozowski P. Sypek, "How to render FDTD computations more effective using a graphics accelerator," IEEE Transactions on Magnetics, March 2009.
 V. Demir, "A stacking scheme to improve the efficiency of finite-difference timedomain solutions on graphics processing units," Applied Computational Electromagnetics Society Journal, Apr. 2010.
 V. Demir and A. Z. Elsherbeni, "Compute unified device architecture (CUDA) based finite-difference time-domain (FDTD) implementation," Applied Computational Electromagnetics Society Journal, Apr. 2010.
 Naumov M, "Incomplete-LU and Cholesky preconditioned iterative methods using CUSPARSE and CUBLAS," Technical report and white paper 2011.
 N. Bell and M. Garland., "Efficient sparse matrix-vector multiplication on CUDA," NVIDIA Corporation, NVIDIA Technical Report NVR-2008-004 2008.
 Jichun Li and Yunqing Huang, Time- Domain Finite Element Methods for Maxwell's Equations in Metamaterials.: Springer Series
Cite this paper
Khaled Salah, Mohamed AbdelSalam. (2017) Performance Comparison of FPGAs and GPUs: Solving Sparse Matrices Case-Study. International Journal of Mathematical and Computational Methods, 2, 161-170
Copyright © 2017 Author(s) retain the copyright of this article.
This article is published under the terms of the Creative Commons Attribution License 4.0