Du P, Weber R, Luszczek P, Tomov S, Peterson G, Dongarra J (2012) From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming. Parallel Comput 38(8): 391-407.
 Sengupta S, Harris M, Zhang Y, Owens JD (2007) Scan primitives for GPU computing. Proceedings of the 22nd ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware, San Diego, California.
 Sakharnykh N (2009) Tridiagonal solvers on the GPU and applications to fluid simulation. NVIDIA GPU Technology Conference, San Jose, California, USA.
 Zhang Y, Cohen J, Owens JD (2010) Fast tridiagonal solvers on the GPU. ACM Sigplan Notices 45(5): 127-136.
 Göddeke D, Strzodka R (2011) Cyclic reduction tridiagonal solvers on GPUs applied to mixed-precision multigrid. IEEE T Parall Distr 22(1): 22-32.
 Davidson A, Owens JD (2011) Register packing for cyclic reduction: a case study, Proceedings of the fourth workshop on general purpose processing on graphics processing units, Newport Beach, California, USA.
 Sakharnykh N (2010) Efficient tridiagonal solvers for ADI methods and fluid simulation. NVIDIA GPU Technology Conference, San Jose, California, USA.
 Egloff D (2011) Pricing financial derivatives with high performance finite difference solvers on GPUs, Wen-mei W. Hwu (Eds.), GPU Computing Gems Jade Edition, pp. 23.309-23.322, Burlington: Morgan Kaufmann.
 Zhang Y, Cohen J, Davidson AA, Owens JD (2011) A Hybrid Method for Solving Tridiagonal Systems on the GPU, Wen-mei W. Hwu (Eds.), GPU Computing Gems Jade Edition, pp. 11.117-11.132, Burlington: Morgan Kaufmann.
 Wei Z, Jang B, Zhang Y, Jia Y (2013) Parallelizing alternating direction implicit solver on GPUs, Procedia Computer Science. 18: 389-398.
 Kim HS, Wu S, Chang LW, Hwu WMW (2011) A scalable tridiagonal solver for gpus. International Conference on Parallel Processing (ICPP), Taipei, Taiwan.
 Esfahanian V, Darian HM, Gohari SI (2013) Assessment of WENO schemes for numerical simulation of some hyperbolic equations using GPU. Comput Fluids 80: 260-268.
 Esfahanian V, Baghapour B, Torabzadeh M, Chizari H (2014) An efficient GPU implementation of cyclic reduction solver for high-order compressible viscous flow simulations. Comput Fluids 92: 160-171.
 Darian HM, Esfahanian V (2014) Assessment of WENO schemes for multi‐dimensional Euler equations using GPU. Int J Numer Meth Fl 76(12): 961-981.
 Zolfaghari A, Foadaddini A (2016) Developing new Checkerboard Thomas algorithm for solving tridiagonal set of equations on GPU. Modares Mechanical Engineering 16(2): 309-318. (in Persian)
 Beck JV, Wright N, Haji-Sheikh A, Cole KD, Amos D (2008) Conduction in rectangular plates with boundary temperatures specified. Heat Mass Transfer 51: 4676-4690.