[1] Du P, Weber R, Luszczek P, Tomov S, Peterson G, Dongarra J (2012) From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming. Parallel Comput 38(8): 391-407.
[2] Sengupta S, Harris M, Zhang Y, Owens JD (2007) Scan primitives for GPU computing. Proceedings of the 22nd ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware, San Diego, California.
[3] Sakharnykh N (2009) Tridiagonal solvers on the GPU and applications to fluid simulation. NVIDIA GPU Technology Conference, San Jose, California, USA.
[4] Zhang Y, Cohen J, Owens JD (2010) Fast tridiagonal solvers on the GPU. ACM Sigplan Notices 45(5): 127-136.
[5] Göddeke D, Strzodka R (2011) Cyclic reduction tridiagonal solvers on GPUs applied to mixed-precision multigrid. IEEE T Parall Distr 22(1): 22-32.
[6] Davidson A, Owens JD (2011) Register packing for cyclic reduction: a case study, Proceedings of the fourth workshop on general purpose processing on graphics processing units, Newport Beach, California, USA.
[7] Sakharnykh N (2010) Efficient tridiagonal solvers for ADI methods and fluid simulation. NVIDIA GPU Technology Conference, San Jose, California, USA.
[8] Egloff D (2011) Pricing financial derivatives with high performance finite difference solvers on GPUs, Wen-mei W. Hwu (Eds.), GPU Computing Gems Jade Edition, pp. 23.309-23.322, Burlington: Morgan Kaufmann.
[9] Zhang Y, Cohen J, Davidson AA, Owens JD (2011) A Hybrid Method for Solving Tridiagonal Systems on the GPU, Wen-mei W. Hwu (Eds.), GPU Computing Gems Jade Edition, pp. 11.117-11.132, Burlington: Morgan Kaufmann.
[10] Wei Z, Jang B, Zhang Y, Jia Y (2013) Parallelizing alternating direction implicit solver on GPUs, Procedia Computer Science. 18: 389-398.
[11] Kim HS, Wu S, Chang LW, Hwu WMW (2011) A scalable tridiagonal solver for gpus. International Conference on Parallel Processing (ICPP), Taipei, Taiwan.
[12] Esfahanian V, Darian HM, Gohari SI (2013) Assessment of WENO schemes for numerical simulation of some hyperbolic equations using GPU. Comput Fluids 80: 260-268.
[13] Esfahanian V, Baghapour B, Torabzadeh M, Chizari H (2014) An efficient GPU implementation of cyclic reduction solver for high-order compressible viscous flow simulations. Comput Fluids 92: 160-171.
[14] Darian HM, Esfahanian V (2014) Assessment of WENO schemes for multi‐dimensional Euler equations using GPU. Int J Numer Meth Fl 76(12): 961-981.
[15] Zolfaghari A, Foadaddini A (2016) Developing new Checkerboard Thomas algorithm for solving tridiagonal set of equations on GPU. Modares Mechanical Engineering 16(2): 309-318. (in Persian)
[16] Beck JV, Wright N, Haji-Sheikh A, Cole KD, Amos D (2008) Conduction in rectangular plates with boundary temperatures specified. Heat Mass Transfer 51: 4676-4690.