工业CT系统是实现工业无损检测的一个重要手段。CT重建算法是工业CT系统的核心技术,随着面阵探测器工业CT系统的发展,由二维投影图像数据直接重建三维图像成为三维CT研究的主流。而在各种三维锥束CT重建算法中,FDK及其衍生算法又一直是实际应用中的主流。工业CT系统检测对象尺寸大,图像分辨率要求高,图像重建数据量大。因此,如何充分利用现有系统的软件及硬件能力,提高重建速度和精度,是工业CT应用研究中的一个重要课题。利用通用显卡实现算法加速是一种适应CT重建特点的重要硬件加速方法。 论文研究了三维锥束CT的FDK算法及其衍生算法的原理,主要包括标准FDK算法、G-FDK、HS-FDK、P-FDK、T-FDK和HT-FDK算法。文章分析比较了各种基于FDK的算法特点及其应用于显卡加速的适用性和难点,分析得出基于FDK的重建算法主要可以分为两类显卡加速方法,一种以加速标准FDK算法的方法为代表,另一种则是对基于重排思想的算法执行的显卡加速方法,其中以T-FDK算法最适合于显卡加速的实现。 论文通过分析显卡实现CT重建算法的主要技术构造和实现机制,具体讨论了三维投影和反投影过程并利用显卡分别实现了对它们的加速。以此为基础详细描述了显卡加速实现三维锥束FDK算法和T-FDK算法的适用性,设计了加速实现的主要过程以及具体方法,分析讨论其存在的问题。其中对T-FDK算法的加速实现根据其投影数据重排的特点采用了网格映射的具体方式。 论文以Visual C++和Cg1.3作为开发工具,以OpenGL为接口平台实现了FDK算法和T-FDK算法的显卡加速过程。实验得出,与经过初步对称性优化后的软件算法相比,使用精度为8bits的普通帧缓存执行FDK算法可以达到33.935的加速比,使用P-Buffer的加速比为14.099;对于T-FDK算法,普通缓存映射获得的加速比为28.956倍,P-Buffer为12.030倍。这些结果表明,使用显卡执行三维锥束FDK和T-FDK算法,算法加速效果显著。使用普通缓存和P-Buffer进行重建分别在速度上和图像精度上各有优势,各有其适用范围。
An industrial computed tomography (CT) system is an important NDT instrument for manufacturing industries. CT reconstruction algorithm is the core technique in the system. With the development of industrial CT systems using flat panel detector, reconstructing 3D image directly from 2D projection image data becomes the mainstream in the research of 3D CT. Among all kinds of 3D cone-beam reconstruction algorithms, FDK-type algorithms are the main methods. Industrial CT systems always deal with large work-pieces of high resolution, which results in large amount of data processing. How to make full use of current software and hardware to increase the reconstruction speed and quality is an important project in this area. One solution to this problem is utilizing commodity PC graphics boards (GPU) for acceleration, which is a method adapting to the characteristics of CT reconstruction well. This thesis researches on the principles of 3D cone-beam FDK algorithm and algorithms, including original FDK、G-FDK、HS-FDK、P-FDK、T-FDK and HT-FDK algorithms derived from it. Characteristics of all above FDK-type algorithms, as well as the applicability and difficulties of accelerating the algorithms on GPU, are analyzed and compared. We sort the FDK-type reconstruction algorithms into two kinds of GPU accelerating methods, one is for the original FDK and the other is a different accelerating method being used for FDK-type algorithms based on re-binning, among which T-FDK algorithm is most suitable to be accelerated on GPU. After analyzing main technical structures of accelerating CT reconstruction algorithms on GPU, we discuss projection and back-projection processes of 3D cone-beam CT reconstruction which are successfully implemented on GPU. Based on the projection and back-projection process, the implementation of accelerating FDK and T-FDK using GPU are described in detail. A grid mapping method is created when accelerating T-FDK algorithm because its projection data are re-binned. The advantages and problems of the methods are analyzed and discussed. Experiments of accelerating FDK and T-FDK on GPU are coded using Visual C++ and Cg language. OpenGL is chosen for application programming interface (API). Compared with software implementation, the GPU accelerated FDK method gets a speedup of 33.935 when using common 8bits frame-buffer of GPU, and a speedup of 14.099 when using P-Buffer. Meanwhile, the fast GPU accelerated T-FDK method gets a speedup of 28.956 compared with the software implementation when using common frame-buffer, and 12.030 when using P-Buffer. These results indicate that acceleration of the FDK and T-FDK using GPU is very efficient. Reconstruction with common frame-buffer of GPU has an advantage in speed and P-Buffer gives images of higher precision.