Deep learning in photoacoustic imaging


Deep learning in photoacoustic imaging combines the hybrid imaging modality of photoacoustic imaging with the rapidly evolving field of deep learning. Photoacoustic imaging is based on the photoacoustic effect, in which optical absorption causes a rise in temperature, which causes a subsequent rise in pressure via thermo-elastic expansion. This pressure rise propagates through the tissue and is sensed via ultrasonic transducers. Due to the proportionality between the optical absorption, the rise in temperature, and the rise in pressure, the ultrasound pressure wave signal can be used to quantify the original optical energy deposition within the tissue.
Photoacoustic imaging has applications of deep learning in both photoacoustic computed tomography and photoacoustic microscopy. PACT utilizes wide-field optical excitation and an array of unfocused ultrasound transducers. Similar to other computed tomography methods, the sample is imaged at multiple view angles, which are then used to perform an inverse reconstruction algorithm based on the detection geometry to elicit the initial pressure distribution within the tissue. PAM on the other hand uses focused ultrasound detection combined with weakly-focused optical excitation or tightly-focused optical excitation. PAM typically captures images point-by-point via a mechanical raster scanning pattern. At each scanned point, the acoustic time-of-flight provides axial resolution while the acoustic focusing yields lateral resolution.

Applications of deep learning in PACT

The one of the first applications of deep learning in PACT was by Reiter et al. in which a deep neural network was trained to learn spatial impulse responses and locate photoacoustic point sources. The resulting mean axial and lateral point location errors on 2,412 of their randomly selected test images were 0.28 mm and 0.37 mm respectively. After this initial implementation, the applications of deep learning in PACT have branched out primarily into removing artifacts from acoustic reflections, sparse sampling, limited-view, and limited-bandwidth. There has also been some recent work in PACT toward using deep learning for wavefront localization.. There have been networks based on fusion of information from two different reconstructions to improve the reconstruction using deep learning fusion based networks.

Using deep learning to locate photoacoustic point sources

Traditional photoacoustic beamforming techniques modeled photoacoustic wave propagation by using detector array geometry and the time-of-flight to account for differences in the PA signal arrival time. However, this technique failed to account for reverberant acoustic signals caused by acoustic reflection, resulting in acoustic reflection artifacts that corrupt the true photoacoustic point source location information. In Reiter et al., a convolutional neural network was used that took pre-beamformed photoacoustic data as input and outputted a classification result specifying the 2-D point source location.

Removing acoustic reflection artifacts (in the presence of multiple sources and channel noise)

Building on the work of Reiter et al., Allman et al. utilized a full VGG-16 architecture to locate point sources and remove reflection artifacts within raw photoacoustic channel data. This utilization of deep learning trained on simulated data produced in the MATLAB , and then later reaffirmed their results on experimental data.

Ill-posed PACT reconstruction

In PACT, tomographic reconstruction is performed, in which the projections from multiple solid angles are combined to form an image. When reconstruction methods like filtered backprojection or time reversal, are ill-posed inverse problems due to sampling under the Nyquist-Shannon's sampling requirement or with limited-bandwidth/view, the resulting reconstruction contains image artifacts. Traditionally these artifacts were removed with slow iterative methods like total variation minimization, but the advent of deep learning approaches has opened a new avenue that utilizes a priori knowledge from network training to remove artifacts. In the deep learning methods that seek to remove these sparse sampling, limited-bandwidth, and limited-view artifacts, the typical workflow involves first performing the ill-posed reconstruction technique to transform the pre-beamformed data into a 2-D representation of the initial pressure distribution that contains artifacts. Then, a convolutional neural network is trained to remove the artifacts, in order to produce an artifact-free representation of the ground truth initial pressure distribution.

Using deep learning to remove sparse sampling artifacts

When the density of uniform tomographic view angles is under what is prescribed by the Nyquist-Shannon's sampling theorem, it is said that the imaging system is performing sparse sampling. Sparse sampling typically occurs as a way of keeping production costs low and improving image acquisition speed. The typical network architectures used to remove these sparse sampling artifacts are U-net and Fully Dense U-net. Both of these architectures contain a compression and decompression phase. The compression phase learns to compress the image to a latent representation that lacks the imaging artifacts and other details. The decompression phase then combines with information passed by the residual connections in order to add back image details without adding in the details associated with the artifacts. FD U-net modifies the original U-net architecture by including dense blocks that allow layers to utilize information learned by previous layers within the dense block. Another technique was proposed using a simple CNN based architecture for removal of artifacts and improving the k-wave image reconstruction.

Removing limited-view artifacts with deep learning

When a region of partial solid angles are not captured, generally due to geometric limitations, the image acquisition is said to have limited-view. As illustrated by the experiments of Davoudi et al., limited-view corruptions can be directly observed as missing information in the frequency domain of the reconstructed image. Limited-view, similar to sparse sampling, makes the initial reconstruction algorithm ill-posed. Prior to deep learning, the limited-view problem was addressed with complex hardware such as acoustic deflectors and full ring-shaped transducer arrays, as well as solutions like compressed sensing, weighted factor, and iterative filtered backprojection. The result of this ill-posed reconstruction is imaging artifacts that can be removed by CNNs. The deep learning algorithms used to remove limited-view artifacts include U-net and FD U-net, as well as generative adversarial networks and volumetric versions of U-net. One GAN implementation of note improved upon U-net by using U-net as a generator and VGG as a discriminator, with the Wasserstein metric and gradient penalty to stabilize training.

Limited-bandwidth artifact removal with deep neural networks

The limited-bandwidth problem occurs as a result of the ultrasound transducer array's limited detection frequency bandwidth. This transducer array acts like a band-pass filter in the frequency domain, attenuating both high and low frequencies within the photoacoustic signal. This limited-bandwidth can cause artifacts and limit the axial resolution of the imaging system. The primary deep neural network architectures used to remove limited-bandwidth artifacts have been WGAN-GP and modified U-net. The typical method to remove artifacts and denoise limited-bandwidth reconstructions before deep learning was Wiener filtering, which helps to expand the PA signal's frequency spectrum. The primary advantage of the deep learning method over Wiener filtering is that Wiener filtering requires a high initial signal-to-noise ratio, which is not always possible, while the deep learning model has no such restriction.
Fusion of information for improving photoacoustic Images with deep neural networks
The complementary information is utilized using fusion based architectures for improving the photoacoustic image reconstruction. Since different reconstructions promote different characteristics in the output and hence the image quality and characteristics vary if a different reconstruction technique is used. A novel fusion based architecture was proposed to combine the output of two different reconstructions and give a better image quality as compared to any of those reconstructions. It includes weight sharing, and fusion of characteristics to achieve the desired improvement in the output image quality.

Applications of deep learning in PAM

Photoacoustic microscopy differs from other forms of photoacoustic tomography in that it uses focused ultrasound detection to acquire images pixel-by-pixel. PAM images are acquired as time-resolved volumetric data that is typically mapped to a 2-D projection via a Hilbert transform and maximum amplitude projection. The first application of deep learning to PAM, took the form of a motion-correction algorithm. This procedure was posed to correct the PAM artifacts that occur when an in vivo model moves during scanning. This movement creates the appearance of vessel discontinuities.

Deep learning to remove motion artifacts in PAM

The two primary motion artifact types addressed by deep learning in PAM are displacements in the vertical and tilted directions. Chen et al. used a simple three layer convolutional neural network, with each layer represented by a weight matrix and a bias vector, in order to remove the PAM motion artifacts. Two of the convolutional layers contain RELU activation functions, while the last has no activation function. Using this architecture, kernel sizes of 3 × 3, 4 × 4, and 5 × 5 were tested, with the largest kernel size of 5 × 5 yielding the best results. After training, the performance of the motion correction model was tested and performed well on both simulation and in vivo data.