[2B6] Uncertainty quantification for deep learning applied to ultrasonic inline pipe inspection

R Pyle¹,², P Wilcox¹, R Hughes¹ and A Ait Si Ali²
1University of Bristol, UK
2Baker Hughes, UK 

As NDE data becomes larger and more complex, the bottleneck is increasingly the human operator who must judge the condition of the part. Whilst the goal of data is to aid defect classification, when there is too much of it for a human to interpret at once it can be counterproductive. Since data interpretation is a pattern recognition task, machine learning is a well-suited alternative to human interpretation. However, a major barrier to its adoption by industry is qualifying NDE inspections without some degree of uncertainty quantification (UQ). Physics-based methods often provide UQ by virtue of being explainable in terms of how they interpret data. However, despite the success of deep learning in providing human-level data analysis of NDE data, there has been little research into its UQ.

This paper aims to show how UQ for deep learning in NDE is possible by presenting two modern UQ methods: Monte-Carlo dropout and deep ensembles. An example application of ultrasonic inline pipe inspection is considered and a convolutional neural network (CNN) used as the deep learning architecture, but the general approach may be applied to any inspection scenario and any machine learning techniques. The training set used in this paper is simulated using a hybrid finite element and ray-based method. 14,500 datasets of A-scan responses from surface-breaking defects are simulated for training and validation purposes, while a small pool of 1485 experimental datasets from electrical discharge machined (EDM) notches are used for further validation and testing. Four distinct plane wave imaging (PWI) images are input to the CNN, which predicts the remaining wall thickness with detection of the defect assumed already complete.

The success of the UQ is evaluated by calibration (ie the predicted uncertainty proportional to the error) and anomaly detection performance (ie assigning high uncertainty to out of training domain data). Calibration is tested using simulated and experimental images of surface-breaking cracks, while anomaly detection is tested using experimental side-drilled holes and simulated embedded cracks. Approaches to increase the overall interpretability of deep learning methods will also be discussed.

Keywords: deep learning, uncertainty quantification, defect characterisation, ultrasound, simulation.