Back to Top
Skip to main content
Novel Signatures from Deployed Sensors for Natural Gas Transmission Pipelines
Project Number
Last Reviewed Dated

The ultimate overall goal is to leverage advances in machine learning and predictive analytics to advance the state of the art in pipeline infrastructure integrity management using forecasted (predictive) pipeline condition, based on large sets of pipeline integrity data (periodic in-line inspection, ILI) and operational data (e.g., sensor data used to monitor flow rate and temperature) generated by natural gas transmission (NGT) pipeline operators. The current project represents Phase I, which focuses on training machine learning (ML) models on existing ILI and operational data. Phase I would culminate in research-grade diagnostic and prognostic ML models that would be ready for “beta testing” under a follow-on Phase II project with the same NGT pipeline partners that provided data for ML model training during Phase I.


Pacific Northwest National Laboratory (PNNL)


NGT pipeline networks are critical infrastructures whose reliability is essential to sustaining energy sector operations and the U.S. economy. Establishing the reliability of NGT pipelines requires periodic ILI to assess their leak integrity.

ILI techniques are typically used to detect the presence of degradation such as axial cracking, mechanical damage, or corrosion in NGT pipelines. Typically, the detection of degradation triggers other analysis techniques, often based on structural mechanics, for assessing the structural integrity of pipelines. ILI techniques that have been applied for pipeline integrity assessment include magnetic flux leakage (MFL) and ultrasound.

Fundamentally, the need is for determining the current health of the pipeline network and quantifying future changes in health and reliability of the network. Such information provides the basis for predictive operational and maintenance decision making and timely preventative maintenance that allow operators to optimize resource allocation for inspection and maintenance and mitigate or prevent leaks and ruptures (pipeline failures). Part of the challenge in addressing all of these needs is the large amount of diverse data available from ILI measurements across the network, as well as the volume of operational data that may exist.

The sheer volume of ILI and operational data used in integrity management of aging pipelines and decision-making on operations and maintenance for integrity management actions points to the need for advances in data analysis methods to support cost-efficient, risk-informed decisions. Advanced data analysis methods may be able to provide new insights from existing sensor data for decision-making purposes and enable the creation of a Pipeline Reliability & Lifecycle Management System that:

  • • Uses real-time sensor measurements from participating NGT pipeline operators to assess the reliability of large pipeline networks and determine their fitness for service. Specifically, the sensor measurements may be used to assign a Pipeline Health Index (PHI) that reflects the current condition of today’s pipelines and be readily updated as new data are generated.
  • • Uses the PHI data with well-established pipeline degradation models to generate data-driven estimates of the remaining service life of today’s NGT pipelines, thereby providing a quantitative measure of the anticipated changes to the reliability of pipeline networks.

Within this context, machine learning and predictive data analytics offer new opportunities to glean information from large historical data sets (ILI and operational data such as flow and temperature) from NGT pipeline operators to: (1) identify ILI tool signatures that are useful for detection of degradation impacting the health of pipelines; (2) obtain a better understanding of the health of pipelines as a whole, and; (3) use the data with well-established pipeline degradation models to develop methods for forecasting when and where in that network pipeline integrity and reliability may be compromised. Collectively, this suite of technologies for detecting and characterizing the current health of pipeline networks and predicting future changes in pipeline health support efficient operations, maintenance planning, and planning for critical infrastructure upgrades. Insights from these technologies, along with information on application measurement needs and metrics, operational environment, and deployment attributes (size, weight, power, cost, robustness), also enable better choices of sensor technology to meet a wide range of measurement needs.


The diagnostic analytics to be developed are aimed at, 1) improving detection of degradation (specifically external corrosion as a starting point) by MFL and/or acoustic (e.g., such as electromagnetic acoustic transducer) ILI tools in NGT pipelines and 2) improving flaw dimensioning, which will improve the accuracy of failure pressure calculations and help operators reduce the risk of pipeline failures. The prognostic analytics to be developed are aimed at improving the selection of a corrosion rate for a pipeline based on its history, environment, and material properties to enable more confident predictions of remaining useful life, enhance situational awareness of future risk, support decisions about where to focus inspections, and inform the timing of preventative maintenance to mitigate risks of pipeline failure (leak or rupture).

Accomplishments (most recent listed first)

Scientific Achievements:

  • In FY22 Q1 and Q2, the PNNL team continued to improve diagnostic model performance and make progress in prognostic model development.
  • On the diagnostic model development task, the team focused on validating the finite-element model to expand the training data for defect identification and surface reconstruction modes. The team developed and continued to update a new defect segmentation workflow to partially automate the excise of new data points from pull data and improve the accuracy of ground-truth information.

On the prognostic model development task, the team curated large datasets from two industry partners and developed an integrated database providing a uniform format for the ILI datasets from different timeframes and operators. The team analyzed the ILI data from both the partners and estimated corrosion growth rate distributions for the data collected from repeat scans and scans of the same pipeline segment on two or more occasions of several oil and gas pipelines. The team focused on analyzing joint aggregate based rates which eliminates the need for defect-to-defect matching between repeat scans. The team also evaluated the usefulness of different statistical modeling approaches, especially defect variability estimates on the ILI data, to support the development of a prognostic model relating pipe metal loss due to corrosion to pipe data including pipe age, material, coating, location, and other factors.

Industry Data Share Partners:

  • Continued the first successful partnership established on the project with a major NGT pipeline operator, who signed another non-disclosure agreement (NDA) with PNNL on 2/9/2022 to extend the original NDA (signed on March 12, 2021) to 3/12/2024. Over 600 MB of data have been shared by the operator with PNNL since May 19, 2021. The datasets include pipeline attribute data; corrosion flaw dimensions from previously analyzed ILI data (analyzed by ILI service providers); pipeline attribute data (e.g., material properties); pipeline operating and service history; and pipeline locations. The data are being analyzed to quantify corrosion rate and prioritize the pipeline attributes and environmental factors that are most strongly correlated with corrosion rate (to support prognostic modeling).
  • Continued the second successful partnership established on the project with a second major NGT pipeline operator, who signed another Material Transfer Agreement (MTA) with PNNL on 2/11/2022 to extend the original MTA (signed March 22, 2021) to 9/30/2023. Another 2 GB of data were shared by the operator with PNNL on April 23 and May 21, 2021, bringing the total data shared to-date to 80 GB. These data are also being analyzed to quantify corrosion rate and prioritize the pipeline attributes and environmental factors that are most strongly correlated with corrosion rate (to support prognostic modeling).
  • Continued the third successful partnership established on the project with a pipeline inline inspection (ILI) technology company/service provider, who signed another NDA with PNNL on 7/21/2021 to extend the original NDA (signed 7/17/2020) to 7/17/2023. Over 40 GB of data have been shared by the company with PNNL since July 30, 2020. The data are meta data and ground-truth (laser profile) data that correspond with magnetic flux leakage (MFL) signal datasets for the diagnostic model being developed for digital flaw reconstruction. PNNL and the ILI partner successfully collaborated on an abstract and conference paper for the ASME 2022 14TH International Pipeline Conference Collaborating for Sustainability (IPC2022) that will take place in Calgary, AB, Canada September 26-30, 2022. The conference paper that will be presented will focus on diagnostic model progress.
  • Ongoing communication and collaboration with the Pipeline Research Council International (PRCI) to define new partnership activities for the new years. The project delivered a technical talk at the PRCI’s REX 2022 Conference on 3/8/2022. The team coordinated with the PRCI to deliver a presentation titled Machine Learning-based Prediction of Future Pipe Condition at the joint PRCI-PNNL webinar on 10/13/2020. The webinar was attended by over 100 participants from the U.S. and other countries and generated impetus for broader industry partnership.

Technical Presentations and Conferences:

  • Abstract submitted to the International Pipeline Conference 2022 has been accepted.
  • Delivered a presentation at the PRCI REX 2022 conference in March 2022.
  • Delivered a project summary/progress presentation at the 2021 Carbon Management and Oil and Gas Research Project Review Meeting on 8/24/2021
  • Delivered a project summary/progress presentation at the 2020 Carbon Management and Oil and Gas Research Project Review Meeting on 10/8/2020
  • Delivered an oral presentation on Natural Gas Transmission Prognostics with Machine Learning: Using Novel Signatures from Deployed Transmission Infrastructure & Sensors at Pipeline Research Council International (PRCI) Research Exchange (REX) 2020, March 3-4, San Diego, CA.

Continued industry outreach and communications:

  • Extended existing NDAs and MTA with industry partners to facilitate ongoing collaboration for the duration of the project.
  • Continued to receive additional data from and conduct technical exchanges with industry partners in data analysis and model development.
Current Status

(May 2022)

The team continues to refine the diagnostic model and develop the prognostic model in the remainder of FY22.

The team achieved a milestone with a diagnostic flaw reconstruction prototype. The goal of this model is to predict the defect surface (not just its geometric properties) from raw (calibrated) MFL data. The preliminary results, along with a discussion of our total diagnostic and prognostic workflows, was documented in a conference paper submitted to IPC 2022 in late April. The diagnostic dataset of experimental data from machined and natural corrosion flaws is now being expanded using a recently validated FEM model of the MFL tool. Simulated MFL pulltest data from modeled corrosion defects will be used to interpolate and expand the experimental dataset for the diagnostic models.

The prognostics team has achieved two milestones. A working training dataset of corrosion features was completed based on statistical and SME advice on the most relevant features for prediction. A baseline for the prognostics model was also completed and preliminary results were obtained. This provides the point of comparison for further model developments and shows the training dataset is suitable to be integrated into the model. The existing prognostic framework utilizes random forest models, while deep learning variants are currently being evaluated. Models to quantify the uncertainty in the prognostics predictions due to sensor and matching uncertainties is also underway to help assess accuracy. The team also made significant progress in quantifying corrosion rates with defect matching and joint aggregate based methods. In the absence of ground truth data, the estimates from these methods could serve as either the bounds for corrosion rate or as options for the models to be trained. Discussion with partners about the estimated rates resulted in additional data sharing from one of the partners which included dig reports and measurements. The team is currently evaluating these reports.

The team will present the progress to date with the DOE on 6/1/2022. The anticipated outcome of the presentation is to enable the completion of the remaining project scope.

Based on the technical roadmap, the team will produce research-grade diagnostic and prognostic models by the end of Phase I (FY19-23). The team will complete additional alpha and beta testing of the models to produce a Pipeline Health Display (PHD) v1.0 with a user-friendly GUI by the end of Phase II (to be proposed/funded; FY24-FY25) and by the end of Phase III (to be proposed/funded; FY25-26), will complete technology transfer of the verified and validated PHD v1.0 to partner operators. In a nutshell, the team anticipate external users will be able to use PHD V1.0 in FY25 upon the completion of technology transfer but industry partners will be involved in model and GUI testing prior to the official technology transfer and release.

Project Start
Project End
DOE Contribution


Performer Contribution


Contact Information

NETL –  Eric Smistad ( or 832-603-0435)
PNNL –  Kayte Denslow  ( or 509-375-2232)
PNNL –  Steven Rosenthal ( or 509-375-4375)