You are Home   »   News   »   View Article

The oil and gas data production line

Thursday, September 11, 2014

The oil and gas industry can be seen as a data production line, with data flowing between different departments and gradually getting refined. With each step, some of the value is lost. Ketan Puri of Infosys explains

As data is processed, it loses its value due to loss in granularity and the time it takes to reach the enterprise applications.

The data undergoes disintegration and transformations to cater to needs of different applications and in turn makes it difficult for analytical tools to extract real value.

Dependency on different data sources with proprietary data formats is created to extract the real value out of the data. This leads to loss in time and limits the enterprise to make timely decisions.

Vertical steps
Many steps are involved in the vertical journey of data from source to upstream enterprise applications.

First, the raw data gets captured by the on premise upstream systems.

It gets enriched and transformed into industry standard formats.

The data is extracted, transformed, published, and loaded into the enterprise data centres. The data centres aggregate this data and make it available to the business applications for consumption.

Data gets filtered, transformed, or new data is synthesized in different formats based application needs and network limitations.

A grain of data is the lowest level detail created by the source system identifying the first occurrence of an event characterized by a set of parameters. The granularity of data is a measure of how close it is to the grain.

Business classification
We can classify data by 'business' - whether it is for exploration, development or production.

Data work in exploration involves analysis of subsurface data and making 3D visualizations to understand the geology. The data streams can be analysed in flight to correct the data errors. It can be staged directly to the enterprise High Performance Computing data centres for near real time analysis.

Data work in development projects includes making data models to identify optimal drilling geometries and well spacing, based on exploration data and past drilling data from other wells.

Data work in production involves monitoring safe operations of wells, including data for temperature, pressure and fluid injection.

Data Analytics, including real time and historical data analysis, can help develop data models for safer operation. Analytics models can be created to analyse the production of one well in relation to other wells in the same region or similar wells across geographies.

Frequency based classification
Frequency based classification means if it is high frequency, medium frequency or low frequency.

High frequency data is generally produced by the sensors, well site data, construction operations related to wellbores, drilling, service data, SCADA systems and other devices associated to subset of drilling, exploration, and production operations. OPC and WITSML data falls in this category.

Medium frequency data is associated to production related activities, time series data, operations, lab analysis, well completion, flow networks. The time unit associated ranges from Hours, Days and weeks.

Low frequency data is associated to geospatial data, structural, stratigraphic (faults and horizons), fractures, time and depth.

Granularity based classification
Granularity based classification means classifying data according to whether it is rawdata, operational data or standardised / refined data.

Raw data is the data produced by the different devices at site locations. This is the most granular form of data. Tapping into this data in real time can generate enormous value to the business.

Operational data is the data captured by the standard upstream systems with proprietary processing logic catering to the operational needs of the business. This includes SCADA systems, vendor specific assets to generate drill logs, Alarms and Events, Historical data logs, and sensor data. These systems consume the raw data and provide mechanism for the system operators to monitor the health of the upstream operations.

Standardized data is the transformed data into various industry standard formats. It caters to different segments of the upstream business.

Integration based classification
An integration based classification for data decides if it is streaming data, staged data (eg files and documents) or data about specific events.

Streaming Data is produced using custom streaming programs or off the shelf product stacks from various vendors. It converts the raw data produced by the low level systems into data streams bypassing the complicated layers of the enterprise.

It provides the ability for Real time data analytics and faster access for the enterprise to react to the system anomalies. Derived analytical data is produced as a result of analytical techniques and models.

Staged data can be in form of files or database.

The streaming data can be stored into Massively Parallel Processing data stores for in depth analysis by use of advanced statistical methods and predictive analytical techniques.

Events and Notifications can be generated using data models on top of both the streaming data and staged data. These can facilitate timely response strategies catering to different business scenarios.

Associated Companies
» Infosys
comments powered by Disqus


To attend our free events, receive our newsletter, and receive the free colour Digital Energy Journal.


Latest Edition Oct-Dec 2023
Nov 2023

Download latest and back issues


Learn more about supporting Digital Energy Journal