You are Home   »   News   »   View Article

Harnessing big data with analytics

Friday, November 21, 2014

In the oil and gas industry, 'Big data' is basically about moving from a deterministic to a probabilistic approach, says Keith Holdaway, upstream domain expert for the SAS Global Oil and Gas business unit.

'Big data' is basically about moving from a deterministic to a probabilistic approach, said Keith Holdaway, upstream domain expert for the SAS Global Oil and Gas business unit, and author of a recent book 'Harness Oil and Gas Big Data with Analytics', speaking at the Digital Energy Journal Sept 23rd conference in Aberdeen, 'Using Analytics to Improve Production.'

For people trained in physics and mathematics, where numbers have absolute values not probability ranges, moving into probability 'takes people out of their comfort zone', he said.

More specifically, the aim is to use both deterministic and probabilistic approaches together.

Keith Holdaway previously worked in Shell as a geophysicist. SAS Institute is a company based in North Carolina, specialising in analytics software, in many different industries.

It aims to develop analytical workflows and methodologies.

A tried and tested method to get value from large data sets is the 'SEMMA' process, which stands for Sample, Explore, Modify, Model and Assess, he said.

You take a sample of data, and try to work out which variables are important in a multivariate and complex system, and explore it to surface hidden patterns. You might want to fit the data together with other data ('modify'), doing projects like cluster analysis.

Ultimately you work out which model is the best to work with your data, and finally you answer your business question with a probabilistic range of decisions.

You aim to build a useful model with a small sample of your data, rather than all of it by reducing the dimensionality of the input space but retaining the variance and distribution.

You want to try to understand the trends and patterns in the data, and develop a hypothesis which is worth testing.

Examples of business questions the oil and gas industry would like to answer are where is the best placement of wells, what is the best way to optimise production and recovery, and what is the best way to do hydraulic fracturing.

Ultimately if you can come up with a model which works, you can 'operationalise' your work - developed a structured and repeatable way of getting useful data.

Data mining can be 'direct' and 'indirect'. With 'direct' data mining, you have a specific objective function or target variable mapped to a business issue you are trying to answer, and you look for variables which will help you answer it. With 'indirect' data mining, you are trying to find the relationship and correlations between independent variables so as to classify the input space that may throw light on a business question.

If the work doesn't lead to a model which you can use to make decisions, 'you're not really doing anything other than academic exercise,' he said.

Ultimately 'It is OK finding the facts - the truth behind the data - but we have to find the truth that enables us to have business value,' he said.

US case study
In one example, with a US unconventional oil operator wanted to know what the best hydraulic fracture strategy would be. They had many wells going into a reservoir but didn't know which stages were most productive.

The first step was to gather together all of the available data relevant to the objective, and then use techniques like cluster analytics and neural networks to see what patterns it could find.

The data showed that a lot of the production was coming from certain pockets of the field, and ultimately work out which parameters had most impact on the production.

The company identified good wells and bad wells. Finally they could make the best decision about where to place the next well.

Middle East
SAS did a project with a National Oil Company in the Middle East, which wanted a better understanding of which wells would benefit from a workover, and better predictions of what the outcome of a workover would be.

The system provides an answer which is more probabilistic than deterministic.

The work included analysis on the water level in the reservoir. The company's expert on free water level saw the data SAS had produced and first said the data didn't make sense, but then realised it might be his understanding of how water behaves in reservoirs that was wrong, Mr Holdaway said.

Good and bad wells
Another company wanted to do analysis on thousands of wells, and find ways to improve production.

The analytics work identified 'good' wells and 'bad' wells, and their characteristics in terms of how they responded to different inputs.

SAS used the SEMMA process as defined above, to try to put together a model that would work, and understand what the important variables were.

It would aim to get the data to a point where a geoscientist could easily gain insights from it.

Ultimately you can identify if wells are under-producing (ie producing less than you would expect, given the data of their part of the reservoir), and so which wells should respond most to an intervention.

There are many different statistical and visualisation techniques to work through multiple sets of data. The neural network is 'one of the more popular ones,' for working on multiple variables, he said.

There are processes to look at trends of different variables and see how they are related statistically to a target variable,

One audience member asked if there was ways to integrate time series data (such as pressure readings at points in time) with spatial data (such as understanding a reservoir).

'Many people are doing analytics on time series quite well. But in our world we deal with geospatial not necessarily time series,' the audience member said.

Mr Holdaway said you can do it with seismic attribute analysis and a neural network approach, to work out reservoir properties between the wells. 'So there's well established techniques for marrying the spatial world with the time based world.'

Associated Companies
» Digital Energy Journal
comments powered by Disqus


To attend our free events, receive our newsletter, and receive the free colour Digital Energy Journal.


How much time are you wasting on data management?
Ross Philo
from Energistics


Latest Edition June-July 2017
Jun 2017

Download latest and back issues


Learn more about supporting Digital Energy Journal