You are Home   »   News   »   View Article

How to use analytics in subsurface

Tuesday, April 17, 2018

Said Jane McConnell, Practice Partner - Oil and Gas at Teradata.

Change happening

There's 'definitely change happening' in how analytics and data science are being used in oil and gas, she said, particularly in analytics and drilling.

The UK is running competitions where people are challenged to find new ways to 'unlock the value of data' with analytics. There was a geoscience data hackathon at the Paris EAGE conference in June 2017.

There is a lot of analytics work going on with processing seismic data and generating structural models.

'The way people are interacting with this kind of data is changing,' she said. 'It is not just fixed workflows.'

Data management

Analytics needs good quality data, which means that the data needs to be well managed.

This probably means that the industry needs to move from manual data management to more automated methods, she said.

The industry has long employed a large army of data managers moving files to wherever they are needed and do necessary conversions along the way, for example to export seismic SEG-Y data into Landmark interpretation

But along with this way of working comes a habit of fixing problems with the data just before the data is needed, rather than fixing problems with the data stores behind it, she said. Also the manual methods can be error prone.

For analytics to work, the systems need to be able to retrieve data from the company data stores in good quality.

Teradata has seen many data science projects in oil companies where the analytics was well underway before someone realised a major problem with the underlying data, such as half of the data is in one unit of measure and half is in another, she said. As an example, 'we had people working with weather data - half of the data had the wind speed in metres per second and the other half was knots,'

So people working with analytics end up spending endless time trying to resolve problems or fill gaps in the core data stores.

It doesn't help that the people who originally created the data, and understand it, are often not available to help.

Data lake

One desired end goal is described as a 'data lake', where all the data which analytics systems might need is readily accessible.

A data lake is not a physical data store, but more an architecture, where all of the data stores are available to the analytics systems, she said.

Many people have got the wrong idea about a data lake, thinking that if they just copied their data into a single file store, 'as if by magic good things would happen,' she said.

For this reason, some people are turning away from the term data lake, using the term 'discovery lake' instead.

The analysis company Gartner has described three types of data lake. One is the 'inflow data lake', where you bring in data from multiple sources together to one place, such as a dashboard. The second is the 'outflow data lake' where one set of data serves multiple different applications. The third type is a 'basic data lake' which is a starting point, with some controlled data management.

Data preparation

There is never enough time to get perfect data, so it is useful to define the minimum quality data which is acceptable for what you want to do with it, rather than try to get it perfect.

In this sense, subsurface data work is different to financial data, and most traditional data management work, where everything has to be absolutely correct at any point. Subsurface data work is more experimental.

Most successful data science projects just focus on one or two specific areas, rather than cover the whole company, she said. You may want to build a data lake for trying to solve a specific problem, and make sure the data is just good enough for that.

However, as the industry sees more successful data science projects, it is probable that it will want to do more and more of it. This will increase the need to bring in structured data governance processes. It will also increase the complexity of the data preparation work, so create more room for error.

Master Data Management (MDM) means setting up processes, governance and standards for managing the critical data of an organisation, making it available from a single point of reference.

'I can't think of any oil companies that do MDM beautifully,' she said.
Spending time on MDM, across the company, is a good way to improve data management and to be better organised.

Build not buy

Until now, the upstream oil and gas industry has mainly worked with purchased software packages, which include data management, visualisation and analytics tools in the package. Data preparation work has mainly been in accordance with the requirements of the package. This could be called 'buy not build'.

Part of the reason for this is that corporate IT departments typically did not have much understanding of the petrotechnical and engineering domains, so it made sense for oil companies to use software developed by oil and gas service companies, such as Landmark and Schlumberger.

But this approach is not best suited to the analytics era, where data scientists want to put together different types of data in new ways to gather new insights from them. The data stores within software packages are usually in formats which only that software can understand.

It would probably be better to take a 'build not buy' approach, making analytics tools as you go along, and building tools to get data to them in the right format. Data engineers can write code to run data pipelines, including tools to split files, move data and run data quality checks.

There is an open source project to develop software to automate the flow of data between software systems, which Teradata is involved in, called Apache NiFi. NiFi is short for Niagara Files, a previous name of the software, when it was developed by the US National Security Agency.

NiFi can be used to define the jobs you want to do with data such as split or check files, and schedule jobs and check for problems.

Many oil and gas companies have data stored in a range of old file formats, which they think can only be read with special software. But there is a surprising amount of open source software tools and scripts available which can work with old file formats, she said.

The important point is that 'we need to move to a place where data is not held away from us by the software vendors and applications companies,' she said.

This philosophy was adopted by the team setting up the 'Diskos' National Data Repository in Norway, where there was a view that data could only be stored in non-proprietary formats.

Why subsurface is different

Subsurface data is different in many ways to data from most other industries, and it is important to understand these differences if you want to do data science with it, she said.

Subsurface data analytics includes a lot of measurement data, which is not found in other industry sectors. There can be very complex jargon and data structures.

Petrotechnical and engineering software systems are usually built around the specific needs of the subsurface domain. In this sense, they are different to oil and gas business IT, which is similar to business IT in other industries, she said.

So oil and gas business departments have long been doing analytics with the same software that other business departments use, such as Tibco Spotfire, but the petrotechnical world has been limited to what it can do within the subsurface software environment. And it has not been very easy to do analytics which involve bringing petrotechnical and business data together.

Another issue is that much subsurface data management work evolved out of work to do records management, looking after physical items such as tapes, printed well logs, fluids, seismic tape. The culture is around making sure the original data doesn't get lost, rather than finding ways to move forward, she said.

Rules for analytics

When doing subsurface data analytics, the first rule can be to get the right people. The best people are described as 'T shaped,' having both depth (being very good at one narrow aspect of E&P) and breadth (understanding how it all fits together). For data science projects, you want people who have in-depth data science skills, but also who understand the broader oil and gas domain.

The second rule is to work on the right platform (software system). The E&P industry typically works with linear workflows, where data is worked on with one application in one department, then sent on to another application in another department, and these methods have evolved over time.

But this means that some data types have never been put together, because the traditional apps don't have a way to do it. Trying out new ways to put data together is usually a big part of analytics work.

Analytics also often involves looking deeply within data to see if there is something worth looking at further, which is not something which can be done easily if the data can only be accessed via an application.

The third rule is to work around 'good enough' data management - doing the minimum amount of work to be able to answer the business question you want to answer. This might mean storing data so you can just pull out the piece you want, such as individual seismic traces, or well log data just for a specific depth level.

It helps if data is 'profiled' so people can get an idea of what it is, without having to load it into the right software system to understand it.

The fourth rule is to be 'agile', or focus narrowly on what the goal requires. The industry can so easily get stuck into 'waterfall' rigid step by step processes which take years, rather than going as fast as possible to answer the specific question.

'If you have a business request for a piece of work, do that piece of work, make sure the value is delivered, don't turn it into a 10 year project,' she said. Small projects can be better - people working quickly to see if they can achieve some specific outcome which is useful for the business.

The fifth role is to get business buy-in. If you want to make changes across silos of the business, you need business support at level which the various departments will both listen to, which might mean 'C' level. Otherwise you can't escape out of any silo. One idea is to have a C level 'chief data officer' who guides the company on things to stop doing or start doing.

The chief data officer might also try to stop people using software tools which are very difficult for someone else to work with, such as Excel and PowerPoint. They can also ensure continued governance on the data and continued data quality improvement. Dashboards can be a good way to drive data quality.

Companies are increasingly forming 'asset teams' where people with different disciplines work together on business problems - rather than in the old days where one department works on data and throws it over a wall to the next one - geophysicist do their seismic interpretation, reservoir engineers to their simulation.

Associated Companies
» Teradata
comments powered by Disqus


To attend our free events, receive our newsletter, and receive the free colour Digital Energy Journal.


Latest Edition Oct-Dec 2023
Nov 2023

Download latest and back issues


Learn more about supporting Digital Energy Journal