You are Home   »   News   »   View Article

How to make data science useful in oil and gas

Thursday, July 16, 2020

Poh Hean Yap, senior manager with Accenture, explained some of the best ways to make data science useful in oil and gas - including by combining data science expertise with domain expertise, and focussing the work into short 'sprints' to develop a minimum viable product - with a case study of how data science saved millions of dollars.

Poh Hean Yap, senior manager with Accenture, explained some of the best ways to make data science useful in upstream oil and gas, including by combining data science expertise with domain expertise, and working in focussed 'sprints', based around a 'minimum viable product', so the client can quickly get a return on investment.

She presented a case study of how Accenture saved an oil and gas client millions of pounds, with a tool to better manage its injection of chemicals to manage emulsion in oil, and recommendations to change the sampling points on an offshore platform.

Ms Yap has been combining data science with domain expertise since doing a Phd in data mining for the textile sector. Her Phd project was to identify factors which impacted the fibre quality of wool in Australia.

Wool which does not meet the fibre specification in Australia cannot be sold for use to make clothing, so 'your whole batch needs to be teared up and made into carpet,' she said.

To understand fibre quality, she had to develop an in-depth understanding of how the industry worked, and the factors impacting fibre quality. This was then used together with data science to make recommendations.

Ms Yap has also spent 6 years working in a refinery, just focussing on using data for optimisation. Working for such a period in a single domain is a good way to build up domain expertise, she said. Now, she focusses only on data science in the resources domain, including oil and gas, chemicals and mining.

Ms Yap's case study, outlined below, was a system Accenture built to enable an oil and gas company to better understand how much chemical it needed to inject in an oil processing facility, in order to prevent emulsion forming in the oil, and oil subsequently getting rejected by the client, a refinery. In order to do this, you need some knowledge at least of what an emulsion is, she said.

Some data scientists promise just to be able to make predictions out of data, with no domain expertise. 'I tell then, go away. If they don't try to understand your domain before they start telling you what can be done, it is really useless.'

A minimum viable product

It is easy for data science projects to get very long, lose focus and become very expensive. To avoid this, Ms Yap advocates that projects should be constrained to a finite length, such as 14 weeks, and aim to deliver a useful output within this time, known as a 'minimum viable product' (MVP) which is useful enough to provide a return on the investment

In the industrial data science world, the MVP might be a dashboard tool for calculating or monitoring something specific, plus some useful recommendations.

Ms Yap explains the concept of MVP with the analogy of someone developing a device for personal transportation where none had previously existed. Your MVP in the first stage might just be a board with wheels, like a skateboard. You might add a handlebar to help balancing in the second stage. You would not try to build a car from scratch.

Another analogy is how Apple developed the iPod before it developed the iPhone, she said.

A focus on MVP also means you are focussing on collecting the minimum amount of data you need to make something which works. This is important, because otherwise you can spend enormous amounts of time finding and cleaning up data.

'Let the data scientist tell you if there is enough data to work with or not,' she said.

Starting a project

Projects often begin with a vague request from the client, such as 'show me how digital can help us', 'show me how I can resolve this specific problem with digital technology', or 'can you build an AI which can scan through documents to tell me, how much should I be spending on this well.'

In the example below, the initial request from the client was, 'we suspect over-use of chemicals [for injection], we are spending too much money, can you please look at it.'

So one of the most important parts of the work is aiming to distil this initial request into something specific which may be possible to deliver.

Accenture takes the entire first two weeks of the 14 week period, doing a workshop with the client to try to identify in depth what the client is trying to achieve, and where to start.

Participants can be split up into groups, each discussing how they think digital technology can help them, or what they want to achieve. 'You have to define your North Star,' she says. You need more than 'I just want to go digital'.

Participants are asked to specify how the company would get business value from the investment, such as from achieving a 10 per cent reduction in spend on wells, or reduce the cost of optimizing wells by 5-10 per cent. They can't just stipulate software as the desired output, such as 'I want a dashboard'.

Accenture brings in a number of use cases from similar projects it has done.

This process may generate a number of ideas, which need to be further screened and refined for their business value.

You need to consider how much the project will disrupt the organization and how much value it will achieve. If you need to change the entire company to do one thing, it will need to show a lot of value to be worthwhile. Conversely you may have projects which are high value but low disruption, so they are relatively easy to do. 'That is your low hanging fruit,' she said.

You also need to consider that the people actually attending the workshop may not be representative of the whole company - there are people who did not attend the workshop who will not gain much value from the product, but may also need to participate for it to be successful.

Use on other equipment

Companies are often tempted by the idea that they can develop a tool which can then be used in multiple different ways around the company. 'They've paid for it, they want more value from it.'

But this rarely works. 'All the data is very sensitive to wherever you are trying to explore,' she said.

You find that parameters work together in different ways on different equipment, or something else matters.

Even if the model is being used on very similar equipment, it will require re-training and customisation to work on the second set of equipment.

Work process

Once you have defined what you want to achieve, with a rough idea of the 'minimum viable project', the work can begin.

The first step is to define the team structure and governance system.

Typically, there will be a 'business results manager' whose role is to keep the focus on making a MVP and support the work of getting there. There will be business 'domain experts' who understand the industrial process the system relates to. You can have data scientists and data engineers to work with the data.

You may have business analysts which sit between the domain experts and data specialists. And you may have visualization experts who drive how the data should be presented and user interface should work.

You can have 'scrum masters who can stream the work if it is in multiple streams.

You have to define what technology you will use, for example Spotfire for visualization.

Once the direction has been defined, the next 11 weeks of work can be very focussed and fast paced, experimenting with different ways of working with the data and discussing it with clients. The work with clients is collaborative. 'We want to bring you on a journey with us', she said.

Much of the 11 weeks work includes pre-processing and cleaning of data.

The 'data discovery' work may give you some results, for example a means of predicting what the output will be with certain input data. Then you can build this into a model, put it into a software tool, which the customer can use to predict what will happen based on the same given input data in future.

There is a different between 'predictive' and 'prescriptive' - prescriptive tells you what is actually happen or going to happen, predictive has elements of probability in it.

The 11 weeks allows time for 'deep dive' into the problem. Perhaps you will find some people in a company see a problem, but not everyone agrees.

Chemical injection example

Ms. Yap presented one example, of an oil and gas operator which felt it was spending too much on chemicals which were being injected into the oil stream to stop it forming an emulsion.

The oil had been rejected by a refinery downstream because it was too high in emulsion, leading to the injection rate being increased.

The client started by just saying, 'We are spending too much money, we suspect over-use of chemicals. Can you please look at it.'

At the process plant, the work process was to take a sample of the crude to see if it was forming an emulsion, and if not, to add more chemical. The oil needed to be heated to a certain temperature for the chemical to work.

Building a model

In data science terms, this project could be defined as a request to make a model of how much chemicals are actually required to be injected in order to ensure the output oil meets the specification of the downstream refinery - and then assess how closely this matches what is actually being injected.

A first task was to discover how much chemical was being injected. Accenture's team discovered that the client had no reliable records.

The chemical injection rate could be recorded by operations staff in handwritten notes, which were then typed into a spreadsheet and sent periodically to a production chemist. Sometimes there were gaps in the data, which could indicate that someone had forgotten to record the amount of injection, or nothing was injected.

Sometimes people offshore forget to inject one day, then realise, and inject twice as much the next day, but do not record this omission in their records.

Accenture's team also found out that there is very little communication between the various silos of the company. There was a downstream group which buys crude from the upstream group. The downstream group says the oil does not meet specification, and causes the refinery to shut down. The upstream group claimed it did meet the specification. There was not much communication between the production chemistry specialists and the separator maintenance teams.

The first discovery was that the oil from the upstream group actually was not meeting the specification, but still being shipped.

A second discovery was that the impact of chemicals depended on the specification of the input crude, and the operating conditions. So to build a model of how much chemicals were required, it would be necessarily to understand how the system is working.

How the system is working

So Accenture's team needed to understand how the oil was actually flowing around the processing plant.

A simple process flow diagram was drawn, showing that incoming crude goes into one of four heat exchangers, then goes to separators, with some circulation so it would then heat up more incoming crude.

The team found that the client's existing piping and instrumentation diagram was inaccurate, with sampling points positioned in the wrong place.

The chemical was only injected into two of the four heat exchange streams, so half of it was not being treated. But the sampling points where in other two flow streams - so the company was not injecting and sampling in the same stream.

Build a data model

This leads to work to construct a data model about how the different functions inter-relate.

You need to separate causation and correlation in doing this. For example, you may observe that the emulsion level in export crude is low when a certain parameter is high. But this does not tell you that you have identified the parameter driving the emulsion level, she said.

At the end of the work, a data model could be built containing the predictions, constructed within a dashboard software tool for the client.

Given the inputs, the model can make predictions of what the temperature or emulsion level will be at a certain point, or where it will go in the next hour. You can ask the model, if you add a certain amount more chemical, how that will change the output.

The software tool was given to the client together with recommendations to move the injection point or use a different sampling point.

The system was then put through a field trial, where the client can be asked to close off certain trains, add certain chemical, and then compare the actual results with the modelling results.

Associated Companies
» Accenture
comments powered by Disqus


To attend our free events, receive our newsletter, and receive the free colour Digital Energy Journal.


Latest Edition Jul-Aug 2022
Aug 2022

Download latest and back issues


Learn more about supporting Digital Energy Journal