You are Home   »   News   »   View Article

Improving data quality

Thursday, April 11, 2013

There are many complaints in the oil and gas industry about poor data quality -but perhaps not so many great ideas for how to improve it

The best way to ensure complete data quality is probably to have better data integration between systems, so there was less retyping or opportunity for data to get soured, said Eric Toogood, DISKOS manager, Norwegian Petroleum Directorate.

'You have systems which are talking to each other from day one. That needs a complete redesign of the process.'

Ugur Algan, director of consultancy Volantice, said that people need to define what data quality is first, before working out how to achieve it.

'Does it mean, check that log is taken in the right way and the calibration was correct when we acquired the log? Is it the quality of the metadata? Is it being described correctly? Is it the completeness?'

'The answer there is yes to all these questions,' Mr Toogood. 'You have to attack them all and make sure you have control. Or indicate what you don't have control over, [say] You're not sure about the actual position of the well.'

Standard definitions

Another way to improve data quality might be to make sure people are using the same definitions for data, or they know what definitions they are using.

'If you go and drill in Russia, their gamma ray logs are exactly opposite of what's in the Western world,' said Samit Sengupta, managing director of Geologix, a company which helps companies communicate and work with well data.

'Now you can imagine the impact if you didn't know that, you'd perforate the wrong areas, or you'd look for oil in the wrong areas.'

'It there were data dictionaries which were standardised in the industry and easily accessible, then that job [of understanding data] would be easier.'

'Some years ago, there was talk of having that as a web service, where you have the entire data dictionary available.'

'PPDM is incorporating some existing naming standards, data dictionaries, mnemonics, and things like that,' said Jess Kozman of Mubadala Petroleum, who has worked with the board of PPDM.

'Over the last few years, we've spent probably too much time on the mechanics of delivering the data, but not thinking about what does the end user need to know about the data to be able to use it faster,' Mr Kozman said.

Living with uncertainty

On the other hand, maybe we would be better off getting better at living with uncertainty of data, rather than trying to improve it all the time, said Julian Pickering of Digital Oilfield Solutions, chairing the session.

'Life is based on uncertainty. We hold fast to certain pieces of information that we believe are absolutely correct, but we don't know they are correct at all.

'The important thing is to recognise the uncertainty. You read something that's 27.836543, you wonder if it's good to the 4th decimal place. Digital system simply a level of accuracy that is way beyond what they actually are.'

'The key thing is knowing where this value is good to +/- 20 per cent, 50 per cent, 500 per cent, whatever it may be. So you make your decisions with unbounded uncertainty based on the quality of your data.'

'Perfect data is never worth it, because you can never get perfect data,' said John Redfern, chairman of Digital Earth.

'Google maps is 99 per cent correct, that's good enough. It's not worth paying a thousand of bucks to get absolutely perfect data.'

Crowdsourcing to improve data

Another way to improve data is to share it more and invite people who are using it to help clean it up or suggest improvements, said Oracle's David Hattrick.

'Apple is correcting serious errors in their mapping by mobilising the millions of Apple users around the world to correct their map data, saying tell us what's wrong, we'll fix it.'

Another example is the story of the company which sends sections of ancient manuscripts to be used as 'CAPTCHA' boxes on sign up forms, so that people type in what they are saying more accurately than a computer could read the manuscripts.

'There's the power of social networking, the power of the network effect, being able to mobile millions of people to do a job they don't know what they are doing,' he said.

The Norwegian Petroleum Directorate was also able to use the strength of the crowd to improve data quality, when it started publishing companies' data on its website.

'The quality went up straight away, because there's a feedback loop,' said Eric Toogood from NPD. 'You're exposing your data, you get good feedback, you can correct things.'

Associated Companies
» Norwegian Petroleum Directorate
comments powered by Disqus


To attend our free events, receive our newsletter, and receive the free colour Digital Energy Journal.


Latest Edition Jan-Feb 2024
Jan 2024

Download latest and back issues


Learn more about supporting Digital Energy Journal