You are Home   »   News   »   View Article

Data formats within OSDU

Tuesday, December 7, 2021

It can be confusing that while OSDU is a standard data platform, it is not a standard data format, but makes use of standard data formats which already exist. Energistics' Jay Hollingsworth explained more.

The Open Subsurface Data Universe (OSDU) is designed as a standard data platform for subsurface data. But it is not a standard data format. It makes use of standard data formats, such as the Energistics standards.

Jay Hollingsworth, CTO of Energistics, explained the difference, speaking at a Society of Petroleum Data Managers forum on June 22-24.

The Energistics core standards, WITSML (drilling), PRODML (production) and RESQML (reservoir), are designed as a means of exchanging subsurface and operational data between one company or system and another.

Energistics also publishes reference documents describing its standard. Sometimes it distributes source code or sample data provided by members which can be used with the standard.

The Energistics standards are used within OSDU.

On the other hand, OSDU is a standardised, open, working commercial system for working with subsurface data, normally on a cloud server, including data entry / upload, quality control, management, search, and security. Part of the deliverable is actually code, which is released as open source web server software on Apache 2.0.

The benefit to oil and gas companies is that they have an easy way to put their data on a cloud native (built for cloud) system. They don't need to create their own 'data environments' or 'data lakes' themselves. Instead, they can put their effort into working together with other companies to develop a system all companies can use. This makes commercial sense, since they don't gain any competitive advantage from this work.

Oil and gas companies can purchase access to a working system, directly with cloud software providers. Or they can load and manage this source code onto servers themselves.

The platform has standard APIs, so standard ways to move data in and out. This means that it is possible for operators to move their data to a different cloud provider, and their software applications will still work with it.

It is also possible to combine the use of data in OSDU standard and data in another system, over this API.

As a 'data universe', it is not OSDU's ambition to contain any software, other than that which is needed for running the data platform itself. So any software applications would be outside OSDU, connecting via the API interface.

OSDU is an open source project, which uses many other open source projects and standards. This includes Apache Airflow for managing the workflows; PROJ for geodetic transformation, Elasticsearch as the search engine, REST for the APIs. The main upstream specific standards used are Energistics ones.
'Those things are all fundamental parts of the OSDU platform,' he said.

Work on OSDU began in late 2019. A large part of the original OSDU code was written by Schlumberger, and donated in Autumn 2019.

A long term ambition of OSDU is that oil and gas companies use it for all upstream technical data, including carbon capture, underground storage, and other types of energy.

The website has a great deal of free open source documentation available about the system.

Using OSDU

The OSDU platform includes tools for 'ingesting' (taking in) data, with 'workflow orchestration,' guided tools for the various steps of uploading and reviewing data.

If the data is provided in a standard format, it can be uploaded directly in that format.

Data generated offshore, such as from sensor data, can be uploaded to a cloud server directly from offshore.

The data needs to have some sort of governance, in a process that can be manual or automated. It can be given a tag indicating the quality of the data, such as 'ready for use' or 'low quality.'

There are three sorts of data within the system, the master data (the subsurface data itself), reference data / meta data (which describes what the data is), and 'work product components', (which describe the operations on the data).

Some countries have rules that their oil and gas data cannot leave their borders. OSDU has tools to manage that, perhaps connecting the global OSDU system with an OSDU system hosted within that country.

A key part of the platform is the ability to search or 'discover' data, which requires some data indexing.

It can support 'all the data operations that you'd expect,' he said.

Not a data model

It may be easy to confuse the data storage aspects of OSDU's platform with a relational data model. It is very different.

A relational data model is a structure for how data elements themselves relate. For example, a spreadsheet can contain a relational data model, if it shows the data of birth of a number of people, or how a sensor data changed in time. Relational data models are not limited to two variables, there is no limit. The weakness is that the structure is normally very rigid, once software has been written around data in a relational data model, it is very hard to change anything.

OSDU's data storage could be described instead as an architecture for structuring the metadata about data items, so you can access the item that you want.

If we are searching for data files like seismic data, it makes more sense to do it by looking through the metadata, rather than looking in the data itself.

The data items in OSDU may or may not be relational - for example they are relational if they are sensor data over time, they are not relational if they are documents, well logs, or seismic data.

Open Group

The OSDU forum is a group within the standards organisation Open Group. 'Open Group is similar to Energistics but much larger,' Mr Hollingsworth said.

The background of Open Group is that it was formed from the merger of groups running different UNIX systems in the 1990s. Open Group owns the UNIX trademark, Mr Hollingsworth said.

Open Group has some history in the oil and gas industry. There is an 'Open Process Automation Forum', as an open way to describe process automation data, which was established by a number of oil and gas operators.

Associated Companies
» Energistics
comments powered by Disqus


To attend our free events, receive our newsletter, and receive the free colour Digital Energy Journal.


Latest Edition May June 2022
Jun 2022

Download latest and back issues


Learn more about supporting Digital Energy Journal