You are Home   »   News   »   View Article

How to get better at Data Operations

Thursday, July 6, 2023

Data Operations, or 'DataOps' is what companies need to get good at, if they are to become 'digital companies'. A webinar with Forrester and Cognite reviewed what that means for heavy industry companies

'Data operations' or DataOps is a term for practises for managing data in pipelines, rather than static or periodic data, such as in reports or files.

Most data is originally created as a continuous stream, such as from continuous sales or emissions. But we have got used to treating it in batches, such as for annual emissions data, annual revenue data, or subsurface survey data, because that makes it much easier.

But if we are going to do more with data, such as helping people get a better situation awareness of where they are right now, and what to prioritise, then it will help if we are looking at data in pipelines of continuous flow, rather than waiting for the next interim report.

The data infrastructure needed to handle data as pipelines is more complicated and evolved, and demands more data standardisation. But once we have a data pipeline system, it is comparatively easy to build new valuable solutions, such as tools to inform our situation awareness and decision making.

A challenge is that it is hard for a company to justify the expense and effort of turning their existing data management systems, which we could consider 'static', into pipeline-based systems. That is, until we reach the point where we can provide exciting digital solutions. Then the value is obvious.

A webinar organised by data operations technology company Cognite, with a speaker from Forrester, explored the topic.


What maturity looks like

Michele Goetz, Principal analyst with Forrester, sees 3 levels of maturity with a company's data operations, which she terms 'walk, run and fly'.

The highest 'fly' level looks like your idea of a 'digital business', she said. For example, farms where the tractors are all autonomous, engineering work which is designed and tested using simulations. People with tools to help them continually prioritise work and refine schedules.

Many companies state something like this as their desired digital destination, such as saying that they want to take certain data sources and build machine learning models and deploy them, she said. 'It's not all that easy. We have to think about how we go about this.'

One example of a company with good 'data operations' she provided was an oil refinery which has managed to organise or 'tune' its production to the expected arrivals of tankers to collect product orders. This can lead to savings over the conventional method of organisation at a refinery, where the refinery makes different product and then keeps it in storage until the designated vessel arrives to collect it.

With good data operations, it is easier to bring in machine learning tools to simulate different scenarios and use cases and work out the most likely best berthing schedule, she said.

The data operations combines data streams from the refinery process, with data streams about expected future vessel arrival times.

Normally, scheduling the berth for ship arrivals is an extremely manual process, based on making estimates when different ships were due to arrive based on their current location.


Walk and run levels

The most basic maturity level for data operations, which Ms Goetz defines as 'walk', is similar to the processes most companies have now, with enterprise data management systems. Data is often handled manually and brought together to give useful views to decision makers, she said.

The data sets themselves are often large and messy, so cannot be organised by machine - only manually.

It can be tricky to get investment to do more with it, because it isn't obvious what the direct benefits of this would be. 'You are hunting for a use case,' she said.

The next level, which Ms Goetz calls 'run', builds on the 'walk' level. It brings in more of a data engineering type approach.

You have people with data competency working in different areas of the company, improving the data flows, and building tools people want.

You might invest in data quality tools, and data 'virtualisation' systems which make it easy to search and retrieve data from multiple systems at once. You may also have cloud data pipelines, where data is sent to cloud servers, and data preparation tools.

But there are still challenges in identifying, sourcing, and understanding the data, and challenges getting executive support for projects, and not much centralised data management, she said.


The fly level

The 'fly' level, building on the 'run' level, is when you can get to 'solution engineering', the ability to develop tools with specific outcomes.

Data operations has more of a 'federated' structure, with data engineering staff working within different company divisions also co-ordinating back with the centralised enterprise data management.

This way they might find ways to re-use data assets and systems within the company.

It should be management culture to rely on data-driven decision making, and trust that the data is correct. There should be improved collaboration and co-ordination with different data and analytic processes.

The company supports the data related skills of its employees, such as for advanced analytics / data science, and AI. There is co-ordination and collaboration between data related roles, such as data science, cybersecurity and IT.

The company is able to contextualise data, bringing different data sources together.


Technology products

It is unlikely that you can purchase a single 'data operations platform' to do all this, because every company's data streams are different, you can't just purchase a 'data operations platform', she said.

You will probably need a number of different tools, and a focus on 'data product portfolio management' to manage these tools, 'or everything becomes chaos.'

'You need to manage the pipelines, synchronising the different processes, continuously deploy new capabilities.'

You will probably also see company 'business users' getting involved in choosing and configuring tools, and perhaps making models, rather than leaving all this to the IT department, she said. 'The literacy of subject matter experts to work with the data is going to increase.'


Cognite perspective

'Working with data is very hard, it has a unique set of challenges,' said Knut Vidvei, lead product manager with Cognite. 'There are specific complexities which come in the industrial space, asset heavy industries particularly.'

Cognite is a Norwegian company which describes its core product, Cognite Data Fusion, as 'The Industrial DataOps Platform'. Mr Vidvei works as a product manager on Cognite Data Fusion, developing better ways to make data available to subject matter experts and data scientists in a company, and building solutions to do it.

Typical projects can involve large volumes of data, very specific types of data, and challenges of matching data from the IT and OT side, such as sensor data, images and 3D data, he said.

And while the data is complex, the tools and practises to get value out of it need to be simple, user friendly, and 'understand the industry's language', he said.

'Finding data, having trust in the data, understanding it in context, is really time consuming.'

The goal is 'easy access to all IT, OT, engineering, visual data, available to you in a simple way in context,' he said.

'We want to empower everyone, specifically people with domain knowledge, to rapidly build and scale solutions.'

It should also be possible to re-use these solutions, rather than building a new one every time, he said.

Data Operations processes can enable machine learning processes, Mr Midvei said. You can have large volumes of data in a standardised format, which is ready for analytics, such as to look for patterns.


Classes of data

A starting point is to identify the different classes of data, which all need different techniques. Mr Vidvei segments data found in an oil and gas company into engineering, operational, visual and conventional.

In this categorisation, engineering data includes documents and piping / instrumentation diagrams. Operational data includes time series sensor data, simulations, and documents. Visual data includes 3D models, laser (LIDAR) scans and video. Conventional IT data includes work orders, tabular documents (spreadsheets), equipment data, and ERP data.


Domain experts

Company domain experts are very important in helping you work with this. Mr Vidvei defines a domain expert as 'a person with expertise in the field, typically an engineer or specialist.' Examples are engineers specialising in maintenance, production, mechanical / reliability, process / chemical, petroleum, and performance. Also plant managers, asset managers, asset operators and data analysts. 'These people are all over your company,' he said.

'They are not transitioning to become 'data professionals' but using data as part of day-to-day work.'

'Most of them don't code or prefer not to code to solve day to day problems. So, we need 'out of the box' tools which work, with no code.'

Any software tools 'should be simple and speak your language,' he said.

It would be useful if ready built models were available for specific tasks. 'You can just launch them and start the job,' he said.

For example, Cognite produces packaged but customisable models for production optimisation, smart maintenance, 'digital worker' and sustainability.


Watch the webinar online here
https://www.cognite.com/en/webinar/unpacking-the-technology-behind-the-practice



Associated Companies
» Cognite

CREATE A FREE MEMBERSHIP

To attend our free events, receive our newsletter, and receive the free colour Digital Energy Journal.

DIGITAL ENERGY JOURNAL

Latest Edition Oct-Nov 2024
Nov 2024

Download latest and back issues

COMPANIES SUPPORTING ONE OR MORE DIGITAL ENERGY JOURNAL EVENTS INCLUDE

Learn more about supporting Digital Energy Journal