You are Home   »   News   »   View Article

Can we improve corporate search?

Friday, January 6, 2017

Most of us expect a search engine to be a tool which delivers us results such as documents, web pages, people profiles, lessons learnt and best practices when we type something into a box. But can it do more? Oil and gas data scientist Paul Cleverley is doing a PhD to try to find out.

Development of 'enterprise search' technology is fairly stagnated in companies, said Paul Cleverley, speaking at the Digital Energy Journal Aberdeen conference in May, 'Subsurface computing and competitive advantage'.

It is usually seen as a utility, not something which affects the bottom line of the company. 'Quite often the user interface is a pretty bland search box,' he said. This may follow the theory-in-use dominant culture of Google which drives our expectations, but are some latent needs going unmet?

Considering the benefit to the company of making it easier for people to find what they are looking for and what may be valuable, perhaps it is worth investing in a better search engine architecture and design, Mr Cleverley believes.

Company information as a whole may be under-exploited and under-explored. Where the 'whole is greater than the sum of the parts', using the right approach, company information may be able to surface an answer or association that is not present in any one single document. Using a common metaphor, as well as finding 'needles in haystacks', smashing together information haystacks and finding 'new needles' could be a gamechanger.

This isn't just an oil and gas problem. Scientists and engineers are generally interested in similar concepts in any industry, he said. Some of the concepts from the research have been shared with NASA and incorporated into their communication and designs.

Better searching tools can help geoscientists be more objective - weighing up a range of different possibilities, rather than sorting for information which fits their hypothesis.

Many of us may have heard executives saying that having a geologist that knows the basin inside out is a valuable asset - but can also be a liability, if he or she is not willing to engage with an alternate point of view about how it works.

Paul Cleverley is an information scientist. He is in his 4th year of a PhD at Robert Gordon University, Aberdeen.

Push technology

You might have also noticed that search engines we use in our personal lives (for example, searching for celebrities) are making far more effort these days to tell you something you might want but didn't actually look for.

Search for a celebrity in Google and you might see Wikipedia information, the latest movies that person is in, former partners and even how much money they are worth.

So we are getting beyond 'pull' (the person enters a search and gets results back) to 'push' (enter a search, and the computer makes all kinds of other suggestions).

How could this kind of thinking apply to oil and gas subsurface?

Like all humans, subsurface people try to understand something by developing a theory on it and testing it, Mr Cleverley says. They look for data which reflects their theory. But this can mean that they miss something which they are not actually looking for.

Like all humans, subsurface people can also find something useful which they weren't actually looking for, just by ferreting around and stumbling on something.

A mood for serendipity

The likelihood of chance discoveries may also be linked to personality and the state of mind the person is in, in the way that we may have observed that of people we know personally, some of them seem to have more 'happy accidents' than others, Mr Cleverley said.

Much of it comes to people's personalities and mood. Perhaps the more focussed someone is on a narrow goal, the less likely they are to find something they weren't looking for. But also the information provided to them plays role.

Perhaps search systems can support a more 'serendipitous' frame of mind where people are more open to different ideas, perhaps by giving them more results they weren't expecting, avoiding distraction, but still find helpful.

Studies show that 80 to 90 per cent of most information searches in companies are looking for a specific item, where there is a single right answer, he said. The rest of search could be called 'exploratory', multiple results, where people don't know what they are looking for exactly, and won't necessarily recognise it if they are given it, they are learning.

Another approach is if the computer can get better at trying to work out what you may be searching for, Mr Cleverley said.

Looking for a specific piece of information in corporate document management systems can be much harder than finding an exchange rate on Google. If you are searching geological information by word, it can be very hard to come up with the perfect string of words which will find the document you want.

Some companies have developed enterprise software with personalized 'activity streams', like on social media, where you can see what other people are doing around you. This can lead to useful chance encounters with information, but won't necessarily find something which helps you, unless someone else has found it already.

Other companies have experimented with tagging systems, where someone has to manually tag a document with labels such as 'Jurassic'. But this relies on someone tagging the original document, and it can be hard to get oil and gas people to do that. Using pre-filled metadata templates and positioning search within a specific task context and technology can also help accuracy.

It is likely a blend of manual and automated machine reading techniques are needed to organize content 'with search in mind' rather than one approach or the other as an enterprise strategy.

Mr Cleverley interviewed staff from both a large oil company and a small geoscience consultancy, and asked them whether their current search systems can facilitate chance encounters, 41 per cent said 'to a moderate extent'. This rose significantly when staff were presented with certain algorithms to augment their current search systems.

Word patterns

One way computers might get better at bringing you something helpful is if the computer can get a better automated understanding of the 'meaning' in the document, by looking for words which often come together combined with predefined knowledge representations.

Computers can automatically scan millions of documents in this way, using probabilities, to work out which word connections may be most useful.

This can be linked to a human entering known information about the domain, such as what sort of data is likely to be linked to a well, or what rules will show you whether the relationships you have found will fit the real world.

The computer can also make automatic connections between documents, so if you find one document useful, you might find another similar one useful.

As one example, a tool was built in less than a week by an oil and gas company using the published research. People could search their reports with a primary term (such as 'submarine fan') and multiple secondary terms (such as geologic era's, or name of basins, or company names).

The computer system can automatically work out which associated terms might be connected to each primary and secondary term, by looking in the body text of documents, which words typically occur around it.

People were 'browsing' with it, they have a mental model about which terms are usually associated with a certain primary term in that secondary context, and then look for something unusual, then dig deeper to find out why a certain word would be associated with that primary term - for example a geological feature found with a depositional environment you would not usually expect. This led to valuable insights that influenced their regional play model that they would not have had otherwise.

Mr Cleverley likens it to a shop, where you walk in and something catches your eye because you have not seen it before.

It may mean there is something unusual happening in this region which isn't happening in any other, or an activity one company is doing that others are not. This could be called 'discriminatory' search, looking for something with an unusual match of characteristics.

The opposite is a 'similarity' search, where the computer searches thousands of documents for an entity such as geological formation names. You end up with a kind of fingerprint for what words are most associated with entities, such as 'Kimmeridge Clay'.

You might be looking to see what formations are similar to the one you are currently reading an article about, and click on 'find similar' for analogues. This is still search, but instead of getting a list of documents back, you are getting a list of rock formations (entities).

Satisfied with search?

Mr Cleverley ran an experiment with 26 experienced oil and gas professionals, to try to understand how well their searching is going.

It found that regardless of search technology, for exploratory search companies may have underestimated the role that the searcher's mental models of the information space plays in delivering task outcomes. There was significant variation between experienced staff and how well they performed, given the same knowledge of the tool and task. Those that constructed the most sophisticated search queries were often out performed by those that did not. There is likely to be more to searching than simply knowing how to construct search queries.

An odd discovery was that 60 per cent of people declared themselves satisfied with their search results, although people only found 27 per cent of the high value items present for the given task. An almost complete absence of formal and informal learning practices for search based on outcomes was identified.

'There was no association between how satisfied people were and how well they did,' he said. This is significant. There are examples in the health sector where people have died because certain information that was present, was missed during searching. Technology will not solve all our problems - in a world of increasing information volumes, search literacy (our mental models) and sensemaking of search results may be a source of competitive advantage for companies.

Different search presentations

Mr Cleverley ran an experiment with 70,000 SPE articles, made available under special agreement for academic use.

He tried different ways (algorithms) to present the result of search and asked 54 petroleum engineers and scientists from over 30 organizations which ones they preferred.

The experiment showed that not everyone preferred data to be presented in the same way, although some methods were more popular than others.

Modality

It appears companies may not be designing and deploying their search technologies with serendipity or creativity (to stimulate ideas) as a design principle.

Some companies may fall into the trap of technological solutionism and treat search as a technology problem to be solved, not as a system that includes the behaviours of people.

Instead of asking what technology is or should be used, perhaps another question to ask, is what principles are being used in your search deployment?

Search appears to be seen by many companies as simply a positivist utility, a time saver, to find specific things (containers of information) that already exist independent of us. That is certainly one important modality for search, but there may be so much more that could be suggested to us by search technology that may enable us in tandem, to construct new knowledge. After all, we may be asking the wrong questions.



Associated Companies
» Independent consultant
comments powered by Disqus

CREATE A FREE MEMBERSHIP

To attend our free events, receive our newsletter, and receive the free colour Digital Energy Journal.

FEATURED VIDEO

Latest thinking about software + data structure for people who manage production operations
Murray Callander
from Eigen Ltd

DIGITAL ENERGY JOURNAL

Latest Edition Apr-May 2017 issue
Apr 2017

Download latest and back issues

COMPANIES SUPPORTING ONE OR MORE DIGITAL ENERGY JOURNAL EVENTS INCLUDE

Learn more about supporting Digital Energy Journal