Artificial intelligence isn’t a magic box: Gazprom Neft specialist Anna Dubovik on the importance of digital product samples

Sputnik radio

Anna Dubovik, Head of Advanced Analytics and Machine Learning at Gazprom Neft, speaks to Sputnik radio at the World Digital Summit on the Internet of Things and Artificial Intelligence in Kazan.

Head of Advanced Analytics and Machine Learning at Gazprom Neft, Anna Dubovik

— How should the flow of information on the development of neural networks and artificial intelligence (AI) be understood — where are we talking about genuine breakthroughs, and what is just PR?

— It seems to me that every industry — such as oil, IT or even retail — needs to make samples of their products available. When you go to a grocery store, you can slice off a bit of sausage and try it — just like in IT. If someone offers you an extremely good solution, you should ask right away where you can read about it, who has written about it and where you can try it. It’s important to remember that artificial intelligence isn’t a magic box. That’s why we try to make samples of a number of our products available, so that companies across the world — clients, contractors and other developers — can see how we work and familiarise themselves with what we offer.

Nvidia is a well-known company that develops hardware for making neural networks. Their servers and cards are extremely expensive. Nonetheless, they make them available to other companies to test for free. These companies try it out, and if it doesn’t work out, they discuss why. It’s a lot of money to spend just to increase their influence across the world. As a company, we do something similar. Yes, we’ve invested significant resources and effort, but we want to put our developments out there, because that’s the point at which the industry will start developing and using them, and attract the right kind of machine learning.

— What results can you gain from this?

The most striking example of free implementation is Google’s TensorFlow library, which they’ve developed and released worldwide. All the current developments to do with smart vision, machine learning, are primarily down to this. Just picture how the industry has stepped up exponentially over the past four years. If they hadn’t done it, perhaps someone else might have. But now they’re the leaders, and everyone else compares themselves to them. So now, when we are facing a similar opportunity to consolidate our own position, I think this is the right strategy.

— What companies are currently at the cutting edge in developing artificial intelligence in Russia? Is there a big gap between Google and the other global majors?

— I know that, according to several reports, Yandex, for example, is doing well with its self-driving, and it is very interesting to the market. And Yandex also has its own open-access library, through which you can judge the quality of their work. That’s definitely a breakthrough — we can sense it, we can feel it, you don’t just see it in press releases — it’s there as a product. So I think Yandex is a great example. There are many small companies, even some research laboratories. I know there’s definitely several companies at Skoltech that are running small projects in the field of machine learning — but they’re running them extremely well. And this is the most important thing — quality, not size or scale.

— Are there any common standards or roadmaps that global companies are working in line with for developing artificial intelligence?

— I know there’s major work going on to develop ISO standards, which exist in relation to all other areas, but I don’t think that’s the most important thing here. There are internal corporate standards, and all companies need to have these. These are the rules by which people write code. So we’ve been trying — albeit softly — to introduce our own corporate standard. It’s certainly available on an open-access basis through the Gazprom Neft repository. You can see the standard our models, model descriptions, datasets and metrics are written to. This definitely gives other companies an understanding of how we work.

Our company is focussed on partnership, including in the field of technology. Thanks to our corporate standard, any company can work with us in a consistent, standard format, and all our applications and solutions will be compatible, there won’t be any problems supporting third-party code.

— And what challenges can artificial intelligence already cope with, and what do you think it will be able to deal with in the future?

— If we’re talking about industry — and the oil industry, specifically — then our developments are starting to cover most aspects of the “big picture” in terms of field development and geological prospecting. We need to combine these into a single process, so that there’s almost no human involvement — so that the person is above the process. When we receive data from a geologist, or from seismic surveying — it is processed, step by step, model by model, down the line. We’re building that interconnected process. The expert, in this case, acts as a validator and supporter of these solutions, so that we can control the entire process. This means not only can work be sped up, but employee time can be reallocated to more interesting and challenging tasks — because the oil industry has that in abundance — you can find new patterns, and add these to the model. With regard to the industry as a whole, it seems to me that artificial intelligence has already penetrated pretty deeply; it’s everywhere — you’ve got it on your phone, FaceID — everything’s running on neural networks. They’ve even started processing photos on Instagram through algorithms — trained, pre-trained, that make your life better, more worthwhile, and more beautiful. So it’s everywhere — and there’s more of it to come.

— The advantages of artificial intelligence are clear and well understood by everyone, but at the same time technologies, including those used in cyberattacks, are developing. What risks are companies likely to come up against in the future?

— There’s a good example in my own field, which is generating neural networks. There’s one kind of network that validates, there’s a network that tries to hoax, and a network that tries to check whether it’s being hoaxed. And a lot of experiments have already shown that a network that’s trying to hoax will keep trying until it actually breaks through. It will try everything: it might take a while, but nothing is absolutely safe and 100-percent protected. But there are certain guidelines, that you have to adhere to. These concern coding rules, rules on data structuring, and your servers — everything you retain.

— So as it turns out, you have to be prepared for this in any case, right?

— Yes, repelling cyber attacks and saving is the most important thing. Because if the worst happens, an attack won’t happen in a matter of seconds — there will be some time in which you can do some critical things, and it’s important for you to use this time as best you can.

Anonymising data is extremely important. There are certain things that are not protected at all, and today the majority of scandals arising from cyberattacks concern personal data. The biggest problem is that people aren’t paying attention to this. If you hold data, and it’s lying around somewhere and can be accessed, it must be anonymised. Everything we do is anonymised. When we’re working with fields, with wells, we don’t know where they are. If something happens, nobody knows where such and such a well is — whether it’s in Russia or not, whether it’s ours, or from open sources — it’s just a number, it exists somewhere. This won’t be useful for anyone else, because they won’t be able to interpret it and understand how to parse the data.