Opinions & Interviews

2019-10-09 Severino Meregalli, Leonardo Maria De Rossi, Lorenzo Diaferia

How to Exploit Data in the Post-Digital Era?

A proposal to talk less about data, and start using it

The World Economic Forum (WEF) recently estimated that by the end of 2020, the amount of stored data will reach the volume of 44 zettabytes, which means a quantity of bytes forty times greater than the number of stars in the universe.

In the world of big data and digitalization, imaginative references such as that of the WEF on the trend in data growth have now become the norm. Research centers, vendors, and the publishing world compete in using analogies and exponential metrics to present the issue of data explosion. Since we now recognize the fact that we are facing an unstoppable trend of great dimensions, the fundamental question increasingly becomes how to begin using this data and extracting value from it. The DEVO Lab at SDA Bocconi has conducted a series of studies on this subject, the lastest of which with the encouragement and support of Google Italia.

The widespread nature of the phenomenon means that the initiatives that revolve around data involve multiple sources and actors that come together in all kinds of ways. In many cases, launching an initiative to exploit data requires forms of cooperation and partnership within a true "data ecosystem," where private actors (companies) and public actors (entities and organizations), can find mutual benefits.

What triggers this need to create chains that involve different types of actors? This is often due to the very nature of the data necessary to obtain concrete results; i.e. initiatives that are not driven a hype effect, but by a real tension towards creating value, taking into account everything that is needed to generate an actual value.

In this regard, a fundamental distinction is that between data on phenomena, i.e. data that describes a precise event (such as the volume of ice cream sales at a cafe), and data on the relevant context, that is essential for a conscious and non-distorted interpretation (for example, the temperature in the area the ice cream is sold).

Matrix

Another element must be added: not all data can be used in the same manner. Some is in open format, i.e. accessible from public sources, while other data - the majority - is owned privately and held by companies or other entities.

Let us take the example of a company that decides to launch a project to generate value from the use of the data on phenomena it possesses. In this case, depending on the aims of the project, an enrichment with data that describes the context can be necessary. Such data is complementary and essential to generate significant results. There are two options available for this enrichment: contact other companies that sell such information, or use platforms - if they exist - that allow for access to the data on the context in an open format. Where context data exists, the value obtained is higher than in cases compared to cases where only data about the phenomenon is used. In some cases, without context data it is not even possible to obtain any value (for example, sales data without knowing the performance of the economy or the actions taken by competitors).

What does all this mean for companies and public regulators?

For companies, this simple observation means that when a decision is made to take on a project of this type, the proper importance must be given to all of the categories of information necessary to establish a structured relationship with all of the actors involved in order to create a chain of content that is well-aligned with the goal.

For those who deal with policy-making, it means it is necessary to pay particular attention to the creation of context data, stimulating the emergence of public or private subjects that favor accessibility in an open format, promoting a legislative framework that also incentivizes the reutilization of public open data.

These considerations are supported by the study of over 20 initiatives for data exploitation carried out principally in Italy and Europe. The analysis led to the identification of three fundamental questions for structuring projects based less on the hype of exponential growth of data and more focused on the generation of measurable economic returns.

The key questions from this perspective are:

  1. Who are the essential actors for the success of the project?
  2. What should be the nature of these actors? Are they exclusively private subjects, or is some sort of public-private cooperation necessary?
  3. What are the steps to follow in the process to generate tangible economic value?

Actors

Synergetic cooperation between the five categories of actors described in the chart above turns out to be crucial. Among them, we must stress the importance of cutting across the various phases of the technological infrastructure process, that enables the management and analysis of the data. While it is true that the exploitation process is based on a careful evaluation of the actors who will have to preside over these five areas, it is also true that those roles often cannot be covered by the same organization, or by organizations of the same nature. This leads to the need to provide for and incentivize forms of cooperation between public and private subjects in order to provide the necessary data and capacity, depending on each case.

Interesting experiences in Italy, such as Trenitalia's Dynamic Maintenance Management System, the ACEA initiative, or the Milano City Pulse project, teach us that in addition to carefully defining all of the necessary roles, it is essential to take the right steps in the process.

The DEVO Lab reworking of the traditional model of the Data Value Chain makes it evident that a structured approach is necessary, to go from raw data to the creation of information.

Value chain

In projects for data exploitation, the value chain must be driven by the final goal that determines what is needed for the generation of economic value. The actors involved and the value chain must be organized and integrated in a single project as a function of this goal. The relationship between the sponsor of an initiative, data sources, aggregators, users, and supporting infrastructure needs to be entered into a broader framework of cooperation, that allows for easier integration between data on phenomena and on context, and that through adequate public policies encourages the creation, maintenance, and reutilization of data in an open format.

In this context, help can come from those companies that due to their digital vocation and availability of infrastructure, find themselves managing large quantities of contextual data. These companies can offer themselves, or should anyway be encouraged, to act as aggregators or basic infrastructure for data projects. Without this component, the transformation of data on phenomena into value often turns out to be fruitless.

A complete discussion of the outcomes and the instruments developed during the DEVO Lab-Google study cited above will be presented in an analysis in the first number of E&M in 2020.

Severino Meregalli is Associate Professor of Practice of Information Systems at SDA Bocconi School of Management and Scientific Coordinator of the research lab “Digital Enterprise Value and Organization” (DEVO Lab). Lorenzo Diaferia is Trainee of the Knowledge Group Information Systems-IT and part of the core team of the research lab “Digital Enterprise Value and Organization” (DEVO Lab).

 

Dati