Some considerations on Big Data.
As is now well-known, it is a technology which mainly deals with collecting, processing and selecting a huge quantity of different data.
As in some Hegel’s works, here Quantity immediately becomes Quality. The mass of data and the link between them change – hence also its meaning and use change.
A technology or, rather, a series of technologies joined together, which processes many terabytes (2 at the power of 40 bytes, equivalent to 1,048,576 megabytes) at the same time. A huge amount and, above all, simultaneously. Another type of quantity that is immediately turned into quality.
After the creation of the International Telecommunication Union in Geneva in 2017, still led by the Chinese Houlin Zhao, we have some additional facts to evaluate the extraordinary relevance of the Big Data Science.
Meanwhile, just the everyday processing-collection of huge amounts of news allows – also by comparison only – the discovery of many new data and often even of industrial or state secrets.
Moreover, if data can be treated with different chains of meaning at the same time, it will be revealed in all its importance and, often, in roles different from those with which we are used to interpret it.
This is obviously essential to make the economic, financial, political, military or intelligence leaders’ analyses and decisions accurate and effective.
Approximately 90% of the data currently present in the world has been generated over the last two years. It seems impossible, but it is so.
Furthermore, every day 2.5 quintillion of new news (every quintillion is 10 at the power of 13) add to the big data networks alone, but 80% of this mass is non-analyzed and cannot be studied with the usual comparative technologies, whatever the speed with which we employ them.
According to other models for analyzing the global news flows, in 2010 over 1,2 zettabytes – i.e. 10 at the power of 21 bytes, equivalent to a sextillion of bytes – were produced in one year only, while in 2020 a total of 35 zettabytes a year will be produced.
Hence the larger the quantity and the form of big data, the lower our ability to use it, if not with very advanced technologies. However, the larger the quantity of big data, the greater the need to choose the policies to be adopted on the basis of these quantities.
Hence, if the world produces all this data, it is inevitable to consider at least the reason for its huge dimension. Hence even the problems are as big as Big Data.
Just think of environmental and ecological issues or of energy and Internet networks.
It seems almost a paradox, but it is inevitable that nowadays the political, military and strategic decision-making is based on a quantity of news by far exceeding what – in the best cases – happened in the twentieth century alone.
Governments, however, mainly need the intrinsic predictive ability these new technologies have.
Certainly big data is currently needed – for example – to predict-manage car traffic in large areas and to organize health, as well as for protection from terrorist attacks or even for environmental protection and protection from natural disasters.
Nevertheless, the Big Data technology is particularly useful for evaluating the development trends of very complex phenomena – trends which become visible and statistically relevant and which are anyway generated only on the basis of huge amounts of data.
However, we are heading for decision-making quantification which is possible, both technologically and ethically, because the huge amount of data collected is anonymous, already structured and, above all, processed for strictly statistical purposes.
With specific reference to military and strategic defense and to intelligence, in particular – which are already the strength of big data technologies – the progress in news gathering stems from the creation of the new In-Q-Tel company “incubator” – at least for the main US intelligence service, namely CIA.
It is the non-profit company which analyzes and later invests in the most technologically advanced projects, or at least in those where there is some relevance for intelligence.
The initial idea for investing in Big Data – at least for the USA and its agencies – was to avoid the most serious mistakes of Human Intelligence (Humint).
As had already happened in Iraq or, previously, in the Lebanon. Still today, however, data is catalogued according to the old system which divides it into structured, semi-structured and non-structured data.
The first class is the one in which each storage element has at least four singular characteristics identifying it. The second class has only some designation features, which are never fully used.
The class of news that currently expands most is obviously that of non-structured data.
Nevertheless the sequence of news to be gathered is more complex: in addition to the typical intelligence collection, there is the operation of cleaning, noting and representing data in such a way that it is readily available for analysis. Furthermore data needs to be processed and specific algorithms to be created, while mechanisms of news similarity must be developed so as to extrapolate the news needed, which are probably not known to human users.
A technology known as data mining.
Algorithms also operate to create data collection models for computers, which can continuously teach computers how to refine their search.
This is what is known as machine learning.
Computers learn from a set of data, defined as “examples”, in an automatic process called learning – hence they automatically adjust their algorithms so as to attribute values and categories already known to examples not yet classified, without deleting or changing the incoming data.
In more practical terms, the thematic big data collections and the creation of examples can permit the wide use of the automatic transcription of audio conversations, with a view to making them usable through key words. Then a sentiment analysis can be made through the reactions on social media. Hence mapping the reaction of the population to an event, a stance, a future law or a future trade war.
There is also – among others – the Geofeedia software, another example of sectoral use and machine learning in the Big Data sector, which is a platform enabling analysts to check the social media in geo-localized areas.
In the case of the analytical process, the large “trawlers” of Big Data are mainly needed to define the most probable strategic scenarios in the future or to create more specific and operative working assumptions in the intelligence field, or to analyze the opinion trends of the public and of the debate within the party and Parliamentary ruling classes.
All this is certainly not enough, because the intelligence that matters is like the black pearl or the black swan, or the particular correlation that – if tested within a range of options – creates the most rational choice or, possibly, even the most obvious one for the leadership of an opposing country.
Here the issue does not lie in collecting all the stamps of New Guinea, but to find the penny black that nobody had seen so far.
Nevertheless the analysis of the popular sentiment, or of the most obvious development trends of a social, financial or natural phenomenon, certainly guarantees that these options will be very probable and above all less “polluted” by adverse operations.
Or is this not the case? Indeed, the trolls’ actions are mainly related to the hybrid war and to the great operations of what – at the time of Cold War – was called dezinformatsjia, literally “disinformation” in Russian.
However, while in a pre-IT phase before the world dimension of the World Wide Web, doing disinformation meant targeting a certain sector of the adversary to fill-saturate it with fake news, which would naturally lead to a wrong decision (to be manipulated as enemy’s mistake or incapacity) or to a decision-making block, or to the decision that the Enemy wants you to take. Everything changes, however, with the trolls, which are a result of Big Data.
Trolls are anyway subjects who interact on the Web with the other participants without disclosing their identity.
Hence the trolls always operate with huge amounts of data that shield them from others’ sight. They enter the social media of vast user communities and finally react so as not to ever disclose their true nature. They often split and create other trolls.
Hence currently online dezinformatsjia operates with large data sets, such as Big Data, and affects the vast masses of Web users with a view to changing their perceptions, their political action – even on the Web – as well as blocking any reaction in the masses penetrated by an Enemy and, indeed, create a new self-image for them.
Much data, many features with which to hide the new identity of users-adversaries – and the more they are flooded with data, the more they will forget their old identity.
This is the action of a troll in the “hybrid war” and hence in what we could today define as an automated “mass psychological war”.
Currently there is both a symmetrical and opposite relationship between the Big Data of two enemy countries – as in the series of frescoes known as The Allegory of Good and Bad Government, painted by Ambrogio Lorenzetti and hosted in Siena’s Palazzo Pubblico.
On the one hand, the Angels ensuring justice – the typically Aristotelian, “commutative” or “distributive” justice – on the other, the Bad Government, the devilish tyrant who administers cruelty, betrayal and fraud, which are the opposite of the three theological-political virtues of the Good Government.
Hence, in more topical terms, Big Data is an extraordinary equalizer of strategic power – there is no longer small or large country, nor even non-State communities, compared to traditional States, which cannot wage a fight – even invisible to the most – with major powers.
Nevertheless, reverting to the current strategic and technological situation, Big Data will have many unexpected effects, at military and geopolitical levels, that we can summarize as follows: a) all “high” and “low” communication will become mobile and geo-localized social media.
Hence, in the future, intelligence will increasingly deal with the selective dissemination of its data, as well as with their careful spatial-personal determination and with their specification according to areas and receptors.
We will have an increasingly tailor-made intelligence. Furthermore, b) the Big Data challenge is somehow the opposite compared to the old Cold War-style technology.
While, in the past, the data collected ranged from Much to Little, looking for the confidential or secret information that changed the whole geopolitical perspective, nowadays it ranges from Much to Much, because the collection of declassified data – if well-processed – generates confidential news and information that are often unknown even to those who generated them.
Currently the secret is a whole technology, not just a mere datum or fact.
It is a technology changing according to the data it processes, precisely at the moment when it processes it.
Furthermore, c) the future “Big Data” solutions will be modelled and increasingly user-friendly.
They will often be intuitive and hence available also to medium-low level operators in the field.
The old division between “analysis” and “operations” will no longer exist. The true or fake news will be so manifold as to become – as such – war actions.
No longer messages to the ruling classes, but mass signals to the masses or selective operations for individual groups.
Moreover, d) the all-pervasive nature of the Web will be such as to create both new information opportunities and unavoidable “holes” that the Enemy will exploit easily.
Nor should we forget the use of other new technologies, such as laser optical space communications, which will make military and “service” communications safer – although further challenges, such as the new encrypted and adaptable “Internet of things”, will already be on the horizon.
In essence, in the intelligence field, Big Data will match the human operators’ analytical potential, thus making them often capable of operating in restricted and selected areas with a speed equal to that of the perceived threat.
A sort of “artisanalisation” of the intelligence Services’ analysis, which will incorporate more data from the action field and will be ever less controllable ex-ante by some central political authorities.
Again thanks to the huge amounts of incoming data (or data targeted to the Enemy), there will be vertical integration between strategic analysis and top political decision-making, while both analytical and operational choices will be entrusted to local units, which will see an ever-increasing integration between operators and analysts.
We must not even forget, however, the real military technologies: the analysis of social networks, which can be automated, at least at the beginning, and manipulate both the popular sentiment and the adversary technologies.
Furthermore the automatic update of the weapon systems networks, increasingly integrated via the “Internet of Things”, as well as intelligence and the analysis of trends for tactical operations. Finally the activity based intelligence, i.e. a methodology – again supported by IT networks – which allows the analysis of even microscopically anomalous behaviors of the enemy’s small patterns of life.
There will be new types of analysis and hence new collections of large (and new) data.
Hence not only Big Data, but new storage for new classes of data.
Moreover, we should not forget a real cultural revolution that all what is very advanced technology will make absolutely necessary.
Hence, while in the past the intelligence area was well defined and regarded a (not always easy) correct perception of the national interest or the position of one’s own stable international alliances, currently – thanks to Big Data – all this becomes not obsolete, but anyway very different from the logic of Nation-States.
Nowadays, for example, the analysis of intelligence Services – at least of the most advanced ones – will be increasingly oriented to the creation-verification of the different fault lines of the opposed public opinions, or to a new sector we could define as “political intelligence”, which is no longer just the manipulation of the enemy ruling classes, but not even the current mass dezinformatsjia spread through Big Data.
In the future, I already see the creation of diversified managerial classes from outside, with the distribution of technologies which is allowed or forbidden depending on the geopolitical choices of one or more adversaries. Hence we shall imagine a new intelligence which, unlike what currently happens, plays a role in the determination of the international “value chains” and in the global distribution of work, but above all of the technologies that enhance it.
Everything will take place ex ante and ever less ex post. Nevertheless this implies a transformation of the ruling classes and hence a profound change in their selection.
GIANCARLO ELIA VALORI
Honorable de l’Académie des Sciences de l’Institut de France