Media monitoring focusing on well-known topics that the public considers relevant is imperative. But it is more important to detect what has been still unknown and it is now emerging. The automatic discovery of topics and the analysis of trends help us to understand what we don’t know we don’t know.
Monitoring of –both social and traditional- media and other sources of information usually tracks and analyzes a “focus” established a priori: a set of topics and aspects (people, companies, brands, industries…) known and predefined. This might be enough in scenarios where innovations and changes are not frequent… i.e. not a single one, at the present time.
Indeed, monitoring the conversation about something which everybody agrees that is worth tracking is not enough anymore. There are plenty of cases in which what is actually important is to discover the emergence of new topics (from news or rumors) with the potential to become relevant. To take advantage of this value, it is necessary to perform an early warning that enables to put those topics “on the radar” and to identify and understand the trend as soon as possible. In other words, we need tools to discover our “unknown unknowns”.
Discovering and identifying new trends
A community or a market may be talking in “business as usual” mode with a few topics, keywords, frequencies… more or less known and stable. But suddenly people may start to talk about a topic or issue actually new (or there are known topics that multiply their frequency), even unknown terms towards which the conversation gravitates.
This is the situation that can indicate the emergence of a new relevant topic or concept. It is important to discover and validate these emerging topics as soon as possible because they might constitute market-moving information or cause a crisis of any nature (incident, reputational crisis).
For example, for a provider of financial market information it may be important to detect in forums and social networks rumors about the unexpected merging of two companies, even a few minutes before that such information “becomes news” and begins to appear on the screens of Bloomberg or Thomson Reuters. Similarly, for civil protection or emergency management services it is essential to discover as soon as possible that people are starting to talk about a mass gathering or a potentially dangerous incident.
But the discovery of our “unknown unknowns” is not limited to social media. It is the case of agencies and corporate departments engaged in user experience management activities and in the analysis of the voice of the customer which comes from contact centers interactions, satisfaction surveys, etc. For them, as important as analyzing the voice of the customer according to predefined categories and topics (e.g. activities and departments of the company, its products and brands, competitors) is to discover the “new voice”, topics that were not on the agenda and emerge as relevant.
From a technical point of view, the detection of bursty keywords and clustering are useful tools for the discovery of possible trends. To identify a topic it is necessary to recognize the concepts mentioned in a text, examine how often do these concepts appear in a set of texts and how do they co-occur.
These processes enable to group concepts in topics or themes. But those concepts can be expressed in many ways, some are multiword terms and, in all cases, a normalization process is indispensable for identifying unique concepts. To facilitate the understanding of topics, it is necessary to choose a concept representing them, taking into account aspects such as the frequency of use of that concept or if it stands for a named entity.
Nevertheless, recognizing a trend and identifying a topic are not the same thing. A trend is an interesting evolution in time of a certain topic. For that purpose, it has to be considered the probability, a priori, of a topic to appear a number of times, in a specific time, in a set of conversations. If that probability is exceeded by a considerable margin, the topic can be of interest. If that behavior is maintained for several periods of time, we have a trend.
Most of the existing algorithms that identify topics are not prepared to recognize also trends, i.e. to take into account the time variable (at least with a moderate computing time). On the other hand, none of them relies on linguistic information to standardize the references to the concepts of a topic and, on many occasions, they do not even allow to use multiword terms to represent concepts. For this reason, at Daedalus we have developed our own extensions for the topic detection algorithms, up to the point of enabling them to discover trends.
Monitoring and understanding trends
In order to monitor and understand what we have identified as a potential trend, we must be able to define its “meaning pattern”: which thematic categories, keywords, entities, concepts… do define it. Done this, technology allows us to detect all conversations that respond to that pattern and to group them automatically.
We should be able to perform an exhaustive and aggregated analysis on these conversations: identify the sentiment (positive, negative) associated with each comment’s aspect, discover the perception of the community in relation to these trends (the concepts semantically related with it that appear more often), segment users and see the evolution in time of the trend.
At Daedalus we are applying this analysis, for example, to monitor emergencies in the physical world (accidents, gatherings) and build a social dashboard with aggregated information about them.
Finally, we must be able to define alerts on the evolution of the trend in order to act quickly and accordingly, e.g. detecting that the conversation volume about that trend exceeds a maximum or minimum threshold, that its associated sentiment reaches extreme positive or negative polarity, that comes into contact with (and can “contaminate”) certain entities, etc.
The semantic analysis permits us to not only discover and identify new topics, but also to monitor and understand their evolution in order to focus on what is more relevant at any given moment. (And we remind you that Textalytics, our Meaning as a Service product, is the easiest, less risky and most affordable way to embed semantic analysis in your applications.)