Past editions: 2014 | 2013 | 2012

SEPLN 2015

TASS 2015

Welcome to the 4th evaluation workshop for sentiment analysis focused on Spanish. TASS 2015 will be held as part of the 31st SEPLN Conference in Alicante, Spain, on September 15-18th, 2015. You are invited to attend the workshop, taking part in the proposed tasks and visiting this beautiful city!

Register

Welcome to TASS 2015!

TASS is an experimental evaluation workshop for sentiment analysis and online reputation analysis focused on Spanish language, organized as a satellite event of the annual conference of the Spanish Society for Natural Language Processing (SEPLN). After three previous successful editions, TASS 2015 will take place on September 15th, 2015 at University of Alicante, Spain.

The aim of TASS is to provide a forum for discussion and communication where the latest research work and developments in the field of sentiment analysis in social media, specifically focused on Spanish language, can be shown and discussed by scientific and business communities. The main objective is to promote the application of state-of-the-art algorithms and techniques for sentiment analysis applied to short text opinions extracted from social media messages (specifically Twitter).

Several challenge tasks are proposed, intended to provide a benchmark forum for comparing the latest approaches in these fields. In addition, with the creation and release of the fully tagged corpus, we aim to provide a benchmark dataset that enables researchers to compare their algorithms and systems.


Tasks

First of all, we are interested in evaluating the evolution of the different approaches for sentiment analysis and text classification in Spanish during these years. So, the traditional sentiment analysis at global level task will be repeated again, reusing the same corpus, to compare results. Moreover, we want to foster the research in the analysis of fine-grained polarity analysis at aspect level (aspect-based sentiment analysis, one of the new requirements of the market of natural language processing in these areas.

Thus the following two tasks are proposed this year.

Participants are expected to submit up to 3 results of different experiments for one or both of these tasks, in the appropriate format described below.

Along with the submission of experiments, participants will be invited to submit a paper to the workshop in order to describe their experiments and discussing the results with the audience in a regular workshop session. More information about format and requirements will be provided soon.

Task 1: Sentiment Analysis at global level

This task consists on performing an automatic sentiment analysis to determine the global polarity of each message in the test set of the General corpus (see below). This task is a reedition of the task in the previous years. Participants will be provided with the training set of the General corpus so that they may train and validate their models.

There will be two different evaluations: one based on 6 different polarity labels (P+, P, NEU, N, N+, NONE) and another based on just 4 labels (P, N, NEU, NONE).

Participants are expected to submit (up to 3) experiments for the 6-labels evaluation, but are also allowed to submit (up to 3) specific experiments for the 4-labels scenario.

Accuracy (correct tweets according to the gold standard) will be used for ranking the systems. Precision, recall and F1-measure will be used to evaluate each individual category.

Results must be submitted in a plain text file with the following format:

tweetid \t polarity

where polarity can be:

  • P+, P, NEU, N, N+ and NONE for the 6-labels case
  • P, NEU, N and NONE for the 4-labels case.

The same test corpus of previous years will be used for the evaluation, to allow for comparison among systems. Obviously, participants are not allowed to use any test data to train their systems. However, to deal with the problem reported last years of the imbalanced distribution of labels between the training and test set, a new selected test subset containing 1000 tweets with a similar distribution to the training corpus will be extracted and used for an alternate evaluation of the performance of systems.

Task 2: Aspect-based sentiment analysis

Participants will be provided with a corpus tagged with a series of aspects, and systems must identify the polarity at the aspect-level. Training and test sets for two corpora will be provided: the Social-TV corpus, used last year, and the new Politics corpus, collected this year (both described later).

Participants are expected to submit up to 3 experiments for each corpus, each in a plain text file with the following format:

tweetid \t aspect \t polarity

Allowed polarity values are P, NEU and N.

Microaveraged Precision, recall and F1-measure will be used to evaluate the systems, considering a unique label combining aspect-polarity. Systems will be ranked by F1.


Corpus

General Corpus

The general corpus contains over 68 000 Twitter messages, written in Spanish by about 150 well-known personalities and celebrities of the world of politics, economy, communication, mass media and culture, between November 2011 and March 2012. Although the context of extraction has a Spain-focused bias, the diverse nationality of the authors, including people from Spain, Mexico, Colombia, Puerto Rico, USA and many other countries, makes the corpus reach a global coverage in the Spanish-speaking world.

Each Twitter message includes its ID (tweetid), the creation date (date) and the user ID (user). Due to restrictions in the Twitter API Terms of Service, it is forbidden to redistribute a corpus that includes text contents or information about users. However, it is valid if those fields are removed and instead IDs (including Tweet IDs and user IDs) are provided. The actual message content can be obtained by making queries to the Twitter API using the tweetid.

The general corpus has been divided into two sets: training (about 10%) and test (90%). The training set will be released so that participants may train and validate their models. The test corpus will be provided without any tagging and will be used to evaluate the results provided by the different systems. Obviously, it is not allowed to use the test data from previous years to train the systems.

Each message in both the training and test set is tagged with its global polarity, indicating whether the text expresses a positive, negative or neutral sentiment, or no sentiment at all. A set of 6 labels has been defined: strong positive (P+), positive (P), neutral (NEU), negative (N), strong negative (N+) and one additional no sentiment tag (NONE).

In addition, there is also an indication of the level of agreement or disagreement of the expressed sentiment within the content, with two possible values: AGREEMENT and DISAGREEMENT. This is especially useful to make out whether a neutral sentiment comes from neutral keywords or else the text contains positive and negative sentiments at the same time.

Moreover, the polarity at entity level, i.e., the polarity values related to the entities that are mentioned in the text, is also included for those cases when applicable. These values are similarly tagged with 6 possible values and include the level of agreement as related to each entity.

On the other hand, a selection of a set of topics has been made based on the thematic areas covered by the corpus, such as "política" ("politics"), "fútbol" ("soccer"), "literatura" ("literature") or "entretenimiento" ("entertainment"). Each message in both the training and test set has been assigned to one or several of these topics (most messages are associated to just one topic, due to the short length of the text).

All tagging has been done semiautomatically: a baseline machine learning model is first run and then all tags are manually checked by human experts. In the case of the polarity at entity level, due to the high volume of data to check, this tagging has just been done for the training set.

The following figure shows the information of two sample tweets. The first tweet is only tagged with the global polarity as the text contains no mentions to any entity, but the second one is tagged with both the global polarity of the message and the polarity associated to each of the entities that appear in the text (UPyD and Foro Asturias).

        <tweet>
          <tweetid>0000000000</tweetid>
          <user>usuario0</user>
          <content><![CDATA['Conozco a alguien q es adicto al drama! Ja ja ja te suena d algo!]]></content>
          <date>2011-12-02T02:59:03</date>
          <lang>es</lang>
          <sentiments>
            <polarity><value>P+</value><type>AGREEMENT</type></polarity>
          </sentiments>
          <topics>
            <topic>entretenimiento</topic>
          </topics>
        </tweet>
        <tweet>
          <tweetid>0000000001</tweetid>
          <user>usuario1</user>
          <content><![CDATA['UPyD contará casi seguro con grupo gracias al Foro Asturias.]]></content>
          <date>2011-12-02T00:21:01</date>
          <lang>es</lang>
          <sentiments>
            <polarity><value>P</value><type>AGREEMENT</type></polarity>
            <polarity><entity>UPyD</entity><value>P</value><type>AGREEMENT</type></polarity>
            <polarity><entity>Foro_Asturias</entity><value>P</value><type>AGREEMENT</type></polarity>
          </sentiments>
          <topics>
            <topic>política</topic>
          </topics>
        </tweet>        
      

Social-TV Corpus

This corpus was collected during the 2014 Final of Copa del Rey championship in Spain between Real Madrid and F.C. Barcelona, played on 16 April 2014 at Mestalla Stadium in Valencia. Over 1 million tweets were collected from 15 minutes before to 15 minutes after the match. After filtering useless information, tweets in other languages than Spanish, a subset of 2 773 was selected.

All tweets were manually tagged with the aspects of the expressed messages and its sentiment polarity. Tweets may cover more than one aspect. The list of aspects is:

  • Afición
  • Arbitro
  • Autoridades
  • Entrenador
  • Equipos: Equipo-Atlético_de_Madrid, Equipo-Barcelona, Equipo-Real_Madrid, Equipo (any other team)
  • Jugadores: Jugador-Alexis_Sánchez, Jugador-Alvaro_Arbeloa, Jugador-Andrés_Iniesta, Jugador-Angel_Di_María, Jugador-Asier_Ilarramendi, Jugador-Carles_Puyol, Jugador-Cesc_Fábregas, Jugador-Cristiano_Ronaldo, Jugador-Dani_Alves, Jugador-Dani_Carvajal, Jugador-Fábio_Coentrão, Jugador-Gareth_Bale, Jugador-Iker_Casillas, Jugador-Isco, Jugador-Javier_Mascherano, Jugador-Jesé_Rodríguez, Jugador-José_Manuel_Pinto, Jugador-Karim_Benzema, Jugador-Lionel_Messi, Jugador-Luka_Modric, Jugador-Marc_Bartra, Jugador-Neymar_Jr., Jugador-Pedro_Rodríguez, Jugador-Pepe, Jugador-Sergio_Busquets, Jugador-Sergio_Ramos, Jugador-Xabi_Alonso, Jugador-Xavi_Hernández, Jugador (any other player)
  • Partido
  • Retransmisión

Sentiment polarity has been tagged from the point of view of the person who writes the tweet, using 3 levels: P, NEU and N. No distinction is made in cases when the author does not express any sentiment or when he/she expresses a no-positive no-negative sentiment.

The Social-TV corpus was randomly divided into two sets: training (1 773 tweets) and test (1 000 tweets), with a similar distribution of both aspects and sentiments. The training set will be released so that participants may train and validate their models. The test corpus will be provided without any tagging and will be used to evaluate the results provided by the different systems.

The following figure shows the information of three sample tweets in the training set.

        <tweet id="456544898791907328"><sentiment aspect="Equipo-Real_Madrid" polarity="P">#HalaMadrid</sentiment> ganamos sin <sentiment aspect="Jugador-Cristiano_Ronaldo" polarity="NEU">Cristiano</sentiment>. .perdéis con <sentiment aspect="Jugador-Lionel_Messi" polarity="N">Messi</sentiment>. Hala <sentiment aspect="Equipo-Real_Madrid" polarity="P">Madrid</sentiment>! !!!!!</tweet>
        <tweet id="456544898942906369">@nevermind2192 <sentiment aspect="Equipo-Barcelona" polarity="P">Barça</sentiment> por siempre!!</tweet>
        <tweet id="456544898951282688"><sentiment aspect="Partido" polarity="NEU">#FinalCopa</sentiment> Hala <sentiment aspect="Equipo-Real_Madrid" polarity="P">Madrid</sentiment>, hala <sentiment aspect="Equipo-Real_Madrid" polarity="P">Madrid</sentiment>, campeón de la <sentiment aspect="Partido" polarity="P">copa del rey</sentiment></tweet>
      

Politics Corpus

Another corpus has been gathered for the aspect-based sentiment analysis task. Messages from the 6 main political parties participating at the Andalusian parliamentary election, 2015, written by any of the official Twitter accounts of each party or their most important candidates in each region, were collected during the campaign. We are currently in the process of defining aspects that represent the main ideas for political discussion, such as "economic crisis", "unemployment", "corruption", "role of women", "academic failure", "abortion", etc. and assign their polarity for sentiment in the opinion expressed in the tweet.

This corpus will be splitted into training and test set and will be used in Task 2.

More information about the corpus will be provided when available.

Important Dates

April 6th, 2015Release of tasks.
End of April, 2015Release of training and test corpora (General, Social-TV and Politics).
June 10th, 2015Experiment submissions by participants.
June 25th, 2015Evaluation results.
July 5th, 2015Submission of papers.
September 15th, 2015Workshop.

Registration

Please send an email to tass AT daedalus.es filling in the TASS Corpus License agreement with your email, affiliation (institution, company or any kind of organization). You will be given a password to download the files in the password protected area.

All corpora will be made freely available to the community after the workshop.

If you use the corpus in your research (papers, articles, presentations for conferences or educational purposes), please include a citation to one of the following publications:

Downloads

(no files yet)

Organization

Organizing Commitee

  • Julio Villena-Román - Daedalus, Spain
  • Janine García-Morera - Daedalus, Spain
  • Miguel Ángel García-Cumbreras - University of Jaen, Spain (SINAI-UJAEN)
  • Eugenio Martínez-Cámara - University of Jaen, Spain (SINAI-UJAEN)
  • L. Alfonso Ureña-López - University of Jaen, Spain (SINAI-UJAEN)
  • María-Teresa Martín-Valdivia - University of Jaen, Spain (SINAI-UJAEN)

Contributors

  • David Vilares Calvo - University of Coruña, Spain
  • Ferran Pla Santamaria - Universitat Politècnica de València, Spain
  • Lluís F. Hurtado - Universitat Politècnica de València, Spain
  • David Tomás - University of Alicante, Spain
  • Yoan Gutiérrez Vázquez - University of Alicante, Spain
  • Manuel Montes - National Institute For Astrophysics, Optics and Electronics (INAOE), Mexico
  • Luis Villaseñor - National Institute For Astrophysics, Optics and Electronics (INAOE), Mexico

Programme Commitee

  • Alexandra Balahur - EC-Joint Research Centre, Italy
  • José Carlos González-Cristóbal - Technical University of Madrid, Spain (GSI-UPM)
  • José Carlos Cortizo - European University of Madrid, Spain
  • Ana García-Serrano - UNED, Spain
  • José María Gómez-Hidalgo - Optenet, Spain
  • Carlos A. Iglesias-Fernández - Technical University of Madrid, Spain
  • Zornitsa Kozareva - Information Sciences Institute, USA
  • Sara Lana-Serrano - Technical University of Madrid, Spain
  • Paloma Martínez-Fernandez - Carlos III University of Madrid, Spain
  • Ruslan Mitkov - University of Wolverhampton, U.K.
  • Andrés Montoyo - University of Alicante, Spain
  • Rafael Muñoz - University of Alicante, Spain
  • Constantin Orasan - University of Wolverhampton, U.K.
  • José Manuel Perea - University of Extremadura, Spain
  • Mike Thelwall - University of Wolverhampton, U.K.
  • José Antonio Troyano - University of Seville, Spain