CÉlia Nunes

 
CV
|
Interests
|
Work Proposal
 
Porto, Portugal Douro, Portugal Douro, Portugal Porto, Portugal
|
Home
Covilhã, Portugal
Publications
|
Committees
|
Supervision
|
Jury
|
Organizations of Events
|
Talks
|
Projects
|
Software
|
Awards
|
Links  
  • Demos
  • Python Packages
  • APPs

YAKE! - Yet Another Keyword Extractor


YAKE! (Best Short Paper of ECIR'18) is a light-weight unsupervised automatic keyword extraction method which rests on text statistical features extracted from single documents to select the most important keywords of a text. Our system does not need to be trained on a particular set of documents, neither it depends on dictionaries, external-corpus, size of the text, language or domain. To demonstrate the merits and the significance of our proposal, we compare it against ten state-of-the-art unsupervised approaches (TF.IDF, KP-Miner, RAKE, TextRank, SingleRank, ExpandRank, TopicRank, TopicalPageRank, PositionRank and MultipartiteRank), and one supervised method (KEA). Experimental results carried out on top of twenty datasets show that our methods significantly outperform state-of-the-art methods under a number of collections of different sizes, languages or domains. YAKE is available through a demo, a Python package and an API.

References

YAKE! Keyword Extraction from Single Documents using Multiple Local Features [Article Download]

A Text Feature Based Automatic Keyword Extraction Method for Single Documents [Article Download]

YAKE! Collection-independent Automatic Keyword Extractor [Article Download]

 

Time-Matters


Time-Matters makes use of the Time-Matters python package to score the relevant temporal expressions found within a single text. By doing this, we offer the users the chance to better understand the narrative temporal part of a text. Time-Matters is available through a demo and a Python package.

References

Time-Matters: Temporal Unfolding of Texts [Article Download]

 

narrArquivo


narrArquivo makes use of the Time-Matters python package to score the relevant temporal expressions found within a single text. By doing this, we offer the users the chance to better understand the narrative temporal part of a text. In comparision to Time-Matters we focus on texts collected from the portuguese web archive (Arquivo.pt). narrArquivo is available through a demo and a Python package.

References

Time-Matters: Temporal Unfolding of Texts [Article Download]

 

GTE-Cluster and GTE-Rank


Here we provide two user interfaces so that the research community can test the GTE-Cluster and the GTE-Rank temporal search engine applications. In order to retrieve the query results, we rely on the recently launched Bing Search API (5000 transactions/month allowed) parameterized with the en-US market language parameter to retrieve 50 results per query. The proposed solutions are computationally efficient and can easily be tested online. Although the main motivation of our work is focused on queries with temporal nature, the implemented prototypes allow the execution of any query including non-temporal ones. Below is a detailed description of both user interfaces.

References

GTE-Cluster: A Temporal Search Interface for Implicit Temporal Queries [Article Download]

GTE-Rank: Searching for Implicit Temporal Query Results [Article Download]

GTE-Cluster

GTE-Cluster


GTE-Cluster is a temporal clustering search engine, which offers the user two options: to return all the clusters (including the non-relevant) or to return only the relevant ones. The values that appear in front of the cluster, reflect the similarity value computed by the GTE similarity measure. Note that clusters with a similarity value < 0.35 are considered non-relevant and marked in red. In contrast, relevant clusters are marked in blue.

 

 

 

GTE-Rank

GTE-Rank


GTE-Rank is a temporal re-ranking search engine, which offers the user two options: to return all the web snippets (including those not having dates) or to return only the web snippets with relevant dates. The number in red color is the ranking position initially obtained by Bing search engine. The values in front of the snippet ID, reflect the ranking value computed by the GTE-Rank methodology.

 

 

Below you can find a number of Python packages made available by our research team.

 

YAKE! - Yet Another Keyword Extractor


YAKE! (Best Short Paper of ECIR'18) is a light-weight unsupervised automatic keyword extraction method which rests on text statistical features extracted from single documents to select the most important keywords of a text. YAKE is available as an open source Python package.

References

YAKE! Keyword Extraction from Single Documents using Multiple Local Features [Article Download]

A Text Feature Based Automatic Keyword Extraction Method for Single Documents [Article Download]

YAKE! Collection-independent Automatic Keyword Extractor [Article Download]

 

Time-Matters


Time-Matters (winner of the Fraunhofer Portugal Challenge 2013 PhD Contest) is an algorithm that enables to extract relevant dates from a set of documents or multiple docs. Time-Matters is available as an open source Python package and as a docker image.

References

Campos, R., Duque, J., Cândido, T., Mendes, J., Dias, G., Jorge, A., and Nunes, C. (2021). Time-Matters: Temporal Unfolding of Texts. In: .... (eds), Advances in Information Retrieval. ECIR'21 (Lucca, Italy. March 28 - April 1). Lecture Notes in Computer Science, vol ..., pp. x - x. Springer. [Article Download - To appear]

Campos, R., Dias, G., Jorge, A. and Nunes, C. (2017). Identifying Top Relevant Dates for Implicit Time Sensitive Queries. In Information Retrieval Journal. Springer, Vol 20(4), pp 363-398 [Article Download]

Campos, R., Dias, G., Jorge, A. and Nunes, C. (2016). GTE-Rank: a Time-Aware Search Engine to Answer Time-Sensitive Queries. In Information Processing & Management an International Journal. Elsevier, Vol 52(2), pp. 273-298 [Article Download]

Campos, R., Dias, G., Jorge, A., and Nunes, C. (2014). GTE-Cluster: A Temporal Search Interface for Implicit Temporal Queries. In M. de Rijke et al. (Eds.), Lecture Notes in Computer Science - Advances in Information Retrieval - 36th European Conference on Information Retrieval (ECIR2014). Amesterdam, Netherlands, 13 - 16 April. (Vol. 8416-2014, pp. 775 - 779) [Article Download]

Campos, R., Jorge, A., Dias, G. and Nunes, C. (2012). Disambiguating Implicit Temporal Queries by Clustering Top Relevant Dates in Web Snippets. In Proceedings of The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies Macau, China, 04 - 07 December, Vol. 1, pp 1 - 8. IEEE Computer Society Press. [Article Download]

 

Time-Matters-Query


Time-Matters-Query is a package that enables to extract relevant dates from a set of documents or multiple docs given a query. Time-Matters-Query is available as an open source Python package.

References

Campos, R., Duque, J., Cândido, T., Mendes, J., Dias, G., Jorge, A., and Nunes, C. (2021). Time-Matters: Temporal Unfolding of Texts. In: .... (eds), Advances in Information Retrieval. ECIR'21 (Lucca, Italy. March 28 - April 1). Lecture Notes in Computer Science, vol ..., pp. x - x. Springer. [Article Download - To appear]

Campos, R., Dias, G., Jorge, A. and Nunes, C. (2017). Identifying Top Relevant Dates for Implicit Time Sensitive Queries. In Information Retrieval Journal. Springer, Vol 20(4), pp 363-398 [Article Download]

Campos, R., Dias, G., Jorge, A. and Nunes, C. (2016). GTE-Rank: a Time-Aware Search Engine to Answer Time-Sensitive Queries. In Information Processing & Management an International Journal. Elsevier, Vol 52(2), pp. 273-298 [Article Download]

Campos, R., Dias, G., Jorge, A., and Nunes, C. (2014). GTE-Cluster: A Temporal Search Interface for Implicit Temporal Queries. In M. de Rijke et al. (Eds.), Lecture Notes in Computer Science - Advances in Information Retrieval - 36th European Conference on Information Retrieval (ECIR2014). Amesterdam, Netherlands, 13 - 16 April. (Vol. 8416-2014, pp. 775 - 779) [Article Download]

Campos, R., Jorge, A., Dias, G. and Nunes, C. (2012). Disambiguating Implicit Temporal Queries by Clustering Top Relevant Dates in Web Snippets. In Proceedings of The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies Macau, China, 04 - 07 December, Vol. 1, pp 1 - 8. IEEE Computer Society Press. [Article Download]

 

 

Below you can find a number of APPs developed by our team.

 

YAKE! - Yet Another Keyword Extractor


YAKE! is now available on Google Play

References

YAKE! Keyword Extraction from Single Documents using Multiple Local Features [Article Download]

A Text Feature Based Automatic Keyword Extraction Method for Single Documents [Article Download]

YAKE! Collection-independent Automatic Keyword Extractor [Article Download]