Acessibilidade / Reportar erro

Use of deep learning to build an information retrieval model applied to the mining sector in Brazil

Abstract

Faced with the exponential growth of data and information, provided by sensors and social media, an ecosystem composed of new storage and processing infrastructures, called Big Data, was developed. All this development resulted in a new area of knowledge, called Data Science. Despite there being an ecosystem and an area of knowledge to deal with this massive block of data and information, the discomfort of an overabundance of data still remains and becomes more significant when companies become aware that they can use zettabytes of data and information to direct their strategy and operations. Based on this, this research sought to develop a method to summarize news from the mining sector in Brazil, identifying the effect of semantic similarity in the analysis, enabling information retrieval and use in processes of understanding the sector. In this method, the BERTSUM transformer was applied to summarize the news, and after summarizing, the BERT transformer was applied to measure the similarity between the news. The method made it possible to reduce the entire block of text by 75%, remove news with the same semantic content, and deduce that there is a pattern in the discourse of news related to the mining sector.

Keywords:
natural language processing; deep learning; BERT; ATS; mining

Universidade Federal do Rio Grande do Sul Rua Ramiro Barcelos, 2705, sala 519 , CEP: 90035-007., Fone: +55 (51) 3308- 2141 - Porto Alegre - RS - Brazil
E-mail: emquestao@ufrgs.br