Accessibility / Report Error

Interface terminology: Natural language processing of clinical data in Electronic Health Record narratives



To present the retrieval and analysis of clinical data from anamneses in the. Electronic Health Record (EHR), referred to in this research as Interface Terminology.


The clinical data collection process in this research was carried out on electronic patient records from a private hospital. The data sample consisted of 18,256 anamneses from the field of gynecology in 2018. The clinical data was retrieved through Natural Language Processing using the Python language. The most frequent terms related to clinical data were analysed, such as abbreviations and acronyms, stop words, procedures, and n-grams.


Clinical data has the potential to be reused for scientific production, epidemiological profiling and in the creation of dictionaries and enrichment of controlled vocabularies for PEP and other health information systems. They are also important in defining algorithms for information retrieval. As a result, a repository was created in the OSF ( containing spreadsheets and tables with clinical data for reuse in the delimitation of algorithms, as well as the creation of a word cloud to identify the most frequent terms in electronic patient records in the field of Gynecology. The algorithms used to retrieve the information were made available on the GitHub digital repository.


Clinical data is information about the patient, used for care purposes, hospital administrative issues, allowing research related to the patient's health and illness. The Interface Terminology, exemplified in the research hospital's EHR, presented a diversity of clinical data in the anamneses.

Electronic Health Records; Patient Generated Health Data; Gynecology; Information retrieval; Natural Language Processing; Interface Terminology

Universidade Federal de Santa Catarina Campus Universitário Reitor João David Ferreira Lima - Trindade. CEP-88040-900, Telefone: +55 (48) 3721-2237 - Florianópolis - SC - Brazil