Página 1 dos resultados de 1340 itens digitais encontrados em 0.008 segundos

Utilização de técnicas de text mining sobre registos clínicos de epilepsia em crianças, para auxílio ao diagnóstico e classificação

Pereira, Luís Miguel Oliveira
Fonte: Instituto Politécnico de Leiria Publicador: Instituto Politécnico de Leiria
Tipo: Dissertação de Mestrado
Publicado em //2013 POR
Relevância na Pesquisa
66.35%
Dissertação apresentado à Escola Superior de Tecnologia e Gestão do IPL para obtenção do grau de Mestre em Engenharia Informática - Computação Móvel, orientada pelo Doutor Rui Rijo e pela Doutora Catarina Silva.; A informação médica tem aumentado continuamente ao longo do tempo, produzindo-se quantidades elevadíssimas de dados. A análise e a extração desses dados oferecem possibilidades de reduzir o esforço e o tempo na sugestão e classificação de um diagnóstico. O processamento dos dados médicos representa um grande desafio, considerando que estes dados são geralmente apresentados em texto livre e com vocabulário técnico específico. Entre os dados mais ricos e relevantes encontram-se os registos clínicos. A análise de registos clínicos é complexa pois para a realização de um diagnóstico correto é necessário ter em conta várias características como sintomas, exames, historial do paciente, tratamentos, medicamentos, entre outros. Além disso, esta análise requer um domínio de diferentes áreas de conhecimento para a realização de um diagnóstico fiável, entre outras data mining, text mining, registos clínicos eletrónicos, e a área clínica. Estes diagnósticos devem ainda ser classificados segundo normalizações...

Exploiting text mining techniques for contextual recommendations

Domingues, Marcos Aurelio; Sundermann, Camila Vaccari; Manzato, Marcelo Garcia; Marcacini, Ricardo Marcondes; Rezende, Solange Oliveira
Fonte: University of Warsaw; Institute of Electrical and Electronics Engineers - IEEE; Web Intelligence Consortium - WIC; Association for Computing Machinery - ACM; Warsaw Publicador: University of Warsaw; Institute of Electrical and Electronics Engineers - IEEE; Web Intelligence Consortium - WIC; Association for Computing Machinery - ACM; Warsaw
Tipo: Conferência ou Objeto de Conferência
ENG
Relevância na Pesquisa
66.26%
Unlike traditional recommender systems, which make recommendations only by using the relation between users and items, a context-aware recommender system makes recommendations by incorporating available contextual information into the recommendation process. One problem of context-aware approaches is that it is required techniques to extract such additional information in an automatic manner. In this paper, we propose to use two text mining techniques which are applied to textual data to infer contextual information automatically: named entities recognition and topic hierarchies. We evaluate the proposed technique in four context-aware recommender systems. The empirical results demonstrate that by using named entities and topic hierarchies we can provide better recommendations.; São Paulo Research Foundation (FAPESP) (grants 2010/20564-8, 2011/19850-9, 2012/13830-9, 2013/16039-3, 2013/22547-1); CAPES; CNPq

Avaliação de métodos não-supervisionados de seleção de atributos para mineração de textos; Evaluation of unsupervised feature selection methods for Text Mining

Nogueira, Bruno Magalhães
Fonte: Biblioteca Digitais de Teses e Dissertações da USP Publicador: Biblioteca Digitais de Teses e Dissertações da USP
Tipo: Dissertação de Mestrado Formato: application/pdf
Publicado em 27/03/2009 PT
Relevância na Pesquisa
66.52%
Selecionar atributos é, por vezes, uma atividade necessária para o correto desenvolvimento de tarefas de aprendizado de máquina. Em Mineração de Textos, reduzir o número de atributos em uma base de textos é essencial para a eficácia do processo e a compreensibilidade do conhecimento extraído, uma vez que se lida com espaços de alta dimensionalidade e esparsos. Quando se lida com contextos nos quais a coleção de textos é não-rotulada, métodos não-supervisionados de redução de atributos são utilizados. No entanto, não existe forma geral predefinida para a obtenção de medidas de utilidade de atributos em métodos não-supervisionados, demandando um esforço maior em sua realização. Assim, este trabalho aborda a seleção não-supervisionada de atributos por meio de um estudo exploratório de métodos dessa natureza, comparando a eficácia de cada um deles na redução do número de atributos em aplicações de Mineração de Textos. Dez métodos são comparados - Ranking porTerm Frequency, Ranking por Document Frequency, Term Frequency-Inverse Document Frequency, Term Contribution, Term Variance, Term Variance Quality, Método de Luhn, Método LuhnDF, Método de Salton e Zone-Scored Term Frequency - sendo dois deles aqui propostos - Método LuhnDF e Zone-Scored Term Frequency. A avaliação se dá em dois focos...

Análise de dados por meio de agrupamento fuzzy semi-supervisionado e mineração de textos; Data analysis using semisupervised fuzzy clustering and text mining

Medeiros, Debora Maria Rossi de
Fonte: Biblioteca Digitais de Teses e Dissertações da USP Publicador: Biblioteca Digitais de Teses e Dissertações da USP
Tipo: Tese de Doutorado Formato: application/pdf
Publicado em 08/12/2010 PT
Relevância na Pesquisa
66.26%
Esta Tese apresenta um conjunto de técnicas propostas com o objetivo de aprimorar processos de Agrupamento de Dados (AD). O principal objetivo é fornecer à comunidade científica um ferramental para uma análise completa de estruturas implícitas em conjuntos de dados, desde a descoberta dessas estruturas, permitindo o emprego de conhecimento prévio sobre os dados, até a análise de seu significado no contexto em que eles estão inseridos. São dois os pontos principais desse ferramental. O primeiro se trata do algoritmo para AD fuzzy semi-supervisionado SSL+P e sua evolução SSL+P*, capazes de levar em consideração o conhecimento prévio disponível sobre os dados em duas formas: rótulos e níveis de proximidade de pares de exemplos, aqui denominados Dicas de Conhecimento Prévio (DCPs). Esses algoritmos também permitem que a métrica de distância seja ajustada aos dados e às DCPs. O algoritmo SSL+P* também busca estimar o número ideal de clusters para uma determinada base de dados, levando em conta as DCPs disponíveis. Os algoritmos SSL+P e SSL+P* envolvem a minimização de uma função objetivo por meio de um algoritmo de Otimização Baseado em População (OBP). Esta Tese também fornece ferramentas que podem ser utilizadas diretamente neste ponto: as duas versões modificadas do algoritmo Particle Swarm Optimization (PSO)...

Evidence-based software engineering: systematic literature review process based on visual text mining; Engenharia de software baseada em evidências: processo de revisão sistemática de literatura baseado em mineração visual de texto

Scannavino, Katia Romero Felizardo
Fonte: Biblioteca Digitais de Teses e Dissertações da USP Publicador: Biblioteca Digitais de Teses e Dissertações da USP
Tipo: Tese de Doutorado Formato: application/pdf
Publicado em 15/05/2012 PT
Relevância na Pesquisa
66.26%
Context: Systematic literature review (SLR) is a methodology used to aggregate all relevant evidence of a specific research question. One of the activities associated with the SLR process is the selection of primary studies. The process used to select primary studies can be arduous, particularly when the researcher faces large volumes of primary studies. Another activity associated with an SLR is the presentation of results of the primary studies that meet the SLR purpose. The results are generally summarized in tables and an alternative to reduce the time consumed to understand the data is the use of graphic representations. Systematic mapping (SM) is a more open form of SLR used to build a classification and categorization scheme of a field of interest. The categorization and classification activities in SM are not trivial tasks, since they require manual effort and domain of knowledge by reviewers to achieve adequate results. Although clearly crucial, both SLR and SM processes are time-consuming and most activities are manually conducted. Objective: The aim of this research is to use Visual Text Mining (VTM) to support different activities of SLR and SM processes, e.g., support the selection of primary studies, the presentation of results of an SLR and the categorization and classification of an SM. Method: Extensions to the SLR and SM processes based on VTM were proposed. A series of case studies were conducted to demonstrate the usefulness of the VTM techniques in the selection...

Aircraft interior failure pattern recognition utilizing text mining and neural networks

Rodrigues, Rogerio S.; Balestrassi, Pedro Paulo; Paiva, Anderson P.; Garcia-Diaz, Alberto; Pontes, Fabricio J.
Fonte: Springer Publicador: Springer
Tipo: Artigo de Revista Científica Formato: 741-766
ENG
Relevância na Pesquisa
66.26%
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq); Being more competitive is routine in the aeronautical sector. Airline competitiveness is affected by such factors as time, price, reliability, availability, safety, technology, quality, and information management. To remain competitive, airlines must promptly identify and correct failures found in their fleet. This study aims at reducing the time spent on identifying and correcting such failures logged. Utilizing Text Mining techniques during the pre-processing phase, our study processes an extensive database of events from commercial regional jets. The result is a unique list of keywords that describes each reported failure. Later, an Artificial Neural Network (ANN) identifies and classifies failure patterns, yielding a respective disposition for a given failure pattern. Approximately five years of historical data was used to build and validate the present model. Results obtained were promising.

@Note : a workbench for biomedical text mining

Lourenço, Anália; Carreira, Rafael; Carneiro, S.; Maia, Paulo; Glez-Peña, Daniel; Fdez-Riverola, Florentino; Ferreira, E. C.; Rocha, I.; Rocha, Miguel
Fonte: Elsevier Publicador: Elsevier
Tipo: Artigo de Revista Científica
Publicado em /08/2009 ENG
Relevância na Pesquisa
66.41%
Biomedical Text Mining (BioTM) is providing valuable approaches to the automated curation of scientific literature. However, most efforts have addressed the benchmarking of new algorithms rather than user operational needs. Bridging the gap between BioTM researchers and biologists’ needs is crucial to solve real-world problems and promote further research. We present @Note, a platform for BioTM that aims at the effective translation of the advances between three distinct classes of users: biologists, text miners and software developers. Its main functional contributions are the ability to process abstracts and full-texts; an information retrieval module enabling PubMed search and journal crawling; a pre-processing module with PDF-to-text conversion, tokenisation and stopword removal; a semantic annotation schema; a lexicon-based annotator; a user-friendly annotation view that allows to correct annotations and a Text Mining Module supporting dataset preparation and algorithm evaluation. @Note improves the interoperability, modularity and flexibility when integrating in-home and open-source third-party components. Its component-based architecture allows the rapid development of new applications, emphasizing the principles of transparency and simplicity of use. Although it is still on-going...

Applying a text mining framework to the extraction of numerical parameters from scientific literature in the biotechnology domain

Santos, André Fernandes dos; Nogueira, R.; Lourenço, Anália
Fonte: Universidad de Salamanca Publicador: Universidad de Salamanca
Tipo: Artigo de Revista Científica
Publicado em //2012 ENG
Relevância na Pesquisa
66.33%
Scientific publications are the main vehicle to disseminate information in the field of biotechnology for wastewater treatment. Indeed, the new research paradigms and the application of high-throughput technologies have increased the rate of publication considerably. The problem is that manual curation becomes harder, prone-to-errors and time-consuming, leading to a probable loss of information and inefficient knowledge acquisition. As a result, research outputs are hardly reaching engineers, hampering the calibration of mathematical models used to optimize the stability and performance of biotechnological systems. In this context, we have developed a data curation workflow, based on text mining techniques, to extract numerical parameters from scientific literature, and applied it to the biotechnology domain. A workflow was built to process wastewater-related articles with the main goal of identifying physico-chemical parameters mentioned in the text. This work describes the implementation of the workflow, identifies achievements and current limitations in the overall process, and presents the results obtained for a corpus of 50 full-text documents.

Classifying heart sounds using SAX motifs, random forests and text mining techniques

Gomes, Elsa Ferreira; Jorge, Alípio M.; Azevedo, Paulo J.
Fonte: 334-337 Publicador: 334-337
Tipo: Conferência ou Objeto de Conferência
Publicado em //2014 ENG
Relevância na Pesquisa
66.26%
In this paper we describe an approach to classifying heart sounds (classes Normal, Murmur and Extra-systole) that is based on the discretization of sound signals using the SAX (Symbolic Aggregate Approximation) representation. The ability of automatically classifying heart sounds or at least support human decision in this task is socially relevant to spread the reach of medical care using simple mobile devices or digital stethoscopes. In our approach, sounds are firrst pre-processed using signal processing techniques (decimate, low-pass filter, normalize, Shannon envelope). Then the pre-processed symbols are transformed into sequences of discrete SAX symbols. These sequences are subject to a process of motif discovery. Frequent sequences of symbols (motifs) are adopted as features. Each sound is then characterized by the frequent motifs that occur in it and their respective frequency. This is similar to the term frequency (TF) model used in text mining. In this paper we compare the TF model with the application of the TFIDF (Term frequency - Inverse Document Frequency) and the use of bi-grams (frequent size two sequences of motifs). Results show the ability of the motifs based TF approach to separate classes and the relative value of the TFIDF and the bi-grams variants. The separation of the Extra-systole class is overly dificult and much better results are obtained for separating the Murmur class. Empirical validation is conducted using real data collected in noisy environments. We have also assessed the cost-reduction potential of the proposed methods by considering a fixed cost model and using a cost sensitive meta algorithm.

Automatic information retrieval through text-mining

Viana, Hugo Henrique Amorim
Fonte: Faculdade de Ciências e Tecnologia Publicador: Faculdade de Ciências e Tecnologia
Tipo: Dissertação de Mestrado
Publicado em //2013 ENG
Relevância na Pesquisa
66.36%
The dissertation presented for obtaining the Master’s Degree in Electrical Engineering and Computer Science, at Universidade Nova de Lisboa, Faculdade de Ciências e Tecnologia; Nowadays, around a huge amount of firms in the European Union catalogued as Small and Medium Enterprises (SMEs), employ almost a great portion of the active workforce in Europe. Nonetheless, SMEs cannot afford implementing neither methods nor tools to systematically adapt innovation as a part of their business process. Innovation is the engine to be competitive in the globalized environment, especially in the current socio-economic situation. This thesis provides a platform that when integrated with ExtremeFactories(EF) project, aids SMEs to become more competitive by means of monitoring schedule functionality. In this thesis a text-mining platform that possesses the ability to schedule a gathering information through keywords is presented. In order to develop the platform, several choices concerning the implementation have been made, in the sense that one of them requires particular emphasis is the framework, Apache Lucene Core 2 by supplying an efficient text-mining tool and it is highly used for the purpose of the thesis.

Technological indicators of nanocellulose advances obtained from data and text mining applied to patent documents

Milanez,Douglas Henrique; Amaral,Roniberto Morato do; Faria,Leandro Innocentini Lopes de; Gregolin,José Angelo Rodrigues
Fonte: ABM, ABC, ABPol Publicador: ABM, ABC, ABPol
Tipo: Artigo de Revista Científica Formato: text/html
Publicado em 01/12/2014 EN
Relevância na Pesquisa
66.35%
Nanocellulose is remarkable cellulose-based nanomaterials that have a potential for innovation and sustainable appeal. Their advances can be assessed using patent indicators and text mining techniques. The aim of this study was at analyzing the advances in nanocellulose based on indicators compiled from patents filed at the United States Patent and Trademark Office (USPTO) from 2000 to 2012. Assignees, technological subjects, highly cited patents, applications and types of nanocellulose were obtained by mining structured and unstructured data. The results highlighted the different interests in the USA market, mainly after 2007. Mined terms from titles and abstracts could add further information to the analysis. However, although the method applied was useful, it was not sufficient to identify all applications and types of nanocellulose involved in the sample analyzed, therefore it is recommended that other document parts be included in future analyses.

BOOKISH: Uma ferramenta para contextualização de documentos utilizando mineração de textos e expansão de consulta; BOOKISH: A tool for background documents using text mining and query expansion

SILVA, Luciana Oliveira e
Fonte: Universidade Federal de Goiás; BR; UFG; Mestrado em Ciência da Computação; Ciências Exatas e da Terra - Ciências da Computação Publicador: Universidade Federal de Goiás; BR; UFG; Mestrado em Ciência da Computação; Ciências Exatas e da Terra - Ciências da Computação
Tipo: Dissertação Formato: application/pdf
POR
Relevância na Pesquisa
66.26%
The continuous development of technology and its dissemination in all domains have caused significant changes in society and in education. The new global society demands new skills and provides an opportunity to introduce new technologies into the educational process, improving traditional education systems. The focus should be on the search for information, significant research, and on the development of projects, rather than on the pure transmission of content. When delivering a lecture about a given content, teachers often provide additional sources that will help students deepen their understanding of the subject and carry out activities. Furthermore, it is desirable to have proactive students, capable of interpreting and identifying other sources of information that complement and expand the subject being studied. However, one of the challenges today is information overload - there are many documents available and few effective ways to treat them. Every day, large numbers of documents are stored and made available. These documents contain a lot of relevant information. However finding that knowledge is a difficult task. The BOOKISH system, proposed in this work, assists students in their search activity. Analyzing PowerPoint slide presentations...

DENDROID: A text mining approach to analyzing and classifying code structures in Android malware families

Suárez-Tangil, Guillermo; Estévez-Tapiador, Juan M.; Peris-López, Pedro; Blasco, jorge
Fonte: Elsevier Publicador: Elsevier
Tipo: info:eu-repo/semantics/submittedVersion; info:eu-repo/semantics/article
Publicado em /03/2014 ENG
Relevância na Pesquisa
66.36%
The rapid proliferation of smartphones over the last few years has come hand in hand with and impressive growth in the number and sophistication of malicious apps targetting smartphone users. The availability of reuse-oriented development methodologies and automated malware production tools makes exceedingly easy to produce new specimens. As a result, market operators and malware analysts are increasingly overwhelmed by the amount of newly discovered samples that must be analyzed. This situation has stimulated research in intelligent instruments to automate parts of the malware analysis process. In this paper, we introduce DENDROID, a system based on text mining and information retrieval techniques for this task. Our approach is motivated by a statistical analysis of the code structures found in a dataset of ANDROID OS malware families, which reveals some parallelisms with classical problems in those domains. We then adapt the standard Vector Space Model and reformulate the modelling process followed in text mining applications. This enables us to measure similarity between malware samples, which is then used to automatically classify them into families. We also investigate the application of hierarchical clustering over the feature vectors obtained for each malware family. The resulting dendo-grams resemble the so-called phylogenetic trees for biological species...

Business intelligence in banking: A literature analysis from 2002 to 2013 using Text Mining and latent Dirichlet allocation

Moro, Sérgio; Cortez, Paulo; Rita, Paulo
Fonte: Elsevier Publicador: Elsevier
Tipo: Artigo de Revista Científica
Publicado em /02/2015 ENG
Relevância na Pesquisa
66.39%
telligence applications for the banking industry. Searches were performed in relevant journals resulting in 219 articles published between 2002 and 2013. To analyze such a large number of manuscripts, text mining techniques were used in pursuit for relevant terms on both business intelligence and banking domains. Moreover, the latent Dirichlet allocation modeling was used in or- der to group articles in several relevant topics. The analysis was conducted using a dictionary of terms belonging to both banking and business intelli- gence domains. Such procedure allowed for the identification of relationships between terms and topics grouping articles, enabling to emerge hypotheses regarding research directions. To confirm such hypotheses, relevant articles were collected and scrutinized, allowing to validate the text mining proce- dure. The results show that credit in banking is clearly the main application trend, particularly predicting risk and thus supporting credit approval or de- nial. There is also a relevant interest in bankruptcy and fraud prediction. Customer retention seems to be associated, although weakly, with targeting, justifying bank offers to reduce churn. In addition, a large number of ar- ticles focused more on business intelligence techniques and its applications...

A practical application of text mining to literature on cognitive rehabilitation and enhancement through neurostimulation

Balan, Puiu F.; Gerits, Annelies; Vanduffel, Wim
Fonte: Frontiers Media S.A. Publicador: Frontiers Media S.A.
Tipo: Artigo de Revista Científica
EN_US
Relevância na Pesquisa
66.3%
The exponential growth in publications represents a major challenge for researchers. Many scientific domains, including neuroscience, are not yet fully engaged in exploiting large bodies of publications. In this paper, we promote the idea to partially automate the processing of scientific documents, specifically using text mining (TM), to efficiently review big corpora of publications. The “cognitive advantage” given by TM is mainly related to the automatic extraction of relevant trends from corpora of literature, otherwise impossible to analyze in short periods of time. Specifically, the benefits of TM are increased speed, quality and reproducibility of text processing, boosted by rapid updates of the results. First, we selected a set of TM-tools that allow user-friendly approaches of the scientific literature, and which could serve as a guide for researchers willing to incorporate TM in their work. Second, we used these TM-tools to obtain basic insights into the relevant literature on cognitive rehabilitation (CR) and cognitive enhancement (CE) using transcranial magnetic stimulation (TMS). TM readily extracted the diversity of TMS applications in CR and CE from vast corpora of publications, automatically retrieving trends already described in published reviews. TMS emerged as one of the important non-invasive tools that can both improve cognitive and motor functions in numerous neurological diseases and induce modulations/enhancements of many fundamental brain functions. TM also revealed trends in big corpora of publications by extracting occurrence frequency and relationships of particular subtopics. Moreover...

OSCAR4: a flexible architecture for chemical text-mining

Jessop, David M; Adams, Sam; Willighagen, Egon L; Hawizy, Lezan; Murray-Rust, Peter
Fonte: Universidade de Cambridge Publicador: Universidade de Cambridge
Tipo: Article; not applicable
EN
Relevância na Pesquisa
66.26%
The Open-Source Chemistry Analysis Routines (OSCAR) software, a toolkit for the recognition of named entities and data in chemistry publications, has been developed since 2002. Recent work has resulted in the separation of the core OSCAR functionality and its release as the OSCAR4 library. This library features a clean API that permits client programmers to easily incorporate it into external applications. OSCAR4 offers a foundation upon which chemistry specific text-mining tools can be built, and its development and usage are discussed.; We gratefully acknowledge OMII-UK, JISC (ChETA project) and EPSRC (Sciborg, Pathways to Impact awards) for funding.

Maximum Entropy Models for Text mining from the Life Sciences Literature; Addressing Data heterogeneity

Nikolov, Nikolay
Fonte: DAMTP and Department of Chemistry Publicador: DAMTP and Department of Chemistry
Tipo: Relatório
EN
Relevância na Pesquisa
66.26%
This is supporting data and software for an MPhil project report submitted on 2009-08-18 by Nikolay Nikolov. The data should be used in conjunction with the OSCAR3 software as described in the project report; The life sciences nowadays are characterized by rapid growth. Due to the huge number of publications per year ? in the hundreds of thousands and growing ? it is becoming increasingly difficult for the researchers to stay abreast of the latest developments. Thus, automated methods of analysing the scientific information grow in importance. Text mining in the Life Sciences aims at extracting information from textual data (usually abstracts or full texts of scientific publications, but also non-publications like clinical histories or patents). It normally involves some kind of machine learning technique that requires training data from the given thematical domain. Our case study concerns the automatic identification of chemical named entities (e.g. compounds, reaction names) from the life science literature. We investigate the impact of the data heterogeneity on the performance of Maximum Entropy Markov models and explore possible solutions to this problem. This is, to the best of our knowledge, the first study to explore thematical heterogeneity in the chemistry-related life science literature and its impact on named entity recognition. Thus it is necessarily general - its role is to collect evidence...

Text mining e inferencia de defensas en el análisis del discurso en psicología

Stein-Sparvieri,Elena
Fonte: Subjetividad y procesos cognitivos Publicador: Subjetividad y procesos cognitivos
Tipo: Artigo de Revista Científica Formato: text/html
Publicado em 01/12/2010 ES
Relevância na Pesquisa
66.26%
El text mining es una técnica empleada en el análisis del discurso, utilizada cada vez más por las herramientas informáticas que identifican información clave. Consiste en extraer del discurso únicamente los datos que presentan un interés para el investigador. El presente trabajo explica algunas de las técnicas de TM en el análisis del discurso en psicología. Se consideran los siguientes puntos: a) extracción de información de interés específico en el discurso del paciente, b) clasificación de la información de acuerdo con patrones observados, c) evaluación, interpretación de los resultados, d) organización de los resultados en bases de datos, e) elaboración de hipótesis a partir de los resultados. En este marco, el trabajo muestra la posibilidad de inferencia de las defensas empleadas en el nivel verbal del discurso del paciente / terapeuta analizado con la metodología algoritmo David Liberman (ADL) a partir de las nuevas técnicas informáticas de procesamiento del lenguaje, específicamente a partir de la implementación de la herramienta GATE de Procesamiento de Lenguaje Natural (PLN) para el ADL. Con el fin de justificar la inferencia de defensas en el análisis automático del discurso se expone la teoría freudiana en cuanto a: a) defensas centrales...

A Semantically-based Lattice Approach for Assessing Patterns in Text Mining Tasks

Atkinson,John; Figueroa,Alejandro; Pérez,Claudio
Fonte: Centro de Investigación en computación, IPN Publicador: Centro de Investigación en computación, IPN
Tipo: Artigo de Revista Científica Formato: text/html
Publicado em 01/12/2013 EN
Relevância na Pesquisa
66.26%
In this paper, a new approach to automatically assessing patterns in text mining is proposed. It combines corpus based semantics and Formal Concept Analysis in order to deal with semantic and structural properties for concepts discovered in tasks such as generation of association rules. Experiments show the promise of our evaluation method to effectively assess discovered patterns when compared with other state-of-the-artevaluation methods.

Monitoring interaction and collective text production through text mining; Acompañamiento de la interacción y producción textual colectiva através de la mineración de textos; Acompanhamento da interação e produção textual coletiva por meio de mineração de textos

Macedo, Alexandra Lorandi; Behar, Patricia Alejandra; Azevedo, Breno Fabrício Terra
Fonte: ETD - Educação Temática Digital Publicador: ETD - Educação Temática Digital
Tipo: info:eu-repo/semantics/article; info:eu-repo/semantics/publishedVersion; Formato: application/pdf
Publicado em 07/03/2014 POR
Relevância na Pesquisa
66.38%
This article presents the Concepts Network tool, developed using text mining technology. The main objective of this tool is to extract and relate terms of greatest incidence from a text and exhibit the results in the form of a graph. The Network was implemented in the Collective Text Editor (CTE) which is an online tool that allows the production of texts in synchronized or non-synchronized forms. This article describes the application of the Network both in texts produced collectively and texts produced in a forum. The purpose of the tool is to offer support to the teacher in managing the high volume of data generated in the process of interaction amongst students and in the construction of the text. Specifically, the aim is to facilitate the teacher’s job by allowing him/her to process data in a shorter time than is currently demanded. The results suggest that the Concepts Network can aid the teacher, as it provides indicators of the quality of the text produced. Moreover, messages posted in forums can be analyzed without their content necessarily having to be pre-read.; Este paper presenta la Herramienta Red de Conceptos (en portugues Rede de Conceitos). Fue desarrollada a partir de la tecnologia de mineración de texto. El principal objetivo és extrair y relacionar los conceptos tratados con mayor  Incidencia en la producción  textual y  exibir el resultado a través de un grafo. La Red fue implementada en el  Editor de Texto Colectivo (ETC) que és una herrameinta online que permite la producción de textos de forma síncrona o assíncrona por los autores. Este artículo describe una aplicación de la Red tanto en producciones textuales colectivas...