Página 1 dos resultados de 425 itens digitais encontrados em 0.005 segundos

"Métodos para análise discursiva automática" ; Methods for Automatic Discourse Analysis

Pardo, Thiago Alexandre Salgueiro
Fonte: Biblioteca Digitais de Teses e Dissertações da USP Publicador: Biblioteca Digitais de Teses e Dissertações da USP
Tipo: Tese de Doutorado Formato: application/pdf
Publicado em 04/08/2005 PT
Relevância na Pesquisa
46.29%
Pesquisas em Lingüística e Lingüística Computacional têm comprovado há tempos que um texto é mais do que uma simples seqüência de sentenças justapostas. Um texto possui uma estrutura subjacente altamente elaborada que relaciona todo o seu conteúdo, atribuindo-lhe coerência. A essa estrutura dá-se o nome de estrutura discursiva, sendo ela objeto de estudo da área de pesquisa conhecida como Análise de Discurso. Diante da grande utilidade desse conhecimento para diversas aplicações de Processamento de Línguas Naturais, por exemplo, sumarização automática de textos e resolução de anáforas, a análise discursiva automática tem recebido muita atenção. Para o português do Brasil, em particular, há poucos recursos e pesquisas nessa área de pesquisa. Neste cenário, esta tese de doutorado visa a investigar, desenvolver e implementar métodos para análise discursiva automática, adotando como principal teoria discursiva a Rhetorical Structure Theory, uma das teorias mais difundidas atualmente. A partir da anotação retórica e da análise de um corpus de textos científicos da Computação, produziu-se o primeiro analisador retórico automático para a língua portuguesa do Brasil, chamado DiZer (DIscourse analyZER)...

Sintaxe x-barra : uma aplicação computacional; X-bar syntax : a computational application

Menuzzi, Sérgio de Moura; Othero, Gabriel de Ávila
Fonte: Universidade Federal do Rio Grande do Sul Publicador: Universidade Federal do Rio Grande do Sul
Tipo: Artigo de Revista Científica Formato: application/pdf
POR
Relevância na Pesquisa
46.32%
Neste trabalho, apresentaremos uma aplicação computacional da teoria X-barra (cf. HAEGEMAN 1994, MIOTO et al. 2004), através do programa Grammar Play, um parser sintático em Prolog. O Grammar Play analisa sentenças declarativas simples do português brasileiro, identificando sua estrutura de constituintes. Sua gramática é implementada em Prolog, com o recurso das DCGs, e é baseada nos moldes propostos pela teoria X-barra. O parser é uma primeira tentativa de expandir a cobertura de analisadores semelhantes, como o esboçado em Pagani (2004) e Othero (2004). Os objetivos que guiam a presente versão do Grammar Play são o de implementar computacionalmente modelos lingüísticos coerentes aplicados à descrição do português e o de criar uma ferramenta computacional que possa ser usada didaticamente em aulas de introdução à sintaxe e lingüística, por exemplo.; In this article, we present an application of X-bar syntax in a computational enviroment. We present the parser Grammar Play, a syntactic parser in Prolog. The parser analyses simple declarative sentences of Brazilian Portuguese, identifying their constituent structure. The grammar is implemented in Prolog, making use of DCGs, and it is based on the X-bar theory (HAEGEMAN 1994...

Effective use of latent semantic indexing and computational linguistics in biological and biomedical applications

Chen, Hongyu; Martin, Bronwen; Daimon, Caitlin M.; Maudsley, Stuart
Fonte: Frontiers Media S.A. Publicador: Frontiers Media S.A.
Tipo: Artigo de Revista Científica
Publicado em 30/01/2013 EN
Relevância na Pesquisa
46.19%
Text mining is rapidly becoming an essential technique for the annotation and analysis of large biological data sets. Biomedical literature currently increases at a rate of several thousand papers per week, making automated information retrieval methods the only feasible method of managing this expanding corpus. With the increasing prevalence of open-access journals and constant growth of publicly-available repositories of biomedical literature, literature mining has become much more effective with respect to the extraction of biomedically-relevant data. In recent years, text mining of popular databases such as MEDLINE has evolved from basic term-searches to more sophisticated natural language processing techniques, indexing and retrieval methods, structural analysis and integration of literature with associated metadata. In this review, we will focus on Latent Semantic Indexing (LSI), a computational linguistics technique increasingly used for a variety of biological purposes. It is noted for its ability to consistently outperform benchmark Boolean text searches and co-occurrence models at information retrieval and its power to extract indirect relationships within a data set. LSI has been used successfully to formulate new hypotheses...

Synchronous grammars as tree transducers

Shieber, Stuart
Fonte: Harvard University Publicador: Harvard University
EN_US
Relevância na Pesquisa
46.19%
Tree transducer formalisms were developed in the formal language theory community as generalizations of finite-state transducers from strings to trees. Independently, synchronous tree-substitution and -adjoining grammars arose in the computational linguistics community as a means to augment strictly syntactic formalisms to provide for parallel semantics. We present the first synthesis of these two independently developed approaches to specifying tree relations, unifying their respective literatures for the first time, by using the framework of bimorphisms as the generalizing formalism in which all can be embedded. The central result is that synchronous tree-substitution grammars are equivalent to bimorphisms where the component homomorphisms are linear and complete.; Engineering and Applied Sciences

Issues in the foundations of cognitive psychology

Stabler, Edward Palmer
Fonte: Massachusetts Institute of Technology Publicador: Massachusetts Institute of Technology
Tipo: Tese de Doutorado Formato: 214 leaves
ENG
Relevância na Pesquisa
46.23%
by Edward Palmer Stabler, Jr.; Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Linguistics and Philosophy, 1981.; MICROFICHE COPY AVAILABLE IN ARCHIVES AND HUMANITIES.; Includes bibliographies.

Ranking contemporary American poems

Dalvean, Michael Coleman
Fonte: Oxford University Press; European Association for Digital Humanities; http://www.oxfordjournals.org/ Publicador: Oxford University Press; European Association for Digital Humanities; http://www.oxfordjournals.org/
Tipo: Journal article; Submitted Version Formato: 22 pages
Relevância na Pesquisa
46.19%
In this paper I use computational linguistics to find the differences between poems written by amateurs and poems written by professionals. I identify a number of linguistic variables that are important in distinguishing between the two classes of poems. To a large extent the findings corroborate those of earlier researchers, such as the fact that professional poems have more concrete language than amateur poems. However, I go on to use the identifed characteristics to create an ensemble classifier using the principles of machine learning. The holdout sample classification accuracy of the classifier is 80%. Furthermore, I go on to use the scores generated by the classifier to rank a number of contemporary American poets on a continuum from "amateur" to "professional". This method could be used by publishers to run an initial check on submitted poems to determine their merit.; Article submitted for publication in January 2013. Submitted version available on Social Science Research Network site at http://ssrn.com/abstract=2208452. Published version (28 June 2013) available as an Advance Access article.

The Axial Age: an examination of cognitive evolution

Dalvean, Michael
Fonte: Universidade Nacional da Austrália Publicador: Universidade Nacional da Austrália
Tipo: Working/Technical Paper
Relevância na Pesquisa
46.19%
Prior to the first millennium BCE, introspection and reflexive thought were essentially absent from literary texts. At approximately the middle of the first millennium BCE, introspection and reflexive thought began to emerge. This period has been described as the "Axial Age". However, objective measurement of the phenomenon has not been undertaken. In this paper I attempt to track this change using techniques derived from computational linguistics. Specifically, I use the variable Cogmech (cognitive mechanisms) from Linguistic Inquiry and Word Count (LIWC) to measure the extent to which there were changes in cognitive elements of expression used in literary works written in the period from 2000 BCE to the modern era. The evidence is that Cogmech rises significantly at approximately 650 BCE and remains at this level with minor variations up to the modern era, thus supporting the Axial Age hypothesis.

Word Sense Disambiguation with GermaNet; Disambiguierung von Wortbedeutungen mit GermaNet

Henrich, Verena
Fonte: Universität Tübingen Publicador: Universität Tübingen
Tipo: Dissertação
EN
Relevância na Pesquisa
46.19%
The subject of this dissertation is boosting research on word sense disambiguation (WSD) for German. WSD is a very active area of research in computational linguistics, but most of the work is focused on English. One of the factors that has hampered WSD research for other languages such as German is the lack of appropriate resources, particularly in the form of sense-annotated corpus data. Hence, this work inevitably has to start with the preparation of resources before actual WSD experiments can be performed. The work program is fourfold. Firstly, since sense definitions are necessary to distinguish word senses (both for humans and for automatic WSD algorithms), the German wordnet GermaNet is (semi-)automatically extended with sense descriptions. This is done by automatically mapping GermaNet senses to descriptions in the online dictionary Wiktionary. Secondly, since the availability of sense-annotated corpora is a prerequisite for evaluating and developing word sense disambiguation systems, two GermaNet sense-annotated corpora are constructed. One corpus is automatically constructed and the other corpus is manually sense-annotated. Thirdly, several knowledge-based WSD algorithms are applied and evaluated -- using the newly created sense-annotated corpora. These algorithms are based on a suite of semantic relatedness measures...

WASSA 2012 - Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis

BALAHUR DOBRESCU ALEXANDRA
Fonte: The Association for Computational Linguistics (ACL) Publicador: The Association for Computational Linguistics (ACL)
Tipo: Books Formato: Online
ENG
Relevância na Pesquisa
56.34%
In the past years, the quantity of contents generated by users on the Web, in social networking sites, fora and microblogs has reached an unprecedented level. All this data adds on to the contents generated in traditional media, such as newspapers, bringing additional factual, as well as a high quantity of opinionated and subjective information. In the context of the society in which we live, where sifting through the immense quantities of information to gather knowledge has become a must, the challenge of processing opinionated and subjective information is becoming more and more a focus to the Natural Language Processing (NLP) research communities worldwide. In the past decade, the interest in proposing computational methods to deal with subjectivity and sentiment in text has grown constantly from the NLP community. However, although the subjectivity and sentiment analysis research fields have been highly dynamic in this period, much remains still to be done, so that systems dealing with subjectivity, sentiment and, more generally, affect in text, can be reliably used in critical decision-making environments. Moreover, the new means of communication and user connection, in microblogs and social networks, become more and more relevant to these two tasks...

Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

BALAHUR DOBRESCU ALEXANDRA; VAN DER GOOT Erik
Fonte: The Association for Computational Linguistics Publicador: The Association for Computational Linguistics
Tipo: Books Formato: Online
ENG
Relevância na Pesquisa
56.43%
Research in automatic Subjectivity and Sentiment Analysis, as subtasks in Affective Computing within the Artificial Intelligence field of Natural Language Processing (NLP), has flourished in the past years. The growth in interest in these tasks was motivated by the birth and rapid expansion of the Social Web that made it possible for people all over the world to share, comment or consult content on any given topic. In this context, opinions, sentiments and emotions expressed in Social Media texts have been shown to have a high influence on the social and economic behaviour worldwide. The aim of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA 2013) was to continue the line of the previous three editions, bringing together researchers in Computational Linguistics working on Subjectivity and Sentiment Analysis and researchers working on interdisciplinary aspects of affect computation from text. Additionally, this year, we extended the focus to Social Media phenomena and the impact of affect-related phenomena in this context. WASSA 2013 was organized in conjunction to the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies...

5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis WASSA 2014.Proceedings of the Workshop

BALAHUR DOBRESCU ALEXANDRA; VAN DER GOOT Erik; STEINBERGER Ralf; MONTOYO Andrés
Fonte: Association for Computational Linguistics Publicador: Association for Computational Linguistics
Tipo: Books Formato: Online
ENG
Relevância na Pesquisa
56.43%
Research in automatic Subjectivity and Sentiment Analysis (SSA), as subtasks of Affective Computing and Natural Language Processing (NLP), has flourished in the past years. The growth in interest in these tasks was motivated by the birth and rapid expansion of the Social Web that made it possible for people all over the world to share, comment or consult content on any given topic. In this context, opinions, sentiments and emotions expressed in Social Media texts have been shown to have a high influence on the social and economic behavior worldwide. SSA systems are highly relevant to many real-world applications (e.g. marketing, eGovernance, business intelligence, social analysis) and also to many tasks in Natural Language Processing (NLP) - information extraction, question answering, textual entailment, to name just a few. The importance of this field has been proven by the high number of approaches proposed in research in the past decade, as well as by the interest that it raised from other disciplines (Economics, Sociology, Psychology) and the applications that were created using its technology. Despite the large interest shown by the research community and the development of a set of benchmarking resources and methods to tackle sentiment analysis...

Tecnologies de la llengua i les seves aplicacions

Martí Antonín, M. Antònia; Civit Torruella, Montserrat; Taulé Delor, Mariona
Fonte: Universidade da Coruña Publicador: Universidade da Coruña
Tipo: Artigo de Revista Científica
CAT
Relevância na Pesquisa
46.46%
[Resumo] A investigación en Lingüística Computacional e Procesamento da Lenguaje Natural deu lugar estes últimos anos ás denominadas Tecnoloxías da Linguaxe, cuxo obxectivo principal é o desenvolvemento de sistemas informáticos capaces de recoñeceren, comprenderen e xeraren linguaxe humana en todas as súas formas. Con esta finalidade, desenvolveuse unha serie de aplicacións, como a Tradución Automática, a Extracción e Recuperación da Información, a Clasificación de Documentos etc., que procesan a información para facilitaren o acceso, organización e transmisión do coñecemento que xera a chamada Sociedade da Información en que vivimos. Como noutras disciplinas científicas, na área da Lingüística Computacional e do Procesamento da Linguaxe Natural pasouse dunha etapa inicial centrada na investigación básica de carácter experimental a outra en que se interaxe máis coa sociedade e, por tanto, máis interesada na creación de produtos e aplicacións que resolvan problemas reais. Isto significa desenvolver sistemas e recursos capaces de analizaren a linguaxe sen restricións, isto é, que ofrezan unha ampla cobertura lingüística. Neste artigo preséntase de xeito introdutorio os recursos (lingüísticos) e as aplicacións máis características que se desenvolven actualmente no marco das Tecnoloxías da Linguaxe. En concreto...

Ranking canonical English poems

Dalvean, Michael
Fonte: Universidade Nacional da Austrália Publicador: Universidade Nacional da Austrália
Tipo: Artigo de Revista Científica Formato: 43 pages
Relevância na Pesquisa
46.19%
In this article I extend recent work on the application of computational linguistics to the analysis of poetry. The dataset consist of 85 canonical English poems and a matched control group of obscure poems. I use several content analysis dictionaries to create over 500 linguistic variables and then use machine learning to develop a classifier. The classifier, consisting of 7 linguistic variables is tested using 10-fold crossover validation. Classifier accuracy is 69%. Of the 7 variables, three confirm previous findings about the nature of how “successful” poems differ from less successful poems while four of the variables give further insight into the phenomenon. I then go on to rank the poems using the probability scores of the classifier and find that Blake's “A poison Tree” scores highest. I explain the ranking method as being a means of distilling the “literary” appeal from the “popular” appeal of the poems in the sample. Finally, I discuss the implications for the theory of poetry in general.

Propriedades das línguas naturais e o processo de aquisição : reflexões a partir da implementação do modelo em Berwick (1985); Properties of natural languages and the acquisitions process : reflections based on an implementation of the model in Berwick (1985)

Pablo Picasso Feliciano de Faria
Fonte: Biblioteca Digital da Unicamp Publicador: Biblioteca Digital da Unicamp
Tipo: Dissertação de Mestrado Formato: application/pdf
Publicado em 03/12/2009 PT
Relevância na Pesquisa
46.19%
Nesta dissertação de mestrado, o objetivo principal é refletir sobre algumas propriedades da linguagem e do processo de aquisição, tomando como ponto de partida questões que surgiram durante o processo de implementação do modelo proposto em Berwick (1985). O quadro teórico geral em que esta pesquisa se situa é o da Gramática Gerativa - na linha chomskiana - e, em particular, o modelo implementado aqui tem como principal base teórica a Gramática Transformacional (Cf. CHOMSKY, 1965). Entre as propriedades da linguagem que discutimos estão: os traços distintivos dos itens lexicais, a assimetria entre especificadores e complementos, categorias vazias e o papel da informação temática na sintaxe. A idéia subjacente que permeia as reflexões é a busca por um olhar mais abstrato sobre o conhecimento gramatical, procurando rever ou até eliminar dispositivos que, em primeiro lugar, aparecem como obstáculos significativos para o analisador e, em segundo lugar, resistem à identificação de evidências para sua aquisição, do ponto de vista do aprendiz da língua. Para atingir estes objetivos, a primeira metade do trabalho faz uma breve discussão teórica, para em seguida trazer uma apresentação razoavelmente detalhada do modelo de Berwick...

The semantics of grammar formalisms seen as computer languages

Shieber, Stuart; Pereira, Fernando C. N.
Fonte: Association for Computational Linguistics Publicador: Association for Computational Linguistics
Tipo: Conference Paper
EN_US
Relevância na Pesquisa
46.19%
The design, implementation, and use of grammar formalisms for natural language have constituted a major branch of computational linguistics throughout its development. By viewing grammar formalisms as just a special case of computer languages, we can take advantage of the machinery of denotational semantics to provide a precise specification of their meaning. Using Dana Scott's domain theory, we elucidate the nature of the feature systems used in augmented phrase-structure grammar formalisms, in particular those of recent versions of generalized phrase structure grammar, lexical functional grammar and PATR-II, and provide a denotational semantics for a simple grammar formalism. We find that the mathematical structures developed for this purpose contain an operation of feature generalization, not available in those grammar formalisms, that can be used to give a partial account of the effect of coordination on syntactic features.; Engineering and Applied Sciences

Proceedings of the First Workshop on Computing News Storylines (CNewsStory 2015)

CASELLI Tommaso; VAN ERP Marieke; MINARD Anne-Lyse; FINLAYSON Mark; MILLER Ben; ATSERIAS Jordi; BALAHUR-DOBRESCU ALEXANDRA; VOSSEN Piek
Fonte: Association for Computational Linguistics Publicador: Association for Computational Linguistics
Tipo: Books Formato: Online
ENG
Relevância na Pesquisa
46.3%
This volume contains the proceedings of the 1st Workshop on Computing News Storylines (CNewsStory 2015) held in conjunction with the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2015) at the China National Convention Center in Beijing, on July 31st 2015. Narratives are at the heart of information sharing. Ever since people began to share their experiences, they have connected them to form narratives. The study od storytelling and the field of literary theory called narratology have developed complex frameworks and models related to various aspects of narrative such as plots structures, narrative embeddings, characters’ perspectives, reader response, point of view, narrative voice, narrative goals, and many others. These notions from narratology have been applied mainly in Artificial Intelligence and to model formal semantic approaches to narratives (e.g. Plot Units developed by Lehnert (1981)). In recent years, computational narratology has qualified as an autonomous field of study and research. Narrative has been the focus of a number of workshops and conferences (AAAI Symposia, Interactive Storytelling Conference (ICIDS)...

Editorial for the First Workshop on Mining Scientific Papers: Computational Linguistics and Bibliometrics

Atanassova, Iana; Bertin, Marc; Mayr, Philipp
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 17/06/2015
Relevância na Pesquisa
46.46%
The workshop "Mining Scientific Papers: Computational Linguistics and Bibliometrics" (CLBib 2015), co-located with the 15th International Society of Scientometrics and Informetrics Conference (ISSI 2015), brought together researchers in Bibliometrics and Computational Linguistics in order to study the ways Bibliometrics can benefit from large-scale text analytics and sense mining of scientific papers, thus exploring the interdisciplinarity of Bibliometrics and Natural Language Processing (NLP). The goals of the workshop were to answer questions like: How can we enhance author network analysis and Bibliometrics using data obtained by text analytics? What insights can NLP provide on the structure of scientific writing, on citation networks, and on in-text citation analysis? This workshop is the first step to foster the reflection on the interdisciplinarity and the benefits that the two disciplines Bibliometrics and Natural Language Processing can drive from it.; Comment: 4 pages, Workshop on Mining Scientific Papers: Computational Linguistics and Bibliometrics at ISSI 2015

Algorithms for Estimating Information Distance with Application to Bioinformatics and Linguistics

Kaltchenko, Alexei
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 20/04/2004
Relevância na Pesquisa
46.26%
After reviewing unnormalized and normalized information distances based on incomputable notions of Kolmogorov complexity, we discuss how Kolmogorov complexity can be approximated by data compression algorithms. We argue that optimal algorithms for data compression with side information can be successfully used to approximate the normalized distance. Next, we discuss an alternative information distance, which is based on relative entropy rate (also known as Kullback-Leibler divergence), and compression-based algorithms for its estimation. Based on available biological and linguistic data, we arrive to unexpected conclusion that in Bioinformatics and Computational Linguistics this alternative distance is more relevant and important than the ones based on Kolmogorov complexity.; Comment: 4 pages

Using word and phrase abbreviation patterns to extract age from Twitter microtexts

Moseley, Nathaniel
Fonte: Rochester Instituto de Tecnologia Publicador: Rochester Instituto de Tecnologia
Tipo: Tese de Doutorado
EN_US
Relevância na Pesquisa
46.56%
The wealth of texts available publicly online for analysis is ever increasing. Much work in computational linguistics focuses on syntactic, contextual, morphological and phonetic analysis on written documents, vocal recordings, or texts on the internet. Twitter messages present a unique challenge for computational linguistic analysis due to their constrained size. The constraint of 140 characters often prompts users to abbreviate words and phrases. Additionally, as an informal writing medium, messages are not expected to adhere to grammatically or orthographically standard English. As such, Twitter messages are noisy and do not necessarily conform to standard writing conventions of linguistic corpora, often requiring special pre-processing before advanced analysis can be done. In the area of computational linguistics, there is an interest in determining latent attributes of an author. Attributes such as author gender can be determined with some amount of success from many sources, using various methods, such as analysis of shallow linguistic patterns or topic. Author age is more difficult to determine, but previous research has been somewhat successful at classifying age as a binary (e.g. over or under 30), ternary, or even as a continuous variable using various techniques. Twitter messages present a difficult problem for latent user attribute analysis...

Corpora for computational linguistics; Corpora for computational linguistics

Orasan, Constantin; University of Wolverhampton - United Kingdom; Ha, Le An; Evans, Richard; Hasler, Laura; Mitkov, Ruslan
Fonte: UFSC Publicador: UFSC
Tipo: info:eu-repo/semantics/article; info:eu-repo/semantics/publishedVersion; ; Formato: application/pdf
Publicado em 12/11/2008 POR
Relevância na Pesquisa
66.54%
Since the mid 90s corpora has become very important for computational linguistics. This paper offers a survey of how they are currently used in different fields of the discipline, with particular emphasis on anaphora and coreference resolution, automatic summarisation and term extraction. Their influence on other fields is also briefly discussed.; Since the mid 90s corpora has become very important for computational linguistics. This paper offers a survey of how they are currently used in different fields of the discipline, with particular emphasis on anaphora and coreference resolution, automatic summarisation and term extraction. Their influence on other fields is also briefly discussed.