O advento da internet e sua constante evolução exigiu o desenvolvimento de sistemas de informação em saúde, permitindo aos alunos de enfermagem o acesso à informação mais amplo, rápido e eficaz, agregando qualidade às buscas bibliográficas. Apesar da evolução e dinâmica atualização desses recursos informacionais, além da sua disponibilização gratuita, alguns alunos ainda apresentam fragilidades quando se envolvem com a busca e recuperação dessa informação. O objetivo desse estudo qualitativo é conhecer e analisar como são realizadas as buscas bibliográficas quanto à elaboração das estratégias de busca, determinação dos descritores de assunto, uso de bases de dados bibliográficas, recuperação de documentos em texto completo e quais são as dificuldades e os avanços encontrados nesse processo por alunos de graduação em enfermagem, dos cursos de Bacharelado e Bacharelado e Licenciatura da Escola de Enfermagem de Ribeirão Preto-USP. Foram entrevistados 21 alunos desses cursos no mês de novembro de 2010. A partir da análise temática, foram configurados alguns temas: 1 - necessidades e práticas de busca da informação: dificuldades experimentadas pelos estudantes de enfermagem; 2 - a organização do ensino da busca bibliográfica e o papel do professor e 3 - o bibliotecário como educador. O Google destaca-se como principal recurso de busca da WEB privilegiado pelos estudantes...
Knowledge discovery in databases is a process that aims at the discovery of associations
within data sets. The analysis of geo-referenced data demands a particular approach
in this process. This chapter presents a new approach to the process of knowledge
discovery, in which qualitative geographic identifiers give the positional aspects of
geographic data. Those identifiers are manipulated using qualitative reasoning
principles, which allows for the inference of new spatial relations required for the data
mining step of the knowledge discovery process. The efficacy and usefulness of the
implemented system — PADRÃO — has been tested with a bank dataset. The results
obtained support that traditional knowledge discovery systems, developed for
relational databases and not having semantic knowledge linked to spatial data, can
be used in the process of knowledge discovery in geo-referenced databases, since some
of this semantic knowledge and the principles of qualitative spatial reasoning are
available as spatial domain knowledge.
NoSQL databases were initially devised to support a few concrete extreme scale applications. Since the specificity and scale of the target systems justified the investment of manually crafting application code their limited query and indexing capabilities were not a major im- pediment. However, with a considerable number of mature alternatives now available there is an increasing willingness to use NoSQL databases in a wider and more diverse spectrum of applications and, to most of them, hand-crafted query code is not an enticing trade-off. In this paper we address this shortcoming of current NoSQL databases with an effective approach for executing SQL queries while preserving their scalability and schema flexibility. We show how a full-fledged SQL engine can be integrated atop of HBase leading to an ANSI SQL compli- ant database. Under a standard TPC-C workload our prototype scales linearly with the number of nodes in the system and outperforms a NoSQL TPC-C implementation optimized for HBase.
Several studies show that biological knowledge is growing at a continuous rate and distributed among different databases, making the process of data integration a hard task to perform, because they have different structures, different ways of storing data and also different approaches to export information, and are usually developed to provide information for a specific organism. Due to the large amount of biological data, the process of data integration has been one of the major challenges in the field of bioinformatics as well as discovering information about Transcriptional Regulatory Networks (TRN). When using a single source, this task is not easy to perform since the source often lacks enough information for the successful completion of the task. Therefore it is necessary to find information in several databases in order to create a useful body of knowledge. This work presents a new approach of integrating data related with TRNs for the Escherichia coli by creating a new integrated data repository gathering information from KEGG, EcoCyc, Regulon and NCBI databases.; CNPq; BIOSYSTEMS
Expressed sequence tags (ESTs) are randomly sequenced cDNA clones.
Currently, nearly 3 million human and 2 million mouse ESTs provide
valuable resources that enable researchers to investigate the products
of gene expression. The EST databases have proven to be useful tools
for detecting homologous genes, for exon mapping, revealing differential splicing,
etc. With the increasing availability of large amounts of poorly
characterised eukaryotic (notably human) genomic sequence, ESTs
have now become a vital tool for gene identification, sometimes yielding
the only unambiguous evidence for the existence of a gene expression
product. However, BLAST-based Web servers available to the general user
have not kept pace with these developments and do not provide appropriate
tools for querying EST databases with large highly spliced genes,
often spanning 50 000–100 000 bases or
more. Here we describe Gene2EST (http://woody.embl-heidelberg.de/gene2est/),
a server that brings together a set of tools enabling efficient
retrieval of ESTs matching large DNA queries and their subsequent
analysis. RepeatMasker is used to mask dispersed repetitive sequences
(such as Alu elements) in the query, BLAST2 for searching EST databases
and Artemis for graphical display of the findings. Gene2EST combines
these components into a Web resource targeted at the researcher
who wishes to study one or a few genes to a high level of detail.
A non-redundant database of nuclear, protein-encoding, genomic DNA sequences highlighting nuclear pre-mRNA introns was constructed using information contained in the SWISS-PROT and GenBank sequence databases. This Intron DataBase (IDB) contains information about (i) introns (including nucleotide sequence, location, phase, length, GC content and consensus-sequence rule violations), (ii) exons (including nucleotide sequence, length and GC content), (iii) protein coding regions (including amino acid sequence and length), and (iv) descriptive information about the source gene and organism (including gene designations and species taxonomy). The Intron Evolution DataBase (IEDB) provides a statistical analysis of the exon and intron sequences catalogued in IDB as well as data concerning intron penetration (relative number of coding regions with introns), density (number of introns per kb of total coding sequence DNA), distribution, and consensus sequences for each species present in IDB. This supplement is provided to furnish insights into the phylogenetic distribution and evolution of introns. Both databases are extensively cross-referenced to the SWISS-PROT and GenBank databases. IDB currently contains information on over 63 000 genes and 154 000 introns; IEDB summarizes information on over 2800 species. IDB and IEDB will be updated twice a year and are available via the internet (http://nutmeg.bio.indiana.edu/intron/index.html ).
The American Nurses Association (ANA) Cabinet on Nursing Practice mandated the formation of the Steering Committee on Databases to Support Clinical Nursing Practice. The Committee has established the process and the criteria by which to review and recommend nursing classification schemes based on the ANA Nursing Process Standards and elements contained in the Nursing Minimum Data Set (NMDS) for inclusion of nursing data elements in national databases. Four classification schemes have been recognized by the Committee for use in national databases. These classification schemes have been forwarded to the National Library of Medicine (NLM) for inclusion in the Unified Medical Language System (UMLS) and to the International Council of Nurses for the development of a proposed International Classification of Nursing Practice.
The Ribosomal RNA Mutation Databases (16SMDB and 23SMDB) provide lists of mutated positions in 16S and 23S ribosomal RNA from Escherichia coli and the identity of each alteration. Information provided for each mutation includes: (i) a brief description of the phenotype(s) associated with each mutation; (ii) whether a mutant phenotype has been detected by in vivo or in vitro methods; and (iii) relevant literature citations. The databases are available via ftp and on the World Wide Web. Expansion of the databases to include information about mutations isolated in organisms other than E.coli is currently in progress.
Expanded versions of the Ribosomal RNA Mutation Databases provide lists of mutated positions in 16S and 16S-like ribosomal RNA (16SMDBexp) and 23S and 23S-like ribosomal RNA (23SMDBexp) and the identity of each alteration. Alterations from organisms other than Escherichia coli are reported at positions according to the E.coli numbering system. Information provided for each mutation includes: (i) a brief description of the phenotype(s) associated with each mutation, (ii) whether a mutant phenotype has been detected by in vivo or in vitro methods, and (iii) relevant literature citations. The databases are available via ftp and on the World Wide Web at the following URL: http: //www.fandm.edu/Departments/Biology/Databases/RNA.h tml
Tchieu, Jason H.; Fana, Fariba; Fink, J. Lynn; Harper, Jeffrey; Nair, T. Murlidharan; Niedner, R. Hannes; Smith, Douglas W.; Steube, Kenneth; Tam, Tobey M.; Veretnik, Stella; Wang, Degeng; Gribskov, Michael
Fonte: Oxford University PressPublicador: Oxford University Press
PlantsP and PlantsT allow users to quickly gain a global understanding of plant phosphoproteins and plant membrane transporters, respectively, from evolutionary relationships to biochemical function as well as a deep understanding of the molecular biology of individual genes and their products. As one database with two functionally different web interfaces, PlantsP and PlantsT are curated plant-specific databases that combine sequence-derived information with experimental functional-genomics data. PlantsP focuses on proteins involved in the phosphorylation process (i.e., kinases and phosphatases), whereas PlantsT focuses on membrane transport proteins. Experimentally, PlantsP provides a resource for information on a collection of T-DNA insertion mutants (knockouts) in each kinase and phosphatase, primarily in Arabidopsis thaliana, and PlantsT uniquely combines experimental data regarding mineral composition (derived from inductively coupled plasma atomic emission spectroscopy) of mutant and wild-type strains. Both databases provide extensive information on motifs and domains, detailed information contributed by individual experts in their respective fields, and descriptive information drawn directly from the literature. The databases incorporate a unique user annotation and review feature aimed at acquiring expert annotation directly from the plant biology community. PlantsP is available at http://plantsp.sdsc.edu and PlantsT is available at http://plantst.sdsc.edu.
SRS (Sequence Retrieval System) is a widely used keyword search engine for querying biological databases. BLAST2 is the most widely used tool to query databases by sequence similarity search. These tools allow users to retrieve sequences by shared keyword or by shared similarity, with many public web servers available. However, with the increasingly large datasets available it is now quite common that a user is interested in some subset of homologous sequences but has no efficient way to restrict retrieval to that set. By allowing the user to control SRS from the BLAST output, BLAST2SRS (http://blast2srs.embl.de/) aims to meet this need. This server therefore combines the two ways to search sequence databases: similarity and keyword.
The cis-acting elements that promote efficient ribosomal frameshifting in the −1 (5′) direction have been well characterized in several viral systems. Results from many studies have convincingly demonstrated that the basic molecular mechanisms governing programmed −1 ribosomal frameshifting are almost identical from yeast to humans. We are interested in testing the hypothesis that programmed −1 ribosomal frameshifting can be used to control cellular gene expression. Toward this end, a computer program was designed to search large DNA databases for consensus −1 ribosomal frameshift signals. The results demonstrated that consensus programmed −1 ribosomal frameshift signals can be identified in a substantial number of chromosomally encoded mRNAs and that they occur with frequencies from two- to sixfold greater than random in all of the databases searched. A preliminary survey of the databases resulting from the computer searches found that consensus frameshift signals are present in at least 21 homologous genes from different species, 2 of which are nearly identical, suggesting evolutionary conservation of function. We show that four previously described missense alleles of genes that are linked to human diseases would disrupt putative programmed −1 ribosomal frameshift signals...
I present a common philosophy for implementing the EMBL and GENBANK (BBN-Los Alamos) nucleic acid sequence databases, as well as the National Biological Foundation (Dayhoff) protein sequence database. The associated FORTRAN 77 fully transportable software package includes: 1) modules for implementing each of these databases from the initial magnetic tape file, 2) modules performing a fast mnemonic access, 3) modules performing key-string access and allowing the definition of user-specific database subsets, 4) a common probe searching module allowing the stacking of multiple combined search requests over the databases. This software is particularly suitable for 32-bit mini/microcomputers but would eventually run on 16-bit computers.
Modern bibliographic databases provide the basis for scientific research and
its evaluation. While their content and structure differ substantially, there
exist only informal notions on their reliability. Here we compare the
topological consistency of citation networks extracted from six popular
bibliographic databases including Web of Science, CiteSeer and arXiv.org. The
networks are assessed through a rich set of local and global graph statistics.
We first reveal statistically significant inconsistencies between some of the
databases with respect to individual statistics. For example, the introduced
field bow-tie decomposition of DBLP Computer Science Bibliography substantially
differs from the rest due to the coverage of the database, while the citation
information within arXiv.org is the most exhaustive. Finally, we compare the
databases over multiple graph statistics using the critical difference diagram.
The citation topology of DBLP Computer Science Bibliography is the least
consistent with the rest, while, not surprisingly, Web of Science is
significantly more reliable from the perspective of consistency. This work can
serve either as a reference for scholars in bibliometrics and scientometrics or
a scientific evaluation guideline for governments and research agencies.; Comment: 16 pages...
The conventional clustering algorithms mine static databases and generate a
set of patterns in the form of clusters. Many real life databases keep growing
incrementally. For such dynamic databases, the patterns extracted from the
original database become obsolete. Thus the conventional clustering algorithms
are not suitable for incremental databases due to lack of capability to modify
the clustering results in accordance with recent updates. In this paper, the
author proposes a new incremental clustering algorithm called CFICA(Cluster
Feature-Based Incremental Clustering Approach for numerical data) to handle
numerical data and suggests a new proximity metric called Inverse Proximity
Estimate (IPE) which considers the proximity of a data point to a cluster
representative as well as its proximity to a farthest point in its vicinity.
CFICA makes use of the proposed proximity metric to determine the membership of
a data point into a cluster.; Comment: 19 pages
We present an introduction to RNA databases. The history and technology
behind RNA databases is briefly discussed. We examine differing methods of data
collection and curation, and discuss their impact on both the scope and
accuracy of the resulting databases. Finally, we demonstrate these principals
through detailed examination of four leading RNA databases: Noncode, miRBase,
Rfam, and SILVA.; Comment: 27 pages, 10 figures, 1 tables. Submitted as a chapter for "An
introduction to RNA bioinformatics" to be published by "Methods in Molecular
This paper presents the various emotion classification and recognition systems which implement methods aiming at improving Human Machine Interaction. The modalities and approaches used for affect detection vary and contribute to accuracy and efficacy in detecting emotions of human beings. This paper discovers them in a comparison and descriptive manner. Various applications that use the methodologies in different contexts to address the challenges in real time are discussed. This survey also describes the databases that can be used as standard data sets in the process of emotion identification. Thus an integrated discussion of methods, databases used and applications pertaining to the emerging field of Affective Computing (AC) is done and surveyed. This paper presents the various emotion classification and recognition systems which implement methods aiming at improving Human Machine Interaction. The modalities and approaches used for affect detection vary and contribute to accuracy and efficacy in detecting emotions of human beings. This paper discovers them in a comparison and descriptive manner. Various applications that use the methodologies in different contexts to address the challenges in real time are discussed. This survey also describes the databases that can be used as standard data sets in the process of emotion identification. Thus an integrated discussion of methods...
The declining cost of computer hardware and the
increasing data processing needs of geographically
dispersed organizations have led to substantial interest
in distributed data management. These characteristics
have led to reconsider the design of centralized databases.
Distributed databases have appeared as a result
of those considerations.
A number of advantages result from having duplicate
copies of data in a distributed databases. Some of these
advantages are: increased data accesibility, more
responsive data access, higher reliability, and load
These and other benefits must be balanced against
the additional cost and complexity introduced in doing
so. This thesis considers the problem of concurrency
control of multiple copy databases. Several synchronization
techniques are mentioned and a few algorithms
for concurrency control are evaluated and compared.