Página 17 dos resultados de 23237 itens digitais encontrados em 0.029 segundos

Databases of genomic variation and phenotypes: existing resources and future needs

Johnston, Jennifer J.; Biesecker, Leslie G.
Fonte: Oxford University Press Publicador: Oxford University Press
Tipo: Artigo de Revista Científica
EN
Relevância na Pesquisa
26.48%
Massively parallel sequencing (MPS) has become an important tool for identifying medically significant variants in both research and the clinic. Accurate variation and genotype–phenotype databases are critical in our ability to make sense of the vast amount of information that MPS generates. The purpose of this review is to summarize the state of the art of variation and genotype–phenotype databases, how they can be used, and opportunities to improve these resources. Our working assumption is that the objective of the clinical genomicist is to identify highly penetrant variants that could explain existing disease or predict disease risk for individual patients or research participants. We have detailed how current databases contribute to this goal providing frequency data, literature reviews and predictions of causation for individual variants. For variant annotation, databases vary greatly in their ease of use, the use of standard mutation nomenclature, the comprehensiveness of the variant cataloging and the degree of expert opinion. Ultimately, we need a dynamic and comprehensive reference database of medically important variants that is easily cross referenced to exome and genome sequence data and allows for an accumulation of expert opinion.

The World Bacterial Biogeography and Biodiversity through Databases: A Case Study of NCBI Nucleotide Database and GBIF Database

Selama, Okba; James, Phillip; Nateche, Farida; Wellington, Elizabeth M. H.; Hacène, Hocine
Fonte: Hindawi Publishing Corporation Publicador: Hindawi Publishing Corporation
Tipo: Artigo de Revista Científica
EN
Relevância na Pesquisa
26.48%
Databases are an essential tool and resource within the field of bioinformatics. The primary aim of this study was to generate an overview of global bacterial biodiversity and biogeography using available data from the two largest public online databases, NCBI Nucleotide and GBIF. The secondary aim was to highlight the contribution each geographic area has to each database. The basis for data analysis of this study was the metadata provided by both databases, mainly, the taxonomy and the geographical area origin of isolation of the microorganism (record). These were directly obtained from GBIF through the online interface, while E-utilities and Python were used in combination with a programmatic web service access to obtain data from the NCBI Nucleotide Database. Results indicate that the American continent, and more specifically the USA, is the top contributor, while Africa and Antarctica are less well represented. This highlights the imbalance of exploration within these areas rather than any reduction in biodiversity. This study describes a novel approach to generating global scale patterns of bacterial biodiversity and biogeography and indicates that the Proteobacteria are the most abundant and widely distributed phylum within both databases.

Databases and Software for NMR-Based Metabolomics

Ellinger, James J.; Chylla, Roger A.; Ulrich, Eldon L.; Markley, John L.
Fonte: PubMed Publicador: PubMed
Tipo: Artigo de Revista Científica
EN
Relevância na Pesquisa
26.48%
New software and increasingly sophisticated NMR metabolite spectral databases are advancing the unique abilities of NMR spectroscopy to identify and quantify small molecules in solution for studies of metabolite biomarkers and metabolic flux. Public and commercial databases now contain experimental 1D 1H, 13C and 2D 1H-13C spectra and extracted spectral parameters for over a thousand compounds and theoretical data for thousands more. Public databases containing experimental NMR data from complex metabolic studies are emerging. These databases are providing information vital for the construction and testing of new computational algorithms for NMR-based chemometric and quantitative metabolomics studies. In this review we focus on database and software tools that support a quantitative NMR approach to the analysis of 1D and 2D NMR spectra of complex biological mixtures.

Rule-based deduplication of article records from bibliographic databases

Jiang, Yu; Lin, Can; Meng, Weiyi; Yu, Clement; Cohen, Aaron M.; Smalheiser, Neil R.
Fonte: Oxford University Press Publicador: Oxford University Press
Tipo: Artigo de Revista Científica
Publicado em 16/01/2014 EN
Relevância na Pesquisa
26.48%
We recently designed and deployed a metasearch engine, Metta, that sends queries and retrieves search results from five leading biomedical databases: PubMed, EMBASE, CINAHL, PsycINFO and the Cochrane Central Register of Controlled Trials. Because many articles are indexed in more than one of these databases, it is desirable to deduplicate the retrieved article records. This is not a trivial problem because data fields contain a lot of missing and erroneous entries, and because certain types of information are recorded differently (and inconsistently) in the different databases. The present report describes our rule-based method for deduplicating article records across databases and includes an open-source script module that can be deployed freely. Metta was designed to satisfy the particular needs of people who are writing systematic reviews in evidence-based medicine. These users want the highest possible recall in retrieval, so it is important to err on the side of not deduplicating any records that refer to distinct articles, and it is important to perform deduplication online in real time. Our deduplication module is designed with these constraints in mind. Articles that share the same publication year are compared sequentially on parameters including PubMed ID number...

P-MITE: a database for plant miniature inverted-repeat transposable elements

Chen, Jiongjiong; Hu, Qun; Zhang, Yu; Lu, Chen; Kuang, Hanhui
Fonte: Oxford University Press Publicador: Oxford University Press
Tipo: Artigo de Revista Científica
EN
Relevância na Pesquisa
26.48%
Miniature inverted-repeat transposable elements (MITEs) are prevalent in eukaryotic species including plants. MITE families vary dramatically and usually cannot be identified based on homology. In this study, we de novo identified MITEs from 41 plant species, using computer programs MITE Digger, MITE-Hunter and/or Repetitive Sequence with Precise Boundaries (RSPB). MITEs were found in all, but one (Cyanidioschyzon merolae), species. Combined with the MITEs identified previously from the rice genome, >2.3 million sequences from 3527 MITE families were obtained from 41 plant species. In general, higher plants contain more MITEs than lower plants, with a few exceptions such as papaya, with only 538 elements. The largest number of MITEs is found in apple, with 237 302 MITE sequences. The number of MITE sequences in a genome is significantly correlated with genome size. A series of databases (plant MITE databases, P-MITE), available online at http://pmite.hzau.edu.cn/django/mite/, was constructed to host all MITE sequences from the 41 plant genomes. The databases are available for sequence similarity searches (BLASTN), and MITE sequences can be downloaded by family or by genome. The databases can be used to study the origin and amplification of MITEs...

Filling the gap in functional trait databases: use of ecological hypotheses to replace missing data

Taugourdeau, Simon; Villerd, Jean; Plantureux, Sylvain; Huguenin-Elie, Olivier; Amiaud, Bernard
Fonte: John Wiley & Sons Ltd. Publicador: John Wiley & Sons Ltd.
Tipo: Artigo de Revista Científica
EN
Relevância na Pesquisa
26.48%
Functional trait databases are powerful tools in ecology, though most of them contain large amounts of missing values. The goal of this study was to test the effect of imputation methods on the evaluation of trait values at species level and on the subsequent calculation of functional diversity indices at community level using functional trait databases. Two simple imputation methods (average and median), two methods based on ecological hypotheses, and one multiple imputation method were tested using a large plant trait database, together with the influence of the percentage of missing data and differences between functional traits. At community level, the complete-case approach and three functional diversity indices calculated from grassland plant communities were included. At the species level, one of the methods based on ecological hypothesis was for all traits more accurate than imputation with average or median values, but the multiple imputation method was superior for most of the traits. The method based on functional proximity between species was the best method for traits with an unbalanced distribution, while the method based on the existence of relationships between traits was the best for traits with a balanced distribution. The ranking of the grassland communities for their functional diversity indices was not robust with the complete-case approach...

Accurate Assignment of Significance to Neuropeptide Identifications Using Monte Carlo K-Permuted Decoy Databases

Akhtar, Malik N.; Southey, Bruce R.; Andrén, Per E.; Sweedler, Jonathan V.; Rodriguez-Zas, Sandra L.
Fonte: Public Library of Science Publicador: Public Library of Science
Tipo: Artigo de Revista Científica
Publicado em 17/10/2014 EN
Relevância na Pesquisa
26.48%
In support of accurate neuropeptide identification in mass spectrometry experiments, novel Monte Carlo permutation testing was used to compute significance values. Testing was based on k-permuted decoy databases, where k denotes the number of permutations. These databases were integrated with a range of peptide identification indicators from three popular open-source database search software (OMSSA, Crux, and X! Tandem) to assess the statistical significance of neuropeptide spectra matches. Significance p-values were computed as the fraction of the sequences in the database with match indicator value better than or equal to the true target spectra. When applied to a test-bed of all known manually annotated mouse neuropeptides, permutation tests with k-permuted decoy databases identified up to 100% of the neuropeptides at p-value < 10−5. The permutation test p-values using hyperscore (X! Tandem), E-value (OMSSA) and Sp score (Crux) match indicators outperformed all other match indicators. The robust performance to detect peptides of the intuitive indicator “number of matched ions between the experimental and theoretical spectra” highlights the importance of considering this indicator when the p-value was borderline significant. Our findings suggest permutation decoy databases of size 1×105 are adequate to accurately detect neuropeptides and this can be exploited to increase the speed of the search. The straightforward Monte Carlo permutation testing (comparable to a zero order Markov model) can be easily combined with existing peptide identification software to enable accurate and effective neuropeptide detection. The source code is available at http://stagbeetle.animal.uiuc.edu/pepshop/MSMSpermutationtesting.

Biological Databases for Behavioral Neurobiology

Baker, Erich J.
Fonte: PubMed Publicador: PubMed
Tipo: Artigo de Revista Científica
Publicado em //2012 EN
Relevância na Pesquisa
26.48%
Databases are, at their core, abstractions of data and their intentionally derived relationships. They serve as a central organizing metaphor and repository, supporting or augmenting nearly all bioinformatics. Behavioral domains provide a unique stage for contemporary databases, as research in this area spans diverse data types, locations, and data relationships. This chapter provides foundational information on the diversity and prevalence of databases, how data structures support the various needs of behavioral neuroscience analysis and interpretation. The focus is on the classes of databases, data curation, and advanced applications in bioinformatics using examples largely drawn from research efforts in behavioral neuroscience.

Comparison of human cell signaling pathway databases—evolution, drawbacks and challenges

Chowdhury, Saikat; Sarkar, Ram Rup
Fonte: Oxford University Press Publicador: Oxford University Press
Tipo: Artigo de Revista Científica
Publicado em 28/01/2015 EN
Relevância na Pesquisa
26.48%
Elucidating the complexities of cell signaling pathways is of immense importance to gain understanding about various biological phenomenon, such as dynamics of gene/protein expression regulation, cell fate determination, embryogenesis and disease progression. The successful completion of human genome project has also helped experimental and theoretical biologists to analyze various important pathways. To advance this study, during the past two decades, systematic collections of pathway data from experimental studies have been compiled and distributed freely by several databases, which also integrate various computational tools for further analysis. Despite significant advancements, there exist several drawbacks and challenges, such as pathway data heterogeneity, annotation, regular update and automated image reconstructions, which motivated us to perform a thorough review on popular and actively functioning 24 cell signaling databases. Based on two major characteristics, pathway information and technical details, freely accessible data from commercial and academic databases are examined to understand their evolution and enrichment. This review not only helps to identify some novel and useful features, which are not yet included in any of the databases but also highlights their current limitations and subsequently propose the reasonable solutions for future database development...

Comparative Evaluation of Registration Algorithms in Different Brain Databases With Varying Difficulty: Results and Insights

Ou, Yangming; Akbari, Hamed; Bilello, Michel; Da, Xiao; Davatzikos, Christos
Fonte: PubMed Publicador: PubMed
Tipo: Artigo de Revista Científica
EN
Relevância na Pesquisa
26.48%
Evaluating various algorithms for the inter-subject registration of brain magnetic resonance images (MRI) is a necessary topic receiving growing attention. Existing studies evaluated image registration algorithms in specific tasks or using specific databases (e.g., only for skull-stripped images, only for single-site images, etc.). Consequently, the choice of registration algorithms seems task- and usage/parameter-dependent. Nevertheless, recent large-scale, often multi-institutional imaging-related studies create the need and raise the question whether some registration algorithms can 1) generally apply to various tasks/databases posing various challenges; 2) perform consistently well, and while doing so, 3) require minimal or ideally no parameter tuning. In seeking answers to this question, we evaluated 12 general-purpose registration algorithms, for their generality, accuracy and robustness. We fixed their parameters at values suggested by algorithm developers as reported in the literature. We tested them in 7 databases/tasks, which present one or more of 4 commonly-encountered challenges: 1) inter-subject anatomical variability in skull-stripped images; 2) intensity homogeneity, noise and large structural differences in raw images; 3) imaging protocol and field-of-view (FOV) differences in multi-site data; and 4) missing correspondences in pathology-bearing images. Totally 7...

Quantifying the Consistency of Scientific Databases

Šubelj, Lovro; Bajec, Marko; Mileva Boshkoska, Biljana; Kastrin, Andrej; Levnajić, Zoran
Fonte: Public Library of Science Publicador: Public Library of Science
Tipo: Artigo de Revista Científica
Publicado em 18/05/2015 EN
Relevância na Pesquisa
26.48%
Science is a social process with far-reaching impact on our modern society. In recent years, for the first time we are able to scientifically study the science itself. This is enabled by massive amounts of data on scientific publications that is increasingly becoming available. The data is contained in several databases such as Web of Science or PubMed, maintained by various public and private entities. Unfortunately, these databases are not always consistent, which considerably hinders this study. Relying on the powerful framework of complex networks, we conduct a systematic analysis of the consistency among six major scientific databases. We found that identifying a single "best" database is far from easy. Nevertheless, our results indicate appreciable differences in mutual consistency of different databases, which we interpret as recipes for future bibliometric studies.

P-MITE: a database for plant miniature inverted-repeat transposable elements

Chen, Jiongjiong; Hu, Qun; Zhang, Yu; Lu, Chen; Kuang, Hanhui
Fonte: Oxford University Press Publicador: Oxford University Press
Tipo: Artigo de Revista Científica
EN
Relevância na Pesquisa
26.48%
Miniature inverted-repeat transposable elements (MITEs) are prevalent in eukaryotic species including plants. MITE families vary dramatically and usually cannot be identified based on homology. In this study, we de novo identified MITEs from 41 plant species, using computer programs MITE Digger, MITE-Hunter and/or Repetitive Sequence with Precise Boundaries (RSPB). MITEs were found in all, but one (Cyanidioschyzon merolae), species. Combined with the MITEs identified previously from the rice genome, >2.3 million sequences from 3527 MITE families were obtained from 41 plant species. In general, higher plants contain more MITEs than lower plants, with a few exceptions such as papaya, with only 538 elements. The largest number of MITEs is found in apple, with 237 302 MITE sequences. The number of MITE sequences in a genome is significantly correlated with genome size. A series of databases (plant MITE databases, P-MITE), available online at http://pmite.hzau.edu.cn/django/mite/, was constructed to host all MITE sequences from the 41 plant genomes. The databases are available for sequence similarity searches (BLASTN), and MITE sequences can be downloaded by family or by genome. The databases can be used to study the origin and amplification of MITEs...

Automatic Categorization of Diverse Experimental Information in the Bioscience Literature

Fang, Ruihua; Schindelman, Gary; Auken, Kimberly Van; Fernandes, Jolene; Chen, Wen; Wang, Xiaodong; Davis, Paul; Tuli, Mary Ann; Marygold, Steven J; Millburn, Gillian; Matthews, Beverley; Zhang, Haiyan; Brown, Nick; Gelbart, William Martin; Sternberg, Pau
Fonte: BioMed Central Publicador: BioMed Central
Tipo: Artigo de Revista Científica
EN_US
Relevância na Pesquisa
26.48%
Background: Curation of information from bioscience literature into biological knowledge databases is a crucial way of capturing experimental information in a computable form. During the biocuration process, a critical first step is to identify from all published literature the papers that contain results for a specific data type the curator is interested in annotating. This step normally requires curators to manually examine many papers to ascertain which few contain information of interest and thus, is usually time consuming. We developed an automatic method for identifying papers containing these curation data types among a large pool of published scientific papers based on the machine learning method Support Vector Machine (SVM). This classification system is completely automatic and can be readily applied to diverse experimental data types. It has been in use in production for automatic categorization of 10 different experimental datatypes in the biocuration process at WormBase for the past two years and it is in the process of being adopted in the biocuration process at FlyBase and the Saccharomyces Genome Database (SGD). We anticipate that this method can be readily adopted by various databases in the biocuration community and thereby greatly reducing time spent on an otherwise laborious and demanding task. We also developed a simple...

Scaling geospatial searches in large spatial databases

Cary, Ariel
Fonte: FIU Digital Commons Publicador: FIU Digital Commons
Tipo: Artigo de Revista Científica
EN
Relevância na Pesquisa
26.48%
Modern geographical databases, which are at the core of geographic information systems (GIS), store a rich set of aspatial attributes in addition to geographic data. Typically, aspatial information comes in textual and numeric format. Retrieving information constrained on spatial and aspatial data from geodatabases provides GIS users the ability to perform more interesting spatial analyses, and for applications to support composite location-aware searches; for example, in a real estate database: “Find the nearest homes for sale to my current location that have backyard and whose prices are between $50,000 and $80,000”. Efficient processing of such queries require combined indexing strategies of multiple types of data. Existing spatial query engines commonly apply a two-filter approach (spatial filter followed by nonspatial filter, or viceversa), which can incur large performance overheads. On the other hand, more recently, the amount of geolocation data has grown rapidly in databases due in part to advances in geolocation technologies (e.g., GPS-enabled smartphones) that allow users to associate location data to objects or events. The latter poses potential data ingestion challenges of large data volumes for practical GIS databases. ^ In this dissertation...

Query translation and optimisation for complex value databases

Liu, Hong-Cheu
Fonte: Universidade Nacional da Austrália Publicador: Universidade Nacional da Austrália
Tipo: Thesis (PhD)
EN_AU
Relevância na Pesquisa
26.48%
This thesis considers the theory of database queries on the complex value data model extended with external functions. In modern intelligent database systems, we expect that query systems be able to handle a wide range of calculus formulas correctly and efficiently. Accordingly, they will require general query translators and efficient optimisers. Motivated by these concerns, this thesis undertakes a· comprehensive study of query evaluation in the complex value model and investigates the following issues: • identifying recursive sets of complex value formulas which define domain independent queries; • implementing complex value calculus queries with the incorporation of functions; • solving the problem of how to process join operation in complex value databases; and • investigating some algebraic properties concerning nested relational operators. The first part of this thesis extends some classical properties of the relational theory - particularly those related to query safety - to the context of complex value databases with fixed external functions and investigates the problem of how to implement calculus queries. Two notions of syntactic criteria for queries which guarantee domain independence, namely, embedded evaluable and embedded allowed...

Using population-based routine data for evidence-based health policy decisions: lessons from three examples of setting and evaluating national health policy in Australia, the UK and the USA

Morrato, E.; Elias, M.; Gericke, C.
Fonte: Oxford University Press Publicador: Oxford University Press
Tipo: Artigo de Revista Científica
Publicado em //2007 EN
Relevância na Pesquisa
26.48%
Background The desire for evidence-based health policy and practice is well established. Routine population-based health information systems play a fundamental role to inform policy decisions and to evaluate their effectiveness. Methods This paper presents three case studies of using population-based data in national health policy from three countries—USA (prescription drug safety), Australia (childhood immunization) and UK (hospital waiting times)—which were chosen to represent a diversity of health policy issues. The utilization of population-based databases and the social and political context in which the data were used are examined. Our goal was to summarize general lessons learned for policy decision-makers and other users and developers of population-based databases. Results Key lessons presented include: the importance of political will in initiating and sustaining data collection and analysis at a national level; the types of decision-making factors databases can address; and how the data were integrated into the decision-making process. Conclusion Population-based routine data provide an important piece of the mosaic of evidence for health policy decision makers. They can be used to assess the magnitude of the health problem...

Bioinformatics methods for the analysis of hepatitis viruses

Moriconi, F.; Beard, M.; Yuen, L.
Fonte: Int Medical Press Ltd Publicador: Int Medical Press Ltd
Tipo: Artigo de Revista Científica
Publicado em //2013 EN
Relevância na Pesquisa
26.48%
HBV and HCV are the only hepatotropic viruses capable of establishing chronic infections. More than 500 million people worldwide are estimated to have chronic infections with HBV and/or HCV, and they have an increased risk of developing liver complications, such as cirrhosis or hepatocellular carcinoma. During the past decade, several antiviral agents including immune-modulatory drugs and nucleoside/nucleotide analogues have been approved for the treatment of HBV and HCV infections. In recent years, the focus has been on the development of new and better therapeutic agents for management of chronic HCV infections. Bioinformatics has only been applied recently to the field of viral hepatitis research. In addition to the wide range of general tools freely available for identification of open reading frames, gene prediction, homology searching, sequence alignment, and motif and epitope recognition, several public database systems designed specifically for HBV and HCV research have now been developed. The focus of these databases ranged from being viral sequence repositories for the provision of bioinformatics tools for viral genome analysis, as well as HBV or HCV drug resistance prediction. This review provides an overview of these public databases...

A method for finding common attributes in hetrogenous DoD databases

Zobair, Hamza A.
Fonte: Monterey, California. Naval Postgraduate School Publicador: Monterey, California. Naval Postgraduate School
Tipo: Tese de Doutorado
Relevância na Pesquisa
26.48%
Approved for public release; distribution is unlimited.; Traditional database development has been done for a specific, self-contained purpose with no plan to share or merge the data with other databases in the future. As these systems have matured, users have realized a requirement exists to share their data. Finding common attributes among databases is a time consuming task. However, it is one that is necessary as more and more corporations and agencies consolidate operations. In terms of DoD, the requirement to consolidate systems has come about, as the various data systems used by DoD agencies and our allies need to communicate with each other for a well-coordinated operation. One alternative for achieving the desired interconnectivity is to specify the requirement for interoperability in new systems. A more practical, less costly process is to merge existing systems and consolidate the common components. This paper proposes a process for consolidating portions of data dictionaries of two existing databases. The proposed method uses commercial-off-the-shelf software in finding common attributes between multiple databases and represents an improvement in accuracy and time over previous methods.

Le banche dati genetiche per fini giudiziari e i diritti della persona alla ricerca di una legislazione europea armonizzata

Scaffardi, Lucia
Fonte: Universidade da Coruña Publicador: Universidade da Coruña
Tipo: Artigo de Revista Científica
ITA
Relevância na Pesquisa
26.48%
[Abstract] Recent emergency legislation in several EU countries as well as continuous developments in the scientific techniques and an improved use of genetic databases in both crime and terrorism prevention and trials proceedings, put the issue of DNA data- base legislation as one of the more delicate challenge of legislative harmonization at the European level. Indeed, the balance between the right to privacy, which founds an incre- asingly detailed and enforced protection at all levels, from national to sovra-national and international arenas, and the right to security and to fair trials is hard to be achie- ved and it depends a lot from the cultural, historical, philosophical background each country is characterised by. At present solutions widely differ in Europe: on the one side there are cases like Italy, that has no official policy on the subject, while others, such as the United Kingdom, have developed detailed policies. And among the countries having adopted specific legislation on the creation, use and management of genetic databases, the approaches are pretty different, as pretty different are the outcomes in terms of protec- tion of privacy, security in a broad sense and the right to fair trial. Despite the presence of several international declarations...

Are all credit default swap databases equal?

Mayordomo, Sergio; Peña Sánchez de Rivera, Juan Ignacio; Schwartz, Eduardo S.
Fonte: National Bureau of Economic Research Publicador: National Bureau of Economic Research
Tipo: info:eu-repo/semantics/submittedVersion; info:eu-repo/semantics/workingPaper Formato: application/pdf
Publicado em /12/2010 ENG
Relevância na Pesquisa
26.48%
The presence of different prices in different databases for the same securities can impair the comparability of research efforts and seriously damage the management decisions based upon such research. In this study we compare the six major sources of corporate Credit Default Swap prices: GFI, Fenics, Reuters EOD, CMA, Markit and JP Morgan, using the most liquid single name 5-year CDS of the components of the leading market indexes, iTraxx (European firms) and CDX (US firms) for the period from 2004 to 2010. We find systematic differences between the data sets implying that deviations from the common trend among prices in the different databases are not purely random but are explained by idiosyncratic factors as well as liquidity, global risk and other trading factors. The lower is the amount of transaction prices available the higher is the deviation among databases. Our results suggest that the CMA database quotes lead the price discovery process in comparison with the quotes provided by other databases. Several robustness tests confirm these results.