Página 4 dos resultados de 23237 itens digitais encontrados em 0.019 segundos

Forensic DNA databases

Corte-Real, Francisco
Fonte: Universidade de Coimbra Publicador: Universidade de Coimbra
Tipo: Artigo de Revista Científica Formato: aplication/PDF
ENG
Relevância na Pesquisa
36.41%
Genetic databases have been created in several countries: the United Kingdom was the first European country to have, in 1995, a DNA database. Subsequently, the Netherlands and Austria (1997), Germany (1998), Finland and Norway (1999) and many others have introduced or are preparing databases.; http://www.sciencedirect.com/science/article/B6T6W-4DTKFSF-6/1/b1b40e97eb4be818493af08cb4b86cc1

O processo de extração de conhecimento de base de dados apoiado por agentes de software.; The process of knowledge discovery in databases supported by software agents.

Oliveira, Robson Butaca Taborelli de
Fonte: Biblioteca Digitais de Teses e Dissertações da USP Publicador: Biblioteca Digitais de Teses e Dissertações da USP
Tipo: Dissertação de Mestrado Formato: application/pdf
Publicado em 01/12/2000 PT
Relevância na Pesquisa
36.41%
Os sistemas de aplicações científicas e comerciais geram, cada vez mais, imensas quantidades de dados os quais dificilmente podem ser analisados sem que sejam usados técnicas e ferramentas adequadas de análise. Além disso, muitas destas aplicações são voltadas para Internet, ou seja, possuem seus dados distribuídos, o que dificulta ainda mais a realização de tarefas como a coleta de dados. A área de Extração de Conhecimento de Base de Dados diz respeito às técnicas e ferramentas usadas para descobrir automaticamente conhecimento embutido nos dados. Num ambiente de rede de computadores, é mais complicado realizar algumas das etapas do processo de KDD, como a coleta e processamento de dados. Dessa forma, pode ser feita a utilização de novas tecnologias na tentativa de auxiliar a execução do processo de descoberta de conhecimento. Os agentes de software são programas de computadores com propriedades, como, autonomia, reatividade e mobilidade, que podem ser utilizados para esta finalidade. Neste sentido, o objetivo deste trabalho é apresentar a proposta de um sistema multi-agente, chamado Minador, para auxiliar na execução e gerenciamento do processo de Extração de Conhecimento de Base de Dados.; Nowadays, commercial and scientific application systems generate huge amounts of data that cannot be easily analyzed without the use of appropriate tools and techniques. A great number of these applications are also based on the Internet which makes it even more difficult to collect data...

Bancos de dados geográficos: uma análise das arquiteturas dual (Spring) e integrada (Oracle Spatial). ; Spatial databases: an analyse of the architectures dual (Spring) e integrated (Oracle Spatial).

Silva, Rosângela
Fonte: Biblioteca Digitais de Teses e Dissertações da USP Publicador: Biblioteca Digitais de Teses e Dissertações da USP
Tipo: Dissertação de Mestrado Formato: application/pdf
Publicado em 29/08/2002 PT
Relevância na Pesquisa
36.41%
As características particulares dos dados geográficos constituem a razão pela qual se faz necessário estruturar novos tipos de dados e arquitetar novas formas de armazenamento e acesso aos dados. Este trabalho apresenta uma análise considerando as Arquiteturas Dual e Integrada em relação à forma de gerenciamento e recuperação da informação espacial, em conjunto com as informações não espaciais. Este trabalho aborda os conceitos fundamentais acerca dos Sistemas Gerenciadores de Banco de Dados Geográficos. Para demonstrar como estes conceitos são importantes e influenciam diretamente na eficiência dos mesmos, conclui-se o trabalho com o desenvolvimento de alguns testes de funcionalidade sob duas ferramentas com arquiteturas distintas, são elas: o SPRING, de Arquitetura Dual, e o ORACLE SPATIAL, de Arquitetura Integrada. Os testes de funcionalidade objetivaram verificar se e como as ferramentas em estudo, suportam determinados tipos de consultas espaciais. Para tanto foi escolhido o cenário de Planejamento Urbano e selecionados alguns tipos de consultas envolvendo componentes espaciais, que normalmente são implementadas neste tipo de aplicação. Os resultados obtidos permitem concluir, principalmente, que as ferramentas analisadas suportaram as consultas espaciais utilizadas nos testes - algumas envolvendo o objeto espacial e o atributo ao mesmo tempo - porém...

Modelagem de processo de extração de conhecimento em banco de dados para sistemas de suporte à decisão.; Modeling of knowledge discovery in databases for decision systems.

Shiba, Sonia Kaoru
Fonte: Biblioteca Digitais de Teses e Dissertações da USP Publicador: Biblioteca Digitais de Teses e Dissertações da USP
Tipo: Dissertação de Mestrado Formato: application/pdf
Publicado em 26/06/2008 PT
Relevância na Pesquisa
36.41%
Este trabalho apresenta a modelagem de um processo de extração de conhecimento, onde a aquisição de informações para a análise de dados têm como origem os bancos de dados transacionais e data warehouse. A mineração de dados focou-se na geração de modelos descritivos a partir de técnicas de classificação baseada no Teorema de Bayes e no método direto de extração de regras de classificação, definindo uma metodologia para a geração de modelos de aprendizagem. Foi implementado um processo de extração de conhecimento para a geração de modelos de aprendizagem para suporte à decisão, aplicando técnicas de mineração de dados para modelos descritivos e geração de regras de classificação. Explorou-se a possibilidade de transformar os modelos de aprendizagem em bases de conhecimento utilizando um banco de dados relacional, disponível para acesso via sistema especialista, para a realização de novas classificações de registros, ou então possibilitar a visualização dos resultados a partir de planilhas eletrônicas. No cenário descrito neste trabalho, a organização dos procedimentos da etapa de pré-processamento permitiu que a extração de atributos adicionais ou transformação de dados fosse realizada de forma iterativa...

Análise visual de dados relacionais: uma abordagem interativa suportada por teoria dos grafos; Visual analysis of relational databases: an interactive approach supported by graph theory

Lima, Daniel Mário de
Fonte: Biblioteca Digitais de Teses e Dissertações da USP Publicador: Biblioteca Digitais de Teses e Dissertações da USP
Tipo: Dissertação de Mestrado Formato: application/pdf
Publicado em 18/12/2013 PT
Relevância na Pesquisa
36.41%
Bancos de dados relacionais são fontes de dados rigidamente estruturadas, caracterizadas por relacionamentos complexos entre um conjunto de relações (tabelas). Entender tais relacionamentos é um desafio, porque os usuários precisam considerar múltiplas relações, entender restrições de integridade, interpretar vários atributos, e construir consultas SQL para cada tentativa de exploração. Neste cenário, introduz-se uma metodologia em duas etapas; primeiro utiliza-se um grafo organizado como uma estrutura hierárquica para modelar os relacionamentos do banco de dados, e então, propõe-se uma nova técnica de visualização para exploração relacional. Os resultados demonstram que a proposta torna a exploração de bases de dados significativamente simplificada, pois o usuário pode navegar visualmente pelos dados com pouco ou nenhum conhecimento sobre a estrutura subjacente. Além disso, a navegação visual de dados remove a necessidade de consultas SQL, e de toda complexidade que elas requerem. Acredita-se que esta abordagem possa trazer um paradigma inovador no que tange à compreensão de dados relacionais; Relational databases are rigid-structured data sources characterized by complex relationships among a set of relations (tables). Making sense of such relationships is a challenging problem because users must consider multiple relations...

Descoberta de equivalência semântica entre atributos em bancos de dados utilizando redes neurais; Discovering semantic equivalences on attributes in databases using neural networks

Lima Junior, José
Fonte: Universidade Federal do Rio Grande do Sul Publicador: Universidade Federal do Rio Grande do Sul
Tipo: Dissertação Formato: application/pdf
POR
Relevância na Pesquisa
36.41%
Com o crescimento das empresas que fazem uso das tecnologias de bancos de dados, os administradores destes bancos de dados criam novos esquemas a cada instante, e na maioria dos casos não existe uma normalização ou procedimentos formais para que tal tarefa seja desempenhada de forma homogênea, resultando assim em bases de dados incompatíveis, o que dificulta a troca de dados entre as mesmas. Quando os Sistemas de Bancos de Dados (SBD) são projetados e implementados independentemente, é normal que existam incompatibilidades entre os dados de diferentes SBD. Como principais conflitos existentes nos esquemas de SBD, podem ser citados problemas relacionados aos nomes dos atributos, armazenamento em diferentes unidades de medida, diferentes níveis de detalhes, atributos diferentes com mesmo nome ou atributos iguais com nomes diferentes, tipos de dado diferentes, tamanho, precisão, etc. Estes problemas comprometem a qualidade da informação e geram maiores custos em relação à manutenção dos dados. Estes problemas são conseqüências de atributos especificados de forma redundante. Estes fatos têm provocado grande interesse em descobrir conhecimento em banco de dados para identificar informações semanticamente equivalentes armazenadas nos esquemas. O processo capaz de descobrir este conhecimento em banco de dados denomina-se DCDB (Descoberta de Conhecimento em Bancos de Dados). As ferramentas disponíveis para a execução das tarefas de DCDB são genéricas e derivadas de outras áreas do conhecimento...

Molecular modeling databases: A new way in the search of protein targets for drug development

da Silveira, Nelson José Freitas; Bonalumi, Carlos Eduardo; Arcuri, Helen Andrade; de Azevedo Jr., Walter Filgueira
Fonte: Universidade Estadual Paulista Publicador: Universidade Estadual Paulista
Tipo: Artigo de Revista Científica Formato: 1-10
ENG
Relevância na Pesquisa
36.41%
DBMODELING is a relational database of annotated comparative protein structure models and their metabolic, pathway characterization. It is focused on enzymes identified in the genomes of Mycobacterium tuberculosis and Xylella fastidiosa. The main goal of the present database is to provide structural models to be used in docking simulations and drug design. However, since the accuracy of structural models is highly dependent on sequence identity between template and target, it is necessary to make clear to the user that only models which show high structural quality should be used in such efforts. Molecular modeling of these genomes generated a database, in which all structural models were built using alignments presenting more than 30% of sequence identity, generating models with medium and high accuracy. All models in the database are publicly accessible at http://www.biocristalografia.df.ibilce.unesp.br/tools. DBMODELING user interface provides users friendly menus, so that all information can be printed in one stop from any web browser. Furthermore, DBMODELING also provides a docking interface, which allows the user to carry out geometric docking simulation, against the molecular models available in the database. There are three other important homology model databases: MODBASE...

ETHNOS: A versatile electronic tool for the development and curation of national genetic databases

van Baal, Sjozef; Zlotogora, Joël; Lagoumintzis, George; Gkantouna, Vassiliki; Tzimas, Ioannis; Poulas, Konstantinos; Tsakalidis, Athanassios; Romeo, Giovanni; Patrinos, George P
Fonte: BioMed Central Publicador: BioMed Central
Tipo: Artigo de Revista Científica
Publicado em 01/06/2010 EN
Relevância na Pesquisa
36.41%
National and ethnic mutation databases (NEMDBs) are emerging online repositories, recording extensive information about the described genetic heterogeneity of an ethnic group or population. These resources facilitate the provision of genetic services and provide a comprehensive list of genomic variations among different populations. As such, they enhance awareness of the various genetic disorders. Here, we describe the features of the ETHNOS software, a simple but versatile tool based on a flat-file database that is specifically designed for the development and curation of NEMDBs. ETHNOS is a freely available software which runs more than half of the NEMDBs currently available. Given the emerging need for NEMDB in genetic testing services and the fact that ETHNOS is the only off-the-shelf software available for NEMDB development and curation, its adoption in subsequent NEMDB development would contribute towards data content uniformity, unlike the diverse contents and quality of the available gene (locus)-specific databases. Finally, we allude to the potential applications of NEMDBs, not only as worldwide central allele frequency repositories, but also, and most importantly, as data warehouses of individual-level genomic data, hence allowing for a comprehensive ethnicity-specific documentation of genomic variation.

Recent updates and developments to plant genome size databases

Garcia, Sònia; Leitch, Ilia J.; Anadon-Rosell, Alba; Canela, Miguel Á.; Gálvez, Francisco; Garnatje, Teresa; Gras, Airy; Hidalgo, Oriane; Johnston, Emmeline; Mas de Xaxars, Gemma; Pellicer, Jaume; Siljak-Yakovlev, Sonja; Vallès, Joan; Vitales, Daniel;
Fonte: Oxford University Press Publicador: Oxford University Press
Tipo: Artigo de Revista Científica
EN
Relevância na Pesquisa
36.41%
Two plant genome size databases have been recently updated and/or extended: the Plant DNA C-values database (http://data.kew.org/cvalues), and GSAD, the Genome Size in Asteraceae database (http://www.asteraceaegenomesize.com). While the first provides information on nuclear DNA contents across land plants and some algal groups, the second is focused on one of the largest and most economically important angiosperm families, Asteraceae. Genome size data have numerous applications: they can be used in comparative studies on genome evolution, or as a tool to appraise the cost of whole-genome sequencing programs. The growing interest in genome size and increasing rate of data accumulation has necessitated the continued update of these databases. Currently, the Plant DNA C-values database (Release 6.0, Dec. 2012) contains data for 8510 species, while GSAD has 1219 species (Release 2.0, June 2013), representing increases of 17 and 51%, respectively, in the number of species with genome size data, compared with previous releases. Here we provide overviews of the most recent releases of each database, and outline new features of GSAD. The latter include (i) a tool to visually compare genome size data between species, (ii) the option to export data and (iii) a webpage containing information about flow cytometry protocols.

Recent updates and developments to plant genome size databases

Garcia, Sònia; Leitch, Ilia J.; Anadon-Rosell, Alba; Canela, Miguel Á.; Gálvez, Francisco; Garnatje, Teresa; Gras, Airy; Hidalgo, Oriane; Johnston, Emmeline; Mas de Xaxars, Gemma; Pellicer, Jaume; Siljak-Yakovlev, Sonja; Vallès, Joan; Vitales, Daniel;
Fonte: Oxford University Press Publicador: Oxford University Press
Tipo: Artigo de Revista Científica
EN
Relevância na Pesquisa
36.41%
Two plant genome size databases have been recently updated and/or extended: the Plant DNA C-values database (http://data.kew.org/cvalues), and GSAD, the Genome Size in Asteraceae database (http://www.asteraceaegenomesize.com). While the first provides information on nuclear DNA contents across land plants and some algal groups, the second is focused on one of the largest and most economically important angiosperm families, Asteraceae. Genome size data have numerous applications: they can be used in comparative studies on genome evolution, or as a tool to appraise the cost of whole-genome sequencing programs. The growing interest in genome size and increasing rate of data accumulation has necessitated the continued update of these databases. Currently, the Plant DNA C-values database (Release 6.0, Dec. 2012) contains data for 8510 species, while GSAD has 1219 species (Release 2.0, June 2013), representing increases of 17 and 51%, respectively, in the number of species with genome size data, compared with previous releases. Here we provide overviews of the most recent releases of each database, and outline new features of GSAD. The latter include (i) a tool to visually compare genome size data between species, (ii) the option to export data and (iii) a webpage containing information about flow cytometry protocols.

Quality standards for DNA sequence variation databases to improve clinical management under development in Australia

Bennetts, B.; Caramins, M.; Hsu, A.; Lau, C.; Mead, S.; Meldrum, C.; Smith, T.D.; Suthers, G.; Taylor, G.R.; Cotton, R.G.H.; Tyrrell, V.
Fonte: Elsevier Publicador: Elsevier
Tipo: Artigo de Revista Científica
Publicado em //2014 EN
Relevância na Pesquisa
36.41%
Despite the routine nature of comparing sequence variations identified during clinical testing to database records, few databases meet quality requirements for clinical diagnostics. To address this issue, The Royal College of Pathologists of Australasia (RCPA) in collaboration with the Human Genetics Society of Australasia (HGSA), and the Human Variome Project (HVP) is developing standards for DNA sequence variation databases intended for use in the Australian clinical environment. The outputs of this project will be promoted to other health systems and accreditation bodies by the Human Variome Project to support the development of similar frameworks in other jurisdictions.; B. Bennetts, M. Caramins, A. Hsu, C. Lau, S. Mead, C.Meldrum, T.D. Smith, G. Suthers, G.R. Taylor, R.G.H. Cotton, V. Tyrrell

An Abstract Algebraic Theory of L-Fuzzy Relations for Relational Databases

Chowdhury, Abdul Wazed
Fonte: Brock University Publicador: Brock University
Tipo: Electronic Thesis or Dissertation
ENG
Relevância na Pesquisa
36.41%
Classical relational databases lack proper ways to manage certain real-world situations including imprecise or uncertain data. Fuzzy databases overcome this limitation by allowing each entry in the table to be a fuzzy set where each element of the corresponding domain is assigned a membership degree from the real interval [0…1]. But this fuzzy mechanism becomes inappropriate in modelling scenarios where data might be incomparable. Therefore, we become interested in further generalization of fuzzy database into L-fuzzy database. In such a database, the characteristic function for a fuzzy set maps to an arbitrary complete Brouwerian lattice L. From the query language perspectives, the language of fuzzy database, FSQL extends the regular Structured Query Language (SQL) by adding fuzzy specific constructions. In addition to that, L-fuzzy query language LFSQL introduces appropriate linguistic operations to define and manipulate inexact data in an L-fuzzy database. This research mainly focuses on defining the semantics of LFSQL. However, it requires an abstract algebraic theory which can be used to prove all the properties of, and operations on, L-fuzzy relations. In our study, we show that the theory of arrow categories forms a suitable framework for that. Therefore...

Análisis del nivel de aplicación y uso docente de herramientas teleformativas en el área de programación y bases de datos; Analysis of the level of implementation and use e-learning tools by the professors in the area of programming and databases

Muñoz Carril, Pablo César; González Sanmamed, Mercedes
Fonte: Universidade de Múrcia Publicador: Universidade de Múrcia
Tipo: Artigo de Revista Científica Formato: application/pdf
SPA
Relevância na Pesquisa
36.41%
This paper presents the results of a quantitative survey research, developed at the University of A Coruña. Part of this research focused to know what level of use and handling that university teachers engaged on e-learning tools belonging to the area of systems programming and database systems.Statistical analysis conducted in this investigation, showed that the higher the level of teacher training on tools and applications related to programming and databases, the greater the degree of implementation and use of these programs. It revealed a significant positive correlation.Likewise, the inferential analysis performed showed that there are certain variables such as "teaching experience using virtual environments" and "administrative category, which significantly influenced the level of application and use that showed teachers in the programming area and databases at the e-learning environments.; El presente artículo muestra los resultados obtenidos en un estudio cuantitativo tipo “survey”, desarrollado en la Universidad de A Coruña. Una parte de esta investigación se centró en conocer cuál era el nivel de uso y manejo que los docentes universitarios realizaban respecto a herramientas teleformativas pertenecientes al área de programación y sistemas gestores de bases de datos.Los análisis estadísticos realizados en dicha investigación...

Databases and Information Integration for the Medicago truncatula Genome and Transcriptome1

Cannon, Steven B.; Crow, John A.; Heuer, Michael L.; Wang, Xiaohong; Cannon, Ethalinda K.S.; Dwan, Christopher; Lamblin, Anne-Francoise; Vasdewani, Jayprakash; Mudge, Joann; Cook, Andrew; Gish, John; Cheung, Foo; Kenton, Steve; Kunau, Timothy M.; Brown, D
Fonte: American Society of Plant Biologists Publicador: American Society of Plant Biologists
Tipo: Artigo de Revista Científica
Publicado em /05/2005 EN
Relevância na Pesquisa
36.41%
An international consortium is sequencing the euchromatic genespace of Medicago truncatula. Extensive bioinformatic and database resources support the marker-anchored bacterial artificial chromosome (BAC) sequencing strategy. Existing physical and genetic maps and deep BAC-end sequencing help to guide the sequencing effort, while EST databases provide essential resources for genome annotation as well as transcriptome characterization and microarray design. Finished BAC sequences are joined into overlapping sequence assemblies and undergo an automated annotation process that integrates ab initio predictions with EST, protein, and other recognizable features. Because of the sequencing project's international and collaborative nature, data production, storage, and visualization tools are broadly distributed. This paper describes databases and Web resources for the project, which provide support for physical and genetic maps, genome sequence assembly, gene prediction, and integration of EST data. A central project Web site at medicago.org/genome provides access to genome viewers and other resources project-wide, including an Ensembl implementation at medicago.org, physical map and marker resources at mtgenome.ucdavis.edu, and genome viewers at the University of Oklahoma (www.genome.ou.edu)...

Strong types for relational databases : functional pearl

Visser, J.; Silva, Alexandra M.
Fonte: ACM Press Publicador: ACM Press
Tipo: Conferência ou Objeto de Conferência
Publicado em //2006 ENG
Relevância na Pesquisa
36.41%
Haskell's type system with multi-parameter constructor classes and functional dependencies allows static (compile-time) computations to be expressed by logic programming on the level of types. This emergent capability has been exploited for instance to model arbitrary-length tuples (heterogeneous lists), extensible records, functions with variable length argument lists, and (homogenous) lists of statically fixed length (vectors).We explain how type-level programming can be exploited to define a strongly-typed model of relational databases and operations on them. In particular, we present a strongly typed embedding of a significant subset of SQL in Haskell. In this model, meta-data is represented by type-level entities that guard the semantic correctness of database operations at compile time.Apart from the standard relational database operations, such as selection and join, we model functional dependencies (among table attributes), normal forms, and operations for database transformation. We show how functional dependency information can be represented at the type level, and can be transported through operations. This means that type inference statically computes functional dependencies on the result from those on the arguments.Our model shows that Haskell can be used to design and prototype typed languages for designing...

A survey on data and transaction management in mobile databases

Selvarani, D. Roselin; Ravi, T. N.
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 23/11/2012
Relevância na Pesquisa
36.48%
The popularity of the Mobile Database is increasing day by day as people need information even on the move in the fast changing world. This database technology permits employees using mobile devices to connect to their corporate networks, hoard the needed data, work in the disconnected mode and reconnect to the network to synchronize with the corporate database. In this scenario, the data is being moved closer to the applications in order to improve the performance and autonomy. This leads to many interesting problems in mobile database research and Mobile Database has become a fertile land for many researchers. In this paper a survey is presented on data and Transaction management in Mobile Databases from the year 2000 onwards. The survey focuses on the complete study on the various types of Architectures used in Mobile databases and Mobile Transaction Models. It also addresses the data management issues namely Replication and Caching strategies and the transaction management functionalities such as Concurrency Control and Commit protocols, Synchronization, Query Processing, Recovery and Security. It also provides Research Directions in Mobile databases.; Comment: 20 Pages; International Journal of Database Management Systems (IJDMS) Vol.4...

Consistency Checking and Querying in Probabilistic Databases under Integrity Constraints

Flesca, Sergio; Furfaro, Filippo; Parisi, Francesco
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 13/03/2013
Relevância na Pesquisa
36.48%
We address the issue of incorporating a particular yet expressive form of integrity constraints (namely, denial constraints) into probabilistic databases. To this aim, we move away from the common way of giving semantics to probabilistic databases, which relies on considering a unique interpretation of the data, and address two fundamental problems: consistency checking and query evaluation. The former consists in verifying whether there is an interpretation which conforms to both the marginal probabilities of the tuples and the integrity constraints. The latter is the problem of answering queries under a "cautious" paradigm, taking into account all interpretations of the data in accordance with the constraints. In this setting, we investigate the complexity of the above-mentioned problems, and identify several tractable cases of practical relevance.; Comment: Probabilistic databases, Integrity constraints, Consistency checking

Table manipulation in simplicial databases

Spivak, David I.
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 13/03/2010
Relevância na Pesquisa
36.48%
In \cite{Spi}, we developed a category of databases in which the schema of a database is represented as a simplicial set. Each simplex corresponds to a table in the database. There, our main concern was to find a categorical formulation of databases; the simplicial nature of the schemas was to some degree unexpected and unexploited. In the present note, we show how to use this geometric formulation effectively on a computer. If we think of each simplex as a polygonal tile, we can imagine assembling custom databases by mixing and matching tiles. Queries on this database can be performed by drawing paths through the resulting tile formations, selecting records at the start-point of this path and retrieving corresponding records at its end-point.; Comment: 8 pages.

Constructing Bio-molecular Databases on a DNA-based Computer

Chang, Weng-Long; Michael; Ho; Guo, Minyi
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 11/12/2007
Relevância na Pesquisa
36.48%
Codd [Codd 1970] wrote the first paper in which the model of a relational database was proposed. Adleman [Adleman 1994] wrote the first paper in which DNA strands in a test tube were used to solve an instance of the Hamiltonian path problem. From [Adleman 1994], it is obviously indicated that for storing information in molecules of DNA allows for an information density of approximately 1 bit per cubic nm (nanometer) and a dramatic improvement over existing storage media such as video tape which store information at a density of approximately 1 bit per 1012 cubic nanometers. This paper demonstrates that biological operations can be applied to construct bio-molecular databases where data records in relational tables are encoded as DNA strands. In order to achieve the goal, DNA algorithms are proposed to perform eight operations of relational algebra (calculus) on bio-molecular relational databases, which include Cartesian product, union, set difference, selection, projection, intersection, join and division. Furthermore, this work presents clear evidence of the ability of molecular computing to perform data retrieval operations on bio-molecular relational databases.; Comment: The article includes 35 pages, several tables and figures

Spatial Indexing of Large Multidimensional Databases

Csabai, István; Trencséni, Márton; Herczegh, Géza; Dobos, László; Józsa, Péter; Purger, Norbert; Budavári, Tamás; Szalay, Alexander
Fonte: Universidade Cornell Publicador: Universidade Cornell
Tipo: Artigo de Revista Científica
Publicado em 28/09/2012
Relevância na Pesquisa
36.48%
Scientific endeavors such as large astronomical surveys generate databases on the terabyte scale. These, usually multidimensional databases must be visualized and mined in order to find interesting objects or to extract meaningful and qualitatively new relationships. Many statistical algorithms required for these tasks run reasonably fast when operating on small sets of in-memory data, but take noticeable performance hits when operating on large databases that do not fit into memory. We utilize new software technologies to develop and evaluate fast multidimensional indexing schemes that inherently follow the underlying, highly non-uniform distribution of the data: they are layered uniform grid indices, hierarchical binary space partitioning, and sampled flat Voronoi tessellation of the data. Our working database is the 5-dimensional magnitude space of the Sloan Digital Sky Survey with more than 270 million data points, where we show that these techniques can dramatically speed up data mining operations such as finding similar objects by example, classifying objects or comparing extensive simulation sets with observations. We are also developing tools to interact with the multidimensional database and visualize the data at multiple resolutions in an adaptive manner.; Comment: 12 pages...