snowdeal.org > {bio,medical}informatics: Business Intelligence Network: How Federated Databases Benefit Bioinformatics Research

{bio,medical} informatics

Tuesday, January 10, 2006

bookmark: connotea :: del.icio.us ::digg ::furl ::reddit ::yahoo::

Business Intelligence Network How Federated Databases Benefit Bioinformatics Research

"“Find 3-D structure of hypothetical protein UPF0301 in V. cholerae, SWISS-PROT homology models, PFAM/SCOP classifications, and recent PUBMED entries.”

These are difficult tasks posed by biomedical researchers.

If there is any hope of providing answers for them, bioinformaticians will have to provide coherent access to a wide variety of data and information sources. Database federations are one promising solution. These federations act as integration points to bring together biological and biomedical information from widely distributed sources. Let’s briefly examine database federations and their benefits for biomedical research."

redux [08.16.05]
Business Intelligence Network Bioinformatics Data Integration
"The need for data integration is widely acknowledged in the bioinformatics community. Bioinformatics data is currently spread across the Internet and throughout organizations in a wide variety of formats. Success in most bioinformatics-related activities, from functional characterization of genomic sequences to prioritization of drug targets, requires an integrated view of all relevant data in a drug discovery R&D program. The challenges of data integration may be addressed using a wide variety of approaches, and integration systems abound in both the academic and commercial sectors. While each approach has strengths and weaknesses, it can be difficult to evaluate which approach suits a particular need best without fully understanding the data integration landscape."

redux [11.05.03]
Nature: Science Update Biology gets digital in Maryland
"The scientists spending two days at the National Institutes of Health (NIH) in Bethesda this week want to integrate the reams of information spewed out from sequencing machines and computer models.

At the moment, it is a struggle to link a patient's genetic profile with their brain scans and the latest clinical studies. It's like a primitive PC running incompatible word-processing, e-mail and spreadsheet programs, says Erik Jakobsson of the National Institute of General Medical Sciences, who helped to convene the meeting. "We're way behind in making it all work together," he says."

redux [03.31.03]
eWeek Virtual Databases Make Sense of Varied Data
"Separately, IBM last week announced it is working with a Canadian bioresearch center to create an information system that uses a virtual database to integrate data from a variety of databases, flat-file formats and file types. The iQ Engine, being developed with iCapture Center, of Vancouver, British Columbia, uses IBM's DB2 database and DiscoveryLink integration technology.

The goal is to create a system that will assist researchers in correlating genetic susceptibility of patients with cardiovascular and respiratory diseases to environment influences such as culture, socioeconomic status, educational background, inhaled cigarette smoke, pollutants, viruses, allergens, diet and obesity."

redux [03.17.02]
The Scientist Life Sentences
[requires 'free' registration]

"The great challenge in biological research today is how to turn data into knowledge. I have met people who think data is knowledge but these people are then striving for a means of turning knowledge into understanding. Knowledge and science are related words and to know, I believe, is to understand. Before rushing to convert genomics to 'genamics' and finding that it is another dead end, we should consider evacuating the Tower of Babel. We need a theoretical framework in which to embed biological data so that the endless stream of data, filled with the flotsam and jetsam of evolution, can be sifted and abstracted.

Very simply, the network we should be interested is not the network of names but the network of the objects themselves. The language of these objects is not the Oxford Dictionary of Molecular Biology--the Ontology Consortium's main source--but that of molecular recognition, the language of molecular biology itself."

redux [01.08.02]
Stanford Medical Informatics Preprint Archive Ontology Development for a Pharmacogenetics Knowledge Base
"Research directed toward discovering how genetic factors influence a patient's response to drugs requires coordination of data produced from laboratory experiments, computational methods, and clinical studies. A public respository of pharmacogenetic data to which investigators from different centers can contribute will facilitate hypothesis generation for further research. We are developing a pharmacogenetics knowledge base (PharmGKB) that will support storage and retrieval of experimental data and conceptual knowledge. We are confronted with the challenge of designing an Internet-based resource that integrates complex biological, pharmacological, and clinical data in such a way that researchers can submit their data and users can retrieve information that supports genotype phenotype correlations. Successful management of the names, meaning, and organization of concepts used within the system is crucial. We have selected a frame-based knowledge-representation system for development of an ontology of concepts and relationships that represent the domain and that will permit storage of experimental data. Preliminary experience shows that the ontology we have developed for gene-sequence data submissions is appropriate for experimental data that researchers will enter."

The Molecular Biology Ontology Working Group An Evaluation of Ontology Exchange Languages for Bioinformatics
"Ontologies are specifications of the concepts in a given field and the relationships among those concepts. The development of ontologies for molecular-biology information and the sharing of those ontologies within the bioinformatics community are central problems in bioinformatics. If the bioinformatics community is to share ontologies effectively, ontologies must be exchanged in a form that uses standardized syntax and semantics. This paper reports on an effort among the authors to evaluate a number of alternative ontology-exchange languages, and to recommend one or more languages for use within the larger bioinformatics community. The study selected a set of candidate languages, and defined a set of capabilities that the ideal ontology-exchange language should satisfy. The study scored the languages according to the degree to which they provided each capability. In addition, the authors performed several ontology-exchange experiments with the two languages that received the highest scores: OML and Ontolingua. The result of those experiments, and the main conclusions of this study, was that the frame-based semantic model of Ontolingua is preferable to the conceptual graph model of OML, but that the XML-based syntax of OML is preferable to the Lisp-based syntax of Ontolingua."

redux [05.10.00]
SemanticWeb.Org Tutorial on Knowledge Markup Techniques
"There is an increasing demand for formalized knowledge on the Web. Several communities (e.g. in bioinformatics and educational media) are getting ready to offer semiformal or formal Web content. XML-based markup languages provide a 'universal' storage and interchange format for such Web-distributed knowledge representation. This tutorial introduces techniques for knowledge markup: we show how to map AI representations (e.g., logics and frames) to XML (incl. RDF and RDF Schema), discuss how to specify XML DTDs and RDF (Schema) descriptions for various representations, survey existing XML extensions for knowledge bases/ontologies, deal with the acquisition and processing of such representations, and detail selected applications. After the tutorial, participants will have absorbed the theoretical foundation and practical use of knowledge markup and will be able to assess XML applications and extensions for AI. Besides bringing to bear existing AI techniques for a Web-based knowledge markup scenario, the tutorial will identify new AI research directions for further developing this scenario."

redux [03.22.01]
Peter Karp A Vision of DB Interoperation
"To realize the full potential of biological databases requires more than the interactive, hypertext flavor of database interoperation that is now so popular in the bioinformatics community. Interoperation based on declarative queries to multiple network-accessible databases will support analyses and investigations that are orders of magnitude faster and more powerful than what can be accomplished through interactive navigation. I present a vision of the capabilities that a query-based interoperation infrastructure should provide, and identify assumptions behind, and requirements of, this vision. I then propose an architecture for query-based interoperation that identifies a number of novel components of an information infrastructure for molecular biology. Those components include: A knowledge base that describes relationships among the conceptualizations used in different biological databases; a module that can determine what known DBs are relevant to a particular query; a module that can translate a query, or the results of a query, from one conceptualization to another; a family of DB drivers that provide uniform physical access to different DBMSs; a family of translators that can interconvert among different database schema languages; and a database that describes the network location and access methods for biological databases. A number of the components are translators because biological databases exhibit heterogeneity at several different levels, including the conceptual level, the data model, the query language, and data formats."

redux [02.28.01]
PENN Database Research Group K2/Kleisli and GUS: Experiments in integrated access to genomic data sources
"The integration of heterogeneous data sources and software systems is a major issue in the biomedical community and several approaches have been explored: linking databases, "on-the-fly" integration through views, and integration through warehousing. In this paper we report on our experiences with two systems that were developed at the University of Pennsylvania: an integration system called K2, which has primarily been used to provide views over multiple external data sources and software systems; and a data warehouse called GUS which downloads, cleans, integrates and annotates data from multiple external data sources. Although the view and warehouse approaches each have their advantages, there is no clear "winner". Therefore, users must consider how the data is to be used, what the performance guarantees must be, and how much programmer time and expertise is available to choose the best strategy for a particular application. Our experiences also point to some practical tips on how updates should be published by the community, and how XML can be used to facilitate the processing of updates in a warehousing environment."

redux [01.17.01]
The Collection of Computer Science Bibliographies Bibliography on Mediation, Database Integration, Database Interoperability and related topics
"Personal bibliography on query mediation, database integration, database interoperability and related topics, concentrating on projects in genomic research."

[ rhetoric ]

“Bioinformatics will be at the core of biology in the 21st century. In fields ranging from structural biology to genomics to biomedical imaging, ready access to data and analytical tools are fundamentally changing the way investigators in the life sciences conduct research and approach problems. Complex, computationally intensive biological problems are now being addressed and promise to significantly advance our understanding of biology and medicine. No biological discipline will be unaffected by these technological breakthroughs.”

BIOINFORMATICS IN THE 21st CENTURY

[ search ]

[ outbound ]

{bio,medical} informatics

[ rhetoric ]

[ search ]

[ outbound ]

[ schwag ]

[ et cetera ]