snowdeal.org > {bio,medical}informatics

{bio,medical} informatics

Friday, October 17, 2003

bookmark: connotea :: del.icio.us ::digg ::furl ::reddit ::yahoo::

The New York Times Digging for Nuggets of Wisdom
[requires 'free' registration]

"MICHAEL N. LIEBMAN knows his limitations. Even with a Ph.D. and a long career in medical research, he cannot keep up with all the developments in his area of interest, breast cancer. Medline, the database that already houses more than 10 million abstracts for journal articles, is adding 7,000 to 8,000 abstracts per week. Only a fraction of these are about cancer, but the volume of information is daunting nonetheless."

"Yet Dr. Liebman is convinced that new cures could someday emerge for breast cancer if only someone could read all the literature and synthesize it. So he has found a solution: enlisting a computer program to read the articles for him."

redux [11.09.02]
Stanford Medical Informatics Preprint Archive Using Text Analysis to Identify Functionally Coherent Gene Groups
"The analysis of large-scale genomic information (such as sequence data or expression patterns) frequently involves grouping genes on the basis of common experimental features. Often, as with gene expression clustering, there are too many groups to easily identify the functionally relevant ones. One valuable source of information about gene function is the published literature. We present a method, neighbor divergence, for assessing whether the genes within a group share a common biological function based on their associated scientific literature. The method uses statistical natural language processing techniques to interpret biological text. It requires only a corpus of documents relevant to the genes being studied (e.g., all genes in an organism) and an index connecting the documents to appropriate genes. Given a group of genes, neighbor divergence assigns a numerical score indicating how "functionally coherent" the gene group is from the perspective of the published literature. We evaluate our method by testing its ability to distinguish 19 known functional gene groups from 1900 randomly assembled groups. Neighbor divergence achieves 79% sensitivity at 100% specificity, comparing favorably to other tested methods. We also apply neighbor divergence to previously published gene expression clusters to assess its ability to recognize gene groups that had been manually identified as representative of a common function."

redux [10.08.01]
BioNLP.Org Natural language processing of biology text
"The literature of the field of biology is the largest of all the sciences. The volume of biology literature each year, measured in bytes, is about fifty times the size of the entire human genome, junk and all. But locked in this literature is an enormous amount of information that can tell us much about the structure and function of genes, proteins, cells and organisms -- how they work as well as how they can fail.

The newly emergent interest in natural language processing for biology has been christened "Information Extraction". But work in this area has been going on for many decades under different names and this site includes a good deal of information about past and current work in NLP and in information extraction for biology in particular."

redux [04.30.01]
New Scientist Biologists in Norway use a computer program to "read" the scientific literature and successfully predict gene interactions
"Biologists in Norway have used a computer program to "read" the scientific literature and successfully predict gene interactions.

This data-mining of the "biobibliome" provides a way of dealing with the ever-increasing torrent of biological data - millions of papers a year. But even more impressively, the completely automated process can make new genetic discoveries - essentially free research."

Stanford Medical Informatics Preprint Archive Improving Biological Literature Improves Homology Search
"Annotating the tremendous amount of sequence information being generated requires accurate automated methods for recognizing homology. Although sequence similarity is only one of many indicators of evolutionary homology, it is often the only one used. Here we find that supplementing sequence similarity with information from biomedical literature is successful in increasing the accuracy of homology search results. We modified the PSI-BLAST algorithm to use literature similarity in each iteration of its database search. The modified algorithm is evaluated and compared to standard PSI-BLAST in searching for homologous proteins. The performance of the modified algorithm achieved 32% recall with 95% precision, while the original one achieved 33% recall with 84% precision; the literature similarity requirement preserved the sensitive characteristic of the PSI-BLAST algorithm while improving the precision."

MIT Technology Review Emerging Technologies That Will Change the World: Data Mining
"And the future of data-mining technology? Wide open, says Fayyad - especially as researchers begin to move beyond the field's original focus on highly structured, relational databases. One very hot area is "text data mining": extracting unexpected relationships from huge collections of free-form text documents. The results are still preliminary, as various labs experiment with natural-language processing, statistical word counts and other techniques. But the University of California at Berkeley's LINDI system, to take one example, has already been used to help geneticists search the biomedical literature and produce plausible hypotheses for the function of newly discovered genes."

[ rhetoric ]

“Bioinformatics will be at the core of biology in the 21st century. In fields ranging from structural biology to genomics to biomedical imaging, ready access to data and analytical tools are fundamentally changing the way investigators in the life sciences conduct research and approach problems. Complex, computationally intensive biological problems are now being addressed and promise to significantly advance our understanding of biology and medicine. No biological discipline will be unaffected by these technological breakthroughs.”

BIOINFORMATICS IN THE 21st CENTURY

[ search ]

[ outbound ]

{bio,medical} informatics

[ rhetoric ]

[ search ]

[ outbound ]

[ schwag ]

[ et cetera ]