Introduction:
Genome-wide high density SNP association studies are expected to identify various SNP alleles associated with different complex disorders. Understanding the biological significance of these SNP alleles in the context of existing literature is a major challenge, as the current Medline search engine and Google Scholar are not designed for SNP or other genetic marker-based searches. Although extensive efforts have been devoted to the literature mining of gene and protein functions, similar work on genetic markers and their related diseases is still at its infancy. Our goal is to develop an effective solution that can facilitate the mining of Medline literature related to genetic studies and gene/protein function studies. Our solution is based on four main function modules: (1) identification of different types of genetic markers or genetic variations in Medline records; (2) distinguish positive versus negative linkage or association relationship between genetic markers and diseases; (3) integrate marker genomic location data from different databases to enable HapMap data-based query for retrieving Medline records for all genetically related markers, such as markers in the same linkage disequilibrium region; (4) a web interface called MarkerInfoFinder to search, display, sort and download Medline citation results. Tests using published data suggest MarkerInfoFinder can significantly increase the efficiency of finding genetic disorders and their underlying molecular mechanisms. Functions we developed will also be used to build a knowledge base for genetic markers and diseases. |