Home > DataMining
 
 
MarkerInfoFinder Resources

Introduction:
       Genome-wide high density SNP association studies are expected to identify various SNP alleles associated with different complex disorders. Understanding the biological significance of these SNP alleles in the context of existing literature is a major challenge, as the current Medline search engine and Google Scholar are not designed for SNP or other genetic marker-based searches. Although extensive efforts have been devoted to the literature mining of gene and protein functions, similar work on genetic markers and their related diseases is still at its infancy. Our goal is to develop an effective solution that can facilitate the mining of Medline literature related to genetic studies and gene/protein function studies. Our solution is based on four main function modules: (1) identification of different types of genetic markers or genetic variations in Medline records; (2) distinguish positive versus negative linkage or association relationship between genetic markers and diseases; (3) integrate marker genomic location data from different databases to enable HapMap data-based query for retrieving Medline records for all genetically related markers, such as markers in the same linkage disequilibrium region; (4) a web interface called MarkerInfoFinder to search, display, sort and download Medline citation results. Tests using published data suggest MarkerInfoFinder can significantly increase the efficiency of finding genetic disorders and their underlying molecular mechanisms. Functions we developed will also be used to build a knowledge base for genetic markers and diseases.

 
Evaluation Corpus and Annotation:

(1) Cyoband and STS Evaluation Corpus and Annotation:

 
Cytoband Extraction
STS Extraction
Testing corpus
Corpus PMIDs
Testing results
Annotation results

 

(2) Genetic Disease and Marker-Disease Negative Association Evaluation Corpus and Annotation:

 
Genetic Disease Extraction
Marker-Disease Negagive Association Identification
Testing corpus
Corpus PMIDs
Testing results
Annotation results

 

 
Evaluation Online:
                          Scan cytoband in text online  Cytoband Extraction Online
Other Compiled Resources:
             
Download
Description
Download compiled resources used in our progams
Resource: Compiled genetic or sequence variation-related MeSH descriptors
Download compiled resources used in our progams
Resource: disease feature terms
Download compiled resources used in our progams
Resource: negation feature patterns
Download compiled resources used in our progams
Resource: negation feature action (base form)
Download compiled resources used in our progams
Resource: frequent last tokens of disease names
Download compiled resources used in our progams
Resource: common short words in Medline
Download compiled resources used in our progams
Resource: words for stop disease name scanning
Download compiled resources used in our progams
Resource: inflections of disease keywords
Download compiled resources used in our progams
Resource: trivial disease keywords
Download compiled resources used in our progams
Resource: OMIM to gene mapping
Download compiled resources used in our progams
Resource: disease keyword frequency
Download compiled resources used in our progams
Resource: OMIM disease name normalization
Download compiled resources used in our progams
Resource: cytoband regions
Download compiled resources used in our progams
Resource: cytoband regions (human, max/min)
Download compiled resources used in our progams
Resource: cytoband feature terms 1
Download compiled resources used in our progams
Resource: cytoband feature terms 2
Download compiled resources used in our progams
Resource: medline records with SNP annotation
Download compiled resources used in our progams
Resource: STS genomic locations
Download compiled resources used in our progams
Resource: most ambiguous short symbols from gene/protein identification
Download compiled resources used in our progams
Resource: most ambiguous gene/protein patterns
Download compiled resources used in our progams
Resource: stop word list
Download compiled resources used in our progams
Resource: strong gene/protein feature terms
 
             

 

 
Contact: Contact Us
 
 

 

                                                           

 Microarray Lab, Department of Psychiatry / Molecular and Behavioral Neuroscience Institute, University of Michigan

 

                                                             Welcome:<%=Response.Write(Session.SessionID) %>