Data
Mining
The ability of microarray
technology to generate data on the expression
of thousands of genes at a time has led to an
increased need for cross-reference experiment
data with previously reported biological facts,
theories and results.
Biomedical literature
databases provide knowledge warehouses required
for such cross-reference. However, the overwhelming
amount of biomedical literature makes such task
intimidating.
|
|
|
|
|
|
|
Analyze Affymetrix CEL files with
our powerful customized function |
|
|
|
|
|
The ontology mapping tool will help
the understanding of the biological significance of differentially
expressed gene lists derived from high throughput experiments.
It will map genes to their corresponding Gene Ontology
terms and rank the statistical significance of GO term
matches based on hypergeometric distribution. The method
used here is rather simple, but it's helpful in the sense
that it can translate a list of genes into biological
meanings. |
|
|
|
|
| |
On one side, results are pouring
in from microarray experiments. One the other side, the
volume of literature is growing at unprecedented rate.
Medline alone contains more than 12 million citations,
making it almost impossible for researchers to keep up
with current research in their fields. There is an urgent
need for bridging the gap between high-throughput experiments
and vast knowledge repositories. Automatic extraction
of information from biomedical literatures will thus play
a critical role in aiding in research and speeding up
discovery process. We are currently interested in identifying
and extracting macromolecular entities from biomedical
literatures. Our approach will map the identified entities
to individual LocusLink entries thus enable the seamless
integration of literature information with existing gene
and protein databases.
|
| |
|
|
|
|
Our goal is to develop an effective solution that can
facilitate the mining of Medline literature related to
genetic studies and gene/protein function studies. |
|
|
|
|
|
|
|
Due to the lack of consensus tagged corpus in biomedical
domain, this page is setup to allow users to evaluate
the performance of some components using their own test
data, and hence facilitates the evaluation process. |
|
|
|
|
| |
The Neighborhood Analysis Algorithm is based on the paper published by Vamsi K. Mootha, et al. "Identification of a gene causing human cytochrome C oxidase deficiency by integrative genomics", PNAS, January 21, 2003, vol. 100 no. 2 |