The aim of the PhenCode project
is to get information on mutations throughout the genome onto the
UCSC Genome Browser, which then allows for better integration with
functional annotation and genome conservation. The mutations can be
linked in a two-way connection between the Browser and the individual
Locus Specific Databases (LSDBs). Thus users can start at an LSDB to
find mutations that fit various criteria using the LSDB's query
interface, and then view them on the Browser in register with additional
annotation tracks of their choosing. Conversely, Browser users,
after loading the custom tracks from the PhenCode site, may find
mutations of interest, and follow links back to the source LSDB for
more complete information. Examples of this usage, drawn from
several LSDBs, are illustrated in the
PhenCode examples pages.
Penn State researchers began discussions with the UCSC Genome Browser folks in the summer of 2005, as an offshoot of some work on the ENCODE project. As a proof of principle, Belinda Giardine at Penn State entered into a collaboration with the UCSC staff to build the interfaces and connections between our databases of hemoglobin variants and genotype-phenotype relations (HbVar and GenPhen) and the Genome Browser. At about the same time, UCSC added a track derived from the protein variant data in Swiss-Prot to the test version of the Browser.
We knew that to be really effective, information from many (ideally all) LSDBs needs to be displayed and linked. We thought that two-way connections between the HGVS Central Repository and the UCSC Browser, with further links to the LSDBs, would be the next step. However, at the October 2005 HGVS meeting, we learned that the Central Repository was on hold while the WayStation project got rolling. We met with Dr. Robert Flegg (WayStation) and talked with Prof. Cotton at the meeting. Dr. Flegg agreed that passing new mutations recorded in the WayStation into our system would be a good idea, and we plan to do that. Clearly, having a home for new mutations in the WayStation is highly desirable; we are just offering a means to access, display, and hopefully better interpret them.
Some database curators have been concerned about the time it would take to convert from their database's coordinate system to the human genome assembly coordinates used by the Browser. Belinda has developed software that can do most of this conversion automatically; once the LSDB coordinate system is explained to her, usually the only work required from the curators is to spot-check the results and examine any atypical records that could not be converted automatically. This process should handle the minimal fields required to display the mutations on the Browser. Additional database-specific attributes can also be included and are encouraged, but may entail further consultation.