Gene Fusion Events:
A gene fusion event occurs when two separate genes in one organism are found to have high combined sequence identity to one long continous gene in another organism. We took a list of target protein sequences (including some that designate largely unfolded and partially unfolded proteins) and ran a BLAST for them using FusionDB, a database of bacterial and archaeal gene fusion events. Next, we analyzed the hits to the longer "fusion" sequences of another organism. Either the N-terminus or the C-terminus of this fusion gene had high sequence identity with query sequence. This database also produced the sequence of the gene from the query genome that matches the other terminus of the fusion gene. The following criteria are required in order to be considered a gene fusion event:

An example of a good gene fusion event is shown below:

In this example, the query protein "SR10" is similar to a transcriptional regulator in the organism Bacillus subtilis. It is shown in the second alignment in the image above where it has high sequence identity with the C-terminus region of the longer, fusion sequence (in the organism Vibrio parahaemolyticus). The first alignment shown is another protein sequence (similar to peptide methionine sulfoxide reductase) in the same query genome being aligned against the fusion genome. It has high sequence identity with the N-terminus region of the fusion sequence. Together, these two smaller genes produce a high BLAST score with the longer fusion gene (the third allignment shown).
`
`
`
The quality of good gene fusion events can be evaluated in two ways, the fusion index and separation index. Although the separation index is considered to be more important, both are crucial in evaluating the integrity of the gene fusion. The separation index is a value between 0 and 1 where larger values show less overlap and more separation between the two smaller sequences when placed in a triple alignment with the fusion sequence. Separation indices greater than .6 are considered true gene fusion events. The fusion index is also a number between 0 and 1 where higher values designate a greater sequence identity between the two smaller genes and the fusion gene when placed in a triple alignment.