This is about Hox genes, but not in the way I’d normally write about them. They are the rock stars of evo-devo for a reason, but this post is not about their awesomeness. (See this Pharyngula post for a great introduction to Hox genes.) Generally, I’m more interested in how genes evolve than how they work. However, when I saw this paper in PNAS, I couldn’t help but wonder about function.
I’ll explain. The study by Papadopoulos et al. (2011) is about engineering artificial Hox genes to see how various bits of sequence affect the function of the gene product. Its main focus is the role of a small sequence motif, and the “linker” sequence that connects that motif to the main DNA-binding portion of the protein. This is a schematic of a typical Hox protein sequence, not to scale:
In a typical Hox protein, the homeodomain interacts witht DNA (Hox proteins are transcription factors – they turn genes on and off), while the YPWM motif  binds to other proteins . Those other proteins change the preferences of the homeodomain, so they influence which specific genes are switched on or off. All Hox proteins have a homeodomain, and the majority have some variation on the YPWM theme. The yellow parts are more variable.
Now, one thing the aforementioned study did was cut pieces off a particularly well-studied Hox gene called Antennapedia. One of the truncated genes encoded a protein that consisted of just the YPWM motif and homeodomain. This gene appeared to do the same things a normal Antp gene does. It activated the same target genes. Its product interacted with the same proteins. When they switched it on in the wrong places, it turned the antennae of adult flies developing in pupae into legs, and the heads of fly embryos into thoraxes – just like normal Antp.
Just to give you an idea, this is the sequence of the protein in question (copied from Genbank):
MTMSTNNCESMTSYFTNSYMGADMHHGHYPGNGVTDLDAQQMHHYSQNANHQGNMPYPRFPPYDRMPYYN GQGMDQQQQHQVYSRPDSPSSQVGGVMPQAQTNGQLGVPQQQQQQQQQPSQNQQQQQAQQAPQQLQQQLP QVTQQVTHPQQQQQQPVVYASCKLQAAVGGLGMVPEGGSPPLVDQMSGHHMNAQMTLPHHMGHPQAQLGY TDVGVPDVTEVHQNHHNMGMYQQQSGVPPVGAPPQGMMHQGQGPPQMHQGHPGQHTPPSQNPNSQSSGMP SPLYPWMRSQFGKCQERKRGRQTYTRYQTLELEKEFHFNRYLTRRRRIEIAHALC LTERQIKIWFQNRRMKWKKENKTKGEPGSGGEGDEITPPNSPQ
If I understood them right, the highlighted portion is what the experimenters left of it.
So… if that little segment is enough to perform the duties of the whole thing, why the hell is the rest of the protein there? It seems like a total waste of genome space.
(There is an obvious answer, that is this study didn’t actually prove that the full-length and the reduced versions are totally equivalent. For that, they would have had to engineer flies without their own Antp (which would be some horribly messed up flies), and put this mini-Antp into some of them. If mini-Antp can make an otherwise screwed-up fly completely normal, chances are the original gene doesn’t do anything that mini-Antp can’t do. They didn’t do that experiment, so we can’t be sure that there aren’t differences between the two versions where they didn’t look.
It’s probably just that. It has to be that, right?)
 Well, I should say that they normally live in neat clusters. There are animals in which Hox genes have become jumbled within a cluster, torn in two or more clusters or even completely scattered throughout the genome. Nevertheless, on the whole they have a remarkable tendency to stay together and ordered.
 YPWM stands for the amino acids tyrosine, proline, tryptophan and methionine, in case you wondered.
 Actually, the homeodomain also interacts with other proteins, but DNA binding is regarded as its main function.
Papadopoulos DK et al. (2011) Functional synthetic Antennapedia genes and the dual roles of YPWM motif and linker size in transcriptional activation and repression. PNAS 108:11959-11964