To dump a chunk of trunk

The Mammal has deemed that Hox genes and good old-fashioned feel-good evo-devo are a good way to blink back to life*. Also, tardigrades. Tardigrades are awesome. Here is one viewed from above, from the Goldstein lab via Encyclopedia of Life:


Tardigrades or water bears are also a bit unusual. Their closest living relatives are velvet worms (Onychophora) and arthropods. Exactly who’s closest to whom in that trio of phyla collectively known as the Panarthropoda is not clear, and I don’t have the energy to wade into the debate – besides, it’s not really important for the purposes of this post. What Smith et al. (2016) concluded about these adorably indestructible little creatures holds irrespective of their precise phylogenetic position.

Anyway. I said tardigrades were unusual, and I don’t mean their uncanny ability to survive the apocalypse and pick up random genes in the process (Boothby et al., 2015). (ETA: so apparently there may not be nearly as much foreign gene hoarding as the genome paper suggests – see Sujai Kumar’s comment below! Doesn’t change the fact that tardigrades are tough little buggers, though 🙂 ) The oddity we’re interested in today lies in the fact that all known species are built to the exact same compact body plan. Onychophorans and many arthropods are elongated animals with lots of segments, lots of legs, and often lots of variation in the number and type of such body parts. Tardigrades? A wee head, four chubby pairs of legs, and that’s it.

How does a tardigrade body relate to that of a velvet worm, or a centipede, or a spider? Based solely on anatomy, that’s a hell of a question to answer; even the homology of body parts between different kinds of arthropods can be difficult to determine. I have so far remained stubbornly uneducated on the minutiae of (pan)arthropod segment homologies, although I do see papers purporting to match brain parts, appendages and suchlike between different kinds of creepy-crawlies on a fairly regular basis. Shame on me for not being able to care about the details, I guess – but the frequency with which the subject comes up suggests that the debate is far from over.

Now, when I was first drawn to the evo-devo field, one of the biggest attractions was the notion that the expression of genes as a body part forms can tell us what that body part really is even when anatomical clues are less than clear. That, of course, is too good to be simply true, but sometimes the lure of genes and neat homology stories is just too hard to resist. Smith et al.‘s investigation of tardigrade Hox genes is definitely that kind of story.

Hox genes are generally a good place to look if you’re trying to decipher body regions, since their more or less neat, orderly expression patterns are remarkably conserved between very distantly related animals (they are probably as old as the Bilateria, to be precise). A polychaete worm, a vertebrate and an arthropod show the same general pattern – there is no active Hox gene at the very front of the embryo, then Hoxes 1, 2, 3 and so on appear in roughly that order, all the way to the rear end. There are variations in the pattern – e.g. the expression of a gene can have sharp boundaries or fade in and out gradually; different genes can overlap to different extents, the order isn’t always perfect, etc. – but staggered Hox gene expression domains, with the same genes starting up in the same general area along the main body axis, can be found all across the Bilateria.

Tardigrades are no exception, in a sense – but they are also quite exceptional. First, their complement of Hox genes is a bit of a mess. At long last, we have a tardigrade genome to hand, in which Smith et al. (2016) found good honest Hox genes. What they didn’t find was a Hox cluster, an orderly series of Hox genes sitting like beads on a DNA string. Instead, the Hox genes in Hypsibius dujardini, the sequenced species, are all over the genome, associating with all kinds of dubious fellows who aren’t Hoxes.

What Smith et al. also didn’t find was half of the Hox genes they expected. A typical arthropod has ten or so Hox genes, a pretty standard ballpark for an animal that isn’t a vertebrate. H. dujardini has only seven, three of which are triplicates of Abdominal-B, a gene that normally exists in a single copy in arthropods. So basically, only five kinds of Hox gene – number two and most of the “middle” ones are missing. What’s more, two more tardigrades that aren’t closely related to H. dujardini also appear to have the same five Hox gene types (though only one Abd-B each), so this massive loss is probably a common feature of Tardigrada. (No word on whether the scattering of the Hox  cluster is also shared by the other two species.)

We know that the genes are scattered and decimated, but are their expression patterns similarly disrupted? You don’t actually need an intact Hox cluster for orderly Hox expression, and indeed, tardigrade Hox genes are activated in a perfectly neat and perfectly usual pattern that resembles what you see in their panarthropod cousins. Except for the bit where half the pattern is missing!

Here’s part of Figure 4 from the paper, a schematic comparison of tardigrade Hox expression to that of other panarthropods – a generic arachnid, a millipede and a velvet worm. (otd is a “head” gene that lives in the Hox-free anterior region; lab is the arthropod equivalent of Hox1, Dfd is Hox4, and I’m not sure which of Hox6-8 ftz is currently supposed to be.) The interesting thing about this is that according to Hox genes, the entire body of the tardigrade corresponds to just the front end of arthropods and velvet worms.


In addition, one thing that is not shown on this diagram is that Abdominal-B, which normally marks the butt end of the animal, is still active in the tardigrade, predictably in the last segment (L4, that is). So if you take the Hox data at face value, a tardigrade is the arse end of an arthropod tacked straight onto its head. Weird. It’s like evolution took a perfectly ordinary velvet worm-like creature and chopped out most of its trunk.

The tardigrade data suggest that the original panarthropod was probably more like arthropods and velvet worms than tardigrades – an elongated animal with many segments. The strange tardigrade situation can’t be the ancestral one, since the Hox genes that tardigrades lack long predate the panarthropod ancestor. Now, it might be possible to lose half your Hox genes while keeping your ancestral body plan, but an unusual body plan and an unusual set of Hox genes is a bit of a big coincidence, innit?

Smith et al. point out that the loss of the Hox genes was unlikely to be the cause of the loss of the trunk region – Hox genes only specify what grows on a segment, they don’t have much say in how many segments develop in the first place. Instead, the authors reason, the loss of the trunk in the tardigrade ancestor probably made the relevant Hox genes dispensable.

Damn, this story makes me want to see the Hox genes of all those oddball lobopodians from the Cambrian. Some of them are bound to be tardigrade relatives, right?



Boothby TC et al. (2015) Evidence for extensive horizontal gene transfer from the draft genome of a tardigrade. PNAS 112:15976-15981

Smith FW et al. (2016) The compact body plan of tardigrades evolved by the loss of a large body region. Current Biology 26:224-229


*The Mammal has been pretty depressed lately. As in mired up to her head in weird energy-sucking flu. Unfortunately, writing is one of those things that the damn brain monster has eaten most of the fun out of. Also, I have a shitty normal person job at the moment, and shitty job taking up time + barely enough motivation to crawl out of bed and pretend to be human means I have, at best, one afternoon per week that I actually spend on catching up with science. That is just enough to scroll through my feeds and file away the interesting stuff, but woefully insufficient for the writing of posts, not to mention that my ability to concentrate is, to be terribly technical, absolutely fucked. It’s not an ideal state of affairs by any stretch, and I’m pretty sure that if I made more of an effort to read and write about cool things, it would pay off in the mental health department, but
 well. That sort of reasonable advice is hard to hear with the oozing fog-grey suckers of that thing clamped onto my brain.

In which a “living fossil’s” genome delights me

I promised myself I wouldn’t go on for thousands and thousands of words about the Lingula genome paper (I’ve got things to do, and there is a LOT of stuff in there), but I had to indulge myself a little bit. Four or five years ago when I was a final year undergrad trying to figure out things about Hox gene evolution, I would have killed for a complete brachiopod genome. Or even a complete brachiopod Hox cluster. A year or two ago, when I was trying to sweat out something resembling a PhD thesis, I would have killed for some information about the genetics of brachiopod shells that amounted to more than tables of amino acid abundances. Too late for my poor dissertations, but a brachiopod genome is finally sequenced! The paper is right here, completely free (Luo et al., 2015). Yay for labs who can afford open-access publishing!

In case you’re not familiar with Lingula, it’s this guy (image from Wikipedia):

In a classic case of looks being deceiving, it’s not a mollusc, although it does look a bit like one except for the weird white stalk sticking out of the back of its shell. Brachiopods, the phylum to which Lingula belongs, are one of those strange groups no one really knows where to place, although nowadays we are pretty sure they are somewhere in the general vicinity of molluscs, annelid worms and their ilk. Unlike bivalve molluscs, whose shell valves are on the left and right sides of the animal, the shells of brachiopods like Lingula have top and bottom valves. Lingula‘s shell is also made of different materials: while bivalve shells contain calcium carbonate deposited into a mesh of chitin and silk-like proteins,* the subgroup of brachiopods Lingula belongs to uses calcium phosphate, the same mineral that dominates our bones, and a lot of collagen (again like bone). But we’ll come back to that in a moment…

One of the reasons the Lingula genome is particularly interesting is that Lingula is a classic “living fossil”. In the Paleobiology Database, there’s even an entry for a Cambrian fossil classified as Lingula, and there are plenty of entries from the next geological period. If the database is to be believed, the genus Lingula has existed for something like 500 million years, which must be some kind of record for an animal.** Is its genome similarly conservative? Or did the DNA hiding under a deceptively conservative shell design evolve as quickly as anyone’s?

In a heroic feat of self-control, I’m not spending all night poring over the paper, but I did give a couple of interesting sections a look. Naturally, the first thing I dug out was the Hox cluster hiding in the rather large supplement. This was the first clue that Lingula‘s genome is definitely “living” and not at all a fossil in any sense of the word. If it were, we’d expect one neat string of Hox genes, all in the order we’re used to from other animals. Instead, what we find is two missing genes, one plucked from the middle of the cluster and tacked onto its “front” end, and two genes totally detached from the rest. It’s not too bad as Hox cluster disintegration goes – six out of nine genes are still neatly ordered – but it certainly doesn’t look like something left over from the dawn of animals.

The bigger clue that caught my eye, though, was this little family tree in Figure 2:


The red numbers on each branch indicate the number of gene families that expanded or first appeared in that lineage, and the green numbers are the families shrunk or lost. Note that our “living fossil” takes the lead in both. What I find funny is that it’s miles ahead of not only the animals generally considered “conservative” in terms of genome evolution, like the limpet Lottia and the lancelet Branchiostoma, but also the sea squirt (Ciona). Squirts are notorious for having incredibly fast-evolving genomes; then again, most of that notoriety was based on the crazily divergent sequences and often wildly scrambled order of its genes. A genome can be conservative in some ways and highly innovative in others. In fact, many of the genes involved in basic cellular functions are very slow-evolving in Lingula. (Note also: humans are pretty slow-evolving as far as gene content goes. This is not the first study to find that.)

So, Lingula, living fossil? Not so much.

The last bit I looked at was the section about shell genetics. Although it’s generally foolish to expect the shell-forming gene sets of two animals from different phyla to be similar (see my first footnote), if there are similarities, they could potentially go at least two different ways. First, brachiopods might be quite close to molluscs, which is the hypothesis Luo et al.‘s own treebuilding efforts support. Like molluscs, brachiopods also have a specialised mantle that secretes shell material, though having the same name doesn’t mean the two “mantles” actually share a common origin. So who knows, some molluscan shell proteins, or shell regulatory genes, might show up in Lingula, too.

On the other hand, the composition of Lingula’s shell is more similar to our skeletons’. So, since they have to capture the same mineral, could the brachiopods share some of our skeletal proteins? The answer to both questions seems to be “mostly no”.

Molluscan shell matrix proteins, those that are actually built into the structure of the shell, are quite variable even within Mollusca. It’s probably not surprising, then, that most of the relevant genes that are even present in Lingula are not specific to the mantle, and those that are are the kinds of genes that are generally involved in the handling of calcium or the building of the stuff around cells in all kinds of contexts. Some of the regulatory mechanisms might be shared – Luo et al. report that BMP signalling seems to be going on around the edge of the mantle in baby Lingula, and this cellular signalling system is also involved in molluscan shell formation. Then again, a handful of similar signalling systems “are involved” in bloody everything in animal development, so how much we can deduce from this similarity is anyone’s guess.

As for “bone genes” – the ones that are most characteristically tied to bone are missing (disappointingly or reassuringly, take your pick). The SCPP protein family is so far known only from vertebrates, and its various members are involved in the mineralisation of bones and teeth. SCPPs originate from an ancient protein called SPARC, which seems to be generally present wherever collagen is (IIRC, it’s thought to help collagen fibres arrange themselves correctly). Lingula has a gene for SPARC all right, but nothing remotely resembling an SCPP gene.

I mentioned that the shell of Lingula is built largely on collagen, but it turns out that it isn’t “our” kind of collagen. “Collagen” is just a protein with a particular kind of repetitive sequence. Three amino acids (glycine-proline-something else, in case you’re interested) are repeated ad nauseam in the collagen chain, and these repetitive regions let the protein twist into characteristic rope-like fibres that make collagen such a wonderfully tough basis for connective tissue. Aside from the repeats they all share, collagens are a large and diverse bunch. The ones that form most of the organic matrix in bone contain a non-repetitive and rather easily recognised domain at one end, but when Luo et al. analysed the genome and the proteins extracted from the Lingula shell, they found that none of the shell collagens possessed this domain. Instead, most of them had EGF domains, which are pretty widespread in all kinds of extracellular proteins. Based on the genome sequence, Lingula has a whole little cluster of these collagens-with-EGF-domains that probably originated from brachiopod-specific gene duplications.

So, to recap: Lingula is not as conservative as its looks would suggest (never judge a living fossil by its cover, right?) We also finally have actual sequences for lots of its shell proteins, which reveal that when it comes to building shells, Lingula does its own thing. Not much of a surprise, but still, knowing is a damn sight better than thinkin’ it’s probably so. We are scientists here, or what.

I am Very Pleased with this genome. (I just wish it was published five years ago 😛 )



*This, interestingly, doesn’t seem to be the general case for all molluscs. Jackson et al. (2010) compared the genes building the pearly layer of snail (abalone, to be precise) and bivalve (pearl oyster) shells, and found that the snail showed no sign of the chitin-making enzymes and silk type proteins that were so abundant in its bivalved cousins. It appears that even within molluscs, different groups have found different ways to make often very similar shell structures. However, all molluscs shells regardless of the underlying genetics are predominantly composed of calcium carbonate.

**You often hear about sharks, or crocodiles, or coelacanths, existing “unchanged” for 100 or 200 or whatever million years, but in reality, 200-million-year-old crocodiles aren’t even classified in the same families, let alone the same genera, as any of the living species. Again, the living coelacanth is distinct enough from its relatives in the Cretaceous, when they were last seen, to warrant its own genus in the eyes of taxonomists. I’ve no time to check up on sharks, but I’m willing to bet the situation is similar. Whether Lingula‘s jaw-dropping 500-million-year tenure on earth is a result of taxonomic lumping or the shells genuinely looking that similar, I don’t know. Anyway, rant over.



Jackson DJ et al. (2010) Parallel evolution of nacre building gene sets in molluscs. Molecular Biology and Evolution 27:591-608

Luo Y-J et al. (2015) The Lingula genome provides insights into brachiopod evolution and the origin of phosphate biomineralization. Nature Communications 6:8301

Putting the cart before the… snake?

Time to reexamine some assumptions (again)! And also, talk about Hox genes, because do I even need a reason?

Hox genes often come up when we look for explanations for various innovations in animal body plans – the digits of land vertebrates, the limbless abdomens of insects, the various feeding and walking and swimming appendages of crustaceans, the strongly differentiated vertebral columns of mammals, and so on.

Speaking of differentiated vertebral columns, here’s one group I’d always thought of as having pretty much the exact opposite of them: snakes. Vertebral columns are patterned, among other things, by Hox genes. Boundaries between different types of vertebrae such as cervical (neck) and thoracic (the ones bearing the ribcage) correspond to boundaries of Hox gene expression in the embryo – e.g. the thoracic region in mammals begins where HoxC6 starts being expressed.

In mammals like us, and also in archosaurs (dinosaurs/birds, crocodiles and extinct relatives thereof), these boundaries can be really obvious and sharply defined – here’s Wikipedia’s crocodile skeleton for an example:

In contrast, the spine of a snake (example from Wikipedia below) just looks like a very long ribcage with a wee tail:

Snakes, of course, are rather weird vertebrates, and weird things make us sciencey types dig for an explanation.

Since Hox genes appear to be responsible for the regionalisation of vertebral columns in mammals and archosaurs, it stands to reason that they’d also have something to do with the comparative lack of regionalisation (and the disappearance of limbs) seen in snakes and similar creatures. In a now classic paper, Cohn and Tickle (1999) observed that unlike in chicks, the Hox genes that normally define the neck and thoracic regions are kind of mashed together in embryonic pythons. Below is a simple schematic from the paper showing where three Hox genes are expressed along the body axis in these two animals. (Green is HoxB5, blue is C8, red is C6.)


As more studies examined snake embryos, others came up with different ideas about the patterning of serpentine spines. Woltering et al. (2009) had a more in-depth look at Hox gene expression in both snakes and caecilians (limbless amphibians) and saw that there are in fact regions ruled by different Hoxes in these animals, if a little fuzzier than you’d expect in a mammal or bird – but they don’t appear to translate to different anatomical regions. Here’s their summary of their findings, showing the anteriormost limit of the activity of various Hox genes in a corn snake compared to a mouse:


Such differences aside, both of the above studies operated on the assumption that the vertebral column of snakes is “deregionalised” – i.e. that it evolved by losing well-defined anatomical regions present in its ancestors. But is that actually correct? Did snakes evolve from more regionalised ancestors, and did they then lose this regionalisation?

Head and Polly (2015) argue that the assumption of deregionalisation is a bit stinky. First, that super-long ribcage of snakes does in fact divide into several regions, and these regions respect the usual boundaries of Hox expression. Second, ordinary lizard-shaped lizards (from which snakes descended back in the days of the dinosaurs) are no more regionalised than snakes.

The study is mostly a statistical analysis of the shapes of vertebrae. Using an approach called geometric morphometrics, it turned these shapes from dozens of squamate (snake and lizard) species into sets of coordinates, which could then be compared to see how much they vary along the spine and whether the variation is smooth and continuous or clustered into different regions. The authors evaluated hypotheses regarding the number of distinct regions to see which one(s) best explained the observed variation. They also compared the squamates to alligators (representing archosaurs).

The results were partly what you’d expect. First, alligators showed much more overall variation in vertebral shape than squamates. Note that that’s all squamates – leggy lizards are nearly (though not quite) as uniform as their snake-like relatives. However, in all squamates, the best-fitting model of regionalisation was still one with either three or four distinct regions in front of the hips/cloaca, and in the majority, it was four, the same number as the alligator had.

Moreover, there appeared to be no strong support for an evolutionary pattern to the number of regions – specifically, none of the scenarios in which the origin of snake-like body plans involved the loss of one or more regions were particularly favoured by the data. There was also no systematic variation in the relative lengths of various regions; the idea that snakes in general have ridiculously long thoraxes is not supported by this analysis.

In summary, snakes might show a little less variation in vertebral shape than their closest relatives, but they certainly didn’t descend from alligator-style sharply regionalised ancestors, and they do still have regionalised spines.

Hox gene expression is not known for most of the creatures for which vertebral shapes were analysed, but such data do exist for mammals (mice, here), alligators, and corn snakes. What is known about different domains of Hox gene activation in these three animals turns out to match the anatomical boundaries defined by the models pretty well. In the mouse and alligator, Hox expression boundaries are sharp, and the borders of regions fall within one vertebra of them.

In the snake, the genetic and morphological boundaries are both gradual, but the boundaries estimated by the best model are always within the fuzzy boundary region of an appropriate Hox gene expression domain. Overall, the relationship between Hox genes and regions of the spine is pretty consistent in all three species.

To finish off, the authors make the important point that once you start turning to the fossil record and examining extinct relatives of mammals, or archosaurs, or squamates, or beasties close to the common ancestor of all three groups (collectively known as amniotes), you tend to find something less obviously regionalised than living mammals or archosaurs – check out this little figure from Head and Polly (2015) to see what they’re talking about:


(Moving across the tree, Seymouria is an early relative of amniotes but not quite an amniote; Captorhinus is similarly related to archosaurs and squamates, Uromastyx is the spiny-tailed lizard, Lichanura is a boa, Thrinaxodon is a close relative of mammals from the Triassic, and Mus, of course, is everyone’s favourite rodent. Note how alligators and mice really stand out with their ribless lower backs and suchlike.)

Although they don’t show stats for extinct creatures, Head and Polly argue that mammals and archosaurs, not snakes, are the weird ones when it comes to vertebral regionalisation. For most of amniote evolution, the norm was the more subtle version seen in living squamates. It was only during the origin of mammals and archosaurs that boundaries were sharpened and differences between regions magnified. Nice bit of convergent/parallel evolution there!



Cohn MJ & Tickle C (1999) Developmental basis of limblessness and axial patterning in snakes. Nature 399:474-479

Head JJ & Polly PD (2015) Evolution of the snake body form reveals homoplasy in amniote Hox gene function. Nature 520:86-89

Woltering JM et al. (2009) Axial patterning in snakes and caecilians: evidence for an alternative interpretation of the Hox code. Developmental Biology 332:82-89

Finally, that sponge ParaHox gene

ParaHox genes are a bit like the underappreciated sidekicks of Hox genes. Or little sisters, as the case may be, since the two families are closely related. Hox genes are probably as famous as anything in evo-devo. Being among the first genes controlling embryonic development to be (a) discovered, (b) found to be conserved between very distantly related animals, they are symbolic of the late 20th century evo-devo revolution.

ParaHoxes get much less attention despite sharing some of the most exciting properties of Hox genes. Like those, they are involved in anteroposterior patterning – that is, partitioning an embryo along its head to tail axis. Also like Hox genes, they are often neatly clustered in the genome, and when they are, they tend to be expressed in the same order (both in space and time) in which they sit in the cluster*. Their main ancestral roles for bilaterian animals seem to be in patterning the gut and the central nervous system (Garstang and Ferrier, 2013).

There are three known types of ParaHox gene, which are generally thought to be homologous to specific Hox subsets of Hox genes – by the most accepted scheme, Gsx is the closest sister of Hox1 and Hox2, Xlox is closest to Hox3, and Cdx to Hox9 and above. It is abundantly clear that Hoxes and ParaHoxes are closely related, but there has been a bit of debate concerning the number of genes in the ancestral gene cluster that gave rise to both – usually called “ProtoHox” (Garcia-Fernàndez, 2005).

Another big question about these genes is precisely when they originated, and in this regard, ParaHox genes are proving much more interesting than Hoxes. You see, there are plenty of animals with both Hox and ParaHox genes, which is what you’d expect given the ProtoHox hypothesis, but there are also animals with only ParaHoxes. If there really was a ProtoHox gene/cluster that then duplicated to give rise to Hoxes and ParaHoxes, then lone ParaHoxes (or Hoxes for that matter) shouldn’t happen – unless the other cluster was lost along the way.

So a suspiciously Gsx-like gene in the weird little blob-creature Trichoplax, which has nothing remotely resembling a Hox gene, was a big clue that (a) Hox/ParaHox genes might go back further in animal evolution than we thought, (b) the loss of the entire Hox or ParaHox cluster is totally possible**, despite how fundamental these genes appear to be for correctly building an animal.

I wrote (at length) about a study by Mendivil Ramos et al. (2012), which revealed that while Trichoplax had no Hox genes and only one of the three types of ParaHox gene, it preserved the more or less intact genomic neighbourhoods in which Hox and ParaHox clusters are normally situated. One of the more interesting results of that paper was that the one sponge genome available at the time – that of Amphimedon queenslandica, which had no trace of either Hox or ParaHox genes – also contained statistically significant groupings of Hox and ParaHox neighbour genes, as if it had a Hox neighbourhood and a ParaHox neighbourhood, but the Hoxes and ParaHoxes themselves had moved out.

That study thus pointed towards an intriguing hypothesis, previously championed by Peterson and Sperling (2007) based solely on gene phylogenies: sponges once did have Hox and ParaHox genes/clusters, which at least some of them later lost. This would essentially mean that the two gene clusters go straight back to the origin of animals if not further***, and we may never find any surviving remnant of the ancestral ProtoHox cluster, since the closest living relatives of animals have neither the genes nor their neighbourhoods (that we know of).

Hypotheses are nice, but as we know, they do have a tendency to be tragically slain by ugly facts. Can we further test this particular hypothesis about sponges? Are there facts that could say yay or nay? (Of course there are. I wouldn’t be writing this otherwise 😉 )

I keep saying that we should always be careful when generalising from one or a few model organisms, that we ignore diversity in the animal kingdom at our own peril, and that “distantly related to us” = “looks like our distant ancestors” is an extremely dodgy assumption. Well, here’s another lesson in that general vein: unlike Amphimedon, some sponges have not just the ghosts of vanished ParaHox clusters, but intact, honest to god ParaHox genes!

It’s calcareous sponges again. Sycon ciliatum and Leucosolenia complicata, two charming little calcisponges, recently had their genomes sequenced (alas, they weren’t yet public last time I checked), and since then, there’s been a steady stream of “cool stuff we found in calcisponge genomes” papers from Maja Adamska’s lab and their collaborators. I’ve discussed one of them (Robinson et al., 2013), in which the sponges revealed their rather unhelpful microRNAs, and back in October (when I was slowly self-destructing from thesis stress), another study announced a couple of delicious ParaHoxes (Fortunato et al., 2014).

(Exciting as it is, the paper starts by tickling my pet peeves right off the bat by calling sponges “strong candidates for being the earliest extant lineage(s) of animals”… I suppose nothing can be perfect… *sigh*)

The study actually covers more than just (Para)Hox genes; it looks at an entire gene class called Antennapedia (ANTP), which includes Hoxes and ParaHoxes plus a handful of related families I’m far less interested in. Sycon and Leucosolenia don’t have a lot of ANTP genes – only ten in the former and twelve in the latter, whereas a typical bilaterian like a fruit fly or a lancelet has several times that number – but from phylogenetic analyses, these appear to be a slightly different assortment of genes from those present in Amphimedon, the owner of the first sequenced sponge genome. This picture is most consistent with a scenario in which all of the ANTP genes in question were present in our common ancestor with sponges, and each sponge lineage lost some of them independently. (You may not realise this until you start delving into the history of various gene families, but genes come and go a LOT in evolution.)

Sadly, many of the branches on these gene trees are quite wonky, including the one linking a gene from each calcisponge to the ParaHox gene Cdx. However, somewhat fuzzy trees are not the only evidence the study presents. First, the putative sponge Cdxes possess a little motif in their protein sequences that is only present in a handful of gene families within the ANTP class. If you take only these families rather than everything ANTP and make trees with them, the two genes come out as Cdx in every single tree, and with more statistical support than the global ANTP trees gave them. Another motif they share with all Hoxes, ParaHoxes and a few of their closest relatives, but not with other ANTP class families.

Second, at least the gene in Sycon appears to have the right neighbours (Leucosolenia was not analysed for this). Since the Sycon genome sequence is currently in pieces much smaller than whole chromosomes, only four or so of the genes flanking ParaHox clusters in other animals are clearly linked to the putative Cdx in the sponge. However, when the researchers did the same sort of simulation Mendivil Ramos et al. (2012) did for Amphimedon, testing whether Hox neighbours and ParaHox neighbours found across all fragments of the genome are (a) close to other Hox/ParaHox neighbours or randomly scattered (b) mixed or segregated, they once again found cliques of genes with little overlap, indicating the once-existence of separate Hox and ParaHox clusters.

Fortunato et al. (2014) also examined the expression of their newfound Cdx gene, and found it no less intriguing than its sequence or location in the genome, although their description in the paper is very limited (no doubt because they’re trying to cram results on ten genes into a four-page Nature paper). The really interesting activity they mention and picture is in the inner cell mass of the young sponge in its post-larval stages – the bit that develops into the lining of its feeding chambers. Which, Adamska’s team contend, may well be homologous to our gut lining. In bilaterians, developing guts are one of the major domains of Cdx and ParaHox genes in general!

So at least three different lines of evidence – sequence, neighbours and expression – make this picture hang together quite prettily. It’s incredibly cool – the turning on their heads of long-held assumptions is definitely the most exciting part of science, I say! On the other hand, it’s also a little disheartening, because now that everyone in the animal kingdom except ctenophores has definitive ParaHox genes and at least the empty seats once occupied by Hox genes, are we ever going to find a ProtoHox thingy? May it be that it’ll turn up in one of the single-celled beasties people like Iñaki Ruiz-Trillo are sequencing? That would be cool and weird.

The coolest twist on this story, though, would be to discover traces of ProtoHoxes in a ctenophore, since solid evidence for ProtoHox-wielding ctenophores would (a) confirm the strange and frankly quite dubious-sounding idea that ctenophores, not sponges, are the animal lineage farthest removed from ourselves, (b) SHOW US A FREAKING PROTOHOX CLUSTER. (*bounces* >_> Umm, * cough* OK, maturity can suck it 😀 ) However, given how horribly scrambled at least one ctenophore genome is (Ryan et al., 2013), that’s probably a bit too much to ask…



*Weirdly, the order of expression in time is the opposite of that of the Hox cluster. In both clusters, the “anterior” gene(s), i.e. Hox1-2 or Gsx, are active nearest the front of the embryo, but while anterior Hox genes are also the earliest to turn on, in the ParaHox cluster the posterior gene (Cdx) wakes up first. /end random trivia

**Of course we’ve long known that losing a Hox cluster is not that big a deal, but previously, all confirmed losses occurred in animals with more than one Hox cluster to begin with – a fish has plenty of Hox genes left even after chucking an entire set of them.

***With the obligatory ctenophore caveat



Fortunato SAV et al. (2014) Calcisponges have a ParaHox gene and dynamic expression of dispersed NK homeobox genes. Nature 514:620-623

Garcia-FernĂ ndez J (2005) The genesis and evolution of homeobox gene clusters. Nature Reviews Genetics 6:881-892

Garstang M & Ferrier DEK (2013) Time is of the essence for ParaHox homeobox gene clustering. BMC Biology 11:72

Mendivil Ramos O et al. (2012) Ghost loci imply Hox and ParaHox existence in the last common ancestor of animals. Current Biology 22:1951-1956

Peterson KJ & Sperling EA (2007) Poriferan ANTP genes: primitively simple or secondarily reduced? Evolution and Development 9:405-408

Robinson JM et al. (2013) The identification of microRNAs in calcisponges: independent evolution of microRNAs in basal metazoans. Journal of Experimental Zoology B 320:84-93

Ryan JF et al. (2013) The genome of the ctenophore Mnemiopsis leidyi and its implications for cell type evolution. Science 342:1242592

A bit of Hox gene nostalgia

I had the most random epiphany over my morning tea today. I don’t even know what got me thinking about the Cambrian explosion (as if I needed a reason…). Might have been remembering something from the Euro Evo Devo conference I recently went to. (I kind of wanted to post about that, because I saw some awesome things, but too much effort. My brain isn’t very cooperative these days.)


I was thinking about explanations of the Cambrian explosion and remembering how the relevant chapter in The Book of Life (otherwise known as the book that made me an evolutionary biologist)  tried to make it all about Hox genes. It’s an incredibly simplistic idea, and almost certainly wrong given what we now know about the history of Hox genes (and animals)*. At the time, and for a long time afterwards, I really wanted it to be true because it appeals to my particular biases. But I digress.

Then it dawned on me just how new and shiny Hox genes were when this book was written. I thought, holy shit, TBoL is old. And how far evo-devo as a field has come since!

The Book of Life was first published in 1993. That is less than a decade after the discovery of the homeobox in fruit fly genes that controlled the identity of segments (McGinnis et al., 1984; Scott and Weiner, 1984), and the finding that homeoboxes were shared by very distantly related animals (Carrasco et al., 1984). It was only four years after the recognition that fly and vertebrate Hox genes are activated in the same order along the body axis (Graham et al., 1989; Duboule and Dollé, 1989).

This was a HUGE discovery. Nowadays, we’re used to the idea that many if not most of the genes and gene networks animals use to direct embryonic development are very ancient, but before the discovery of Hox genes and their clusters and their neatly ordered expression patterns, this was not at all obvious. What were the implications of these amazing, deep connections for the evolution of animal form? It’s not surprising that Hox genes would be co-opted to explain animal evolution’s greatest mysteries.

It also occurred to me that 1993 is the year of the zootype paper (Slack et al., 1993). Slack et al. reads like a first peek into a brave new world with limitless possibilities. They first note the similarity of Hox gene expression throughout much of the animal kingdom, then propose that this expression pattern (their “zootype”) should be the definition of an animal. After that, they speculate that just as the pattern of Hox genes could define animals, the patterns of genes controlled by Hoxes could define subgroups within animals. Imagine, they say, if we could solve all those tough questions in animal phylogeny by looking at gene expression.

As always, things turned out More Complicated, what with broken and lost Hox clusters and all the other weird shit developmental “master” genes get up to… but it was nice to look back at the bright and simple childhood of my field.

(And my bright and simple childhood. I read The Book of Life in 1998 or 1999, not entirely sure, and in between Backstreet Boys fandom, exchanging several bookfuls of letters with my BFF and making heart-shaped eyes at long-haired guitar-playing teenage boys, I somehow found true, eternal, nerdy love. *nostalgic sigh*)


*Caveat: it’s been years since I last re-read the book, and my copy is currently about 2500 km from me, so the discussion of the Cambrian explosion might be more nuanced than I remember. Also, my copy is the second edition, so I’m only assuming that the Hox gene thing is there in the original.



Carrasco AE et al. (1984) Cloning of an X. laevis gene expressed during early embryogenesis coding for a peptide region homologous to Drosophila homeotic genes. Cell 37:409-414

Duboule D & Dollé P (1989) The structural and functional organization of the murine HOX gene family resembles that of Drosophila homeotic genes. The EMBO Journal 8:1497-1505

Graham A et al. (1989) The murine and Drosophila homeobox gene complexes have common features of organization and expression. Cell 57:367-378

McGinnis W et al. (1984) A conserved DNA sequence in homoeotic genes of the Drosophila Antennapedia and bithorax complexes. Nature 308:428-433

Scott MP & Weiner AJ (1984) Structural relationships among genes that control development: sequence homology between the Antennapedia, Ultrabithorax, and fushi tarazu loci of Drosophila. PNAS 81:4115-4119

Slack JMW et al. (1993) The zootype and the phylotypic stage. Nature 361:490-492

Lamprey Hox clusters and genome duplications, oh my!

What the hell is up with lamprey Hox clusters?

Lampreys are among the few living jawless vertebrates, creatures that parted evolutionary ways with our ancestors somewhere on the order of 500 million years ago. If you want to know where things like jaws, paired fins or our badass adaptive immune systems came from, a vertebrate that doesn’t possess some of these things and may have diverged from the rest of the vertebrates soon after others originated is just what you need for comparison.

The vertebrate fossil record is pretty rich thanks to us having hard tissues, so a lot can be inferred about these things from the wealth of extinct fishes we have at our disposal. However, there are times when comparisons of living creatures are just as useful, if not more, than examinations of fossils. (Fossils, for example, tend not to have immune systems. ;))

One of the things you absolutely need a living animal to study is, of course, genome evolution. Vertebrates – well, at least jawed vertebrates – are now generally accepted to have the remnants of four genomes. Our long-gone ancestors underwent two rounds of whole genome duplication. Afterwards, most of the extra genes were lost, but evidence for the duplications can still be found in the structure of our genomes, where entire recognisable gene neighbourhoods of our close invertebrate relatives often still exist in up to four copies (Putnam et al., 2008).

Among these neighbourhoods are the four clusters of Hox genes most groups of jawed vertebrates possess. A “normal” animal like a snail or a centipede only has one of these. Since Hox genes are involved in the making of body plans, you have to wonder how suddenly having four sets of them and other developmental “master genes” might have influenced the evolution of vertebrate bodies.

Of course, to guess that, you need to know precisely when these duplications happened. That’s where lampreys come in: their lineage branched off from our definitely quadruple-genomed one after the next closest, definitely single-genomed group. But was it before, between, or after, the two rounds of duplication?

A few years ago, a phylogenetic analysis of 55 gene families by Kuraku et al. (2009) suggested that the lamprey-jawed vertebrate split happened after the 2R. Just this year, the genome of the sea lamprey Petromyzon marinus was finally published (Smith et al., 2013), and its authors agreed that yes, lampreys probably split off from us post-2R. (I don’t entirely get all the things they did to arrive at this conclusion. Groups of linked genes show up again, among other approaches.)

However, that isn’t the whole story, the latest lamprey genomics paper argues (Mehta et al., 2013). The P. marinus genome assembly couldn’t stitch all the Hox clusters properly together. There were two that sat on nice big scaffolds with the whole row of Hox genes and a few of their neighbours, and then there were a bunch of “loose” Hox genes that they couldn’t link to anything (diagram comparing humans and P. marinus below from Smith et al., 2013; the really pale blue boxes under the numbers represent Hox genes):


Given that Hox9 genes exist in four copies in this species, it seems like there may be four clusters. However, in hagfish, the other kind of living jawless vertebrate, a study found Hox genes that seemed to have as many as seven copies (Stadler et al., 2004). Another round of duplication? It wouldn’t be unheard of. Most teleosts, which include most of the things we call “fish” in everyday parlance, have seven Hox clusters courtesy of an extra genome duplication and loss of one cluster*. Salmon and kin have thirteen, after yet another duplication. Maybe hagfish also had another one – but did lampreys? How many more clusters do those lonely Hox genes belong to?

Mehta et al. hunted down the Hox clusters of Japanese lampreys (Lethenteron japonicum), hoping to pin down exactly how many there were. They used large chunks of DNA derived partly from the testicles, where sperm cells and their precursors keep the full genome throughout the animal’s life (lampreys throw away large chunks of the genome in most non-reproductive cells [Smith et al., 2009]). They probed these for Hox genes and sequenced the ones that tested positive. Plus they also got about two-thirds of the full genome together in fairly big pieces. Together, these data allowed them to get a better idea of the mess that is lamprey Hox cluster genomics.

They assembled four whole clusters, including their neighbouring genes, and a partial fifth cluster. A bunch of other genes sat on smaller sequence fragments containing only a couple of Hoxes, or a Hox and a non-Hox, but they were tentatively assigned to a total of eight clusters, eight being the number of different Hox4 genes in the data (no known vertebrate Hox cluster contains more than one Hox4 gene). The L. japonicum equivalents of the 31 publicly available Hox sequences from P. marinus spread out over six of these, which indicates that both species have at least six clusters. Seems like lampreys had another round of genome duplication after 2R? (Summary of L. japonicum Hox clusters from Mehta et al. below.)

But wait, that’s not the end of it.

First of all, although there are undoubtedly four complete Hox clusters in there L. japonicum, the relationships of these clusters to our four are terribly confused. Whether you look at the phylogenetic trees of individual genes, or the arrangement of non-Hox genes on either side of the cluster, only a big pile of what the fuck emerges. Phylogenies are problematic because the unusual composition of lamprey genes and proteins (Smith et al., 2013) could easily throw them off. All the complete lamprey clusters have a patchwork of neighbours that look like a mashup of more than one of our Hox clusters. Might it mean that lampreys’ proliferation of Hox clusters occurred independently of ours? Did we split before 2R after all?

Hox genes are not the only interesting things in a Hox cluster. In the long gaps between them, there are all sorts of little DNA switches that regulate their behaviour. Some of these are conserved across the jawed vertebrates. Mehta et al. aligned complete Hox clusters of humans, elephant sharks and lampreys to look for such sequences – called conserved non-coding elements or CNEs – in the lamprey.

They only found a few, but that’s enough for a bit more head-scratching. Most CNEs in, say, the human HoxA cluster are only found in one elephant shark cluster, and vice versa. Humans have a HoxA cluster, elephant sharks have a HoxA cluster, they’re clearly the same thing, pretty straightforward. Not so for lampreys. Homologues of individual CNEs in the complete lamprey clusters are spread out over all four human/elephant shark clusters. More evidence for independent duplications?

Mehta et al. are cautious – they point out that the silly mix of Hox cluster neighbours in lampreys could just be due to independent post-2R losses, which is plausible if the split between lamprey and jawed vertebrate lineages happened not too long after 2R. There’s also the fact that the weird lamprey sequences are phylogenetic minefields – however, that’s a double-edged sword, since the same caveat applies to analyses that support a post-2R divergence. Then, perhaps the same argument that goes for Hox cluster neighbours could also apply to CNEs. And, of course, this is just Hox clusters. Smith et al.‘s (2013) findings about overall genome structure don’t go away just because lamprey Hox clusters are weird.

So, in summary, thanks, lampreys. Fat lot of help you are! 😛


*Actually, two losses of two separate clusters in two different teleost lineages. Because Hox evolution wasn’t already complicated enough.



Kuraku S et al. (2009) Timing of genome duplications relative to the origin of the vertebrates: did cyclostomes diverge before or after? Molecular Biology and Evolution 26:47-59

Mehta TK et al. (2013) Evidence for at least six Hox clusters in the Japanese lamprey (Lethenteron japonicum). PNAS 110:16044-16049

Putnam NH et al. (2008) The amphioxus genome and the evolution of the chordate karyotype. Nature 453:1064-1071

Smith JJ et al. (2009) Programmed loss of millions of base pairs from a vertebrate genome. PNAS 106:11212-11217

Smith JJ et al. (2013) Sequencing of the sea lamprey (Petromyzon marinus) genome provides insights into vertebrate evolution. Nature Genetics 45:415-421

Stadler PF et al. (2004) Evidence for independent Hox gene duplications in the hagfish lineage: a PCR-based gene inventory of Eptatretus stoutii. Molecular Phylogenetics and Evolution 32:686-694

Lotsa news

Hah, I open my Google Reader (damn you, Google, why do you have to kill it??? >_<), expecting to find maybe a handful of new articles since my last login, and instead getting both Nature and Science in one big heap of awesome. The latest from the Big Two are quite a treat!


By now, of course, the internet is abuzz with the news of all those four-winged birdies from China (Zheng et al., 2013). I’m a sucker for anything with feathers anywhere, plus these guys are telling us in no uncertain terms that four-wingedness is not just some weird dromaeosaur/troodontid quirk but an important stage in bird evolution. Super-cool.


Then there is that Cambrian acorn worm from the good old Burgess Shale (Caron et al., 2013). It’s described to be like modern acorn worms in most respects, except it apparently lived in a tube. Living in tubes is something that pterobranchs, a poorly known group related to acorn worms do today. The Burgess Shale fossils (along with previous molecular data) suggest that pterobranchs, which are tiny, tentacled creatures living in colonies, are descendants rather than cousins of the larger, tentacle-less and solitary acorn worms. This has all kinds of implications for all kinds of common ancestors…


Third, a group used a protein from silica-based sponge skeletons to create unusually bendy calcareous rods (Natalio et al., 2013). Calcite, the mineral that makes up limestone, is not normally known for its flexibility, but the sponge protein helps tiny crystals of it assemble into a structure that bends rather than breaks. Biominerals would just be ordinary rocks without the organic stuff in them, and this is a beautiful demonstration of what those organic molecules are capable of!


And finally, Japanese biologists think they know where the extra wings of ancient insects went (Ohde et al., 2013). Today, most winged insects have two pairs of wings, one pair on the second thoracic segment and another on the third. But closer to their origin, they had wing-like outgrowths all the way down the thorax and abdomen. Ohde et al. propose that these wing homologues didn’t just disappear – they were instead modified into other structures. Their screwing with Hox gene activity in mealworm beetles transformed some of the parts on normally wingless segments into somewhat messed up wings. What’s more, the normal development of the same bits resembles that of wings and relies on some of the same master genes. It’s a lot like bithorax mutant flies with four wings (normal flies only have two, the hindwings being replaced by balancing organs), except no modern insect has wings where these victims of genetic wizardry grew them. The team encourage people to start looking for remnants of lost wings in other insects…

Lots of insteresting stuff today! And we got more Hox genes, yayyyy!



Caron J-B et al. (2013) Tubicolous enteropneusts from the Cambrian period. Nature advance online publication 13/03/2013, doi: 10.1038/nature12017

Natalio F et al. (2013) Flexible minerals: self-assembled calcite spicules with extreme bending strength. Science 339:1298-1302

Ohde T et al. (2013) Insect morphological diversification through the modification of wing serial homologs. Science Express, published online 14/03/2013, doi: 10.1126/science.1234219

Zheng X et al. (2013) Hind wings in basal birds and the evolution of leg feathers. Science 339:1309-1312

“Same” function, but the devil is in the details.

Aaaaaand todaaaaay, ladies and, um, other kinds of people…. Hox genes!

Considering that I did my Honours project on them and I think they are made of awesome, I’m kind of shocked by the general lack of them here*. Hmmmmmm. Well, having just found Sambrani et al. (2013), I think today is a good time to do something about that.

Hox genes in general are “what goes where” type regulators of development. In bilaterian animals, they tend to work along the head to tail axis of the embryo. (Cnidarians like sea anemones also have them, but the situation re: main body axis and Hox genes in cnidarians is a leeeetle less clear. And heaven knows what sort of weird things happened with the rest of the animals.)

Hox genes are responsible for one of the peculiarities of the insect body plan. Unlike many other arthropods, insects have leg-free abdomens. On the left below is a poor little lobster with legs or related appendages all the way down (plus a bonus clutch of eggs). (Arnstein RĂžnning, Wikimedia Commons). To her right is a bland, boring insect abdomen (Hans Hillewaert, Wikimedia Commons).

As I said, Hox genes are responsible for the difference. Three of them are expressed in various segments of the abdomen of a developing insect: Ultrabithorax (Ubx), Abdominal-A and Abdominal-B. I’m going to whip out that amazing fluorescent image of Hox gene expression in a fruit fly embryo from Lemons and McGinnis (2006) because aside from being cool as hell, it also happens to be a good illustration:

(The embryo is folded back on itself, so the Abd-B-expressing tail end is right next to the Hox gene-free head)

In insects, all three can turn off the expression of the leg “master” gene distal-less (dll). However, they turn out to do so through two different mechanisms. Ubx and Abd-A proteins have long been known to team up with the distantly related Extradenticle (Exd) and Homothorax (Hth). With their partners, the Hoxes can sit on a regulatory region belonging to the dll gene and prevent its activation.

Sambrani et al. were curious whether Abd-B works in the same way. Sure enough, Abd-B also represses dll wherever it shows up. However, when it comes to interacting with Exd and Hth, differences start to emerge. For starters, those two aren’t even present in the rear end of the abdomen, where Abd-B does its business. When the researchers took the regulatory region of dll and threw various combinations of proteins at it, they found that (1) Abd-B is perfectly capable of binding the DNA on its own, (2) Exd, Hth or engrailed (another Hox cofactor) didn’t improve this ability at all, (3) Hth alone or in combination with the others actually inhibited the binding of Abd-B to the dll regulatory sequence.

Interestingly, dll repression in the anterior and posterior abdominal segments requires the exact same bits of regulatory DNA even though different proteins are involved. It looks like in the posterior segments, Abd-B actually takes over an “Exd” binding site – maybe that’s how it can do the job without getting Exd itself involved.

Furthermore, while the DNA-binding ability of Abd-B is crucial to its ability to kill dll expression, the same is not the case for Ubx. The authors speculate that cooperation with Exd and Hth kind of exempts Ubx from having to bind the regulatory sequences itself, while Abd-B, being on its own, can’t afford to slack off like that. The paper illustrates the idea with such a deliciously ugly pair of drawings that I feel compelled to post it:

(I know they’re going for colour-matching with the fluorescent images, but unfortunately glowy greens and reds that look good on a black background kind of just hurt my eyes on white.)

I don’t really have a point to make here. (There doesn’t always have to be a point, right?) There’s absolutely nothing surprising about the fact that different Hox genes evolved the same overall function in different ways –  after all, they existed as separate entities long before insects lost their buttward legs. I just think Hox genes are cool, and this was an interesting look into the nuts and bolts of how they work. And that’s that.



*Well, aside from this one I’ve written three posts about them and a couple more where they are mentioned. That’s maybe not that bad considering how many different things I’m interested in.



Lemons D and McGinnis W (2006) Genomic evolution of Hox gene clusters. Science 313:1918-1922

Sambrani N et al. (2013) Distinct molecular strategies for Hox-mediated limb suppression in Drosophila: From cooperativity to dispensability/antagonism in TALE partnership. PLoS Genetics 9:e1003307.

The origin of Hox genes: a telltale neighbourhood

Gods, it’s been so hard to keep my mouth shut about this. A friend of mine just published a paper about Hox genes, and I’ve known about it for a while and it’s been keeping me crazy excited because it’s fascinating and, well: Hox genes! Now that it’s finally out, I can blather about it to my heart’s content, and so I will. Be prepared for a long ride 😉

First of all, a quick rundown of Hox genes for those who aren’t evo-devo geeks. These genes encode transcription factors – proteins that switch genes on/off. They are members of the large and distinguished class of homeobox genes, many of which play important roles in orchestrating embryonic development. Hox genes in particular are famous for laying out the plan for the head to tail axes of bilaterian animals, and for often sitting in neat clusters in the genome and being expressed along the body axis in the same order they are in the cluster. (Below: one of my favourite scientific figures ever, a fruit fly embryo stained in different colours for each of its Hox genes*. From Lemons and McGinnis [2006] via Pharyngula) In short, Hox genes are fucking awesome and extremely important to boot.

Tracing origins

One of the unresolved questions about Hox genes is exactly where they come from, and the new study draws some interesting conclusions regarding their origins. Before we delve into Mendivil Ramos et al. ( 2012) itself, perhaps it’s best to pull out my old sketch of animal phylogeny, because the relationships of the great old animal lineages are kind of important for the discussion. So this is the family tree of animals at first approximation (photos were all sourced from Wikimedia Commons; more info about them in my Nectocaris post):

Mendivil Ramos et al. follow one of the more popular resolutions of the question marks, in which cnidarians are closest to bilaterians and placozoans are the sister group to cnidarians+bilaterians. They don’t talk too much about ctenophores, but I’ll return to that later 🙂

Bilaterians all have Hox genes, and in most of them they do what they were originally discovered doing in fruit flies: patterning the anterior-posterior axis as they say in Jargonese. Some bilaterians have duplicated individual genes or even whole Hox clusters (we have four clusters, and salmon have as many as 13), but it’s pretty uncontroversial that a neat Hox cluster with representatives of most existing types of Hox genes was present already on the left side of the bilaterian box. So was the little sister of the Hox cluster, unimaginatively called the ParaHox cluster, which only contains three kinds of genes but operates in a similar way to its more famous sister (Brooke et al., 1998).

Where did Hox and ParaHox genes come from? Given the phylogeny of the genes, it’s likely that there was originally a small (maybe 2-3 genes) ProtoHox cluster that duplicated to give rise to both Hoxes and ParaHoxes. We know that cnidarians like sea anemones have both Hox and ParaHox genes, which behave somewhat like their bilaterian counterparts (Ryan et al., 2007). Therefore, the ProtoHox cluster must have existed before the common ancestor of these two great lineages.

Enter the Blob

What about placozoans? That’s where things get a bit complicated. Trichoplax, the mysterious little blob that is the only living representative of this oddball phylum, has only one Hox-like gene noncommittally named Trox-2. A relic of the ProtoHox era? Not really – in phylogenetic analyses of the protein sequence, it tends to group with the ParaHox gene Gsx, whereas you would expect a leftover ProtoHox gene to remain outside the Hox+ProtoHox clique.

Is Trox-2 a ProtoHox gene anyway? That would mean something weird happened in the evolution of Hox and ParaHox genes after the cluster duplication: Gsx (and its sisters Hox1-2) would have stagnated somewhere near its ancestral condition while all the other genes sped ahead. It’s a long shot, but evolution has been known to do strange things to gene sequences. Also, homeobox genes are often difficult to classify by sequence alone. Scientists typically use the DNA-binding region that the homeobox encodes for this purpose, but a homeodomain is only 60 amino acids and simply doesn’t contain enough information to place some problematic sequences. And unless we’re examining very closely related genes, the rest of the protein sequence is too different to be compared.

Guilt by association

However, there is another way of solving the mystery. Hox and ParaHox genes are not alone in the genome. They sit on huge chromosomes, and while they tend to banish non-*Hox genes from among them, the flanks of each cluster are populated by a variety of unrelated genes. The key thing is that Hox clusters and ParaHox clusters have different neighbours. Thus, looking at a problem gene’s neighbours can tell us what it is!

(Above: the neighbours of Trox-2. Yellow genes are ParaHox neighbours in humans, green genes are Hox neighbours, grey genes have no human counterparts, and orange genes are parts of both Hox and ParaHox neighbourhoods. From Mendivil Ramos et al. [2012])

This is exactly what happened. My lovely friend Olivia looked at the chunk of genomic sequence that contains Trox-2 and found about two dozen genes on it that had clear homologues in humans. She then tallied where each of the human homologues were, and behold: many of them crowded around ParaHox clusters (we also have several of those, courtesy of whole genome duplications), while only one was a Hox neighbour in humans. If Trox-2 were a ProtoHox, we’d expect a mixture of Hox and ParaHox neighbours, but that’s not what we find at all. Statistically speaking, it’s a no-brainer. Trox-2 is exactly where a ParaHox gene should be.

Ghosts in the genome

Now, we have a problem. If Trox-2 is a ParaHox gene, it must have come after the Hox/ParaHox duplication. So where the hell is the Hox cluster? Well, seeing as Trichoplax only has one ParaHox gene instead of the more typical three or so, gene loss certainly sounds like a possibility. Is there an “empty” Hox cluster lurking somewhere in the blob’s genome? Here, cnidarians turn out to be pretty helpful. After sequencing the genome of the sea anemone Nematostella vectensis, Putnam et al. (2007) attempted to reconstruct parts of the original chromosomes of the cnidarian-bilaterian ancestor. They called the results Putative Ancestral Linkage Groups, in other words, groups of genes that have stayed together since cnidarians and bilaterians diverged 600 or so million years ago.

One of these PALs contains over 200 conserved Hox neighbours, nearly all of which are present in Trichoplax. Strikingly, about half of them are close enough to one another that they are in the same chunk of sequence even though the Trichoplax genome hasn’t been stitched together to the level of whole chromosomes. That’s much more than you’d expect by chance. Trichoplax has a Hox locus without Hox genes, what Mendivil Ramos et al. call a ghost Hox locus.

Hox genes all the way down?

If you followed so far, you might have noticed that we’ve been pushing that elusive ProtoHox further and further back in animal evolution. It preceded bilaterians, it preceded cnidarians and bilaterians, and now it turns out it also preceded our split from placozoans. Will we find it if we look in the remaining animal lineages? Since a ctenophore genome hasn’t yet been released to the public, that question transforms into: will we find it in sponges?

The sponge Amphimedon queenslandica does have a publicly available genome, and much has been made of its apparent lack of many developmentally important transcription factor families (e.g. Larroux et al., 2008). It doesn’t have anything that looks like a Hox, ParaHox or ProtoHox gene, but what about the neighbourhoods?

Like that of Trichoplax, the Amphimedon genome sequence is in relatively small pieces, so a little clever statisticking was needed to decide whether it contains Hox, ParaHox or ProtoHox neighbourhoods. The starting points were the PAL of Hox neighbours mentioned above, and a PAL of ParaHox neighbours the team constructed using the human and Trichoplax genomes. These genes were distributed among many genomic scaffolds, but of course lacking chromosome-level information the group didn’t know whether any of these scaffolds are actually linked to each other in the sponge genome.

The solution was a simulation: take the number of genes in the PAL, take the number and size (in number of genes) of the thousands of Amphimedon scaffolds, and scatter the PAL members randomly among the scaffolds with the larger scaffolds proportionately more likely to receive a PAL gene. When all the PAL members are handed out, count the number of scaffolds with PAL members on them. Repeat this a thousand times, and you get an idea what the distribution of Hox and ParaHox neighbours would be if they weren’t clustered together. This approach showed that the real distribution is anything but random. Hox and ParaHox neighbours are clearly clustered in the sponge genome, and what’s more, they are clustered separately.

Still no ProtoHox locus, in other words. At some point in the murky depths of their ancestry, sponges lost bona fide Hox and ParaHox genes!


That raises a couple of issues. First, where is the ProtoHox? Hox-like genes have never been found outside animals. These are smart people we’re talking about, so they checked the genome of the closest non-animal relative we have today, a choanoflagellate. Neither Hox/ParaHox nor ProtoHox neighbourhoods were there – the PAL genes didn’t cluster together any more than they would by chance. The whole *Hox phenomenon seems unique to animals (or else the choanoflagellate genome is totally scrambled). It appears that somewhere in our ancestry, ProtoHox gene(s) appeared and parted ways before sponges split from the rest of the animals. Since we have no surviving descendants of these ancestors outside of sponges and the rest of the animals, we’ll probably never find unduplicated descendants of the ProtoHox cluster.

Second, what happened in ctenophores? Everything we know about their genomes suggests that they completely lack Hox-like genes. Although there have been studies that placed them even further out than sponges (Dunn et al., 2008), it’s more likely that they are much closer to bilaterians than that (Philippe et al., 2011). I think I’m not the only one itching to examine a ctenophore genome for Hox neighbours…

And finally, if some distant ancestor of all animals had full-blown Hox and ParaHox clusters, what the heck was it doing with them? Was it something unexpectedly complex that would need genes for axial patterning? Are sponges and placozoans grossly simplified descendants of a much more complex ancestor, or did Hox-like genes only become involved in dividing up body axes later in evolution?

The more we learn the less we know. One thing is (once again) clear: assuming that a simple animal is a good proxy for an ancestral animal is a dangerous, dangerous assumption to make.


*Technically, fruit flies have twelve Hox genes, but only seven are shown in the image. Hox2/proboscipedia is a normal Hox gene involved in the development of mouthparts among others, but four more genes have completely lost their “canonical” Hox gene-like activities. That includes all three of Drosophila‘s weird triplicated Hox3 genes.



Brooke NM et al. (1998) The ParaHox gene cluster is an evolutionary sister of the Hox gene cluster. Nature 392:920-922

Dunn CW et al. (2008) Broad phylogenomic sampling improves resolution of the animal tree of life. Nature 457:745-759

Larroux C et al. (2008) Genesis and expansion of metazoan transcription factor gene classes. Molecular Biology and Evolution 25:980-996

Lemons D and McGinnis W (2006) Genomic evolution of Hox gene clusters. Science 313:1918-1922

Mendivil Ramos O et al. (2012) Ghost loci imply Hox and ParaHox existence in the last common ancestor of animals. Current Biology in press, available online 26/09/2012, doi: 10.1016/j.cub.2012.08.023

Philippe H et al. (2011) Resolving difficult phylogenetic questions: why more sequences are not enough. PLoS Biology 9:e1000602

Putnam NH et al. (2007) Sea anemone genome reveals ancestral eumetazoan gene repertoire and genomic organization. Science 317:86-94

Ryan JF et al. (2007) Pre-bilaterian origins of the Hox cluster and the Hox code: evidence from the sea anemone, Nematostella vectensis. PLoS ONE 2:e153

Hexapeptide or bust!

I have a big soft spot for Hox genes, or rather, Hox proteins. Thanks to some of my earlier work, I also have a soft spot for all their secret little sequence motifs that help them interact with other proteins and help us classify them (e. g. Balavoine et al., 2002). Probably the best-known such motif is the hexapeptide. (“Hexa-” would kind of imply that it’s made of six amino acids, but people only ever seem to talk about four. Don’t ask me why they call it a hexapeptide…)

This motif is very widespread, occurring not just in Hox proteins but also in many others in the larger class of homeodomain proteins that Hoxes belong to. For many years, the hexapeptide has been regarded as the key to the interaction of Hox proteins with another homeodomain-bearing protein, called Extradenticle in flies and Pbx plus a number (we have 3 of them) in vertebrates*. (Above is a cartoon version of DNA with the homeodomains – the purple curls – of Exd and the fly Hox protein Ubx bound to it, from the Protein Data Bank.) Hox proteins bind DNA to regulate various genes, and are absolutely vital for an embryo to develop the right organs in the right places. This interaction changes their DNA binding behaviour, making the hexapeptide possibly the most important four amino acids in animal development.

And now we’re supposed to scrap that?

I’ve just skimmed through Hudry et al. (2012), and died a little inside.

The study claims – on what seems to be good evidence – that the hexapeptide is not all it’s cracked up to be. Out of six fruit fly Hox proteins examined, only two stop interacting with Exd when the hexapeptide is mutated beyond recognition, and even then one of them is kind of half-hearted about it. The team also tested a few mouse Hox genes in cultured cells and – for whatever reason – chick embryos, and got largely the same results.

I rather like their approach, though. I think the method for detecting interaction is incredibly clever, though it’s clearly not something they invented. The idea is based on fluorescent proteins. These are very commonly used to track the levels and whereabouts of other proteins. Since they are pretty small and innocuous, the gene encoding them can be tacked onto the gene of interest, and the resulting protein chimaera will do whatever the target protein would do without its fluorescent companion. The only difference is now it glows wherever it goes. The more protein, the brighter the glow.

The nice thing about fluorescent proteins is that you can cut them in half, and if the two parts get close enough, they’ll still glow. Therefore, if you glue one half of the DNA for a fluorescent protein to gene 1, the other half to gene 2, and let them loose in the same cell, you can tell whether the protein products of gene 1 and gene 2 interact just by looking for the telltale fluorescence. And the tale it’s telling is that all these Hox proteins are getting snug with Exd despite the loss of the motif supposedly necessary for the interaction.

I’ll just go away and quietly get over that now.


*Vertebrate geneticists have no imagination. Okay, they did come up with lunatic fringe and Sonic hedgehog. After the fly people named fringe and hedgehog.



Balavoine G et al. (2002) Hox clusters and bilaterian phylogeny. Molecular Phylogenetics and Evolution 24:366-373

Hudry B et al. (2012) Hox proteins display a common and ancestral ability to diversify their interaction mode with the PBC class cofactors. PLoS Biology 10:e1001351