The Mammal’s very own Hox genes (excite! Woo!)

It’s kind of hard to begin this post. First of all, let’s get the important news out of the way: I’ve just published a paper. In a moment, I’ll get around to discussing it at even more than my usual length, but I feel that I can’t do my excited puppy act without at least trying to capture how bloody much this paper means to me. The following may get a little personal; if you want to jump straight to the Cool Stuff, feel free to scroll a couple of paragraphs down.

<personal bit>

As you may have guessed from the long silence here, it’s not been a good handful of years, Real Life and mental health-wise. After my PhD, the prospect of the research career I’d dreamed of since I first began to grasp the meaning of the word “scientist” no longer seemed so dreamlike. It may surprise you to hear this from someone who finished a PhD with four published papers and spent the years of said PhD blathering regularly on the internet, but I find writing things for other people to read very, very stressful. In the case of a job application or a thesis chapter, that becomes “I’m not eating or sleeping properly” stressful. (Don’t ask me how I survived 20+ years of formal education.)

Long story short, for the last 3 years I’ve been getting by with a minimum wage job for which I’m both vastly overqualified and singularly ill-suited. I started the research project that culminated in the paper you can now read (for free, yay!) in BMC Evolutionary Biology (Szabó and Ferrier, 2018) while unemployed and broke, and I did most of it in my free time around work. This paper is a hard-won victory over myself and my circumstances. It’s a tiny glint of self-worth in the depth of the tunnel. In some ways, it was harder than my thesis: no funding body to satisfy, no lab mates to gripe at, no deadlines to spur me on. The only constant was my ex-supervisor turned co-author, who took my hobby project under his wings for the slim reward of having his name on a paper and nudged me into finishing it with unending patience. Here’s to Dave Ferrier, champion of non-model organisms, homeobox guy extraordinaire and all-round excellent human being. Dave, I hope you know you’re an absolute star.

</personal bit>

With that out of the way, it’s time for the Cool Stuff. There are Hox genes! More Hox genes than anyone ever imagined! (That is kind of the point, in fact!)

Apologies for the word count. I thought it would be a good idea to explain a few things, but also, I think I enjoy waffling about my baby far too much 😊

Hox therapy

The story of my Hox paper begins with an unemployed biologist with an overabundance of free time and a desperate need to do something scientific. Since I have a slightly odd idea of “fun”, back in 2015 I decided to catalogue Hox gene (or rather, protein) diversity in the animal kingdom, with particular focus on obscure and poorly studied groups. (I didn’t get very far, as we’ll see.)

Since it’s hard to discuss the paper without dropping some arcane zoological nomenclature, here’s my trusty old animal phylogeny to (re)acquaint us with the general outlines of the animal kingdom (I might need to update this in light of the Great Ctenophore Controversy some day, but we’re not dealing with anything outside the Bilateria today):

animalPhylogeny

For the purposes of my paper, we’re zooming into the deuterostome branch, which looks something like this on the inside (borrowing my own rather lacklustre last-minute figure from Szabó and Ferrier [2018]):

12862_2018_1307_Fig1_HTML

Everything on this tree apart from chordates (that’s us) belongs to a group called Ambulacraria, which contains two phyla, hemichordates (top two branches) and echinoderms (the next five). Echinoderms are the more familiar of the two – starfish and sea urchins and suchlike – and also the focus of my project. (I could find no Hox gene data from pterobranchs, which puts a slight caveat on everything I say about hemichordates)

Back to Hox genes.

Hox genes were kind of my gateway drug into evolutionary developmental biology. A few decades earlier, they had served the same purpose for developmental biology as a whole, since they were among the first genes to be discovered that (1) directed embryonic development (2) were comparable between very disparate animal groups. The short version, which will suffice for our purposes here, is that Hox genes are important in what we eggheads call anteroposterior patterning, or determining what body parts go where along the head (anterior) to tail (posterior) axis of a (bilaterian) animal.

In (I think, I haven’t counted) the majority of animals that have them, Hox genes are clustered to a greater or lesser extent. Rather than being scattered haphazardly across the genome, they sit close to one another along the same stretch of DNA. (Duboule [2007] is an excellent – albeit now slightly out of date – review of the various known configurations.)

Since my study is about echinoderms, the schematic Hox cluster shown below is the neatest known example from an echinoderm, the crown-of-thorns starfish Acanthaster planci (source: Baughman et al., 2014):

baughman2014_1

In this image, Hox genes are colour-coded according to a commonly used classification scheme. This classification is mostly based on the homeodomain, or the “business” end of the protein that a Hox gene encodes. A homeodomain makes up a relatively small portion (maybe 1/5th on average) of a typical Hox protein, but it’s the part that interacts with the DNA switches through which Hoxes control their target genes, and it’s often the only part that is similar enough to be compared between different Hox types.

The important genes for us today are the “posterior” Hox genes shown in pink and red above, especially the last two. The four posterior Hox genes seen here represent the “standard” set for ambulacrarians, although it’s uncertain whether Hox11/13b-c were already separate genes or just a single precursor gene in the ambulacrarian ancestor.

Eureka… or WTF?

“The most exciting phrase to hear in science, the one that heralds new discoveries, is not Eureka! (I found it!) but rather, ‘hmm… that’s funny…”Almost certainly not Isaac Asimov

In creating my grand catalogue, I’d quickly breezed through vertebrates (which are all essentially the same for my purposes) and other chordates (for which the data I could find were rather limited). I thought echinoderms would be an easy job, too: there were good in-depth studies of a few species, and they hadn’t revealed anything terribly unusual other than a rearrangement of the Hox cluster in sea urchins (Cameron et al., 2006).

In fact, through comparison with their sister group, the hemichordates (Freeman et al., 2012), it seemed likely that the ancestral echinoderm had a nice, ordered Hox cluster with few if any oddities (Baughman et al., 2014). So I clicked my way to the wonderful Echinobase, which has searchable draft genomes from four of the five living classes of echinoderms (crinoids, a.k.a sea lilies and feather stars, are missing, although a genome in a very early, fragmentary stage exists here). I expected to double-check the published data, collect the same genes from the groups for which Hox papers hadn’t been published, and be off to protostomes in a day or two. Two years later, I still haven’t made it to protostomes, but I’ve gone rather deeper than expected in echinoderms…

(Below: my cast. The main characters are Strongylocentrotus purpuratus [photo: Kirt L. Onthank] and Lytechinus variegatus [photo: Hans Hillewaert] representing sea urchins, Patiria miniata [photo: Jerry Kirkhart] and Acanthaster planci [photo: JSLUCAS75] for sea stars, Parastichopus parvimensis [from here] and Apostichopus japonicus [photo: OpenCage] for sea cucumbers, Metacrinus rotundus [photo: OpenCage] and Anneissia japonica [photo: OpenCage] for crinoids, Ophiothrix spiculata [photo: Jerry Kirkhart] for brittle stars, with supporting acts from Peronella japonica [sea urchins, photo: Endo et al., 2018], Ophiopsila aranea [brittle stars, photo: Bernard Picton], Balanoglossus simodensis [photo: Misaki Marine Biological Station, U of Tokyo], Saccoglossus kowalevskii [photo: Lowe lab] and Ptychodera flava [photo: Moorea BioCode via CalPhotos] for hemichordates, and Branchiostoma floridae [photo via JGI genome portal], Latimeria menadoensis [photo: Claudio Martino] and Callorhinchus milii [photo: fir0002/Flagstaffotos] for chordates. I sourced the photos through Wikipedia/Wikimedia Commons where I could; other sources are linked where applicable.)

cast

You see, I didn’t want to stop at just homeodomains. Homeodomains are cool and important and all, but one thing I’d learned from my earlier forays into the world of Hox genes was that valuable information hid in small patches of conserved sequence elsewhere in their proteins. Besides, I am a pathological perfectionist. I felt a terrible need to collect complete Hox sequences wherever possible.

I already mentioned that sequence similarity between Hoxes outside the homeodomain can be weak to non-existent. I ran into this problem with Echinobase’s brittle star, Ophiothrix spiculata. Using the known sea urchin Hoxes to search its genome, I’d found believable matches for many of them, but the 11/13s defeated me. I had two homeodomains that I thought represented 11/13b and c, but I couldn’t for the life of me recover the rest of the proteins.

The problem with genome databases (or their great advantage depending on your perspective) is that they contain all of the DNA that could be sequenced from the owner of the genome. The problem with Hox genes – most of our genes, in fact – is that they aren’t continuous stretches of DNA. Your typical gene exists in multiple segments (exons) separated by a whole lot of DNA that leaves no trace in the protein product of the gene. (Hox genes normally have two or three exons, the first of which is devoid of homeodomain parts.)

When a gene is expressed, the cell first makes an RNA copy of all that, which is edited to throw out the introns and splice the exons together. That intron-less RNA copy is then carried off to be translated into a protein. Transcriptomes are derived from the RNA copies of active genes. Introns lie forgotten on the cutting room floor: in the sequenced transcripts, one exon continues straight into the next. Therefore, if I could find a brittle star transcriptome, and the 11/13b-c homeodomains in it, perhaps there would be enough of the rest in there to reconstruct those elusive first exons.

Luckily, Delroisse et al. (2016) had published exactly what I needed. In one of their transcriptomes, I found a homeodomain that looked like my Ophiothrix Hox11/13c, as part of a near-complete sequence. Excited, I did the reciprocal search against the Ophiothrix genome…

… and hit neither 11/13b nor 11/13c.

So here I am, staring at a beautiful match between this transcript and a part of the Ophiothrix genome that I hadn’t examined before. The match contains sequence from the first exon, which, given my previous experience with these buggers, is a sure sign that they’re the same gene. And it’s neither of the ones I’d expected.

A bit later in a different database, I hit upon an automatically predicted sea urchin protein that definitely isn’t 11/13b or c either. This is the model sea urchin, S. purpuratus, the one I thought we knew inside out when it came to Hoxes. I check the genome on Echinobase, and lo and behold, there’s the third 11/13b-c type gene, and it’s nowhere near the Hox cluster.

If memory serves, it’s roughly at this point that the words, “What. The. Actual. Fuck. Is. Going. On.” occur in my research notes. (Complete with punctuation.)

I checked the other species on Echinobase. Three 11/13b-c genes again, every time. Over on Genbank, I found a complete protein sequence from a sand dollar that Tsuchimoto and Yamaguchi (2014) had previously classified as 11/13c by exclusion. The Japanese duo had a clear b, but this other sequence was behaving oddly in their phylogenetic analyses. Now I had the obvious explanation: it wasn’t 11/13c at all.*

I wrote to Dave and found out that this was also news to him. By all appearances, I had stumbled on something truly new, in a gene family that’s both iconic in our field, and dear to my obsessive little heart.

We decided to try to turn it into a paper.

In search of the alphabet’s end

Once we’d made that decision, and following Dave’s advice, I had a few tasks ahead of me. I had to check how far back in evolution our new gene (which we called Hox11/13d) went. I had to test whether it had truly escaped the Hox cluster in all of our study species. I had to refresh my memory on deuterostome posterior Hox genes in general, both for paper-writing purposes and in case there was a forgotten reference to our “new” gene lurking somewhere in the literature.

There wasn’t, but.

In a figure legend in Thomas-Chollier et al., 2010), there is a brief mention of an unnamed “Hox11/13c-like” sequence in sea urchins. When I saw that, I damn near soiled myself, but the authors couldn’t definitively identify this sequence as a Hox gene, so they left it at that throwaway comment and a few bits of supplementary data. Luckily, they had a gene ID that I could look up on Echinobase.

Gods help me, it turned out to be another new Hox. When the shock of Hox11/13d had barely worn off, I was confronted with a possible Hox11/13e. And this one wasn’t in the Hox cluster either.

Aside from not being part of the Hox cluster, Hox11/13d is a pretty good echinoderm Hox gene. The homeodomain it encodes is reminiscent of Hox11/13b and c, and, although they are hard for automated searches to find, there are similarities outside the homeodomain that place it firmly in the same group as b-c.

Unlike d, Thomas-Chollier’s “11/13c-like” sequence isn’t that 11/13c-like at all, as you might have guessed from the fact that they weren’t even sure it’s a Hox. The region immediately following the homeodomain (sometimes known as the C-peptide) is very similar to the same part of Hox11/13d. These kinds of motifs can sometimes be used to tell different Hox genes apart. Two C-peptides being strongly similar is a clue that we’re dealing with related genes. However, the homeodomain of Hox11/13e, as we indeed dubbed Thomas-Chollier’s sequence, is really, really weird. It isn’t just unlike 11/13c, it’s unlike anything else I’d seen before. It groups with posterior Hoxes when we test it against a variety of homeodomains, but you wouldn’t know that simply from looking at it.

It is, however, an oddball with a history. As strange as that homeodomain is, once I knew what I was looking for, I found examples in all my other echinoderms. This combination of strong conservation of one Hox gene with considerable differences from other Hox genes just screams “study me more!”, especially when you realise that Hox11/13e appears to be limited to echinoderms (unless something like it is hiding in protostomes…). I looked quite carefully in the hemichordates available to me (Simakov et al., 2015), but the only thing I found that wasn’t one of the “canonical” four posteriors is something called “Abdominal B-like”, which is weird in its own way and not obviously connected to either of our two new genes.

Tangled histories and unhelpful clues

I alluded to the question of Hox11/13b-c origins earlier on. Posterior Hox genes in deuterostomes are notoriously difficult to classify (Ferrier et al., 2000; Thomas-Chollier et al., 2010). When you try to use traditional tree-building methods on them, you get a big unresolved mess, as if the twigs on the tree emerged from an impenetrable mist that hides the arrangement of the older branches from view. Ambulacrarians are definitely the better-behaved half of the Deuterostomia in this regard, since we can say with some confidence that Hox9/10, 11/13a and at least a single precursor to 11/13b-c were present in their last common ancestor.

Nonetheless, two new genes, at least one of which is clearly close to 11/13b-c, complicate matters (Abdominal B-like, as they say in scientist-speak, is beyond the scope of this work). Were they lost in hemichordates? Did echinoderms undergo extra gene duplications, and if so, was it from one or two ancestral genes? Where on earth does Hox11/13e fit? I did a lot of exploratory tree-building for this paper, none of which was particularly helpful in answering those questions.

My other hope was to look at the parts of the protein sequence that led me to my new Hoxes in the first place: all the stuff other than the homeodomain. Using a program called MEME, I found a fair few conserved motifs, but they only seemed to add to the confusion. Hox11/13e, for which I only had first exons (and tentative ones at that) from sea urchins and sea stars, yielded nothing of use apart from its striking C-peptide. In the others, the distribution of motifs created a patchwork of similarities that didn’t neatly align with any one possible history. Echinoderm Hox11/13c mostly did its own thing, while b and d each shared a different subset of motifs with one or both of the hemichordate b-c proteins.

I’m almost inclined to think that there was a single, “prototype” Hox11/13b+ sequence in the ambulacrarian ancestor, which contained all of the motifs I found. In that scenario, separate b and c (and d and maybe e) genes would have evolved independently in hemichordates and echinoderms, and each descendant gene would have lost some of the original motifs more or less at random. Duplicated genes can split the functions of their single ancestor between them (Force et al., 1999), so why not motifs? Short sequence motifs like the ones I was looking for can have important functions, after all. It’s a possibility, but we may never know for sure.

Hox genes gone rogue

I mentioned before that Hox11/13d was outside the Hox cluster. Well, so is Hox11/13e. As far as I can tell, Hox 11/13d and e always reside on separate chunks of the genome form any other Hox gene, including each other. They are always accompanied by neighbouring genes that aren’t Hoxes. Although detachment of a posterior gene from an otherwise apparently intact Hox cluster also happened in ragworms (Hui et al., 2012), it’s still a surprise in echinoderms. Since the relationship between the organisation of Hox genes and their regulation in space and time is… kinda complicated, we can’t really tell what, if anything, all this wandering implies without actually looking at some gene expression.

What are they for?

Then there’s the question of what on earth these genes do. Thanks to Tsuchimoto and Yamaguchi (2014), we know that Hox11/13d is active in later embryonic stages of some sea urchins. It even looks like it might be working with Hox11/13b in a Hox-like fashion, the two of them having adjacent expression domains. We have some transcriptomic evidence that this gene is also active in other sea urchins, brittle stars and starfish, but no idea what it’s doing in any of the above.

We know even less about Hox11/13e. The only evidence for expression I’m aware of is from starfish testicles, and testicles will express any old piece of DNA with an “on” switch. If it’s somehow involved in development, it must be either at very low levels that are difficult to capture in a transcriptome, or at developmental stages that weren’t included in the data I encountered.

If it does have a role in adult echinoderm development, that would be crazy exciting, as both adult echinoderm anatomy and Hox11/13e are so weird and unique. Although they develop from bilaterally symmetrical larvae, adult echinoderms have dispensed with the symmetry that gave Bilateria its name. Instead, like a sea anemone (or a regular anemone…), they are radially symmetrical. Hox genes are involved in both larval and adult development in echinoderms, but from what little I’ve been able to glean from the existing literature, it’s different subsets in larvae and adults rather than the entire Hox cluster together. Is Hox11/13e in the “adult” subset, missed until now due to its unusual sequence? I really hope someone with a lab and a ready supply of baby echinoderms investigates in the near future…

A lesson about expectations

I could go on for a lot longer about this project, but it’s probably time to form some sort of conclusion. For me, perhaps the most important take-home message of this adventure is not what I found, but how and where and why I found it.

I didn’t set out to discover anything. All I wanted to do was collect and organise information already out there. (If a genie popped out of my desk lamp, I might just wish for a full-time job where I get to build my Hox directory… given the volume of genome data already out there and coming out every time I look, continuing this as a hobby project in my free time seems hopelessly Sisyphean now.)

The discovery of Hox11/13d and all that followed was an accidental side effect of my penchant for perfectionism. If I’d contented myself with the homeodomains most students of Hox evolution focus on, I would never have seen a Hox that wasn’t in the books, a Hox I hadn’t expected to exist.

Expectations are important. I’d told myself that I wanted to make sure I had everything, but when my searches spat out a hundred different results, I started to slack off soon after I ticked off the Hoxes I knew. I gave the rest of the hit list a half-hearted effort at best. Hox11/13d has a homeodomain that’s split across two exons, and Hox11/13e is weird. In a search that scores both the closeness and the length of a match, that pushes them to the bottom of the results, where a casual observer, or an observer who thinks they know what they’re looking for, will most likely miss them. I thought I knew that sea urchins had a single, intact(ish) Hox cluster with 11 genes. I’d read a pretty good paper on it. Only the paper wasn’t quite right, after all.

To me, this study stands as a reminder to keep looking. In an era when new genomes are popping up left and right and Big Data with automated analyses is the scientific zeitgeist, it’s still worth rolling your sleeves up, picking up the old magnifying glass and taking a closer look – even in organisms you think you know. You might just chance upon some real treasure.

***

Note:

*A “Hox11/13c” behaving oddly should be immediately suspicious based on what I saw in my own trees, where echinoderm Hox11/13c consistently formed a strongly supported group. But that’s hindsight for you…

***

References:

Baughman KW et al. (2014) Genomic organization of Hox and ParaHox clusters in the echinoderm, Acanthaster planci. Genesis 52:952-958

Cameron RA et al. (2006) Unusual gene order and organization of the sea urchin hox cluster. JEZ B 306:45-58

Delroisse J et al. (2016) De novo adult transcriptomes of two European brittle stars: spotlight on opsin-based photoreception. PLoS ONE 11: e0152988

Duboule D (2007) The rise and fall of Hox gene clusters. Development 134:2549-2560

Endo M et al. (2018) Hidden genetic history of the Japanese sand dollar Peronella (Echinoidea: Laganidae) revealed by nuclear intron sequences. Gene 659:37-43

Ferrier DEK et al. (2000) The amphioxus Hox cluster: deuterostome posterior flexibility and Hox14. Evol Dev 2:284-293

Force A et al. (1999) Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151:1531-1545

Freeman R et al. (2012) Identical genomic organization of two hemichordate Hox clusters. Curr Biol 22:2053-2058

Hui JH et al. (2012) Extensive chordate and annelid macrosyntheny reveals ancestral homeobox gene organization. Mol Biol Evol 29:157-165

Simakov O et al. (2015) Hemichordate genomes and deuterostome origins. Nature 527:459-465

Szabó R and Ferrier DEKF (2018) Two more Posterior Hox genes and Hox cluster dispersal in echinoderms. BMC Evol Biol 18:203

Thomas-Chollier M et al. (2010) A non-tree-based comprehensive study of metazoan Hox and ParaHox genes prompts new insights into their origin and evolution. BMC Evol Biol 10:73

Tsuchimoto J and Yamaguchi M (2014) Hox expression in the direct-type developing sea urchin Peronella japonica. Dev Dyn 243:1020-1029

Advertisements

To dump a chunk of trunk

The Mammal has deemed that Hox genes and good old-fashioned feel-good evo-devo are a good way to blink back to life*. Also, tardigrades. Tardigrades are awesome. Here is one viewed from above, from the Goldstein lab via Encyclopedia of Life:

hypsibius_dujardini_eol

Tardigrades or water bears are also a bit unusual. Their closest living relatives are velvet worms (Onychophora) and arthropods. Exactly who’s closest to whom in that trio of phyla collectively known as the Panarthropoda is not clear, and I don’t have the energy to wade into the debate – besides, it’s not really important for the purposes of this post. What Smith et al. (2016) concluded about these adorably indestructible little creatures holds irrespective of their precise phylogenetic position.

Anyway. I said tardigrades were unusual, and I don’t mean their uncanny ability to survive the apocalypse and pick up random genes in the process (Boothby et al., 2015). (ETA: so apparently there may not be nearly as much foreign gene hoarding as the genome paper suggests – see Sujai Kumar’s comment below! Doesn’t change the fact that tardigrades are tough little buggers, though 🙂 ) The oddity we’re interested in today lies in the fact that all known species are built to the exact same compact body plan. Onychophorans and many arthropods are elongated animals with lots of segments, lots of legs, and often lots of variation in the number and type of such body parts. Tardigrades? A wee head, four chubby pairs of legs, and that’s it.

How does a tardigrade body relate to that of a velvet worm, or a centipede, or a spider? Based solely on anatomy, that’s a hell of a question to answer; even the homology of body parts between different kinds of arthropods can be difficult to determine. I have so far remained stubbornly uneducated on the minutiae of (pan)arthropod segment homologies, although I do see papers purporting to match brain parts, appendages and suchlike between different kinds of creepy-crawlies on a fairly regular basis. Shame on me for not being able to care about the details, I guess – but the frequency with which the subject comes up suggests that the debate is far from over.

Now, when I was first drawn to the evo-devo field, one of the biggest attractions was the notion that the expression of genes as a body part forms can tell us what that body part really is even when anatomical clues are less than clear. That, of course, is too good to be simply true, but sometimes the lure of genes and neat homology stories is just too hard to resist. Smith et al.‘s investigation of tardigrade Hox genes is definitely that kind of story.

Hox genes are generally a good place to look if you’re trying to decipher body regions, since their more or less neat, orderly expression patterns are remarkably conserved between very distantly related animals (they are probably as old as the Bilateria, to be precise). A polychaete worm, a vertebrate and an arthropod show the same general pattern – there is no active Hox gene at the very front of the embryo, then Hoxes 1, 2, 3 and so on appear in roughly that order, all the way to the rear end. There are variations in the pattern – e.g. the expression of a gene can have sharp boundaries or fade in and out gradually; different genes can overlap to different extents, the order isn’t always perfect, etc. – but staggered Hox gene expression domains, with the same genes starting up in the same general area along the main body axis, can be found all across the Bilateria.

Tardigrades are no exception, in a sense – but they are also quite exceptional. First, their complement of Hox genes is a bit of a mess. At long last, we have a tardigrade genome to hand, in which Smith et al. (2016) found good honest Hox genes. What they didn’t find was a Hox cluster, an orderly series of Hox genes sitting like beads on a DNA string. Instead, the Hox genes in Hypsibius dujardini, the sequenced species, are all over the genome, associating with all kinds of dubious fellows who aren’t Hoxes.

What Smith et al. also didn’t find was half of the Hox genes they expected. A typical arthropod has ten or so Hox genes, a pretty standard ballpark for an animal that isn’t a vertebrate. H. dujardini has only seven, three of which are triplicates of Abdominal-B, a gene that normally exists in a single copy in arthropods. So basically, only five kinds of Hox gene – number two and most of the “middle” ones are missing. What’s more, two more tardigrades that aren’t closely related to H. dujardini also appear to have the same five Hox gene types (though only one Abd-B each), so this massive loss is probably a common feature of Tardigrada. (No word on whether the scattering of the Hox  cluster is also shared by the other two species.)

We know that the genes are scattered and decimated, but are their expression patterns similarly disrupted? You don’t actually need an intact Hox cluster for orderly Hox expression, and indeed, tardigrade Hox genes are activated in a perfectly neat and perfectly usual pattern that resembles what you see in their panarthropod cousins. Except for the bit where half the pattern is missing!

Here’s part of Figure 4 from the paper, a schematic comparison of tardigrade Hox expression to that of other panarthropods – a generic arachnid, a millipede and a velvet worm. (otd is a “head” gene that lives in the Hox-free anterior region; lab is the arthropod equivalent of Hox1, Dfd is Hox4, and I’m not sure which of Hox6-8 ftz is currently supposed to be.) The interesting thing about this is that according to Hox genes, the entire body of the tardigrade corresponds to just the front end of arthropods and velvet worms.

Smith_etal2016-hox_tardigrade_fig4A

In addition, one thing that is not shown on this diagram is that Abdominal-B, which normally marks the butt end of the animal, is still active in the tardigrade, predictably in the last segment (L4, that is). So if you take the Hox data at face value, a tardigrade is the arse end of an arthropod tacked straight onto its head. Weird. It’s like evolution took a perfectly ordinary velvet worm-like creature and chopped out most of its trunk.

The tardigrade data suggest that the original panarthropod was probably more like arthropods and velvet worms than tardigrades – an elongated animal with many segments. The strange tardigrade situation can’t be the ancestral one, since the Hox genes that tardigrades lack long predate the panarthropod ancestor. Now, it might be possible to lose half your Hox genes while keeping your ancestral body plan, but an unusual body plan and an unusual set of Hox genes is a bit of a big coincidence, innit?

Smith et al. point out that the loss of the Hox genes was unlikely to be the cause of the loss of the trunk region – Hox genes only specify what grows on a segment, they don’t have much say in how many segments develop in the first place. Instead, the authors reason, the loss of the trunk in the tardigrade ancestor probably made the relevant Hox genes dispensable.

Damn, this story makes me want to see the Hox genes of all those oddball lobopodians from the Cambrian. Some of them are bound to be tardigrade relatives, right?

***

References:

Boothby TC et al. (2015) Evidence for extensive horizontal gene transfer from the draft genome of a tardigrade. PNAS 112:15976-15981

Smith FW et al. (2016) The compact body plan of tardigrades evolved by the loss of a large body region. Current Biology 26:224-229

***

*The Mammal has been pretty depressed lately. As in mired up to her head in weird energy-sucking flu. Unfortunately, writing is one of those things that the damn brain monster has eaten most of the fun out of. Also, I have a shitty normal person job at the moment, and shitty job taking up time + barely enough motivation to crawl out of bed and pretend to be human means I have, at best, one afternoon per week that I actually spend on catching up with science. That is just enough to scroll through my feeds and file away the interesting stuff, but woefully insufficient for the writing of posts, not to mention that my ability to concentrate is, to be terribly technical, absolutely fucked. It’s not an ideal state of affairs by any stretch, and I’m pretty sure that if I made more of an effort to read and write about cool things, it would pay off in the mental health department, but… well. That sort of reasonable advice is hard to hear with the oozing fog-grey suckers of that thing clamped onto my brain.

In which a “living fossil’s” genome delights me

I promised myself I wouldn’t go on for thousands and thousands of words about the Lingula genome paper (I’ve got things to do, and there is a LOT of stuff in there), but I had to indulge myself a little bit. Four or five years ago when I was a final year undergrad trying to figure out things about Hox gene evolution, I would have killed for a complete brachiopod genome. Or even a complete brachiopod Hox cluster. A year or two ago, when I was trying to sweat out something resembling a PhD thesis, I would have killed for some information about the genetics of brachiopod shells that amounted to more than tables of amino acid abundances. Too late for my poor dissertations, but a brachiopod genome is finally sequenced! The paper is right here, completely free (Luo et al., 2015). Yay for labs who can afford open-access publishing!

In case you’re not familiar with Lingula, it’s this guy (image from Wikipedia):

In a classic case of looks being deceiving, it’s not a mollusc, although it does look a bit like one except for the weird white stalk sticking out of the back of its shell. Brachiopods, the phylum to which Lingula belongs, are one of those strange groups no one really knows where to place, although nowadays we are pretty sure they are somewhere in the general vicinity of molluscs, annelid worms and their ilk. Unlike bivalve molluscs, whose shell valves are on the left and right sides of the animal, the shells of brachiopods like Lingula have top and bottom valves. Lingula‘s shell is also made of different materials: while bivalve shells contain calcium carbonate deposited into a mesh of chitin and silk-like proteins,* the subgroup of brachiopods Lingula belongs to uses calcium phosphate, the same mineral that dominates our bones, and a lot of collagen (again like bone). But we’ll come back to that in a moment…

One of the reasons the Lingula genome is particularly interesting is that Lingula is a classic “living fossil”. In the Paleobiology Database, there’s even an entry for a Cambrian fossil classified as Lingula, and there are plenty of entries from the next geological period. If the database is to be believed, the genus Lingula has existed for something like 500 million years, which must be some kind of record for an animal.** Is its genome similarly conservative? Or did the DNA hiding under a deceptively conservative shell design evolve as quickly as anyone’s?

In a heroic feat of self-control, I’m not spending all night poring over the paper, but I did give a couple of interesting sections a look. Naturally, the first thing I dug out was the Hox cluster hiding in the rather large supplement. This was the first clue that Lingula‘s genome is definitely “living” and not at all a fossil in any sense of the word. If it were, we’d expect one neat string of Hox genes, all in the order we’re used to from other animals. Instead, what we find is two missing genes, one plucked from the middle of the cluster and tacked onto its “front” end, and two genes totally detached from the rest. It’s not too bad as Hox cluster disintegration goes – six out of nine genes are still neatly ordered – but it certainly doesn’t look like something left over from the dawn of animals.

The bigger clue that caught my eye, though, was this little family tree in Figure 2:

Luo_etal2015-fig2

The red numbers on each branch indicate the number of gene families that expanded or first appeared in that lineage, and the green numbers are the families shrunk or lost. Note that our “living fossil” takes the lead in both. What I find funny is that it’s miles ahead of not only the animals generally considered “conservative” in terms of genome evolution, like the limpet Lottia and the lancelet Branchiostoma, but also the sea squirt (Ciona). Squirts are notorious for having incredibly fast-evolving genomes; then again, most of that notoriety was based on the crazily divergent sequences and often wildly scrambled order of its genes. A genome can be conservative in some ways and highly innovative in others. In fact, many of the genes involved in basic cellular functions are very slow-evolving in Lingula. (Note also: humans are pretty slow-evolving as far as gene content goes. This is not the first study to find that.)

So, Lingula, living fossil? Not so much.

The last bit I looked at was the section about shell genetics. Although it’s generally foolish to expect the shell-forming gene sets of two animals from different phyla to be similar (see my first footnote), if there are similarities, they could potentially go at least two different ways. First, brachiopods might be quite close to molluscs, which is the hypothesis Luo et al.‘s own treebuilding efforts support. Like molluscs, brachiopods also have a specialised mantle that secretes shell material, though having the same name doesn’t mean the two “mantles” actually share a common origin. So who knows, some molluscan shell proteins, or shell regulatory genes, might show up in Lingula, too.

On the other hand, the composition of Lingula’s shell is more similar to our skeletons’. So, since they have to capture the same mineral, could the brachiopods share some of our skeletal proteins? The answer to both questions seems to be “mostly no”.

Molluscan shell matrix proteins, those that are actually built into the structure of the shell, are quite variable even within Mollusca. It’s probably not surprising, then, that most of the relevant genes that are even present in Lingula are not specific to the mantle, and those that are are the kinds of genes that are generally involved in the handling of calcium or the building of the stuff around cells in all kinds of contexts. Some of the regulatory mechanisms might be shared – Luo et al. report that BMP signalling seems to be going on around the edge of the mantle in baby Lingula, and this cellular signalling system is also involved in molluscan shell formation. Then again, a handful of similar signalling systems “are involved” in bloody everything in animal development, so how much we can deduce from this similarity is anyone’s guess.

As for “bone genes” – the ones that are most characteristically tied to bone are missing (disappointingly or reassuringly, take your pick). The SCPP protein family is so far known only from vertebrates, and its various members are involved in the mineralisation of bones and teeth. SCPPs originate from an ancient protein called SPARC, which seems to be generally present wherever collagen is (IIRC, it’s thought to help collagen fibres arrange themselves correctly). Lingula has a gene for SPARC all right, but nothing remotely resembling an SCPP gene.

I mentioned that the shell of Lingula is built largely on collagen, but it turns out that it isn’t “our” kind of collagen. “Collagen” is just a protein with a particular kind of repetitive sequence. Three amino acids (glycine-proline-something else, in case you’re interested) are repeated ad nauseam in the collagen chain, and these repetitive regions let the protein twist into characteristic rope-like fibres that make collagen such a wonderfully tough basis for connective tissue. Aside from the repeats they all share, collagens are a large and diverse bunch. The ones that form most of the organic matrix in bone contain a non-repetitive and rather easily recognised domain at one end, but when Luo et al. analysed the genome and the proteins extracted from the Lingula shell, they found that none of the shell collagens possessed this domain. Instead, most of them had EGF domains, which are pretty widespread in all kinds of extracellular proteins. Based on the genome sequence, Lingula has a whole little cluster of these collagens-with-EGF-domains that probably originated from brachiopod-specific gene duplications.

So, to recap: Lingula is not as conservative as its looks would suggest (never judge a living fossil by its cover, right?) We also finally have actual sequences for lots of its shell proteins, which reveal that when it comes to building shells, Lingula does its own thing. Not much of a surprise, but still, knowing is a damn sight better than thinkin’ it’s probably so. We are scientists here, or what.

I am Very Pleased with this genome. (I just wish it was published five years ago 😛 )

***

Notes:

*This, interestingly, doesn’t seem to be the general case for all molluscs. Jackson et al. (2010) compared the genes building the pearly layer of snail (abalone, to be precise) and bivalve (pearl oyster) shells, and found that the snail showed no sign of the chitin-making enzymes and silk type proteins that were so abundant in its bivalved cousins. It appears that even within molluscs, different groups have found different ways to make often very similar shell structures. However, all molluscs shells regardless of the underlying genetics are predominantly composed of calcium carbonate.

**You often hear about sharks, or crocodiles, or coelacanths, existing “unchanged” for 100 or 200 or whatever million years, but in reality, 200-million-year-old crocodiles aren’t even classified in the same families, let alone the same genera, as any of the living species. Again, the living coelacanth is distinct enough from its relatives in the Cretaceous, when they were last seen, to warrant its own genus in the eyes of taxonomists. I’ve no time to check up on sharks, but I’m willing to bet the situation is similar. Whether Lingula‘s jaw-dropping 500-million-year tenure on earth is a result of taxonomic lumping or the shells genuinely looking that similar, I don’t know. Anyway, rant over.

***

References:

Jackson DJ et al. (2010) Parallel evolution of nacre building gene sets in molluscs. Molecular Biology and Evolution 27:591-608

Luo Y-J et al. (2015) The Lingula genome provides insights into brachiopod evolution and the origin of phosphate biomineralization. Nature Communications 6:8301

Putting the cart before the… snake?

Time to reexamine some assumptions (again)! And also, talk about Hox genes, because do I even need a reason?

Hox genes often come up when we look for explanations for various innovations in animal body plans – the digits of land vertebrates, the limbless abdomens of insects, the various feeding and walking and swimming appendages of crustaceans, the strongly differentiated vertebral columns of mammals, and so on.

Speaking of differentiated vertebral columns, here’s one group I’d always thought of as having pretty much the exact opposite of them: snakes. Vertebral columns are patterned, among other things, by Hox genes. Boundaries between different types of vertebrae such as cervical (neck) and thoracic (the ones bearing the ribcage) correspond to boundaries of Hox gene expression in the embryo – e.g. the thoracic region in mammals begins where HoxC6 starts being expressed.

In mammals like us, and also in archosaurs (dinosaurs/birds, crocodiles and extinct relatives thereof), these boundaries can be really obvious and sharply defined – here’s Wikipedia’s crocodile skeleton for an example:

In contrast, the spine of a snake (example from Wikipedia below) just looks like a very long ribcage with a wee tail:

Snakes, of course, are rather weird vertebrates, and weird things make us sciencey types dig for an explanation.

Since Hox genes appear to be responsible for the regionalisation of vertebral columns in mammals and archosaurs, it stands to reason that they’d also have something to do with the comparative lack of regionalisation (and the disappearance of limbs) seen in snakes and similar creatures. In a now classic paper, Cohn and Tickle (1999) observed that unlike in chicks, the Hox genes that normally define the neck and thoracic regions are kind of mashed together in embryonic pythons. Below is a simple schematic from the paper showing where three Hox genes are expressed along the body axis in these two animals. (Green is HoxB5, blue is C8, red is C6.)

Cohn_Tickle1999_hoxRegions

As more studies examined snake embryos, others came up with different ideas about the patterning of serpentine spines. Woltering et al. (2009) had a more in-depth look at Hox gene expression in both snakes and caecilians (limbless amphibians) and saw that there are in fact regions ruled by different Hoxes in these animals, if a little fuzzier than you’d expect in a mammal or bird – but they don’t appear to translate to different anatomical regions. Here’s their summary of their findings, showing the anteriormost limit of the activity of various Hox genes in a corn snake compared to a mouse:

Woltering_etal2009-mouse_vs_snake

Such differences aside, both of the above studies operated on the assumption that the vertebral column of snakes is “deregionalised” – i.e. that it evolved by losing well-defined anatomical regions present in its ancestors. But is that actually correct? Did snakes evolve from more regionalised ancestors, and did they then lose this regionalisation?

Head and Polly (2015) argue that the assumption of deregionalisation is a bit stinky. First, that super-long ribcage of snakes does in fact divide into several regions, and these regions respect the usual boundaries of Hox expression. Second, ordinary lizard-shaped lizards (from which snakes descended back in the days of the dinosaurs) are no more regionalised than snakes.

The study is mostly a statistical analysis of the shapes of vertebrae. Using an approach called geometric morphometrics, it turned these shapes from dozens of squamate (snake and lizard) species into sets of coordinates, which could then be compared to see how much they vary along the spine and whether the variation is smooth and continuous or clustered into different regions. The authors evaluated hypotheses regarding the number of distinct regions to see which one(s) best explained the observed variation. They also compared the squamates to alligators (representing archosaurs).

The results were partly what you’d expect. First, alligators showed much more overall variation in vertebral shape than squamates. Note that that’s all squamates – leggy lizards are nearly (though not quite) as uniform as their snake-like relatives. However, in all squamates, the best-fitting model of regionalisation was still one with either three or four distinct regions in front of the hips/cloaca, and in the majority, it was four, the same number as the alligator had.

Moreover, there appeared to be no strong support for an evolutionary pattern to the number of regions – specifically, none of the scenarios in which the origin of snake-like body plans involved the loss of one or more regions were particularly favoured by the data. There was also no systematic variation in the relative lengths of various regions; the idea that snakes in general have ridiculously long thoraxes is not supported by this analysis.

In summary, snakes might show a little less variation in vertebral shape than their closest relatives, but they certainly didn’t descend from alligator-style sharply regionalised ancestors, and they do still have regionalised spines.

Hox gene expression is not known for most of the creatures for which vertebral shapes were analysed, but such data do exist for mammals (mice, here), alligators, and corn snakes. What is known about different domains of Hox gene activation in these three animals turns out to match the anatomical boundaries defined by the models pretty well. In the mouse and alligator, Hox expression boundaries are sharp, and the borders of regions fall within one vertebra of them.

In the snake, the genetic and morphological boundaries are both gradual, but the boundaries estimated by the best model are always within the fuzzy boundary region of an appropriate Hox gene expression domain. Overall, the relationship between Hox genes and regions of the spine is pretty consistent in all three species.

To finish off, the authors make the important point that once you start turning to the fossil record and examining extinct relatives of mammals, or archosaurs, or squamates, or beasties close to the common ancestor of all three groups (collectively known as amniotes), you tend to find something less obviously regionalised than living mammals or archosaurs – check out this little figure from Head and Polly (2015) to see what they’re talking about:

Head_Polly2015-phylogeny_of_spines

(Moving across the tree, Seymouria is an early relative of amniotes but not quite an amniote; Captorhinus is similarly related to archosaurs and squamates, Uromastyx is the spiny-tailed lizard, Lichanura is a boa, Thrinaxodon is a close relative of mammals from the Triassic, and Mus, of course, is everyone’s favourite rodent. Note how alligators and mice really stand out with their ribless lower backs and suchlike.)

Although they don’t show stats for extinct creatures, Head and Polly argue that mammals and archosaurs, not snakes, are the weird ones when it comes to vertebral regionalisation. For most of amniote evolution, the norm was the more subtle version seen in living squamates. It was only during the origin of mammals and archosaurs that boundaries were sharpened and differences between regions magnified. Nice bit of convergent/parallel evolution there!

***

References:

Cohn MJ & Tickle C (1999) Developmental basis of limblessness and axial patterning in snakes. Nature 399:474-479

Head JJ & Polly PD (2015) Evolution of the snake body form reveals homoplasy in amniote Hox gene function. Nature 520:86-89

Woltering JM et al. (2009) Axial patterning in snakes and caecilians: evidence for an alternative interpretation of the Hox code. Developmental Biology 332:82-89

Finally, that sponge ParaHox gene

ParaHox genes are a bit like the underappreciated sidekicks of Hox genes. Or little sisters, as the case may be, since the two families are closely related. Hox genes are probably as famous as anything in evo-devo. Being among the first genes controlling embryonic development to be (a) discovered, (b) found to be conserved between very distantly related animals, they are symbolic of the late 20th century evo-devo revolution.

ParaHoxes get much less attention despite sharing some of the most exciting properties of Hox genes. Like those, they are involved in anteroposterior patterning – that is, partitioning an embryo along its head to tail axis. Also like Hox genes, they are often neatly clustered in the genome, and when they are, they tend to be expressed in the same order (both in space and time) in which they sit in the cluster*. Their main ancestral roles for bilaterian animals seem to be in patterning the gut and the central nervous system (Garstang and Ferrier, 2013).

There are three known types of ParaHox gene, which are generally thought to be homologous to specific Hox subsets of Hox genes – by the most accepted scheme, Gsx is the closest sister of Hox1 and Hox2, Xlox is closest to Hox3, and Cdx to Hox9 and above. It is abundantly clear that Hoxes and ParaHoxes are closely related, but there has been a bit of debate concerning the number of genes in the ancestral gene cluster that gave rise to both – usually called “ProtoHox” (Garcia-Fernàndez, 2005).

Another big question about these genes is precisely when they originated, and in this regard, ParaHox genes are proving much more interesting than Hoxes. You see, there are plenty of animals with both Hox and ParaHox genes, which is what you’d expect given the ProtoHox hypothesis, but there are also animals with only ParaHoxes. If there really was a ProtoHox gene/cluster that then duplicated to give rise to Hoxes and ParaHoxes, then lone ParaHoxes (or Hoxes for that matter) shouldn’t happen – unless the other cluster was lost along the way.

So a suspiciously Gsx-like gene in the weird little blob-creature Trichoplax, which has nothing remotely resembling a Hox gene, was a big clue that (a) Hox/ParaHox genes might go back further in animal evolution than we thought, (b) the loss of the entire Hox or ParaHox cluster is totally possible**, despite how fundamental these genes appear to be for correctly building an animal.

I wrote (at length) about a study by Mendivil Ramos et al. (2012), which revealed that while Trichoplax had no Hox genes and only one of the three types of ParaHox gene, it preserved the more or less intact genomic neighbourhoods in which Hox and ParaHox clusters are normally situated. One of the more interesting results of that paper was that the one sponge genome available at the time – that of Amphimedon queenslandica, which had no trace of either Hox or ParaHox genes – also contained statistically significant groupings of Hox and ParaHox neighbour genes, as if it had a Hox neighbourhood and a ParaHox neighbourhood, but the Hoxes and ParaHoxes themselves had moved out.

That study thus pointed towards an intriguing hypothesis, previously championed by Peterson and Sperling (2007) based solely on gene phylogenies: sponges once did have Hox and ParaHox genes/clusters, which at least some of them later lost. This would essentially mean that the two gene clusters go straight back to the origin of animals if not further***, and we may never find any surviving remnant of the ancestral ProtoHox cluster, since the closest living relatives of animals have neither the genes nor their neighbourhoods (that we know of).

Hypotheses are nice, but as we know, they do have a tendency to be tragically slain by ugly facts. Can we further test this particular hypothesis about sponges? Are there facts that could say yay or nay? (Of course there are. I wouldn’t be writing this otherwise 😉 )

I keep saying that we should always be careful when generalising from one or a few model organisms, that we ignore diversity in the animal kingdom at our own peril, and that “distantly related to us” = “looks like our distant ancestors” is an extremely dodgy assumption. Well, here’s another lesson in that general vein: unlike Amphimedon, some sponges have not just the ghosts of vanished ParaHox clusters, but intact, honest to god ParaHox genes!

It’s calcareous sponges again. Sycon ciliatum and Leucosolenia complicata, two charming little calcisponges, recently had their genomes sequenced (alas, they weren’t yet public last time I checked), and since then, there’s been a steady stream of “cool stuff we found in calcisponge genomes” papers from Maja Adamska’s lab and their collaborators. I’ve discussed one of them (Robinson et al., 2013), in which the sponges revealed their rather unhelpful microRNAs, and back in October (when I was slowly self-destructing from thesis stress), another study announced a couple of delicious ParaHoxes (Fortunato et al., 2014).

(Exciting as it is, the paper starts by tickling my pet peeves right off the bat by calling sponges “strong candidates for being the earliest extant lineage(s) of animals”… I suppose nothing can be perfect… *sigh*)

The study actually covers more than just (Para)Hox genes; it looks at an entire gene class called Antennapedia (ANTP), which includes Hoxes and ParaHoxes plus a handful of related families I’m far less interested in. Sycon and Leucosolenia don’t have a lot of ANTP genes – only ten in the former and twelve in the latter, whereas a typical bilaterian like a fruit fly or a lancelet has several times that number – but from phylogenetic analyses, these appear to be a slightly different assortment of genes from those present in Amphimedon, the owner of the first sequenced sponge genome. This picture is most consistent with a scenario in which all of the ANTP genes in question were present in our common ancestor with sponges, and each sponge lineage lost some of them independently. (You may not realise this until you start delving into the history of various gene families, but genes come and go a LOT in evolution.)

Sadly, many of the branches on these gene trees are quite wonky, including the one linking a gene from each calcisponge to the ParaHox gene Cdx. However, somewhat fuzzy trees are not the only evidence the study presents. First, the putative sponge Cdxes possess a little motif in their protein sequences that is only present in a handful of gene families within the ANTP class. If you take only these families rather than everything ANTP and make trees with them, the two genes come out as Cdx in every single tree, and with more statistical support than the global ANTP trees gave them. Another motif they share with all Hoxes, ParaHoxes and a few of their closest relatives, but not with other ANTP class families.

Second, at least the gene in Sycon appears to have the right neighbours (Leucosolenia was not analysed for this). Since the Sycon genome sequence is currently in pieces much smaller than whole chromosomes, only four or so of the genes flanking ParaHox clusters in other animals are clearly linked to the putative Cdx in the sponge. However, when the researchers did the same sort of simulation Mendivil Ramos et al. (2012) did for Amphimedon, testing whether Hox neighbours and ParaHox neighbours found across all fragments of the genome are (a) close to other Hox/ParaHox neighbours or randomly scattered (b) mixed or segregated, they once again found cliques of genes with little overlap, indicating the once-existence of separate Hox and ParaHox clusters.

Fortunato et al. (2014) also examined the expression of their newfound Cdx gene, and found it no less intriguing than its sequence or location in the genome, although their description in the paper is very limited (no doubt because they’re trying to cram results on ten genes into a four-page Nature paper). The really interesting activity they mention and picture is in the inner cell mass of the young sponge in its post-larval stages – the bit that develops into the lining of its feeding chambers. Which, Adamska’s team contend, may well be homologous to our gut lining. In bilaterians, developing guts are one of the major domains of Cdx and ParaHox genes in general!

So at least three different lines of evidence – sequence, neighbours and expression – make this picture hang together quite prettily. It’s incredibly cool – the turning on their heads of long-held assumptions is definitely the most exciting part of science, I say! On the other hand, it’s also a little disheartening, because now that everyone in the animal kingdom except ctenophores has definitive ParaHox genes and at least the empty seats once occupied by Hox genes, are we ever going to find a ProtoHox thingy? May it be that it’ll turn up in one of the single-celled beasties people like Iñaki Ruiz-Trillo are sequencing? That would be cool and weird.

The coolest twist on this story, though, would be to discover traces of ProtoHoxes in a ctenophore, since solid evidence for ProtoHox-wielding ctenophores would (a) confirm the strange and frankly quite dubious-sounding idea that ctenophores, not sponges, are the animal lineage farthest removed from ourselves, (b) SHOW US A FREAKING PROTOHOX CLUSTER. (*bounces* >_> Umm, * cough* OK, maturity can suck it 😀 ) However, given how horribly scrambled at least one ctenophore genome is (Ryan et al., 2013), that’s probably a bit too much to ask…

***

Notes

*Weirdly, the order of expression in time is the opposite of that of the Hox cluster. In both clusters, the “anterior” gene(s), i.e. Hox1-2 or Gsx, are active nearest the front of the embryo, but while anterior Hox genes are also the earliest to turn on, in the ParaHox cluster the posterior gene (Cdx) wakes up first. /end random trivia

**Of course we’ve long known that losing a Hox cluster is not that big a deal, but previously, all confirmed losses occurred in animals with more than one Hox cluster to begin with – a fish has plenty of Hox genes left even after chucking an entire set of them.

***With the obligatory ctenophore caveat

***

References

Fortunato SAV et al. (2014) Calcisponges have a ParaHox gene and dynamic expression of dispersed NK homeobox genes. Nature 514:620-623

Garcia-Fernàndez J (2005) The genesis and evolution of homeobox gene clusters. Nature Reviews Genetics 6:881-892

Garstang M & Ferrier DEK (2013) Time is of the essence for ParaHox homeobox gene clustering. BMC Biology 11:72

Mendivil Ramos O et al. (2012) Ghost loci imply Hox and ParaHox existence in the last common ancestor of animals. Current Biology 22:1951-1956

Peterson KJ & Sperling EA (2007) Poriferan ANTP genes: primitively simple or secondarily reduced? Evolution and Development 9:405-408

Robinson JM et al. (2013) The identification of microRNAs in calcisponges: independent evolution of microRNAs in basal metazoans. Journal of Experimental Zoology B 320:84-93

Ryan JF et al. (2013) The genome of the ctenophore Mnemiopsis leidyi and its implications for cell type evolution. Science 342:1242592

A bit of Hox gene nostalgia

I had the most random epiphany over my morning tea today. I don’t even know what got me thinking about the Cambrian explosion (as if I needed a reason…). Might have been remembering something from the Euro Evo Devo conference I recently went to. (I kind of wanted to post about that, because I saw some awesome things, but too much effort. My brain isn’t very cooperative these days.)

Anyway.

I was thinking about explanations of the Cambrian explosion and remembering how the relevant chapter in The Book of Life (otherwise known as the book that made me an evolutionary biologist)  tried to make it all about Hox genes. It’s an incredibly simplistic idea, and almost certainly wrong given what we now know about the history of Hox genes (and animals)*. At the time, and for a long time afterwards, I really wanted it to be true because it appeals to my particular biases. But I digress.

Then it dawned on me just how new and shiny Hox genes were when this book was written. I thought, holy shit, TBoL is old. And how far evo-devo as a field has come since!

The Book of Life was first published in 1993. That is less than a decade after the discovery of the homeobox in fruit fly genes that controlled the identity of segments (McGinnis et al., 1984; Scott and Weiner, 1984), and the finding that homeoboxes were shared by very distantly related animals (Carrasco et al., 1984). It was only four years after the recognition that fly and vertebrate Hox genes are activated in the same order along the body axis (Graham et al., 1989; Duboule and Dollé, 1989).

This was a HUGE discovery. Nowadays, we’re used to the idea that many if not most of the genes and gene networks animals use to direct embryonic development are very ancient, but before the discovery of Hox genes and their clusters and their neatly ordered expression patterns, this was not at all obvious. What were the implications of these amazing, deep connections for the evolution of animal form? It’s not surprising that Hox genes would be co-opted to explain animal evolution’s greatest mysteries.

It also occurred to me that 1993 is the year of the zootype paper (Slack et al., 1993). Slack et al. reads like a first peek into a brave new world with limitless possibilities. They first note the similarity of Hox gene expression throughout much of the animal kingdom, then propose that this expression pattern (their “zootype”) should be the definition of an animal. After that, they speculate that just as the pattern of Hox genes could define animals, the patterns of genes controlled by Hoxes could define subgroups within animals. Imagine, they say, if we could solve all those tough questions in animal phylogeny by looking at gene expression.

As always, things turned out More Complicated, what with broken and lost Hox clusters and all the other weird shit developmental “master” genes get up to… but it was nice to look back at the bright and simple childhood of my field.

(And my bright and simple childhood. I read The Book of Life in 1998 or 1999, not entirely sure, and in between Backstreet Boys fandom, exchanging several bookfuls of letters with my BFF and making heart-shaped eyes at long-haired guitar-playing teenage boys, I somehow found true, eternal, nerdy love. *nostalgic sigh*)

***

*Caveat: it’s been years since I last re-read the book, and my copy is currently about 2500 km from me, so the discussion of the Cambrian explosion might be more nuanced than I remember. Also, my copy is the second edition, so I’m only assuming that the Hox gene thing is there in the original.

***

References:

Carrasco AE et al. (1984) Cloning of an X. laevis gene expressed during early embryogenesis coding for a peptide region homologous to Drosophila homeotic genes. Cell 37:409-414

Duboule D & Dollé P (1989) The structural and functional organization of the murine HOX gene family resembles that of Drosophila homeotic genes. The EMBO Journal 8:1497-1505

Graham A et al. (1989) The murine and Drosophila homeobox gene complexes have common features of organization and expression. Cell 57:367-378

McGinnis W et al. (1984) A conserved DNA sequence in homoeotic genes of the Drosophila Antennapedia and bithorax complexes. Nature 308:428-433

Scott MP & Weiner AJ (1984) Structural relationships among genes that control development: sequence homology between the Antennapedia, Ultrabithorax, and fushi tarazu loci of Drosophila. PNAS 81:4115-4119

Slack JMW et al. (1993) The zootype and the phylotypic stage. Nature 361:490-492

Lamprey Hox clusters and genome duplications, oh my!

What the hell is up with lamprey Hox clusters?

Lampreys are among the few living jawless vertebrates, creatures that parted evolutionary ways with our ancestors somewhere on the order of 500 million years ago. If you want to know where things like jaws, paired fins or our badass adaptive immune systems came from, a vertebrate that doesn’t possess some of these things and may have diverged from the rest of the vertebrates soon after others originated is just what you need for comparison.

The vertebrate fossil record is pretty rich thanks to us having hard tissues, so a lot can be inferred about these things from the wealth of extinct fishes we have at our disposal. However, there are times when comparisons of living creatures are just as useful, if not more, than examinations of fossils. (Fossils, for example, tend not to have immune systems. ;))

One of the things you absolutely need a living animal to study is, of course, genome evolution. Vertebrates – well, at least jawed vertebrates – are now generally accepted to have the remnants of four genomes. Our long-gone ancestors underwent two rounds of whole genome duplication. Afterwards, most of the extra genes were lost, but evidence for the duplications can still be found in the structure of our genomes, where entire recognisable gene neighbourhoods of our close invertebrate relatives often still exist in up to four copies (Putnam et al., 2008).

Among these neighbourhoods are the four clusters of Hox genes most groups of jawed vertebrates possess. A “normal” animal like a snail or a centipede only has one of these. Since Hox genes are involved in the making of body plans, you have to wonder how suddenly having four sets of them and other developmental “master genes” might have influenced the evolution of vertebrate bodies.

Of course, to guess that, you need to know precisely when these duplications happened. That’s where lampreys come in: their lineage branched off from our definitely quadruple-genomed one after the next closest, definitely single-genomed group. But was it before, between, or after, the two rounds of duplication?

A few years ago, a phylogenetic analysis of 55 gene families by Kuraku et al. (2009) suggested that the lamprey-jawed vertebrate split happened after the 2R. Just this year, the genome of the sea lamprey Petromyzon marinus was finally published (Smith et al., 2013), and its authors agreed that yes, lampreys probably split off from us post-2R. (I don’t entirely get all the things they did to arrive at this conclusion. Groups of linked genes show up again, among other approaches.)

However, that isn’t the whole story, the latest lamprey genomics paper argues (Mehta et al., 2013). The P. marinus genome assembly couldn’t stitch all the Hox clusters properly together. There were two that sat on nice big scaffolds with the whole row of Hox genes and a few of their neighbours, and then there were a bunch of “loose” Hox genes that they couldn’t link to anything (diagram comparing humans and P. marinus below from Smith et al., 2013; the really pale blue boxes under the numbers represent Hox genes):

Smith_etal2013-F4

Given that Hox9 genes exist in four copies in this species, it seems like there may be four clusters. However, in hagfish, the other kind of living jawless vertebrate, a study found Hox genes that seemed to have as many as seven copies (Stadler et al., 2004). Another round of duplication? It wouldn’t be unheard of. Most teleosts, which include most of the things we call “fish” in everyday parlance, have seven Hox clusters courtesy of an extra genome duplication and loss of one cluster*. Salmon and kin have thirteen, after yet another duplication. Maybe hagfish also had another one – but did lampreys? How many more clusters do those lonely Hox genes belong to?

Mehta et al. hunted down the Hox clusters of Japanese lampreys (Lethenteron japonicum), hoping to pin down exactly how many there were. They used large chunks of DNA derived partly from the testicles, where sperm cells and their precursors keep the full genome throughout the animal’s life (lampreys throw away large chunks of the genome in most non-reproductive cells [Smith et al., 2009]). They probed these for Hox genes and sequenced the ones that tested positive. Plus they also got about two-thirds of the full genome together in fairly big pieces. Together, these data allowed them to get a better idea of the mess that is lamprey Hox cluster genomics.

They assembled four whole clusters, including their neighbouring genes, and a partial fifth cluster. A bunch of other genes sat on smaller sequence fragments containing only a couple of Hoxes, or a Hox and a non-Hox, but they were tentatively assigned to a total of eight clusters, eight being the number of different Hox4 genes in the data (no known vertebrate Hox cluster contains more than one Hox4 gene). The L. japonicum equivalents of the 31 publicly available Hox sequences from P. marinus spread out over six of these, which indicates that both species have at least six clusters. Seems like lampreys had another round of genome duplication after 2R? (Summary of L. japonicum Hox clusters from Mehta et al. below.)

But wait, that’s not the end of it.

First of all, although there are undoubtedly four complete Hox clusters in there L. japonicum, the relationships of these clusters to our four are terribly confused. Whether you look at the phylogenetic trees of individual genes, or the arrangement of non-Hox genes on either side of the cluster, only a big pile of what the fuck emerges. Phylogenies are problematic because the unusual composition of lamprey genes and proteins (Smith et al., 2013) could easily throw them off. All the complete lamprey clusters have a patchwork of neighbours that look like a mashup of more than one of our Hox clusters. Might it mean that lampreys’ proliferation of Hox clusters occurred independently of ours? Did we split before 2R after all?

Hox genes are not the only interesting things in a Hox cluster. In the long gaps between them, there are all sorts of little DNA switches that regulate their behaviour. Some of these are conserved across the jawed vertebrates. Mehta et al. aligned complete Hox clusters of humans, elephant sharks and lampreys to look for such sequences – called conserved non-coding elements or CNEs – in the lamprey.

They only found a few, but that’s enough for a bit more head-scratching. Most CNEs in, say, the human HoxA cluster are only found in one elephant shark cluster, and vice versa. Humans have a HoxA cluster, elephant sharks have a HoxA cluster, they’re clearly the same thing, pretty straightforward. Not so for lampreys. Homologues of individual CNEs in the complete lamprey clusters are spread out over all four human/elephant shark clusters. More evidence for independent duplications?

Mehta et al. are cautious – they point out that the silly mix of Hox cluster neighbours in lampreys could just be due to independent post-2R losses, which is plausible if the split between lamprey and jawed vertebrate lineages happened not too long after 2R. There’s also the fact that the weird lamprey sequences are phylogenetic minefields – however, that’s a double-edged sword, since the same caveat applies to analyses that support a post-2R divergence. Then, perhaps the same argument that goes for Hox cluster neighbours could also apply to CNEs. And, of course, this is just Hox clusters. Smith et al.‘s (2013) findings about overall genome structure don’t go away just because lamprey Hox clusters are weird.

So, in summary, thanks, lampreys. Fat lot of help you are! 😛

***

*Actually, two losses of two separate clusters in two different teleost lineages. Because Hox evolution wasn’t already complicated enough.

***

References

Kuraku S et al. (2009) Timing of genome duplications relative to the origin of the vertebrates: did cyclostomes diverge before or after? Molecular Biology and Evolution 26:47-59

Mehta TK et al. (2013) Evidence for at least six Hox clusters in the Japanese lamprey (Lethenteron japonicum). PNAS 110:16044-16049

Putnam NH et al. (2008) The amphioxus genome and the evolution of the chordate karyotype. Nature 453:1064-1071

Smith JJ et al. (2009) Programmed loss of millions of base pairs from a vertebrate genome. PNAS 106:11212-11217

Smith JJ et al. (2013) Sequencing of the sea lamprey (Petromyzon marinus) genome provides insights into vertebrate evolution. Nature Genetics 45:415-421

Stadler PF et al. (2004) Evidence for independent Hox gene duplications in the hagfish lineage: a PCR-based gene inventory of Eptatretus stoutii. Molecular Phylogenetics and Evolution 32:686-694

Lotsa news

Hah, I open my Google Reader (damn you, Google, why do you have to kill it??? >_<), expecting to find maybe a handful of new articles since my last login, and instead getting both Nature and Science in one big heap of awesome. The latest from the Big Two are quite a treat!

*

By now, of course, the internet is abuzz with the news of all those four-winged birdies from China (Zheng et al., 2013). I’m a sucker for anything with feathers anywhere, plus these guys are telling us in no uncertain terms that four-wingedness is not just some weird dromaeosaur/troodontid quirk but an important stage in bird evolution. Super-cool.

*

Then there is that Cambrian acorn worm from the good old Burgess Shale (Caron et al., 2013). It’s described to be like modern acorn worms in most respects, except it apparently lived in a tube. Living in tubes is something that pterobranchs, a poorly known group related to acorn worms do today. The Burgess Shale fossils (along with previous molecular data) suggest that pterobranchs, which are tiny, tentacled creatures living in colonies, are descendants rather than cousins of the larger, tentacle-less and solitary acorn worms. This has all kinds of implications for all kinds of common ancestors…

*

Third, a group used a protein from silica-based sponge skeletons to create unusually bendy calcareous rods (Natalio et al., 2013). Calcite, the mineral that makes up limestone, is not normally known for its flexibility, but the sponge protein helps tiny crystals of it assemble into a structure that bends rather than breaks. Biominerals would just be ordinary rocks without the organic stuff in them, and this is a beautiful demonstration of what those organic molecules are capable of!

*

And finally, Japanese biologists think they know where the extra wings of ancient insects went (Ohde et al., 2013). Today, most winged insects have two pairs of wings, one pair on the second thoracic segment and another on the third. But closer to their origin, they had wing-like outgrowths all the way down the thorax and abdomen. Ohde et al. propose that these wing homologues didn’t just disappear – they were instead modified into other structures. Their screwing with Hox gene activity in mealworm beetles transformed some of the parts on normally wingless segments into somewhat messed up wings. What’s more, the normal development of the same bits resembles that of wings and relies on some of the same master genes. It’s a lot like bithorax mutant flies with four wings (normal flies only have two, the hindwings being replaced by balancing organs), except no modern insect has wings where these victims of genetic wizardry grew them. The team encourage people to start looking for remnants of lost wings in other insects…

Lots of insteresting stuff today! And we got more Hox genes, yayyyy!

***

References:

Caron J-B et al. (2013) Tubicolous enteropneusts from the Cambrian period. Nature advance online publication 13/03/2013, doi: 10.1038/nature12017

Natalio F et al. (2013) Flexible minerals: self-assembled calcite spicules with extreme bending strength. Science 339:1298-1302

Ohde T et al. (2013) Insect morphological diversification through the modification of wing serial homologs. Science Express, published online 14/03/2013, doi: 10.1126/science.1234219

Zheng X et al. (2013) Hind wings in basal birds and the evolution of leg feathers. Science 339:1309-1312

“Same” function, but the devil is in the details.

Aaaaaand todaaaaay, ladies and, um, other kinds of people…. Hox genes!

Considering that I did my Honours project on them and I think they are made of awesome, I’m kind of shocked by the general lack of them here*. Hmmmmmm. Well, having just found Sambrani et al. (2013), I think today is a good time to do something about that.

Hox genes in general are “what goes where” type regulators of development. In bilaterian animals, they tend to work along the head to tail axis of the embryo. (Cnidarians like sea anemones also have them, but the situation re: main body axis and Hox genes in cnidarians is a leeeetle less clear. And heaven knows what sort of weird things happened with the rest of the animals.)

Hox genes are responsible for one of the peculiarities of the insect body plan. Unlike many other arthropods, insects have leg-free abdomens. On the left below is a poor little lobster with legs or related appendages all the way down (plus a bonus clutch of eggs). (Arnstein Rønning, Wikimedia Commons). To her right is a bland, boring insect abdomen (Hans Hillewaert, Wikimedia Commons).

As I said, Hox genes are responsible for the difference. Three of them are expressed in various segments of the abdomen of a developing insect: Ultrabithorax (Ubx), Abdominal-A and Abdominal-B. I’m going to whip out that amazing fluorescent image of Hox gene expression in a fruit fly embryo from Lemons and McGinnis (2006) because aside from being cool as hell, it also happens to be a good illustration:

(The embryo is folded back on itself, so the Abd-B-expressing tail end is right next to the Hox gene-free head)

In insects, all three can turn off the expression of the leg “master” gene distal-less (dll). However, they turn out to do so through two different mechanisms. Ubx and Abd-A proteins have long been known to team up with the distantly related Extradenticle (Exd) and Homothorax (Hth). With their partners, the Hoxes can sit on a regulatory region belonging to the dll gene and prevent its activation.

Sambrani et al. were curious whether Abd-B works in the same way. Sure enough, Abd-B also represses dll wherever it shows up. However, when it comes to interacting with Exd and Hth, differences start to emerge. For starters, those two aren’t even present in the rear end of the abdomen, where Abd-B does its business. When the researchers took the regulatory region of dll and threw various combinations of proteins at it, they found that (1) Abd-B is perfectly capable of binding the DNA on its own, (2) Exd, Hth or engrailed (another Hox cofactor) didn’t improve this ability at all, (3) Hth alone or in combination with the others actually inhibited the binding of Abd-B to the dll regulatory sequence.

Interestingly, dll repression in the anterior and posterior abdominal segments requires the exact same bits of regulatory DNA even though different proteins are involved. It looks like in the posterior segments, Abd-B actually takes over an “Exd” binding site – maybe that’s how it can do the job without getting Exd itself involved.

Furthermore, while the DNA-binding ability of Abd-B is crucial to its ability to kill dll expression, the same is not the case for Ubx. The authors speculate that cooperation with Exd and Hth kind of exempts Ubx from having to bind the regulatory sequences itself, while Abd-B, being on its own, can’t afford to slack off like that. The paper illustrates the idea with such a deliciously ugly pair of drawings that I feel compelled to post it:

(I know they’re going for colour-matching with the fluorescent images, but unfortunately glowy greens and reds that look good on a black background kind of just hurt my eyes on white.)

I don’t really have a point to make here. (There doesn’t always have to be a point, right?) There’s absolutely nothing surprising about the fact that different Hox genes evolved the same overall function in different ways –  after all, they existed as separate entities long before insects lost their buttward legs. I just think Hox genes are cool, and this was an interesting look into the nuts and bolts of how they work. And that’s that.

Cheerio!

***

*Well, aside from this one I’ve written three posts about them and a couple more where they are mentioned. That’s maybe not that bad considering how many different things I’m interested in.

***

References:

Lemons D and McGinnis W (2006) Genomic evolution of Hox gene clusters. Science 313:1918-1922

Sambrani N et al. (2013) Distinct molecular strategies for Hox-mediated limb suppression in Drosophila: From cooperativity to dispensability/antagonism in TALE partnership. PLoS Genetics 9:e1003307.

The origin of Hox genes: a telltale neighbourhood

Gods, it’s been so hard to keep my mouth shut about this. A friend of mine just published a paper about Hox genes, and I’ve known about it for a while and it’s been keeping me crazy excited because it’s fascinating and, well: Hox genes! Now that it’s finally out, I can blather about it to my heart’s content, and so I will. Be prepared for a long ride 😉

First of all, a quick rundown of Hox genes for those who aren’t evo-devo geeks. These genes encode transcription factors – proteins that switch genes on/off. They are members of the large and distinguished class of homeobox genes, many of which play important roles in orchestrating embryonic development. Hox genes in particular are famous for laying out the plan for the head to tail axes of bilaterian animals, and for often sitting in neat clusters in the genome and being expressed along the body axis in the same order they are in the cluster. (Below: one of my favourite scientific figures ever, a fruit fly embryo stained in different colours for each of its Hox genes*. From Lemons and McGinnis [2006] via Pharyngula) In short, Hox genes are fucking awesome and extremely important to boot.

Tracing origins

One of the unresolved questions about Hox genes is exactly where they come from, and the new study draws some interesting conclusions regarding their origins. Before we delve into Mendivil Ramos et al. ( 2012) itself, perhaps it’s best to pull out my old sketch of animal phylogeny, because the relationships of the great old animal lineages are kind of important for the discussion. So this is the family tree of animals at first approximation (photos were all sourced from Wikimedia Commons; more info about them in my Nectocaris post):

Mendivil Ramos et al. follow one of the more popular resolutions of the question marks, in which cnidarians are closest to bilaterians and placozoans are the sister group to cnidarians+bilaterians. They don’t talk too much about ctenophores, but I’ll return to that later 🙂

Bilaterians all have Hox genes, and in most of them they do what they were originally discovered doing in fruit flies: patterning the anterior-posterior axis as they say in Jargonese. Some bilaterians have duplicated individual genes or even whole Hox clusters (we have four clusters, and salmon have as many as 13), but it’s pretty uncontroversial that a neat Hox cluster with representatives of most existing types of Hox genes was present already on the left side of the bilaterian box. So was the little sister of the Hox cluster, unimaginatively called the ParaHox cluster, which only contains three kinds of genes but operates in a similar way to its more famous sister (Brooke et al., 1998).

Where did Hox and ParaHox genes come from? Given the phylogeny of the genes, it’s likely that there was originally a small (maybe 2-3 genes) ProtoHox cluster that duplicated to give rise to both Hoxes and ParaHoxes. We know that cnidarians like sea anemones have both Hox and ParaHox genes, which behave somewhat like their bilaterian counterparts (Ryan et al., 2007). Therefore, the ProtoHox cluster must have existed before the common ancestor of these two great lineages.

Enter the Blob

What about placozoans? That’s where things get a bit complicated. Trichoplax, the mysterious little blob that is the only living representative of this oddball phylum, has only one Hox-like gene noncommittally named Trox-2. A relic of the ProtoHox era? Not really – in phylogenetic analyses of the protein sequence, it tends to group with the ParaHox gene Gsx, whereas you would expect a leftover ProtoHox gene to remain outside the Hox+ProtoHox clique.

Is Trox-2 a ProtoHox gene anyway? That would mean something weird happened in the evolution of Hox and ParaHox genes after the cluster duplication: Gsx (and its sisters Hox1-2) would have stagnated somewhere near its ancestral condition while all the other genes sped ahead. It’s a long shot, but evolution has been known to do strange things to gene sequences. Also, homeobox genes are often difficult to classify by sequence alone. Scientists typically use the DNA-binding region that the homeobox encodes for this purpose, but a homeodomain is only 60 amino acids and simply doesn’t contain enough information to place some problematic sequences. And unless we’re examining very closely related genes, the rest of the protein sequence is too different to be compared.

Guilt by association

However, there is another way of solving the mystery. Hox and ParaHox genes are not alone in the genome. They sit on huge chromosomes, and while they tend to banish non-*Hox genes from among them, the flanks of each cluster are populated by a variety of unrelated genes. The key thing is that Hox clusters and ParaHox clusters have different neighbours. Thus, looking at a problem gene’s neighbours can tell us what it is!

(Above: the neighbours of Trox-2. Yellow genes are ParaHox neighbours in humans, green genes are Hox neighbours, grey genes have no human counterparts, and orange genes are parts of both Hox and ParaHox neighbourhoods. From Mendivil Ramos et al. [2012])

This is exactly what happened. My lovely friend Olivia looked at the chunk of genomic sequence that contains Trox-2 and found about two dozen genes on it that had clear homologues in humans. She then tallied where each of the human homologues were, and behold: many of them crowded around ParaHox clusters (we also have several of those, courtesy of whole genome duplications), while only one was a Hox neighbour in humans. If Trox-2 were a ProtoHox, we’d expect a mixture of Hox and ParaHox neighbours, but that’s not what we find at all. Statistically speaking, it’s a no-brainer. Trox-2 is exactly where a ParaHox gene should be.

Ghosts in the genome

Now, we have a problem. If Trox-2 is a ParaHox gene, it must have come after the Hox/ParaHox duplication. So where the hell is the Hox cluster? Well, seeing as Trichoplax only has one ParaHox gene instead of the more typical three or so, gene loss certainly sounds like a possibility. Is there an “empty” Hox cluster lurking somewhere in the blob’s genome? Here, cnidarians turn out to be pretty helpful. After sequencing the genome of the sea anemone Nematostella vectensis, Putnam et al. (2007) attempted to reconstruct parts of the original chromosomes of the cnidarian-bilaterian ancestor. They called the results Putative Ancestral Linkage Groups, in other words, groups of genes that have stayed together since cnidarians and bilaterians diverged 600 or so million years ago.

One of these PALs contains over 200 conserved Hox neighbours, nearly all of which are present in Trichoplax. Strikingly, about half of them are close enough to one another that they are in the same chunk of sequence even though the Trichoplax genome hasn’t been stitched together to the level of whole chromosomes. That’s much more than you’d expect by chance. Trichoplax has a Hox locus without Hox genes, what Mendivil Ramos et al. call a ghost Hox locus.

Hox genes all the way down?

If you followed so far, you might have noticed that we’ve been pushing that elusive ProtoHox further and further back in animal evolution. It preceded bilaterians, it preceded cnidarians and bilaterians, and now it turns out it also preceded our split from placozoans. Will we find it if we look in the remaining animal lineages? Since a ctenophore genome hasn’t yet been released to the public, that question transforms into: will we find it in sponges?

The sponge Amphimedon queenslandica does have a publicly available genome, and much has been made of its apparent lack of many developmentally important transcription factor families (e.g. Larroux et al., 2008). It doesn’t have anything that looks like a Hox, ParaHox or ProtoHox gene, but what about the neighbourhoods?

Like that of Trichoplax, the Amphimedon genome sequence is in relatively small pieces, so a little clever statisticking was needed to decide whether it contains Hox, ParaHox or ProtoHox neighbourhoods. The starting points were the PAL of Hox neighbours mentioned above, and a PAL of ParaHox neighbours the team constructed using the human and Trichoplax genomes. These genes were distributed among many genomic scaffolds, but of course lacking chromosome-level information the group didn’t know whether any of these scaffolds are actually linked to each other in the sponge genome.

The solution was a simulation: take the number of genes in the PAL, take the number and size (in number of genes) of the thousands of Amphimedon scaffolds, and scatter the PAL members randomly among the scaffolds with the larger scaffolds proportionately more likely to receive a PAL gene. When all the PAL members are handed out, count the number of scaffolds with PAL members on them. Repeat this a thousand times, and you get an idea what the distribution of Hox and ParaHox neighbours would be if they weren’t clustered together. This approach showed that the real distribution is anything but random. Hox and ParaHox neighbours are clearly clustered in the sponge genome, and what’s more, they are clustered separately.

Still no ProtoHox locus, in other words. At some point in the murky depths of their ancestry, sponges lost bona fide Hox and ParaHox genes!

So…

That raises a couple of issues. First, where is the ProtoHox? Hox-like genes have never been found outside animals. These are smart people we’re talking about, so they checked the genome of the closest non-animal relative we have today, a choanoflagellate. Neither Hox/ParaHox nor ProtoHox neighbourhoods were there – the PAL genes didn’t cluster together any more than they would by chance. The whole *Hox phenomenon seems unique to animals (or else the choanoflagellate genome is totally scrambled). It appears that somewhere in our ancestry, ProtoHox gene(s) appeared and parted ways before sponges split from the rest of the animals. Since we have no surviving descendants of these ancestors outside of sponges and the rest of the animals, we’ll probably never find unduplicated descendants of the ProtoHox cluster.

Second, what happened in ctenophores? Everything we know about their genomes suggests that they completely lack Hox-like genes. Although there have been studies that placed them even further out than sponges (Dunn et al., 2008), it’s more likely that they are much closer to bilaterians than that (Philippe et al., 2011). I think I’m not the only one itching to examine a ctenophore genome for Hox neighbours…

And finally, if some distant ancestor of all animals had full-blown Hox and ParaHox clusters, what the heck was it doing with them? Was it something unexpectedly complex that would need genes for axial patterning? Are sponges and placozoans grossly simplified descendants of a much more complex ancestor, or did Hox-like genes only become involved in dividing up body axes later in evolution?

The more we learn the less we know. One thing is (once again) clear: assuming that a simple animal is a good proxy for an ancestral animal is a dangerous, dangerous assumption to make.

***

*Technically, fruit flies have twelve Hox genes, but only seven are shown in the image. Hox2/proboscipedia is a normal Hox gene involved in the development of mouthparts among others, but four more genes have completely lost their “canonical” Hox gene-like activities. That includes all three of Drosophila‘s weird triplicated Hox3 genes.

***

References

Brooke NM et al. (1998) The ParaHox gene cluster is an evolutionary sister of the Hox gene cluster. Nature 392:920-922

Dunn CW et al. (2008) Broad phylogenomic sampling improves resolution of the animal tree of life. Nature 457:745-759

Larroux C et al. (2008) Genesis and expansion of metazoan transcription factor gene classes. Molecular Biology and Evolution 25:980-996

Lemons D and McGinnis W (2006) Genomic evolution of Hox gene clusters. Science 313:1918-1922

Mendivil Ramos O et al. (2012) Ghost loci imply Hox and ParaHox existence in the last common ancestor of animals. Current Biology in press, available online 26/09/2012, doi: 10.1016/j.cub.2012.08.023

Philippe H et al. (2011) Resolving difficult phylogenetic questions: why more sequences are not enough. PLoS Biology 9:e1000602

Putnam NH et al. (2007) Sea anemone genome reveals ancestral eumetazoan gene repertoire and genomic organization. Science 317:86-94

Ryan JF et al. (2007) Pre-bilaterian origins of the Hox cluster and the Hox code: evidence from the sea anemone, Nematostella vectensis. PLoS ONE 2:e153