The Mammal’s very own Hox genes (excite! Woo!)

It’s kind of hard to begin this post. First of all, let’s get the important news out of the way: I’ve just published a paper. In a moment, I’ll get around to discussing it at even more than my usual length, but I feel that I can’t do my excited puppy act without at least trying to capture how bloody much this paper means to me. The following may get a little personal; if you want to jump straight to the Cool Stuff, feel free to scroll a couple of paragraphs down.

<personal bit>

As you may have guessed from the long silence here, it’s not been a good handful of years, Real Life and mental health-wise. After my PhD, the prospect of the research career I’d dreamed of since I first began to grasp the meaning of the word “scientist” no longer seemed so dreamlike. It may surprise you to hear this from someone who finished a PhD with four published papers and spent the years of said PhD blathering regularly on the internet, but I find writing things for other people to read very, very stressful. In the case of a job application or a thesis chapter, that becomes “I’m not eating or sleeping properly” stressful. (Don’t ask me how I survived 20+ years of formal education.)

Long story short, for the last 3 years I’ve been getting by with a minimum wage job for which I’m both vastly overqualified and singularly ill-suited. I started the research project that culminated in the paper you can now read (for free, yay!) in BMC Evolutionary Biology (Szabó and Ferrier, 2018) while unemployed and broke, and I did most of it in my free time around work. This paper is a hard-won victory over myself and my circumstances. It’s a tiny glint of self-worth in the depth of the tunnel. In some ways, it was harder than my thesis: no funding body to satisfy, no lab mates to gripe at, no deadlines to spur me on. The only constant was my ex-supervisor turned co-author, who took my hobby project under his wings for the slim reward of having his name on a paper and nudged me into finishing it with unending patience. Here’s to Dave Ferrier, champion of non-model organisms, homeobox guy extraordinaire and all-round excellent human being. Dave, I hope you know you’re an absolute star.

</personal bit>

With that out of the way, it’s time for the Cool Stuff. There are Hox genes! More Hox genes than anyone ever imagined! (That is kind of the point, in fact!)

Apologies for the word count. I thought it would be a good idea to explain a few things, but also, I think I enjoy waffling about my baby far too much 😊

Hox therapy

The story of my Hox paper begins with an unemployed biologist with an overabundance of free time and a desperate need to do something scientific. Since I have a slightly odd idea of “fun”, back in 2015 I decided to catalogue Hox gene (or rather, protein) diversity in the animal kingdom, with particular focus on obscure and poorly studied groups. (I didn’t get very far, as we’ll see.)

Since it’s hard to discuss the paper without dropping some arcane zoological nomenclature, here’s my trusty old animal phylogeny to (re)acquaint us with the general outlines of the animal kingdom (I might need to update this in light of the Great Ctenophore Controversy some day, but we’re not dealing with anything outside the Bilateria today):


For the purposes of my paper, we’re zooming into the deuterostome branch, which looks something like this on the inside (borrowing my own rather lacklustre last-minute figure from Szabó and Ferrier [2018]):


Everything on this tree apart from chordates (that’s us) belongs to a group called Ambulacraria, which contains two phyla, hemichordates (top two branches) and echinoderms (the next five). Echinoderms are the more familiar of the two – starfish and sea urchins and suchlike – and also the focus of my project. (I could find no Hox gene data from pterobranchs, which puts a slight caveat on everything I say about hemichordates)

Back to Hox genes.

Hox genes were kind of my gateway drug into evolutionary developmental biology. A few decades earlier, they had served the same purpose for developmental biology as a whole, since they were among the first genes to be discovered that (1) directed embryonic development (2) were comparable between very disparate animal groups. The short version, which will suffice for our purposes here, is that Hox genes are important in what we eggheads call anteroposterior patterning, or determining what body parts go where along the head (anterior) to tail (posterior) axis of a (bilaterian) animal.

In (I think, I haven’t counted) the majority of animals that have them, Hox genes are clustered to a greater or lesser extent. Rather than being scattered haphazardly across the genome, they sit close to one another along the same stretch of DNA. (Duboule [2007] is an excellent – albeit now slightly out of date – review of the various known configurations.)

Since my study is about echinoderms, the schematic Hox cluster shown below is the neatest known example from an echinoderm, the crown-of-thorns starfish Acanthaster planci (source: Baughman et al., 2014):


In this image, Hox genes are colour-coded according to a commonly used classification scheme. This classification is mostly based on the homeodomain, or the “business” end of the protein that a Hox gene encodes. A homeodomain makes up a relatively small portion (maybe 1/5th on average) of a typical Hox protein, but it’s the part that interacts with the DNA switches through which Hoxes control their target genes, and it’s often the only part that is similar enough to be compared between different Hox types.

The important genes for us today are the “posterior” Hox genes shown in pink and red above, especially the last two. The four posterior Hox genes seen here represent the “standard” set for ambulacrarians, although it’s uncertain whether Hox11/13b-c were already separate genes or just a single precursor gene in the ambulacrarian ancestor.

Eureka… or WTF?

“The most exciting phrase to hear in science, the one that heralds new discoveries, is not Eureka! (I found it!) but rather, ‘hmm… that’s funny…”Almost certainly not Isaac Asimov

In creating my grand catalogue, I’d quickly breezed through vertebrates (which are all essentially the same for my purposes) and other chordates (for which the data I could find were rather limited). I thought echinoderms would be an easy job, too: there were good in-depth studies of a few species, and they hadn’t revealed anything terribly unusual other than a rearrangement of the Hox cluster in sea urchins (Cameron et al., 2006).

In fact, through comparison with their sister group, the hemichordates (Freeman et al., 2012), it seemed likely that the ancestral echinoderm had a nice, ordered Hox cluster with few if any oddities (Baughman et al., 2014). So I clicked my way to the wonderful Echinobase, which has searchable draft genomes from four of the five living classes of echinoderms (crinoids, a.k.a sea lilies and feather stars, are missing, although a genome in a very early, fragmentary stage exists here). I expected to double-check the published data, collect the same genes from the groups for which Hox papers hadn’t been published, and be off to protostomes in a day or two. Two years later, I still haven’t made it to protostomes, but I’ve gone rather deeper than expected in echinoderms…

(Below: my cast. The main characters are Strongylocentrotus purpuratus [photo: Kirt L. Onthank] and Lytechinus variegatus [photo: Hans Hillewaert] representing sea urchins, Patiria miniata [photo: Jerry Kirkhart] and Acanthaster planci [photo: JSLUCAS75] for sea stars, Parastichopus parvimensis [from here] and Apostichopus japonicus [photo: OpenCage] for sea cucumbers, Metacrinus rotundus [photo: OpenCage] and Anneissia japonica [photo: OpenCage] for crinoids, Ophiothrix spiculata [photo: Jerry Kirkhart] for brittle stars, with supporting acts from Peronella japonica [sea urchins, photo: Endo et al., 2018], Ophiopsila aranea [brittle stars, photo: Bernard Picton], Balanoglossus simodensis [photo: Misaki Marine Biological Station, U of Tokyo], Saccoglossus kowalevskii [photo: Lowe lab] and Ptychodera flava [photo: Moorea BioCode via CalPhotos] for hemichordates, and Branchiostoma floridae [photo via JGI genome portal], Latimeria menadoensis [photo: Claudio Martino] and Callorhinchus milii [photo: fir0002/Flagstaffotos] for chordates. I sourced the photos through Wikipedia/Wikimedia Commons where I could; other sources are linked where applicable.)


You see, I didn’t want to stop at just homeodomains. Homeodomains are cool and important and all, but one thing I’d learned from my earlier forays into the world of Hox genes was that valuable information hid in small patches of conserved sequence elsewhere in their proteins. Besides, I am a pathological perfectionist. I felt a terrible need to collect complete Hox sequences wherever possible.

I already mentioned that sequence similarity between Hoxes outside the homeodomain can be weak to non-existent. I ran into this problem with Echinobase’s brittle star, Ophiothrix spiculata. Using the known sea urchin Hoxes to search its genome, I’d found believable matches for many of them, but the 11/13s defeated me. I had two homeodomains that I thought represented 11/13b and c, but I couldn’t for the life of me recover the rest of the proteins.

The problem with genome databases (or their great advantage depending on your perspective) is that they contain all of the DNA that could be sequenced from the owner of the genome. The problem with Hox genes – most of our genes, in fact – is that they aren’t continuous stretches of DNA. Your typical gene exists in multiple segments (exons) separated by a whole lot of DNA that leaves no trace in the protein product of the gene. (Hox genes normally have two or three exons, the first of which is devoid of homeodomain parts.)

When a gene is expressed, the cell first makes an RNA copy of all that, which is edited to throw out the introns and splice the exons together. That intron-less RNA copy is then carried off to be translated into a protein. Transcriptomes are derived from the RNA copies of active genes. Introns lie forgotten on the cutting room floor: in the sequenced transcripts, one exon continues straight into the next. Therefore, if I could find a brittle star transcriptome, and the 11/13b-c homeodomains in it, perhaps there would be enough of the rest in there to reconstruct those elusive first exons.

Luckily, Delroisse et al. (2016) had published exactly what I needed. In one of their transcriptomes, I found a homeodomain that looked like my Ophiothrix Hox11/13c, as part of a near-complete sequence. Excited, I did the reciprocal search against the Ophiothrix genome…

… and hit neither 11/13b nor 11/13c.

So here I am, staring at a beautiful match between this transcript and a part of the Ophiothrix genome that I hadn’t examined before. The match contains sequence from the first exon, which, given my previous experience with these buggers, is a sure sign that they’re the same gene. And it’s neither of the ones I’d expected.

A bit later in a different database, I hit upon an automatically predicted sea urchin protein that definitely isn’t 11/13b or c either. This is the model sea urchin, S. purpuratus, the one I thought we knew inside out when it came to Hoxes. I check the genome on Echinobase, and lo and behold, there’s the third 11/13b-c type gene, and it’s nowhere near the Hox cluster.

If memory serves, it’s roughly at this point that the words, “What. The. Actual. Fuck. Is. Going. On.” occur in my research notes. (Complete with punctuation.)

I checked the other species on Echinobase. Three 11/13b-c genes again, every time. Over on Genbank, I found a complete protein sequence from a sand dollar that Tsuchimoto and Yamaguchi (2014) had previously classified as 11/13c by exclusion. The Japanese duo had a clear b, but this other sequence was behaving oddly in their phylogenetic analyses. Now I had the obvious explanation: it wasn’t 11/13c at all.*

I wrote to Dave and found out that this was also news to him. By all appearances, I had stumbled on something truly new, in a gene family that’s both iconic in our field, and dear to my obsessive little heart.

We decided to try to turn it into a paper.

In search of the alphabet’s end

Once we’d made that decision, and following Dave’s advice, I had a few tasks ahead of me. I had to check how far back in evolution our new gene (which we called Hox11/13d) went. I had to test whether it had truly escaped the Hox cluster in all of our study species. I had to refresh my memory on deuterostome posterior Hox genes in general, both for paper-writing purposes and in case there was a forgotten reference to our “new” gene lurking somewhere in the literature.

There wasn’t, but.

In a figure legend in Thomas-Chollier et al., 2010), there is a brief mention of an unnamed “Hox11/13c-like” sequence in sea urchins. When I saw that, I damn near soiled myself, but the authors couldn’t definitively identify this sequence as a Hox gene, so they left it at that throwaway comment and a few bits of supplementary data. Luckily, they had a gene ID that I could look up on Echinobase.

Gods help me, it turned out to be another new Hox. When the shock of Hox11/13d had barely worn off, I was confronted with a possible Hox11/13e. And this one wasn’t in the Hox cluster either.

Aside from not being part of the Hox cluster, Hox11/13d is a pretty good echinoderm Hox gene. The homeodomain it encodes is reminiscent of Hox11/13b and c, and, although they are hard for automated searches to find, there are similarities outside the homeodomain that place it firmly in the same group as b-c.

Unlike d, Thomas-Chollier’s “11/13c-like” sequence isn’t that 11/13c-like at all, as you might have guessed from the fact that they weren’t even sure it’s a Hox. The region immediately following the homeodomain (sometimes known as the C-peptide) is very similar to the same part of Hox11/13d. These kinds of motifs can sometimes be used to tell different Hox genes apart. Two C-peptides being strongly similar is a clue that we’re dealing with related genes. However, the homeodomain of Hox11/13e, as we indeed dubbed Thomas-Chollier’s sequence, is really, really weird. It isn’t just unlike 11/13c, it’s unlike anything else I’d seen before. It groups with posterior Hoxes when we test it against a variety of homeodomains, but you wouldn’t know that simply from looking at it.

It is, however, an oddball with a history. As strange as that homeodomain is, once I knew what I was looking for, I found examples in all my other echinoderms. This combination of strong conservation of one Hox gene with considerable differences from other Hox genes just screams “study me more!”, especially when you realise that Hox11/13e appears to be limited to echinoderms (unless something like it is hiding in protostomes…). I looked quite carefully in the hemichordates available to me (Simakov et al., 2015), but the only thing I found that wasn’t one of the “canonical” four posteriors is something called “Abdominal B-like”, which is weird in its own way and not obviously connected to either of our two new genes.

Tangled histories and unhelpful clues

I alluded to the question of Hox11/13b-c origins earlier on. Posterior Hox genes in deuterostomes are notoriously difficult to classify (Ferrier et al., 2000; Thomas-Chollier et al., 2010). When you try to use traditional tree-building methods on them, you get a big unresolved mess, as if the twigs on the tree emerged from an impenetrable mist that hides the arrangement of the older branches from view. Ambulacrarians are definitely the better-behaved half of the Deuterostomia in this regard, since we can say with some confidence that Hox9/10, 11/13a and at least a single precursor to 11/13b-c were present in their last common ancestor.

Nonetheless, two new genes, at least one of which is clearly close to 11/13b-c, complicate matters (Abdominal B-like, as they say in scientist-speak, is beyond the scope of this work). Were they lost in hemichordates? Did echinoderms undergo extra gene duplications, and if so, was it from one or two ancestral genes? Where on earth does Hox11/13e fit? I did a lot of exploratory tree-building for this paper, none of which was particularly helpful in answering those questions.

My other hope was to look at the parts of the protein sequence that led me to my new Hoxes in the first place: all the stuff other than the homeodomain. Using a program called MEME, I found a fair few conserved motifs, but they only seemed to add to the confusion. Hox11/13e, for which I only had first exons (and tentative ones at that) from sea urchins and sea stars, yielded nothing of use apart from its striking C-peptide. In the others, the distribution of motifs created a patchwork of similarities that didn’t neatly align with any one possible history. Echinoderm Hox11/13c mostly did its own thing, while b and d each shared a different subset of motifs with one or both of the hemichordate b-c proteins.

I’m almost inclined to think that there was a single, “prototype” Hox11/13b+ sequence in the ambulacrarian ancestor, which contained all of the motifs I found. In that scenario, separate b and c (and d and maybe e) genes would have evolved independently in hemichordates and echinoderms, and each descendant gene would have lost some of the original motifs more or less at random. Duplicated genes can split the functions of their single ancestor between them (Force et al., 1999), so why not motifs? Short sequence motifs like the ones I was looking for can have important functions, after all. It’s a possibility, but we may never know for sure.

Hox genes gone rogue

I mentioned before that Hox11/13d was outside the Hox cluster. Well, so is Hox11/13e. As far as I can tell, Hox 11/13d and e always reside on separate chunks of the genome form any other Hox gene, including each other. They are always accompanied by neighbouring genes that aren’t Hoxes. Although detachment of a posterior gene from an otherwise apparently intact Hox cluster also happened in ragworms (Hui et al., 2012), it’s still a surprise in echinoderms. Since the relationship between the organisation of Hox genes and their regulation in space and time is… kinda complicated, we can’t really tell what, if anything, all this wandering implies without actually looking at some gene expression.

What are they for?

Then there’s the question of what on earth these genes do. Thanks to Tsuchimoto and Yamaguchi (2014), we know that Hox11/13d is active in later embryonic stages of some sea urchins. It even looks like it might be working with Hox11/13b in a Hox-like fashion, the two of them having adjacent expression domains. We have some transcriptomic evidence that this gene is also active in other sea urchins, brittle stars and starfish, but no idea what it’s doing in any of the above.

We know even less about Hox11/13e. The only evidence for expression I’m aware of is from starfish testicles, and testicles will express any old piece of DNA with an “on” switch. If it’s somehow involved in development, it must be either at very low levels that are difficult to capture in a transcriptome, or at developmental stages that weren’t included in the data I encountered.

If it does have a role in adult echinoderm development, that would be crazy exciting, as both adult echinoderm anatomy and Hox11/13e are so weird and unique. Although they develop from bilaterally symmetrical larvae, adult echinoderms have dispensed with the symmetry that gave Bilateria its name. Instead, like a sea anemone (or a regular anemone…), they are radially symmetrical. Hox genes are involved in both larval and adult development in echinoderms, but from what little I’ve been able to glean from the existing literature, it’s different subsets in larvae and adults rather than the entire Hox cluster together. Is Hox11/13e in the “adult” subset, missed until now due to its unusual sequence? I really hope someone with a lab and a ready supply of baby echinoderms investigates in the near future…

A lesson about expectations

I could go on for a lot longer about this project, but it’s probably time to form some sort of conclusion. For me, perhaps the most important take-home message of this adventure is not what I found, but how and where and why I found it.

I didn’t set out to discover anything. All I wanted to do was collect and organise information already out there. (If a genie popped out of my desk lamp, I might just wish for a full-time job where I get to build my Hox directory… given the volume of genome data already out there and coming out every time I look, continuing this as a hobby project in my free time seems hopelessly Sisyphean now.)

The discovery of Hox11/13d and all that followed was an accidental side effect of my penchant for perfectionism. If I’d contented myself with the homeodomains most students of Hox evolution focus on, I would never have seen a Hox that wasn’t in the books, a Hox I hadn’t expected to exist.

Expectations are important. I’d told myself that I wanted to make sure I had everything, but when my searches spat out a hundred different results, I started to slack off soon after I ticked off the Hoxes I knew. I gave the rest of the hit list a half-hearted effort at best. Hox11/13d has a homeodomain that’s split across two exons, and Hox11/13e is weird. In a search that scores both the closeness and the length of a match, that pushes them to the bottom of the results, where a casual observer, or an observer who thinks they know what they’re looking for, will most likely miss them. I thought I knew that sea urchins had a single, intact(ish) Hox cluster with 11 genes. I’d read a pretty good paper on it. Only the paper wasn’t quite right, after all.

To me, this study stands as a reminder to keep looking. In an era when new genomes are popping up left and right and Big Data with automated analyses is the scientific zeitgeist, it’s still worth rolling your sleeves up, picking up the old magnifying glass and taking a closer look – even in organisms you think you know. You might just chance upon some real treasure.



*A “Hox11/13c” behaving oddly should be immediately suspicious based on what I saw in my own trees, where echinoderm Hox11/13c consistently formed a strongly supported group. But that’s hindsight for you…



Baughman KW et al. (2014) Genomic organization of Hox and ParaHox clusters in the echinoderm, Acanthaster planci. Genesis 52:952-958

Cameron RA et al. (2006) Unusual gene order and organization of the sea urchin hox cluster. JEZ B 306:45-58

Delroisse J et al. (2016) De novo adult transcriptomes of two European brittle stars: spotlight on opsin-based photoreception. PLoS ONE 11: e0152988

Duboule D (2007) The rise and fall of Hox gene clusters. Development 134:2549-2560

Endo M et al. (2018) Hidden genetic history of the Japanese sand dollar Peronella (Echinoidea: Laganidae) revealed by nuclear intron sequences. Gene 659:37-43

Ferrier DEK et al. (2000) The amphioxus Hox cluster: deuterostome posterior flexibility and Hox14. Evol Dev 2:284-293

Force A et al. (1999) Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151:1531-1545

Freeman R et al. (2012) Identical genomic organization of two hemichordate Hox clusters. Curr Biol 22:2053-2058

Hui JH et al. (2012) Extensive chordate and annelid macrosyntheny reveals ancestral homeobox gene organization. Mol Biol Evol 29:157-165

Simakov O et al. (2015) Hemichordate genomes and deuterostome origins. Nature 527:459-465

Szabó R and Ferrier DEKF (2018) Two more Posterior Hox genes and Hox cluster dispersal in echinoderms. BMC Evol Biol 18:203

Thomas-Chollier M et al. (2010) A non-tree-based comprehensive study of metazoan Hox and ParaHox genes prompts new insights into their origin and evolution. BMC Evol Biol 10:73

Tsuchimoto J and Yamaguchi M (2014) Hox expression in the direct-type developing sea urchin Peronella japonica. Dev Dyn 243:1020-1029


To dump a chunk of trunk

The Mammal has deemed that Hox genes and good old-fashioned feel-good evo-devo are a good way to blink back to life*. Also, tardigrades. Tardigrades are awesome. Here is one viewed from above, from the Goldstein lab via Encyclopedia of Life:


Tardigrades or water bears are also a bit unusual. Their closest living relatives are velvet worms (Onychophora) and arthropods. Exactly who’s closest to whom in that trio of phyla collectively known as the Panarthropoda is not clear, and I don’t have the energy to wade into the debate – besides, it’s not really important for the purposes of this post. What Smith et al. (2016) concluded about these adorably indestructible little creatures holds irrespective of their precise phylogenetic position.

Anyway. I said tardigrades were unusual, and I don’t mean their uncanny ability to survive the apocalypse and pick up random genes in the process (Boothby et al., 2015). (ETA: so apparently there may not be nearly as much foreign gene hoarding as the genome paper suggests – see Sujai Kumar’s comment below! Doesn’t change the fact that tardigrades are tough little buggers, though 🙂 ) The oddity we’re interested in today lies in the fact that all known species are built to the exact same compact body plan. Onychophorans and many arthropods are elongated animals with lots of segments, lots of legs, and often lots of variation in the number and type of such body parts. Tardigrades? A wee head, four chubby pairs of legs, and that’s it.

How does a tardigrade body relate to that of a velvet worm, or a centipede, or a spider? Based solely on anatomy, that’s a hell of a question to answer; even the homology of body parts between different kinds of arthropods can be difficult to determine. I have so far remained stubbornly uneducated on the minutiae of (pan)arthropod segment homologies, although I do see papers purporting to match brain parts, appendages and suchlike between different kinds of creepy-crawlies on a fairly regular basis. Shame on me for not being able to care about the details, I guess – but the frequency with which the subject comes up suggests that the debate is far from over.

Now, when I was first drawn to the evo-devo field, one of the biggest attractions was the notion that the expression of genes as a body part forms can tell us what that body part really is even when anatomical clues are less than clear. That, of course, is too good to be simply true, but sometimes the lure of genes and neat homology stories is just too hard to resist. Smith et al.‘s investigation of tardigrade Hox genes is definitely that kind of story.

Hox genes are generally a good place to look if you’re trying to decipher body regions, since their more or less neat, orderly expression patterns are remarkably conserved between very distantly related animals (they are probably as old as the Bilateria, to be precise). A polychaete worm, a vertebrate and an arthropod show the same general pattern – there is no active Hox gene at the very front of the embryo, then Hoxes 1, 2, 3 and so on appear in roughly that order, all the way to the rear end. There are variations in the pattern – e.g. the expression of a gene can have sharp boundaries or fade in and out gradually; different genes can overlap to different extents, the order isn’t always perfect, etc. – but staggered Hox gene expression domains, with the same genes starting up in the same general area along the main body axis, can be found all across the Bilateria.

Tardigrades are no exception, in a sense – but they are also quite exceptional. First, their complement of Hox genes is a bit of a mess. At long last, we have a tardigrade genome to hand, in which Smith et al. (2016) found good honest Hox genes. What they didn’t find was a Hox cluster, an orderly series of Hox genes sitting like beads on a DNA string. Instead, the Hox genes in Hypsibius dujardini, the sequenced species, are all over the genome, associating with all kinds of dubious fellows who aren’t Hoxes.

What Smith et al. also didn’t find was half of the Hox genes they expected. A typical arthropod has ten or so Hox genes, a pretty standard ballpark for an animal that isn’t a vertebrate. H. dujardini has only seven, three of which are triplicates of Abdominal-B, a gene that normally exists in a single copy in arthropods. So basically, only five kinds of Hox gene – number two and most of the “middle” ones are missing. What’s more, two more tardigrades that aren’t closely related to H. dujardini also appear to have the same five Hox gene types (though only one Abd-B each), so this massive loss is probably a common feature of Tardigrada. (No word on whether the scattering of the Hox  cluster is also shared by the other two species.)

We know that the genes are scattered and decimated, but are their expression patterns similarly disrupted? You don’t actually need an intact Hox cluster for orderly Hox expression, and indeed, tardigrade Hox genes are activated in a perfectly neat and perfectly usual pattern that resembles what you see in their panarthropod cousins. Except for the bit where half the pattern is missing!

Here’s part of Figure 4 from the paper, a schematic comparison of tardigrade Hox expression to that of other panarthropods – a generic arachnid, a millipede and a velvet worm. (otd is a “head” gene that lives in the Hox-free anterior region; lab is the arthropod equivalent of Hox1, Dfd is Hox4, and I’m not sure which of Hox6-8 ftz is currently supposed to be.) The interesting thing about this is that according to Hox genes, the entire body of the tardigrade corresponds to just the front end of arthropods and velvet worms.


In addition, one thing that is not shown on this diagram is that Abdominal-B, which normally marks the butt end of the animal, is still active in the tardigrade, predictably in the last segment (L4, that is). So if you take the Hox data at face value, a tardigrade is the arse end of an arthropod tacked straight onto its head. Weird. It’s like evolution took a perfectly ordinary velvet worm-like creature and chopped out most of its trunk.

The tardigrade data suggest that the original panarthropod was probably more like arthropods and velvet worms than tardigrades – an elongated animal with many segments. The strange tardigrade situation can’t be the ancestral one, since the Hox genes that tardigrades lack long predate the panarthropod ancestor. Now, it might be possible to lose half your Hox genes while keeping your ancestral body plan, but an unusual body plan and an unusual set of Hox genes is a bit of a big coincidence, innit?

Smith et al. point out that the loss of the Hox genes was unlikely to be the cause of the loss of the trunk region – Hox genes only specify what grows on a segment, they don’t have much say in how many segments develop in the first place. Instead, the authors reason, the loss of the trunk in the tardigrade ancestor probably made the relevant Hox genes dispensable.

Damn, this story makes me want to see the Hox genes of all those oddball lobopodians from the Cambrian. Some of them are bound to be tardigrade relatives, right?



Boothby TC et al. (2015) Evidence for extensive horizontal gene transfer from the draft genome of a tardigrade. PNAS 112:15976-15981

Smith FW et al. (2016) The compact body plan of tardigrades evolved by the loss of a large body region. Current Biology 26:224-229


*The Mammal has been pretty depressed lately. As in mired up to her head in weird energy-sucking flu. Unfortunately, writing is one of those things that the damn brain monster has eaten most of the fun out of. Also, I have a shitty normal person job at the moment, and shitty job taking up time + barely enough motivation to crawl out of bed and pretend to be human means I have, at best, one afternoon per week that I actually spend on catching up with science. That is just enough to scroll through my feeds and file away the interesting stuff, but woefully insufficient for the writing of posts, not to mention that my ability to concentrate is, to be terribly technical, absolutely fucked. It’s not an ideal state of affairs by any stretch, and I’m pretty sure that if I made more of an effort to read and write about cool things, it would pay off in the mental health department, but… well. That sort of reasonable advice is hard to hear with the oozing fog-grey suckers of that thing clamped onto my brain.

In which a “living fossil’s” genome delights me

I promised myself I wouldn’t go on for thousands and thousands of words about the Lingula genome paper (I’ve got things to do, and there is a LOT of stuff in there), but I had to indulge myself a little bit. Four or five years ago when I was a final year undergrad trying to figure out things about Hox gene evolution, I would have killed for a complete brachiopod genome. Or even a complete brachiopod Hox cluster. A year or two ago, when I was trying to sweat out something resembling a PhD thesis, I would have killed for some information about the genetics of brachiopod shells that amounted to more than tables of amino acid abundances. Too late for my poor dissertations, but a brachiopod genome is finally sequenced! The paper is right here, completely free (Luo et al., 2015). Yay for labs who can afford open-access publishing!

In case you’re not familiar with Lingula, it’s this guy (image from Wikipedia):

In a classic case of looks being deceiving, it’s not a mollusc, although it does look a bit like one except for the weird white stalk sticking out of the back of its shell. Brachiopods, the phylum to which Lingula belongs, are one of those strange groups no one really knows where to place, although nowadays we are pretty sure they are somewhere in the general vicinity of molluscs, annelid worms and their ilk. Unlike bivalve molluscs, whose shell valves are on the left and right sides of the animal, the shells of brachiopods like Lingula have top and bottom valves. Lingula‘s shell is also made of different materials: while bivalve shells contain calcium carbonate deposited into a mesh of chitin and silk-like proteins,* the subgroup of brachiopods Lingula belongs to uses calcium phosphate, the same mineral that dominates our bones, and a lot of collagen (again like bone). But we’ll come back to that in a moment…

One of the reasons the Lingula genome is particularly interesting is that Lingula is a classic “living fossil”. In the Paleobiology Database, there’s even an entry for a Cambrian fossil classified as Lingula, and there are plenty of entries from the next geological period. If the database is to be believed, the genus Lingula has existed for something like 500 million years, which must be some kind of record for an animal.** Is its genome similarly conservative? Or did the DNA hiding under a deceptively conservative shell design evolve as quickly as anyone’s?

In a heroic feat of self-control, I’m not spending all night poring over the paper, but I did give a couple of interesting sections a look. Naturally, the first thing I dug out was the Hox cluster hiding in the rather large supplement. This was the first clue that Lingula‘s genome is definitely “living” and not at all a fossil in any sense of the word. If it were, we’d expect one neat string of Hox genes, all in the order we’re used to from other animals. Instead, what we find is two missing genes, one plucked from the middle of the cluster and tacked onto its “front” end, and two genes totally detached from the rest. It’s not too bad as Hox cluster disintegration goes – six out of nine genes are still neatly ordered – but it certainly doesn’t look like something left over from the dawn of animals.

The bigger clue that caught my eye, though, was this little family tree in Figure 2:


The red numbers on each branch indicate the number of gene families that expanded or first appeared in that lineage, and the green numbers are the families shrunk or lost. Note that our “living fossil” takes the lead in both. What I find funny is that it’s miles ahead of not only the animals generally considered “conservative” in terms of genome evolution, like the limpet Lottia and the lancelet Branchiostoma, but also the sea squirt (Ciona). Squirts are notorious for having incredibly fast-evolving genomes; then again, most of that notoriety was based on the crazily divergent sequences and often wildly scrambled order of its genes. A genome can be conservative in some ways and highly innovative in others. In fact, many of the genes involved in basic cellular functions are very slow-evolving in Lingula. (Note also: humans are pretty slow-evolving as far as gene content goes. This is not the first study to find that.)

So, Lingula, living fossil? Not so much.

The last bit I looked at was the section about shell genetics. Although it’s generally foolish to expect the shell-forming gene sets of two animals from different phyla to be similar (see my first footnote), if there are similarities, they could potentially go at least two different ways. First, brachiopods might be quite close to molluscs, which is the hypothesis Luo et al.‘s own treebuilding efforts support. Like molluscs, brachiopods also have a specialised mantle that secretes shell material, though having the same name doesn’t mean the two “mantles” actually share a common origin. So who knows, some molluscan shell proteins, or shell regulatory genes, might show up in Lingula, too.

On the other hand, the composition of Lingula’s shell is more similar to our skeletons’. So, since they have to capture the same mineral, could the brachiopods share some of our skeletal proteins? The answer to both questions seems to be “mostly no”.

Molluscan shell matrix proteins, those that are actually built into the structure of the shell, are quite variable even within Mollusca. It’s probably not surprising, then, that most of the relevant genes that are even present in Lingula are not specific to the mantle, and those that are are the kinds of genes that are generally involved in the handling of calcium or the building of the stuff around cells in all kinds of contexts. Some of the regulatory mechanisms might be shared – Luo et al. report that BMP signalling seems to be going on around the edge of the mantle in baby Lingula, and this cellular signalling system is also involved in molluscan shell formation. Then again, a handful of similar signalling systems “are involved” in bloody everything in animal development, so how much we can deduce from this similarity is anyone’s guess.

As for “bone genes” – the ones that are most characteristically tied to bone are missing (disappointingly or reassuringly, take your pick). The SCPP protein family is so far known only from vertebrates, and its various members are involved in the mineralisation of bones and teeth. SCPPs originate from an ancient protein called SPARC, which seems to be generally present wherever collagen is (IIRC, it’s thought to help collagen fibres arrange themselves correctly). Lingula has a gene for SPARC all right, but nothing remotely resembling an SCPP gene.

I mentioned that the shell of Lingula is built largely on collagen, but it turns out that it isn’t “our” kind of collagen. “Collagen” is just a protein with a particular kind of repetitive sequence. Three amino acids (glycine-proline-something else, in case you’re interested) are repeated ad nauseam in the collagen chain, and these repetitive regions let the protein twist into characteristic rope-like fibres that make collagen such a wonderfully tough basis for connective tissue. Aside from the repeats they all share, collagens are a large and diverse bunch. The ones that form most of the organic matrix in bone contain a non-repetitive and rather easily recognised domain at one end, but when Luo et al. analysed the genome and the proteins extracted from the Lingula shell, they found that none of the shell collagens possessed this domain. Instead, most of them had EGF domains, which are pretty widespread in all kinds of extracellular proteins. Based on the genome sequence, Lingula has a whole little cluster of these collagens-with-EGF-domains that probably originated from brachiopod-specific gene duplications.

So, to recap: Lingula is not as conservative as its looks would suggest (never judge a living fossil by its cover, right?) We also finally have actual sequences for lots of its shell proteins, which reveal that when it comes to building shells, Lingula does its own thing. Not much of a surprise, but still, knowing is a damn sight better than thinkin’ it’s probably so. We are scientists here, or what.

I am Very Pleased with this genome. (I just wish it was published five years ago 😛 )



*This, interestingly, doesn’t seem to be the general case for all molluscs. Jackson et al. (2010) compared the genes building the pearly layer of snail (abalone, to be precise) and bivalve (pearl oyster) shells, and found that the snail showed no sign of the chitin-making enzymes and silk type proteins that were so abundant in its bivalved cousins. It appears that even within molluscs, different groups have found different ways to make often very similar shell structures. However, all molluscs shells regardless of the underlying genetics are predominantly composed of calcium carbonate.

**You often hear about sharks, or crocodiles, or coelacanths, existing “unchanged” for 100 or 200 or whatever million years, but in reality, 200-million-year-old crocodiles aren’t even classified in the same families, let alone the same genera, as any of the living species. Again, the living coelacanth is distinct enough from its relatives in the Cretaceous, when they were last seen, to warrant its own genus in the eyes of taxonomists. I’ve no time to check up on sharks, but I’m willing to bet the situation is similar. Whether Lingula‘s jaw-dropping 500-million-year tenure on earth is a result of taxonomic lumping or the shells genuinely looking that similar, I don’t know. Anyway, rant over.



Jackson DJ et al. (2010) Parallel evolution of nacre building gene sets in molluscs. Molecular Biology and Evolution 27:591-608

Luo Y-J et al. (2015) The Lingula genome provides insights into brachiopod evolution and the origin of phosphate biomineralization. Nature Communications 6:8301

Wherein scientists DON’T spill blood over a Precambrian animal

Having gone through much of my backlog, I was going to post about pretty blue limpet shells, then I saw that people have been arguing over Haootia. You remember Haootia? It’s that Precambrian fossil with probable muscle impressions that looks kind of like a modern-day staurozoan jellyfish (living staurozoan Haliclystus californiensis by Allen Collins, Encyclopedia of Life; Haootia quadriformis reconstruction from Liu et al., 2014):


It’s pretty much a law of Precambrian palaeontology that no interpretation of a fossil can ever remain uncontested, and Haootia is no exception. Nonetheless, this might be the tamest debate anyone ever had about a Precambrian fossil, and it gives me all kinds of warm feels.

Good news: Miranda et al. (2015) don’t dispute that the fossils show muscle impressions. They don’t even dispute that they belong to a cnidarian-grade creature. However, they question some of the details of the muscular arrangement, which could have implications for what this creature was and how it functioned.

They don’t have much of an issue with the muscles that run along the stalk and arms. The main point of contention, as far as I can tell, is that the muscles that run around the body (called coronal muscles in modern jellies) are not that big in living staurozoans. Those are the muscles that regular jellyfish use to contract their bells while swimming, but staurozoans don’t swim and therefore don’t need huge coronal muscles.

By Liu et al.‘s (2014) reconstruction (see above), Haootia has pretty massive coronal muscles. Miranda et al. (2015) wonder whether this was really the case, or the deformation of the fossils combined with the subconscious influence of regular jellyfish misled the original authors. They offer an alternative reconstruction, in which most of the body musculature runs up and down rather than around the body wall:


However, they also entertain the possibility that Liu et al.‘s reconstruction is correct – in which case, they note, Haootia must have done something with those muscles. Did jellyfish-like pulsations somehow form part of its feeding method? Could this even be a precursor to the jellyfish way of swimming? Who knows!

Liu et al. (2015) gave the most amazing response – much of their short reply to Miranda et al.‘s comments is basically thanking them for all the extra information and insight. They seem really pleased that biologists who study living cnidarians are taking an interest in their fossils, and enthusiastic about fruitful discussions in the future. (I concur. Biologists and palaeontologists need to talk to each other!)

They did take another, closer look at Haootia and maintain that they still see a large amount of musculature running around the body. So perhaps this peculiar Precambrian animal was doing something peculiarly Precambrian that has few or no parallels in modern seas. “We must keep in mind,” they write,  “that some, or maybe most, Ediacaran body plans and feeding strategies may have been specifically adapted to Ediacaran conditions.”

Either way, the whole exchange makes me very warm and fuzzy – I love to see scientists having constructive debates and learning from each other. (I also love that Miranda et al. thank Alex Liu in their acknowledgements; they were so obviously not out to tear one another down.) Plus both teams agree that we DO have a cnidarian-type creature from the Precambrian, and we DO have lovely lovely muscle impressions. Here’s to nice people, and to the slowly sizzling fuse of the Cambrian explosion! 🙂



Liu AG et al. (2014) Haootia quadriformis n. gen., n. sp., interpreted as a muscular cnidarian impression from the Late Ediacaran period (approx. 560 Ma). Proceedings of the Royal Society B 281:20141202

Liu AG et al. (2015) The arrangement of possible muscle fibres in the Ediacaran taxon Haootia  quadriformis. Proceedings of the Royal Society B 282:20142949

Miranda LS et al. (2015) Is Haootia quadriformis related to extant Staurozoa (Cnidaria)? Evidence from the muscular system reconsidered. Proceedings of the Royal Society B 282:20142396

Hi, real world, again!

The Mammal has emerged from a thesis-induced supermassive black hole and a Christmas-induced food coma, only to find that in the month or so that she spent barely functional and buried in chapters covered in the supervisor’s dreaded Red Pen, things actually happened in the world outside. This, naturally, manifested in thousands of items feeling thoroughly neglected in RSS readers and email inboxes. (Jesus. How many times have I vowed never to neglect my RSS feed again? Oh well, it’s not like unemployment is such a busy occupation that I can’t deal with a measly two and a half thousand articles 😛 )

… earlier tonight, the paragraph here said I wasn’t doing a proper post yet, “just pointing out” a couple of the cooler things I’ve missed. Then somehow this thing morphed into a 1000+ word post that goes way beyond “pointing things out”. It’s almost like I’ve been itching to write something that isn’t my thesis. >_>

So the first cool thing I wanted to “point out” is the genome paper of the centipede Strigamia maritima, which is a rather nondescript little beast hiding under rocks on the coasts of Northwest Europe. This is the first sequenced genome of a myriapod – the last great class of arthropods to remain untouched by the genome sequencing craze after many genomes from insects, crustaceans and chelicerates (spiders, mites and co.).  The genome sequence itself has been available for years (yay!), but its “official” paper (Chipman et al., 2014) is just recently out.

Part of the appeal of Strigamia – and myriapods in general – is that they are considered evolutionarily conservative for an arthropod. In some respects, the genome analysis confirms this. Compared to its inferred common ancestor with us, Strigamia has lost fewer genes than insects, for example. Quite a lot of its genes are also linked together similarly to their equivalents in distantly related animals, indicating relatively little rearrangement in the last 600 million years or so. But this otherwise conservative genome also has at least one really unique feature.

Specifically, this centipede – which is blind – has not only lost every bit of DNA coding for known light-sensing proteins, but also all known genes specific to the circadian clock. In other animals, genes like clock and period mutually regulate one another in a way that makes the abundance of each gene product oscillate in a regular manner (this is about the simplest graphical representation I could find…). The clock runs on a roughly daily cycle all by itself, but it’s also connected to external light via the aforementioned light-sensing proteins, so we can constantly adjust our internal rhythms according to real day-night cycles.

There are many blind animals, and many that live underground or otherwise find day and night kind of irrelevant, but even these are often found to have a functioning circadian clock or keep some photoreceptor genes around. However, based on the genome data, our favourite centipede may be the first to have completely lost both. The authors of the genome paper hypothesise that this may be related to the length of evolutionary time the animals have spent without light. Things like mole rats are relatively recent “inventions”. However, the geophilomorph order of centipedes, to which Strigamia belongs, is quite old (its most likely sister group is known from the Carboniferous, so they’re probably at least that ancient). Living geophilomorphs are all blind, so chances are they’ve been that way for the last 300+ million years.

Nonetheless, the authors also note that geophilomorphs are still known to avoid light – the question now is how the hell they do it… And, of course, whether Strigamia has a clock is not known – only that it doesn’t have the clock we’re used to. We also have no idea at this point how old the gene losses actually are, since all the authors know is that one other centipede from a different group has perfectly good clock genes and opsins.

In comparison with fruit flies and other insects, the Strigamia genome also reveals some of the ways in which evolutionary cats can be skinned in multiple ways. There is an immune-related gene family we share with arthropods and other animals, called Dscam. The product of this gene is involved in pathogen recognition among other things, and in flies, Dscam genes are divided into roughly 100 chunks or exons, most of which are are found in clusters of variant copies. When the gene is transcribed, only one of these copies is used from each such cluster, so in practical terms the handful of fruit fly Dscam genes can encode tens of thousands of different proteins, enough to adapt to a lot of different pathogens.

A similar arrangement is seen in the closely related crustaceans, although with fewer potential alternative products. In other groups – the paper uses vertebrates, echinoderms, nematodes and molluscs for comparison – the Dscam family is pretty boring with at most one or two members and none of these duplicated exons and alternative splicing business. However, it looks like insects+crustaceans are not the only arthropods to come up with a lot of DSCAM proteins. Strigamia might also make lots of different ones (“only” hundreds in this case), but it achieved this by having dozens of copies of the whole gene instead of performing crazy editing feats on a small number of genes. Convergent evolution FTW!

Before I paraphrase the entire paper in my squeeful enthusiasm (no, seriously, I’ve not even mentioned the Hox genes, and the convergent evolution of chemoreceptors, and I think it’s best if I shut up now), let’s get to something else that I can’t not “point out” at length: a shiny new vetulicolian, and they say it’s related to sea squirts!

Vetulicolians really deserve a proper discussion, but in lieu of a spare week to read up on their messiness, for now, it’s enough to say that these early Cambrian animals have baffled palaeontologists since day one. Reconstructions of various types look like… a balloon with a fin? Inflated grubs without faces? I don’t know. Drawings below (Stanton F. Fink, Wikipedia) show an assortment of the beasts, plus Yunnanozoon, which may or may not have something to do with them. Here are some photos of their fossils, in case you wondered.

Vetulicolians from Wiki

They’re certainly difficult creatures to make sense of. Since their discovery, they’ve been called both arthropods and chordates, and you can’t get much farther than that with bilaterian animals (they’re kind of like the Nectocaris of old, come to think of it…).

The latest one was dug up from the Emu Bay Shale of Australia, the same place that yielded our first good look at anomalocaridid eyes. Its newest treasure has been named Nesonektris aldridgei by its taxonomic parents (García-Bellido et al., 2014), and it looks something like this (Diego García-Bellido’s reconstruction from the paper):


In other words, pretty typical vetulicolian “life but not as we know it”, at first glance. Its main interest lies in the bit labelled “nc” in the specimens shown below (from the same figure):


This chunky structure in the animal’s… tail or whatever is a notochord, the authors contend. Now, only one kind of animal has a notochord: a chordate. (Suspicious annelid muscle bundles notwithstanding. Oh yeah, I also wanted to post on Lauri et al. 2014. Oops?) So if this thing in the middle of Nesonektris’s tail is a notochord, then at the very least it is more closely related to chordates than anything else.

Why do they think it is one? Well, there are several long paragraphs devoted to just that, so here goes a summary:

1. It’s probably not the gut. A gut would be the other obvious ID, but it doesn’t fit very well in this case. Structures interpreted as guts in other vetulicolians – which sometimes contain stuff that may be half-digested food – (a) start in the front half of the body, where the mouth is, (b) constrict and expand and coil and generally look much floppier than this, (c) don’t look segmented, (d) sometimes occur alongside these tail rod-like thingies, so probably aren’t the same structure.

2. It positively resembles modern half-decayed notochords. The notochords of living chordates are long stacks of (muscular or fluid-filled) discs, which fall apart into big blocks as the animal decomposes after death. Here’s what remains of the notochord of a lamprey after two months for comparison (from Sansom et al. (2013)):


This one isn’t as regular as the blockiness in the fossils, I think, but that could just be the vetulicolians not being quite as rotten.

There is, of course, a but(t). To be precise, there are also long paragraphs discussing why the structure might not be a notochord after all. It’s much thicker than anything currently interpreted as such in reasonably clear Cambrian chordates, for one thing. Moreover, it ends right where the animal does, in a little notch that looks like a good old-fashioned arsehole. By the way, the paper notes, vetulicolian tails in general don’t go beyond their anuses by any reasonable interpretation of the anus, and a tail behind the anus is kind of a defining feature of chordates, though this study cites a book from the 1970s claiming that sea squirt larvae have a vestigial bit of proto-gut going all the way to the tip of the tail. (I suspect that claim could use the application of some modern cell labelling techniques, but I’ve not actually seen the book…)

… and there is a phylogenetic analysis, in which, if you interpret vetulicolians as deuterostomes (which impacts how you score their various features), they come out specifically as squirt relatives whether or not you count the notochord. I’m never sure how much stock to put in a phylogenetic analysis based on a few bits of anatomy gleaned from highly contentious fossils, but at least we can say that there are other things – like a hefty cuticle – beyond that notochord-or-not linking vetulicolians to a specific group of chordates.

Having reached the end, I don’t feel like this paper solved anything. Nice fossils either way 🙂

And with that, I’m off. Maybe next time I’ll write something that manages to be about the same thing throughout. I’ve been thinking that I should try to do more posts about broader topics rather than one or two papers (like the ones I wrote about ocean acidification or homology versus developmental genetics), but I’ve yet to see whether I’ll have the willpower to handle the necessary reading. I’m remarkably lazy for someone who wants to know everything 😀

(Aside: holy crap, did I ALSO miss a fucking Nature paper about calcisponges’ honest to god ParaHox genes? Oh my god, oh my GOD!!! *sigh* This is also a piece of incredibly exciting information I’ve known for years, and I miss it when it actually comes out in a journal bloody everyone reads. You can tell I’ve been off-planet!)


Chipman AD et al. (2014) The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede Strigamia maritima. PLoS Biology 12:e1002005

García-Bellido DC et al. (2014) A new vetulicolian from Australia and its bearing on the chordate affinities of an enigmatic Cambrian group. BMC Evolutionary Biology 14:214

Lauri A et al. (2014) Development of the annelid axochord: insights into notochord evolution. Science 345:1365-1368

Sansom RS et al. (2013) Atlas of vertebrate decay: a visual and taphonomic guide to fossil interpretation. Palaeontology 56:457-474