Echinoderm bonanza

Smith et al. (2013) has been sitting on my desktop waiting to be read for the last month or so. Man, am I glad that I finally opened the thing. I’m quite fond of echinoderms, and this paper is full of them. Of course. It’s about echinoderms. Specifically, it’s about the diverse menagerie of them that existed, it seems, a little bit earlier than thought.

The brief little paper introduces new echinoderm finds from two Mid-Cambrian formations in Morocco, which at the time was part of the great continent of Gondwana. As far as I’m concerned, it was worth reading just for this lineup of Cambrian echinoderms. I mean, echinoderms are so amazingly weird in such a variety of ways. They’re a delight.


(The drawings themselves are from Fig. 3. of the paper; I rearranged them to fit into my post width, and the boxes are my additions. Dark box = new groups/species from Morocco, light grey box = known groups/species whose first appearance was pushed back in time by the Moroccan finds.)

Although none of the creatures above belong to the living classes of echinoderms, they display a wide range of body plans. You could say their body plans are more diverse* than those of living echinoderms. (And if you said that, the ghost of Stephen Jay Gould would nod approvingly.) For example, modern echinoderms tend to have either (usually five-part) radial symmetry (any old starfish) or bilateral symmetry that clearly comes from radial symmetry (heart urchins).

In these Early- to Mid-Cambrian varieties, you can see some five-rayed creatures, some that are more or less bilateral without any obvious connection to the prototypical five-point star, animals that are just kind of asymmetric, and those strange spindle-shaped helicoplacoids that look like someone took an animal with radial symmetry and wrung it out. And then there are all the various arrangements of arms and stalks and armour plates that I tend to gloss over when reading about the beasts. (Yeah. I have no attention span.)

The Morroccan finds have some very interesting highlights. The second creature in the lineup above is one of them. Its top half looks like a helicoplacoid such as Helicoplacus itself (first drawing). It’s got that characteristic spiral arrangement of plates and a mouth at the top end. However, unlike previously known helicoplacoids, it sits on a stalk that resembles the radially-symmetric eocrinoids (like the creature on its right). It’s a transitional form all right, though we’ll have to wait for future publications and perhaps future discoveries to see which way evolution actually went. It’ll already help palaeontologists make sense of helicoplacoids themselves, though, which I gather is a big thing in itself. The authors promise to publish a proper description of the creature, which is really exciting.

The other exciting thing about the Moroccan echinoderms is their age. As I already hinted at with my grey boxes, the new fossils push back the known time range of many echinoderm body plans by millions of years. This means that the wide variety of body plans we saw above was already present as little as 10-15 million years after the first appearance of scattered bits of echinoderm skeleton in the fossil record.

Smith et al. argue that this is a fairly solid conclusion based on the mineralogy of echinoderm skeletons. Organisms with calcium carbonate hard parts have a tendency to adopt the “easiest” mineralogy at the time they first evolve skeletons. Seawater composition changes over geological time; most importantly, the ratio of calcium to magnesium fluctuates. Calcium carbonate can adopt several different crystal forms, and the Ca/Mg ratio influences which of them are easier to make. So when there’s a lot of Mg in the sea, aragonite is the “natural” choice, whereas low Mg levels favour calcite.

The first appearance of echinoderms around 525 million years ago coincides with a shift in ocean chemistry from “aragonite seas” to “calcite seas”. Echinoderms and a bunch of other groups that first show up around that time have skeletons that are calcite in their structure but incorporate a lot of Mg. Since the ocean before was favourable to aragonite, it’s unlikely that echinoderm skeletons appeared much earlier than this date. In other words, echinoderm evolution during this geologically short period was truly worthy of the name “Cambrian explosion”.

That is, of course, if the appearance of echinoderm skeletons precedes the appearance of echinoderm body plans. The oldest of our Cambrian treasure troves of soft-bodied fossils, such as the rocks that yielded the Chengjiang biota of China, are roughly the same age as the first echinoderm skeletons. However, they don’t contain undisputed echinoderms as far as I can tell (Clausen et al., 2010). Proposed “echinoderms” from before the Cambrian are even less accepted. Of course, the unique structure of echinoderm skeletons is easy to recognise, but how do you identify an echinoderm ancestor without such a skeleton? (Is all that bodyplan diversity even possible without hard skeletal support?)

Caveats aside, this Moroccan stuff is awesome. And also, if my caveat proves overly cautious, echinoderms did some serious evolving in their first few million years on earth. A supersonic ride with Macroevolution Airlines?


*OK, if I want to be absolutely pedantic, and I do, then body plans are disparate rather than diverse. “Disparity” in palaeontological/evo-devo parlance refers to how different two or more creatures are. Diversity means how many different creatures there are. Maybe I should do a post on that, actually.



Clausen S et al. (2010) The absence of echinoderms from the Lower Cambrian Chengjiang fauna of China: Palaeoecological and palaeogeographical implications. Palaeogeography, Palaeoclimatology, Palaeoecology 294:133-141

Smith AB et al. (2013) The oldest echinoderm faunas from Gondwana show that echinoderm body plan diversification was rapid. Nature Communications 4:1385

Oh, look, an argument!

It seems like forever since I posted about the very old putative bilaterian burrows Pecoits et al. (2012) reported in Science. I read the paper, thought about the implications, wrote the post and then filed the whole thing away in the giant messy cabinet at the back of my mind.

But a big claim like the one Pecoits et al. made – burrows from bilateran animals that appear before the first Ediacaran fossils! – is unlikely to go unchallenged by the scientific community. Now the argument has broken out. Gaucher et al. (2013) wrote a comment in Science criticising the reasoning that put such an old date on the formation where the burrows were found. Pecoits et al. (2013) responded. The plot is thickening!

The main bone of contention seems to be whether the huge body of granite that gave the actual radiometric date of 585 million years lies below the burrow-bearing formation (in which case it must be older than the fossils) or cuts through it (in which case it’s younger). The other question is whether the fossils and the rocks they’re found in actually belong to another nearby formation that is thought to be Permian in age. Burrows in Permian rocks would be no surprise at all . By that time reptiles and the ancestors of mammals walked the earth, insects of all kinds flew over it, and armadas of worms had been boring through soft sediments for hundreds of millions of years. Burrows that far into the Precambrian, on the other hand…

The argument is all very geological, and as I repeatedly said, I’m not much of a geologist. Looking at the figures wouldn’t help me decide who to believe at all. I’m rather amused by some of the snark that gets into the text, though. I have this feeling that Pecoits et al. are annoyed. Watch this, for example:

In this case, Gaucher et al. (1) take no notice of the outcrop-scale relationships and instead prefer to show five photographs from just one hand sample that they assigned to fossil site C to discredit the intrusive nature of the granite [figure 1, B to F, in (1)]. We do not want to speculate on the origin of this sample, but we see no evidence that it comes from fossil site C; it is not the ferruginized basal sandstone we previously documented [figure S3C in (2)].

Oh, yeah. “We do not want to speculate,” but we think something’s fishy with your evidence, only we’re too polite to say it in so many words!

Tee-hee. Academia’s version of an online flame war.



Gaucher C et al. (2013) Comment on “Bilaterian burrows and grazing behavior at >585 million years ago”. Science 339:906

Pecoits E et al. (2012) Bilaterian burrows and grazing behavior at >585 million years ago. Science 336:1693-1696

Pecoits E et al. (2013) Response to comment on “Bilaterian burrows and grazing behavior at >585 million years ago”. Science 339:906

Score one for punk eek

Speaking of macroevolution…

I think it’s fair to say that the concept of punctuated equilibria is one of the most famous and most misunderstood ideas in 20th century evolutionary biology. PE, or “punk eek” was proposed by palaeontologists Niles Eldredge and Stephen Jay Gould  (Eldredge and Gould, 1972) as a reconciliation of the Modern Evolutionary Synthesis and the fossil record. Its core idea is that most (visible) evolutionary change happens during the formation of new species, and that this process is usually quick compared to the lifetime of a species. (An excellent layperson-friendly explanation of punk eek is available here.)

Of course, punk eek is not a law of nature – it’s only one way evolution might proceed, and it’s a decent explanation of the dearth of low-level (species to species) transitions in the fossil record. But there’s nothing to say that this is how evolution always proceeds, and consequently, exactly how often it does so is a valid (and still actively debated) question in evolutionary biology.

A related question is how often new species arise by the wholesale transformation of the ancestral species (anagenesis) or by the splitting of the ancestor into two or more descendants (cladogenesis). Since punk eek posits that most new species come from small isolated populations of the ancestor, under punk eek scenarios you’d expect most speciation to occur by cladogenesis.

However, assessing the exact contribution of each requires an exceptionally good fossil record where ancestor-descendant relationships and precise times of appearance and disappearance can be determined. This makes the investigation difficult to impossible in most groups. In the latest issue of PNAS, Strotz and Allen (2013) went to one of the few groups with a good enough record to answer such questions and analysed the living shit out of them.

Foraminiferans of forams for short are single-celled creatures that build hard shells to live in. They are very abundant, widely distributed in the world’s oceans, and because of their shells they make excellent (if tiny) fossils. Their relationships have also been studied with molecular methods, so we have a pretty good understanding of who’s related to whom and how well morphology meshes with genetics.

Therefore, as Strotz and Allen point out, we can say with a fair amount of confidence that what we’ve identified as species in the fossil record are likely to actually be species, not just varieties. (It doesn’t always work the opposite way – some “species” that look exactly the same on the outside are known hide several genetically distinct lineages.) The genetic data also help sort out who begat whom.

Armed with this knowledge of genetics and the detailed fossil record of planktonic forams in the last 65 million years, the pair formulated criteria for identifying cladogenetic events:

  • If morphologically distinct ancestor and descendant(s) overlap in time (factoring in dating and classification error), the descendant must have arisen by cladogenesis.
  • Likewise, cladogenesis must have occurred if the two species occur together in the same sample even if their morphologies overlap at that point.
  • Third, if an ancestor gave rise to a series of descendants, all but the last of those must have formed by cladogenesis – the ancestral form has to continue existing for it to sprout more descendants!

Thus, the possibility of anagenesis only remains for ancestor-descendant pairs that didn’t get caught on any of the above filters. And the number of those turns out to be very low.  Depending on how you estimate the errors associated with identifying fossils, only around 43-64 out of 337 speciation events (less than a fifth of the total) in the last 65 million years shows no evidence against anagenesis. The numbers are even lower, dipping below one-tenth of all events, if you only consider the last 23 million years, for which more precise dating information is available. In conclusion, for planktonic forams since the death of the dinosaurs, splitting an old species has been by far the more common way of forming new species.

It’s important to talk about the things this paper doesn’t say. It doesn’t, for example, say that its findings apply to all organisms. Speciation need not work the same way for all groups, and a subset of forams need not be representative of anything. It also doesn’t say – and the authors are quite explicit about this – that morphological evolution only occurs when species split. Instead, they argue, their findings support a modified view of punk eek in which species do change throughout their lifetimes – but the changes are fluctuations due to short-term influences, and they only persist if populations get isolated.

(Myself, I just think the simple fact that we have a fossil record where such ideas can be tested is pretty amazing. You can complain about the patchiness of the record all you like, but in the meantime it’s worth stopping and appreciating what we do have!)



Eldredge N & Gould SJ (1972) Punctuated equilibria: an alternative to phyletic gradualism. In Schopf TJM (ed) Models in Paleobiology. Freeman, Cooper & Co., pp. 82-115

Strotz LC & Allen AP (2013) Assessing the role of cladogenesis in macroevolution by integrating fossil and molecular evidence. PNAS 110:2904-2909

Macroevolution Airlines

(This post has been mostly written for a long time but I never got round to publishing it. It’s kind of my darling baby, and I never felt quite ready to let it out into the world. Well, every parent has to let go at some point…)

In the creation vs. evolution section of Christian Forums, “macroevolution” is a common topic of name-calling discussion. At some point in what seems like every other thread, a creationist demands “proof” of macroevolution. The common reaction from the evolution side is that the creationist doesn’t understand evolution, and macroevolution is just lots of microevolution, and here is a list of observed speciation events anyway. While the first point is true more often than not, I have been increasingly uncomfortable with the second lately. To my mind, and I think to anyone interested in palaeontology and/or evo-devo, it’s not at all obvious that macroevolution must be fundamentally similar to the everyday adaptations and driftings we commonly observe real-time.

(Image from the UCMP Understanding Evolution site)

What exactly is macroevolution?

Before I continue my musings, I must first clarify what I mean by micro- and macroevolution. I see two interpretations in use in the scientific community, and I don’t think they are entirely equivalent. The “rigorous” interpretation defines microevolution as anything that happens this side of speciation. Populations adapting to short-term environmental change, individuals and their genes migrating back and forth between neighbouring populations, ordinary everyday genetic drift, etc. are microevolutionary phenomena. Macroevolution starts with the formation of new species. The “wishy-washy” interpretation defines macroevolution as “evolution on the large scale”, or “big change”. This is the one I think many palaeontologists would prefer, and many students of evo-devo as well. This is also the one most creationists seem to have in mind. Most – if not all – of the examples in the well-worn speciation lists I’m guilty of pulling out myself are only macroevolution in the first sense. This is something people often seem unaware of: speciation and big change do not go hand in hand.

The definition I prefer (and I changed my mind on this fairly recently) is the second, because despite its vagueness, it gives us a word for something vitally important, all the things that are (usually) bigger than the evolutionary processes we can readily observe on human timescales. How did something resembling a sausage on legs give rise to the mind-boggling diversity of arthropods? How did our own ancestors end up with legs instead of fins? Why did dinosaurs grow into giants and rule the land while the ancestors of mammals retreated to the shadows? This is what macroevolution means to me. As far as I’m concerned, the population geneticists’ kind of macroevolution already has a perfectly good word for it, and that word is speciation.

The question: what is the question?

With that in mind, is macroevolution something different? This is actually at least two questions. One can ask whether the external forces that set out the path of evolution act in the same way on all scales. Did the environment always exert the same kinds of pressures on living things? The answer to this is probably no – from the appearance of oxygen in the atmosphere to the arrival of predators in animal communities, both non-living and living factors have changed the rules of ecosystems many times in earth history. Do the same sorts of pressures that determine the fate of single populations also affect whole lineages? Does selection operate on more than one level? Do the same traits that natural selection favours in ordinary times also help you in extraordinary times? (Another “no”, if David Jablonski can be believed.)

Alternatively, one can also ask whether small and large changes in the properties of organisms are governed by different intrinsic rules. Do, say, new body parts originate through the same kinds of mutations as new hair colours? Are major changes and small adjustments associated with different developmental stages (Arthur, 2008)? Did the nature of variation itself change over evolutionary time (Gould, 1989; Erwin, 2011)? That last one especially intrigues me, and it may yet return in future meanderings. (It’ll return in force if I ever muster the fortitude to discuss the Cambrian explosion ;))

The way to America

In the aforementioned creation vs. evolution debates, physical distance is a commonly used analogy for evolutionary distance. If you believe in centimetres, the argument goes, how can you not believe in kilometres? If you can walk to the kitchen, why can’t you walk a mile?

I think this analogy is worth examining a little further, because it turns out to be great parallel to the micro vs. macro issue. It is true that anyone who can walk can walk a mile. It may take long and it may tire you out, depending on your physique, but it is possible. However, it isn’t very hard to think of destinations that are simply impossible to reach by walking. I live in Europe. Barring ice ages and Bering land bridges, no amount of steps would take me to America. It is still possible for me to go there, but I have to take a flight or perhaps hop on a ship. Is macroevolution like a mile, or is it more like the distance between Europe and the New World? Does a velvet worm-like creature evolve into an arthropod by lots of tiny steps of its chubby legs, or does it take a ride with Macroevolution Airlines?



Arthur W (2008) Conflicting hypotheses on the nature of mega-evolution. In: Minelli A & Fusco G (eds.) Evolving Pathways: Key Themes in Evolutionary Developmental Biology. Cambridge University Press, pp. 50-61

Erwin DH (2011) Evolutionary uniformitarianism. Developmental Biology 357:27-34

Gould SJ (1989) Wonderful Life: The Burgess Shale and the Nature of History. W. W. Norton & Co.

Celebrating the molecular revolution

I forgot to say happy Darwin Day yesterday, but to make up for that, I present to you Max Telford’s extremely cool way of celebrating.

In 1988, on Darwin Day, no less, a 5-page little paper was published in Science that would absolutely revolutionalise the study of animal evolution. Field et al. (1988) was one of the earliest studies to apply this newfangled thing called molecular biology to the phylogeny of animals. Methods for molecular phylogenetics (or indeed any kind of phylogenetics) were extremely limited by the performance of the computers of the time, but that didn’t stop scientists from trying them. And once someone kicked this snowball, the avalanche couldn’t be stopped.

This early attempt yielded some huge surprises. Arthropods, which were thought to have arisen from segmented worms, were not closely related to any kind of worm. Brachiopods, long thought to belong to their own major group, showed up deep among worms, molluscs and other uncontroversial protostomes instead. Cnidarians such as hydras and sea anemones, and bilaterians such as ourselves, arose independently from single-celled ancestors.

Some of their conclusions – among them the last one about several origins for animals – were contradicted by more sophisticated analyses. Nevertheless, what they stirred up was the beginning of our current understanding of animal phylogeny. For the 25th anniversary of this pivotal publication, Max Telford, animal phylogeneticist extraordinaire of University College London, went back to the roots of his field and reanalysed Field et al.’s data (Telford, 2013).

Could the data and methods of the time have yielded a more accurate tree? How does a “modernised” dataset fare under the latest methods? What advances in methodology and understanding led the molecular phylogenetics of animals from the first tentative steps in the 1980s to where we are today?

Analysing the original data with methods similar to the original, of course, repeats most of the original mistakes.  It’s when Telford starts tweaking things that the interesting stuff starts to happen. For example, just switching from the original method to a more complex one that was available but would have taken years to run at the time pulls all animals back together. “Updating” the analysis by using more complete sequences of the same gene, slower-evolving relatives of some original species, and modern methods impossible to run on 80s computers comes very close to today’s consensus. In other words, Field et al. basically did the best they could. Since then, data availability, careful sampling and far more computer muscle have changed some of their conclusion – but confirmed others.

Telford highlights one way in which the classics got lucky, too. Back in the eighties, sequencing nucleic acids was a difficult affair. Field et al. (1988) picked 18S ribosomal RNA mostly because it was less difficult than most others. But, as Telford points out, they also hit on a really good gene for phylogenetics. The 18S is quite long, providing an abundance of data. It has both very conserved and variable regions, so it has something to say on all levels of divergence. And, as Telford’s updated analysis shows, it can actually give reasonably accurate results on its own, which cannot always be said of single genes. For long years after Field et al. (1988), 18S rRNA continued to be used to probe into animal relationships, and had a few more revolutions up its sleeve (Aguinaldo et al., 1997; Ruiz-Trillo et al., 1999) before yielding to huge multi-gene datasets.

Contemplating Telford’s little historical excursion, I’m reminded of Isaac Asimov’s fantastic essay The Relativity of Wrong. We’ve come a long way from our first bumbling attempts at molecular phylogenetics. We were wrong many times, and I can guarantee you we’re still wrong about a lot of things. But I like to think that, as with the shape of the earth, we are not quite as wrong as our predecessors. Over the years, some great branches of the animal tree have crystallised from a sea of studies. With dogged determination, science approaches the truth.

I think that’s a good note to end on when we commemorate the birthday of a scientist who spent decades perfecting his theory of evolution before publishing perhaps the most important book in the history of biology. Happy belated Darwin Day! 🙂



Aguinaldo AMA et al. (1997) Evidence for a clade of arthropods and other molting animals. Nature 387:489-493

Asimov I (1989) The Relativity of Wrong. The Skeptical Inquirer 14(1):35-44.

Field KG et al. (1988) Molecular phylogeny of the animal kingdom. Science 239:748-753

Ruiz-Trillo I et al. (1999) Acoel flatworms: earliest extant bilaterian metazoans, not members of platyhelminthes. Science 283:1919-1923

Telford MJ (2013) Field et al. redux. EvoDevo 4:5

“Silk beauty” my arse!


I got a little package of cosmetics for Christmas. It goes by the brand name of “Silk Beauty” (Luxury Velvet Edition!!!), which sounds slightly pretentious totally ridiculous, but okay. That’s marketing for you. The stuff smells reasonably nice, and that’s the only difference I’ve ever been able to notice between different brands. I’m quite able to ignore a laughable brand name to smell nice.

However, today in the shower I happened to glance at the ingredient list of my Silk Beauty (*snort*) shower gel. At the end of a standard list of shower gel ingredients stood “silk amino acids”.

Wait a minute.

(During which I waver between laughing my head off and fuming at the outrageous hoodwinking they tried on me.)

Silk amino acids?

(I decided on laughing for the moment.)

Hate to be a killjoy, but the amino acids in silk are exactly like the amino acids in any other protein. Fine, the major ingredient of silk, fibroin, is like 50% glycine, which is a very unusual composition, but once you break it down into amino acids, it’s still just a pile of perfectly ordinary glycine. You can find it in any other protein. You can find it floating around in every one of your cells.

What makes silk smooth and shiny is not the amino acids in themselves but the way they’re arranged in the protein. (Which, ironically, is basically “GAGA” hundreds of times over.) Unfortunately, no matter how much glycine you smear on your skin – assuming it even gets through the outer layer that’s specifically there to keep stuff out -, it won’t reassemble into silk protein for the simple reason that you need a fibroin gene to make fibroin. Ribosomes can synthesise any protein if you give them the instructions, but they’re not exactly creative.

Spoiler alert: humans do not have fibroin genes. (If you had one, you could produce silk without magical cosmetics anyway.)

So, basically, Oriflame expect me to smear the totally average building blocks of a special protein on my skin to make my skin like the special protein. What next? Feeding me tiger meat to give me the strength of the beast?

Slime moulds don’t play by the rules

I’m starting to think dictyostelids are seriously interesting. These are the guys whose eerily animal-like epithelial tissues prompted the idea of multicellularity being ancestral to the lineage containing animals, choanoflagellates, fungi and amoebae. (Incidentally, Parfrey and Lahr [2013] wrote a nice critical response to that hypothesis – it deserves a post of its own, but not this post.) They are used as model organisms in (evolutionary) developmental biology (Schaap, 2011), a field which is mostly dominated by animals and plants for obvious reasons.

Recently I wrote about the developmental hourglass pattern, which means that the most conserved developmental stages are not the earliest (as Karl von Baer thought at the dawn of comparative embryology), but some way into development. This pattern has been found in several animal phyla both at the morphological level and in various features of developmental gene expression, and it was recently also discovered in plants, which prompted my first post about it.

A group of researchers reckoned they should check how universal the hourglass is, and they thought the slime mould/social amoeba and honoured developmental model organism Dictyostelium is a good place to look (Tian et al., 2013). Unlike plants and animals, which develop from a single cell, the multicellular life stage of dictyostelids is a gathering of thousands of previously independent cells that may not be genetically identical. Therefore, these tiny creatures represent a very different approach to development from our favourite lab animals. Whether or not they still show an hourglass pattern could give clues about the deeper laws that govern all developmental processes.

Dictyostelids turn out to be complete deviants in this respect. Comparisons of the genes two species of Dictyostelium use in their multicellular development show neither von Baer’s “funnel” pattern of similarity nor an hourglass. If you include single-celled stages that aren’t, strictly speaking, “developmental”, similarities of gene expression give a “reverse hourglass” with lowest similarity in the middle. If you only consider the actual multicellular developmental stages, conservation increases towards the end – an “inverted funnel”. Other measures gave Tian et al. largely consistent results – genes expressed later in development were more likely to also be present in the other species, and their sequences were more similar on average.

Now that we have a pattern – what could explain it? The authors speculate that an idea that had been used to explain the hourglass in animals may apply just as well to the inverted funnel of slime moulds. This idea is that the evolvability of a developmental stage depends on the interactions that occur during it. The more interactions between genes/cells/tissues, the worse the effect of a tiny screw-up and the smaller the chance of a beneficial change, hence the most interconnected developmental stages will tend to be most conserved in evolution.

In animals, goes the reasoning, early development is relatively simple, and later development is relatively modular. Early on, there’s less to screw up, whereas later, every screw-up is limited to part of the embryo. In between is the sweet spot where everything talks to everything and a small modification can have large knock-on effects. The result is the hourglass. In slime moulds, however, that later stage when the developing organism is subdivided into semi-independent modules never comes. All tissues keep communicating and affecting each other right up to the point where the multicellular body is fully developed. Thus, if you like, only the first half of the hourglass happens in these creatures.

It’s an interesting idea. I like it.



Parfrey LW & Lahr DJG (2013) Multicellularity arose several times in the evolution of eukaryotes. BioEssays advance online publication, 11/01/2013, doi: 10.1002/bies.201200143

Schaap P (2011) Evolutionary crossroads in developmental biology: Dictyostelium discoideum. Development 138:387-396

Tian X et al. (2013) Dictyostelium development shows a novel pattern of evolutionary conservation. Molecular Biology and Evolution advance online publication 16/01/2013, doi: 10.1093/molbev/mst007


When I discussed sponge microRNAs last week, I said deep animal phylogeny was difficult. Quite fortuitously, another paper went online recently that explores exactly this difficulty (Nosenko et al., 2013). Following on from the microRNA post, I’ll use this paper as an excuse/guide to discuss the tangled relationships of animals.

First of all, let’s recap the problem. My trusty old family tree of animals just so happens to be an excellent illustration:


When I first made this tree to explain what the hell I was talking about re: the Cambrian creature Nectocaris, I put in some question marks mostly out of laziness. To illustrate why the “old” Nectocaris didn’t make sense, I only needed the relationships of bilaterians among themselves. Everything outside the Bilateria was irrelevant to the little creature’s mystery, so I decided to forgo reading up on them and stay on an uninformed fence.

But, in fact, said fence is not just my half-arsed perch. I appear to share it with an entire, very much whole-arsed field. While now there’s a reasonable agreement over ecdysozoans and deuterostomes and all that jazz, the non-bilaterians still wander all over the place depending on how you do your analysis. Nosenko et al. cite a number of recent large-scale studies, and point out that they totally fail to agree where to put poor Trichoplax and jellies of various kinds. The other thing they fail at is deciding how many branches sponges actually represent (the problem the microRNA study I discussed tried to tackle). To illustrate the extent of the chaos, I sketched the phylogenies six recent studies cited by Nosenko and colleagues came up with (sponge lineages are marked by dots):


Remarkably, all six studies agree on the basic deuterostome-ecdysozoan-lophotrochozoan arrangement inside Bilateria in spite of using different sets of bilaterian species. In contrast, the non-bilaterian animals – sponges of all kinds, cnidarians, ctenophores and Trichoplax – appear in pretty much every conceivable configuration.

A plethora of pitfalls

Why? What makes these questions so difficult that datasets made of 100+ genes from dozens of species representing all major animal groups and using the best available methods have this much trouble answering them?

Time is probably not the issue, or at least not in the simple sense of “it all happened too long ago”. The Nosenko paper brings up the example of fungi, which are roughly as ancient (or, in the context of all living things, as young) as animals. Studies that tried to use the exact same set of genes to analyse the relationships within each group could apparently produce a nice clear tree for fungi. Animals? A whole lot of noise.

Perhaps the “tree” of animals is really more like Rokas and Carroll’s (2006) evolutionary bushes, with its base branching so quickly that genes didn’t have time to accumulate many informative changes between one split and the next. Perhaps it even happened so fast that ancient within-species sequence variation was carried through several such events, resulting in what population geneticists call incomplete lineage sorting, a situation where the history of genes is not the same as the history of species.

Perhaps we haven’t got a good enough sample of genes, animals, or both.

If early animal evolution was bush-like, only a large amount of good data has any hope of accurately resolving how it went. But finding suitable genes for phylogenetic analysis is not easy. They have to be known in all of our species. They should have unambiguous identities so we know we’re actually comparing the same gene across species. They should evolve slowly enough that chance hasn’t had time to wash away their records of relatedness.

Likewise, picking suitable species can be difficult. Aside from the availability of sequences, the two greatest problems are taxon sampling and long branches. Good taxon sampling means covering the diversity of a group. So for example, if you have to pick three vertebrates, you don’t want them all to be mammals. A mammal, a shark and, say, a bony fish would be a much more representative sample.

Long branches are the bogeyman of phylogenetics. “Long” here means many evolutionary changes compared to other lineages in your sample. Similarities in gene/protein sequences are not always due to shared ancestry: because there’s a limited number of letters in the DNA and protein alphabets, sometimes they happen just by chance. If you have two unusually long branches, they might have a lot of these chance similarities, many more than either of them shares with its true relatives by common ancestry. Some of the newer changes might also have overwritten the older similarities linking them with their real families, a problem known as saturation. The overall outcome is that long branches attract each other.

Last but not least, perhaps the assumptions we put into our analyses don’t actually fit the data. All phylogenetic analyses are based on a model of evolution. For molecular data, these models specify, for example, how likely different sequence changes are, and which bases or amino acids are commonest and rarest. All analyses also need a way of picking the best tree, which range from simply choosing the one with the fewest changes to choices based on complicated probability theory. Sometimes, models and methods still work reasonably well when their assumptions are violated, but, as you might expect, counting on that is generally a stupid idea.

Nosenko et al. (2013) come to the conclusion that the issue of non-bilaterian animal phylogeny is plagued by pretty much the whole package.

Dissecting the Problem

First, studies may have increased the size of their datasets by incorporating less than ideal genes. To test the effect of gene sampling, Nosenko et al. (2013) divided their collection of 122 genes into two parts. One consisted of genes involved in protein synthesis, mostly genes encoding ribosomal proteins, which all evolve very slowly. The other was a mixed bag of non-ribosomal genes with all sorts of functions and evolutionary rates.

Perhaps not surprisingly, the latter set displayed a much higher level of saturation. Accordingly, when they analysed the ribosomal dataset with models of evolution that are more prone to errors due to saturation, they got the same trees they’d seen using more accurate models on the non-ribosomal data. Clearly, saturation, gene and model choice are affecting the answers they’re getting, and they are all problems that would affect your average phylogenomic study.

Second, the authors found every indication of a serious long-branch problem. In most phylogenetic trees, the longest branch is the outgroup. Outgroups are organisms outside your group of interest (the ingroup). Similarities between the outgroup and members of the ingroup are likely to have evolved before the origin of the ingroup, therefore they can be used to locate the root of the ingroup tree. However, outgroups are rarely sampled as well as ingroups, hence they tend to form long branches, making them a liability.

In the case of animals, removing the outgroup cleared the disagreements between the different gene sets, demonstrating that some of them had been due to long-branch artefacts. (Of course, without an outgroup you don’t know which animal lineages split first, which makes this solution not much use at all for important evolutionary questions like what the common ancestor of all animals looked like.)

Likewise, using a more distant outgroup changed the trees considerably. Ctenophores are worth special mention here. When Dunn et al. (2008) placed these jellyfish-like creatures as the sister group to all other animals, it was an odd, unexpected result. Well, ctenophore genomes evolve ridiculously fast, and there’s a good chance that their position “way out there” is an artefact of that. In Nosenko et al.‘s analyses, they ended up in the Dunn position when the more saturated non-ribosomal data were used – or when the ribosomal dataset was analysed with a more distant outgroup. When everything possible was done to reduce long-branch issues, they stayed deep in the crown of the tree next to cnidarians.

Fourth, the assumptions of even the best evolutionary model don’t take into account an annoying property of protein sequences: their overall amino acid compositions can differ across lineages. Changing the entire makeup of an organism’s protein complement involves changes in evolutionary patterns that none of the models account for. Once again, those damned ctenophores are one of the problem taxa with “deviant” sequence compositions. (The even worse news is that the closest available outgroups also differ from typical animals in this respect.)

Fifth, taxon sampling is influencing what you get. For example, the more sponges Nosenko et al. included, the more support they got for sponges being a single lineage. Ctenophores probably also suffer from this problem. For one thing, they’re very poorly known in almost every way that is relevant to picking species for phylogenetic analysis.

For another, they may actually have an additional problem that is literally impossible to crack – phylogenetic analysis of ctenophores themselves and a look at their fossil record hint that most ctenophore lineages have died out, with existing species all coming from a relatively recent common ancestor. That would make the entire phylum incurably long-branched no matter how many living species you throw at your datasets!

And finally, the ribosomal dataset that was the least prone to long-branch artefacts and the most informative about the deepest branches in animal phylogeny comes with a big caveat: it’s not a random selection of genes. In fact, all of these genes are interacting parts of a single system, which means they might not evolve independently (in the statistical sense). Are they all affected by a common set of biases, and does it render them unsuitable for recovering the true history of animals? We don’t yet know.

Hope dies last…

Being the phylogeny nut that I am, I really enjoyed this dissection of a thorny problem. At the same time, the results are kind of depressing. (Especially if, like me, you’re interested in early animal evolution.) No matter how carefully you set up your analysis, biases lurk around the corner waiting to jump on you and destroy your conclusions. You have a choice between not knowing where to root the tree of animals and being screwed by the outgroup. Well-worn measures of statistical confidence can support contradictory hypotheses. Ctenophores are fucking hopeless.

Is there anything we can do about this conundrum? Nosenko et al. conclude their paper on a somewhat hopeful note. There are other methods in molecular phylogenetics than simple sequence comparison. Although they’ve been no more helpful so far than traditional sequence analysis, we’re getting more and more full genome sequences from all over the animal kingdom. There’s more to look at than ever. Perhaps, one day, we’ll find a tool that can trim this thorny beast of a bush (or bush of beasts?) into shape.

Meanwhile, the quandary of deep animal phylogeny stands as a reminder that science is not all-powerful. The universe is a puzzle, but we have no reason to assume that nature left us enough information to solve it all. Which, as far as I’m concerned, shouldn’t stop us from trying. 😉



Dunn CW et al. (2008) Broad phylogenomic sampling improves resolution of the animal tree of life. Nature 452:745-749

Erwin DH et al. (2011) The Cambrian conundrum: early divergence and later ecological success in the early history of animals. Science 334:1091-1097

Nosenko T et al. (2013) Deep metazoan phylogeny: when different genes tell different stories. Molecular Phylogenetics and Evolution (in press), doi: 10.1016/j.ympev.2013.01.010

Philippe H et al. (2009) Phylogenomics revivew traditional views on deep animal relationships. Current Biology 19:706-712

Pick KS et al. (2010) Improved phylogenomic taxon sampling noticeably affects nonbilaterian relationships. Molecular Biology and Evolution 27:1983-1987

Rokas A & Carroll SB (2006) Bushes in the tree of life. PLoS Biology 4:e352

Schierwater B et al. (2009) Concatenated analysis sheds light on early metazoan evolution and fuels a modern “urmetazoon” hypothesis. PloS Biology 7:e20

Sperling EA et al. (2009) Phylogenetic-signal dissection of nuclear housekeeping genes supports the paraphyly of sponges and the monophyly of Eumetazoa. Molecular Biology and Evolution 26:2261-2274