Thornbushes

When I discussed sponge microRNAs last week, I said deep animal phylogeny was difficult. Quite fortuitously, another paper went online recently that explores exactly this difficulty (Nosenko et al., 2013). Following on from the microRNA post, I’ll use this paper as an excuse/guide to discuss the tangled relationships of animals.

First of all, let’s recap the problem. My trusty old family tree of animals just so happens to be an excellent illustration:

animalPhylogeny

When I first made this tree to explain what the hell I was talking about re: the Cambrian creature Nectocaris, I put in some question marks mostly out of laziness. To illustrate why the “old” Nectocaris didn’t make sense, I only needed the relationships of bilaterians among themselves. Everything outside the Bilateria was irrelevant to the little creature’s mystery, so I decided to forgo reading up on them and stay on an uninformed fence.

But, in fact, said fence is not just my half-arsed perch. I appear to share it with an entire, very much whole-arsed field. While now there’s a reasonable agreement over ecdysozoans and deuterostomes and all that jazz, the non-bilaterians still wander all over the place depending on how you do your analysis. Nosenko et al. cite a number of recent large-scale studies, and point out that they totally fail to agree where to put poor Trichoplax and jellies of various kinds. The other thing they fail at is deciding how many branches sponges actually represent (the problem the microRNA study I discussed tried to tackle). To illustrate the extent of the chaos, I sketched the phylogenies six recent studies cited by Nosenko and colleagues came up with (sponge lineages are marked by dots):

metazoanTreesAllSmall

Remarkably, all six studies agree on the basic deuterostome-ecdysozoan-lophotrochozoan arrangement inside Bilateria in spite of using different sets of bilaterian species. In contrast, the non-bilaterian animals – sponges of all kinds, cnidarians, ctenophores and Trichoplax – appear in pretty much every conceivable configuration.

A plethora of pitfalls

Why? What makes these questions so difficult that datasets made of 100+ genes from dozens of species representing all major animal groups and using the best available methods have this much trouble answering them?

Time is probably not the issue, or at least not in the simple sense of “it all happened too long ago”. The Nosenko paper brings up the example of fungi, which are roughly as ancient (or, in the context of all living things, as young) as animals. Studies that tried to use the exact same set of genes to analyse the relationships within each group could apparently produce a nice clear tree for fungi. Animals? A whole lot of noise.

Perhaps the “tree” of animals is really more like Rokas and Carroll’s (2006) evolutionary bushes, with its base branching so quickly that genes didn’t have time to accumulate many informative changes between one split and the next. Perhaps it even happened so fast that ancient within-species sequence variation was carried through several such events, resulting in what population geneticists call incomplete lineage sorting, a situation where the history of genes is not the same as the history of species.

Perhaps we haven’t got a good enough sample of genes, animals, or both.

If early animal evolution was bush-like, only a large amount of good data has any hope of accurately resolving how it went. But finding suitable genes for phylogenetic analysis is not easy. They have to be known in all of our species. They should have unambiguous identities so we know we’re actually comparing the same gene across species. They should evolve slowly enough that chance hasn’t had time to wash away their records of relatedness.

Likewise, picking suitable species can be difficult. Aside from the availability of sequences, the two greatest problems are taxon sampling and long branches. Good taxon sampling means covering the diversity of a group. So for example, if you have to pick three vertebrates, you don’t want them all to be mammals. A mammal, a shark and, say, a bony fish would be a much more representative sample.

Long branches are the bogeyman of phylogenetics. “Long” here means many evolutionary changes compared to other lineages in your sample. Similarities in gene/protein sequences are not always due to shared ancestry: because there’s a limited number of letters in the DNA and protein alphabets, sometimes they happen just by chance. If you have two unusually long branches, they might have a lot of these chance similarities, many more than either of them shares with its true relatives by common ancestry. Some of the newer changes might also have overwritten the older similarities linking them with their real families, a problem known as saturation. The overall outcome is that long branches attract each other.

Last but not least, perhaps the assumptions we put into our analyses don’t actually fit the data. All phylogenetic analyses are based on a model of evolution. For molecular data, these models specify, for example, how likely different sequence changes are, and which bases or amino acids are commonest and rarest. All analyses also need a way of picking the best tree, which range from simply choosing the one with the fewest changes to choices based on complicated probability theory. Sometimes, models and methods still work reasonably well when their assumptions are violated, but, as you might expect, counting on that is generally a stupid idea.

Nosenko et al. (2013) come to the conclusion that the issue of non-bilaterian animal phylogeny is plagued by pretty much the whole package.

Dissecting the Problem

First, studies may have increased the size of their datasets by incorporating less than ideal genes. To test the effect of gene sampling, Nosenko et al. (2013) divided their collection of 122 genes into two parts. One consisted of genes involved in protein synthesis, mostly genes encoding ribosomal proteins, which all evolve very slowly. The other was a mixed bag of non-ribosomal genes with all sorts of functions and evolutionary rates.

Perhaps not surprisingly, the latter set displayed a much higher level of saturation. Accordingly, when they analysed the ribosomal dataset with models of evolution that are more prone to errors due to saturation, they got the same trees they’d seen using more accurate models on the non-ribosomal data. Clearly, saturation, gene and model choice are affecting the answers they’re getting, and they are all problems that would affect your average phylogenomic study.

Second, the authors found every indication of a serious long-branch problem. In most phylogenetic trees, the longest branch is the outgroup. Outgroups are organisms outside your group of interest (the ingroup). Similarities between the outgroup and members of the ingroup are likely to have evolved before the origin of the ingroup, therefore they can be used to locate the root of the ingroup tree. However, outgroups are rarely sampled as well as ingroups, hence they tend to form long branches, making them a liability.

In the case of animals, removing the outgroup cleared the disagreements between the different gene sets, demonstrating that some of them had been due to long-branch artefacts. (Of course, without an outgroup you don’t know which animal lineages split first, which makes this solution not much use at all for important evolutionary questions like what the common ancestor of all animals looked like.)

Likewise, using a more distant outgroup changed the trees considerably. Ctenophores are worth special mention here. When Dunn et al. (2008) placed these jellyfish-like creatures as the sister group to all other animals, it was an odd, unexpected result. Well, ctenophore genomes evolve ridiculously fast, and there’s a good chance that their position “way out there” is an artefact of that. In Nosenko et al.‘s analyses, they ended up in the Dunn position when the more saturated non-ribosomal data were used – or when the ribosomal dataset was analysed with a more distant outgroup. When everything possible was done to reduce long-branch issues, they stayed deep in the crown of the tree next to cnidarians.

Fourth, the assumptions of even the best evolutionary model don’t take into account an annoying property of protein sequences: their overall amino acid compositions can differ across lineages. Changing the entire makeup of an organism’s protein complement involves changes in evolutionary patterns that none of the models account for. Once again, those damned ctenophores are one of the problem taxa with “deviant” sequence compositions. (The even worse news is that the closest available outgroups also differ from typical animals in this respect.)

Fifth, taxon sampling is influencing what you get. For example, the more sponges Nosenko et al. included, the more support they got for sponges being a single lineage. Ctenophores probably also suffer from this problem. For one thing, they’re very poorly known in almost every way that is relevant to picking species for phylogenetic analysis.

For another, they may actually have an additional problem that is literally impossible to crack – phylogenetic analysis of ctenophores themselves and a look at their fossil record hint that most ctenophore lineages have died out, with existing species all coming from a relatively recent common ancestor. That would make the entire phylum incurably long-branched no matter how many living species you throw at your datasets!

And finally, the ribosomal dataset that was the least prone to long-branch artefacts and the most informative about the deepest branches in animal phylogeny comes with a big caveat: it’s not a random selection of genes. In fact, all of these genes are interacting parts of a single system, which means they might not evolve independently (in the statistical sense). Are they all affected by a common set of biases, and does it render them unsuitable for recovering the true history of animals? We don’t yet know.

Hope dies last…

Being the phylogeny nut that I am, I really enjoyed this dissection of a thorny problem. At the same time, the results are kind of depressing. (Especially if, like me, you’re interested in early animal evolution.) No matter how carefully you set up your analysis, biases lurk around the corner waiting to jump on you and destroy your conclusions. You have a choice between not knowing where to root the tree of animals and being screwed by the outgroup. Well-worn measures of statistical confidence can support contradictory hypotheses. Ctenophores are fucking hopeless.

Is there anything we can do about this conundrum? Nosenko et al. conclude their paper on a somewhat hopeful note. There are other methods in molecular phylogenetics than simple sequence comparison. Although they’ve been no more helpful so far than traditional sequence analysis, we’re getting more and more full genome sequences from all over the animal kingdom. There’s more to look at than ever. Perhaps, one day, we’ll find a tool that can trim this thorny beast of a bush (or bush of beasts?) into shape.

Meanwhile, the quandary of deep animal phylogeny stands as a reminder that science is not all-powerful. The universe is a puzzle, but we have no reason to assume that nature left us enough information to solve it all. Which, as far as I’m concerned, shouldn’t stop us from trying. 😉

***

References:

Dunn CW et al. (2008) Broad phylogenomic sampling improves resolution of the animal tree of life. Nature 452:745-749

Erwin DH et al. (2011) The Cambrian conundrum: early divergence and later ecological success in the early history of animals. Science 334:1091-1097

Nosenko T et al. (2013) Deep metazoan phylogeny: when different genes tell different stories. Molecular Phylogenetics and Evolution (in press), doi: 10.1016/j.ympev.2013.01.010

Philippe H et al. (2009) Phylogenomics revivew traditional views on deep animal relationships. Current Biology 19:706-712

Pick KS et al. (2010) Improved phylogenomic taxon sampling noticeably affects nonbilaterian relationships. Molecular Biology and Evolution 27:1983-1987

Rokas A & Carroll SB (2006) Bushes in the tree of life. PLoS Biology 4:e352

Schierwater B et al. (2009) Concatenated analysis sheds light on early metazoan evolution and fuels a modern “urmetazoon” hypothesis. PloS Biology 7:e20

Sperling EA et al. (2009) Phylogenetic-signal dissection of nuclear housekeeping genes supports the paraphyly of sponges and the monophyly of Eumetazoa. Molecular Biology and Evolution 26:2261-2274

Another man after my own heart

It’s not terribly hard to turn me into a squealing fangirl. One of the ways is to agree with me eloquently and/or share my pet peeves. Another is to give me lightbulb moments. A third is to disagree with me in a well-reasoned, intelligent way. And finally, if I see you thoughtfully examining your own thinking, you are awesome by definition. Michaël Manuel’s monster review of body symmetry and polarity in animals (Manuel, 2009) did all of the above.

(In case you wondered, that means a long, squeeful meandering >.>)

Manuel writes about the evolution of two fundamental properties of animal body plans [1]: symmetry and polarity. You probably have a good intuitive understanding of symmetry, but here’s a definition anyway. An object is symmetrical if you can perform some transformation (rotation, reflection, shifting etc.) on it and get the same shape. Polarity is a different but equally simple concept – it basically means that one end of an object is different from the other, like the head and tail of a cat or the inner and outer arcs of a rainbow.

I can’t say that I’d thought an awful lot about either before I came across this review, so it’s not really surprising that I had lightbulbs going off in my head left and right while I was reading it. Because I didn’t think deeply about symmetry and polarity and complexity, I basically held the mainstream view I – and, I suspect, most of the mainstream – mostly picked up by osmosis.

That meant I fell victim to my own biggest pet peeve big time – I believed, without good reason and without even realising, that the body plan symmetries of major lineages of living animals represented successive increases in complexity. Sponges are kind of asymmetrical, cnidarians and ctenophores are radially symmetrical, and bilaterians such as ourselves have (more or less) mirror image symmetry, and these kinds of symmetry increase in complexity in this order. Only… they aren’t, and they don’t.

It turns out that this guy not only shares my pet peeve but uses it to demolish my long-held hidden assumptions. Double fangirl points!

Let there be light(bulbs)!

Problem number one with the traditional view – aside from ignoring that evolution ain’t a ladder – is that the distribution of symmetry types among animals is a little more complicated. Most importantly, most kinds of sponges are not asymmetrical. Most species may be, but that’s not the same thing. You see, most sponge species are demosponges, which make up only one of the four great divisions among sponges. Demosponges do have a tendency towards looking a bit amorphous, but the other three – calcareous sponges, glass sponges and homoscleromorphs – usually are some kind of symmetrical. All in all, the evidence points away from an asymmetrical animal ancestor. (Below: calcareous sponges being blatantly symmetrical, from Haeckel’s Kunstformen der Natur.)

The second problem is that my old view ignores at least one important kind of symmetry. Some “radially” symmetrical animals are actually closer to cylindrical symmetry. To understand the difference, imagine rotating a brick and a straight piece of pipe around their respective long axes. You can rotate the pipe as much or as little as you like, it’ll look exactly the same. In contrast, the only rotation that brings the brick back onto itself is turning it by 180° or multiples thereof. A pipe, with its infinitely many rotational symmetries, is cylindrically symmetrical, while the brick has a finite number of rotational symmetries [2], making it radially symmetrical.

Problem number three is that bilateral symmetry is actually no more complex than radial symmetry! What does “complexity” mean in this context? Manuel defines it as the number of coordinates required to specify any point in the animal’s body. In an animal with cylindrical symmetry, you only need a maximum of two: where along the main body axis and how far from the main body axis you are. Everything else is irrelevant, since these are the only axes along which the animal may be polarised. (Add any other polarity axis, and you’ve lost the cylindrical symmetry.)

Take a radially symmetrical creature, like a jellyfish. These also have a main rotational axis and an inside-outside axis of polarity. However, now the animal’s circumference is also divided up into regions, like slices in a cake. How does a skin cell around a baby jelly’s mouth know whether it’s to grow out into a tentacle or contribute to the space between tentacles? That is an extra instruction, an extra layer of complexity. We’re up to three. (Incidentally, here’s some jellyfish symmetry from Haeckel’s Kunstformen. [Here‘s photos of the real animal] A big cheat he may have been, but ol’ Ernst Haeckel certainly had an eye for beauty!)

And with that, jellies and their kin essentially catch up to the basic bilaterian plan. Because what do you need to specify a worm? You need a head-to-tail coordinate, you need a top-to-bottom one, and you need to say how far from the plane of symmetry you are. Still only three! Many bilaterians, including us, added a fourth coordinate by having different left and right sides, but that’s almost certainly not how we started when we split from the cnidarian lineage. (Below: radial symmetry doesn’t hold a monopoly on beauty! Three-striped flatworm [Pseudoceros tristriatus] by wildsingapore.)

Not only that, but Manuel argues that there’s very little evidence bilateral symmetry evolved from radial symmetry. By his reckoning, the most likely symmetry of the cnidarian-bilaterian common ancestor was cylindrical and not radial (more on this later, though). Thus the (mostly) radial cnidarians and the (mostly) bilateral bilaterians represent separate elaborations of a cylinder rather than stages in the same process.

There were a bunch more smaller lightbulb moments, but I’m already running long, so let’s get on to other things.

Respectful disagreement

I think my disagreements with Manuel’s review are more of degree than of kind. Our fundamental difference of opinion comes back to the symmetries of various ancestors and the evidence for them. He argues that key ancestors in animal phylogeny – that of cnidarians + bilaterians, that of cnidarians + bilaterians + ctenophores, and that of all animals – were cylindrical. (Below is the reference tree Manuel uses for his discussion, with symmetry types indicated by the little icons.)

ManuelPhylo

I think he may well be correct in his conclusions, but I’m not entirely comfortable with his reasons. For example, he infers that the last common ancestor of cnidarians and ctenophores was cylindrical. One of his main arguments is that the repeated structures that “break up the cylinder” to confer radial symmetry are not the same in these two phyla. I think this is an intelligent point a smart guy who knows his zoology would make, so disagreement with it becomes debate as opposed to steamrolling [3].

Why I still disagree? As I said, it comes down to degrees and not kinds. Manuel considers the above evidence against a radially symmetrical common ancestor. I consider it lack of evidence for same. The situation reminds me of Erwin and Davidson (2002), which is also one of my favourite papers ever. They raise perhaps the most important point one could make about comparative developmental genetics: homologous pathways could have been present in common ancestors without the complex structures now generated by those pathways being there. Likewise, I think, radial symmetry could have been there in the common ancestor of cnidarians and ctenophores while none of the complex radially symmetrical structures (tentacles, stomach pouches, comb rows etc.) in the living animals were. Perhaps there were simpler divisions of cell types or whatnot that gave rise to the more overt radial symmetry of jellyfish, sea anemones and comb jellies.

In a related argument, Manuel discusses the homology (or lack thereof) of the dorsoventral axis in bilaterians and the so-called directive axis in sea anemones. Sea anemones actually show hints of bilateral symmetry, which prompted some authors (e.g. Baguñà et al., 2008) to argue that this bilateral symmetry and ours was inherited from a common ancestor (i.e. the cnidarian-bilaterian ancestor was bilateral).

I agree with Manuel that the developmental genetic evidence for this is equivocal at best. I even agree with him that developmental genetics isn’t decisive evidence for homology even if it matches better than it actually does in this case. But again, once the genetic evidence is dismissed as inconclusive, he relies on the non-homology of bilaterally symmetrical structures to conclude non-homology of bilateral symmetry. Again, I think this is a plausible but premature inference. Since I’m not sure whether homology or independent origin of bilateral symmetry is the better default hypothesis in this case, and I don’t think the evidence for/against either is convincing, I actually wouldn’t come down on either side as of yet.

But I can see his point, and that’s really cool.

Why else you’re awesome, Michaël Manuel…

Because you have a whole rant about “basal lineages”. I grinned like a maniac throughout your penultimate paragraph. Incidentally, you might have given me another favourite paper – anything with “basal baloney” in its title sounds like it’s worth a few squees of its own!

Because you apply critical thinking to your own thinking. See where we disagreed, non-homology of structures vs. symmetries, evidence against vs no evidence for, and all that? After you made the argument from non-homology of structures, I expected you to leave it at that. And you didn’t. You went and acknowledged its limitations, even though you stood by your original conclusions in the end.

Because you reminded me that radial symmetry is similar to metamerism/segmentation. I’d thought of that before, but it sort of went on holiday for a long time. Connections, yay!

Because you were suspicious about sponges’ lack of Hox/ParaHox genes. And how right you were!

*

Phew, that turned out rather longer and less coherent than I intended. And I didn’t even cover half of the stuff in my notes. I obviously really, really loved this paper…

***

[1] Or any body plan, really…

[2] Astute readers might have noticed that a brick has more than one axis of symmetry, plus several planes of symmetry as well. So it’s not only radially but also bilaterally symmetrical. The one thing it certainly isn’t is cylindrical 😉

[3] Not to say I don’t enjoy steamrolling obvious nonsense, but I also like growing intellectually, and steamrolling obvious nonsense rarely stretches the mind muscles…

***

References:

Baguñà J et al. (2008) Back in time: a new systematic proposal for the Bilateria. Philosophical Transactions of the Royal Society B 363:1481-1491

Erwin DH & Davidson EH (2002) The last common bilaterian ancestor. Development 129:3021-3032

Manuel M (2009) Early evolution of symmetry and polarity in metazoan body plans. Comptes Rendus Biologies 332:184-209