In which a “living fossil’s” genome delights me

I promised myself I wouldn’t go on for thousands and thousands of words about the Lingula genome paper (I’ve got things to do, and there is a LOT of stuff in there), but I had to indulge myself a little bit. Four or five years ago when I was a final year undergrad trying to figure out things about Hox gene evolution, I would have killed for a complete brachiopod genome. Or even a complete brachiopod Hox cluster. A year or two ago, when I was trying to sweat out something resembling a PhD thesis, I would have killed for some information about the genetics of brachiopod shells that amounted to more than tables of amino acid abundances. Too late for my poor dissertations, but a brachiopod genome is finally sequenced! The paper is right here, completely free (Luo et al., 2015). Yay for labs who can afford open-access publishing!

In case you’re not familiar with Lingula, it’s this guy (image from Wikipedia):

In a classic case of looks being deceiving, it’s not a mollusc, although it does look a bit like one except for the weird white stalk sticking out of the back of its shell. Brachiopods, the phylum to which Lingula belongs, are one of those strange groups no one really knows where to place, although nowadays we are pretty sure they are somewhere in the general vicinity of molluscs, annelid worms and their ilk. Unlike bivalve molluscs, whose shell valves are on the left and right sides of the animal, the shells of brachiopods like Lingula have top and bottom valves. Lingula‘s shell is also made of different materials: while bivalve shells contain calcium carbonate deposited into a mesh of chitin and silk-like proteins,* the subgroup of brachiopods Lingula belongs to uses calcium phosphate, the same mineral that dominates our bones, and a lot of collagen (again like bone). But we’ll come back to that in a moment…

One of the reasons the Lingula genome is particularly interesting is that Lingula is a classic “living fossil”. In the Paleobiology Database, there’s even an entry for a Cambrian fossil classified as Lingula, and there are plenty of entries from the next geological period. If the database is to be believed, the genus Lingula has existed for something like 500 million years, which must be some kind of record for an animal.** Is its genome similarly conservative? Or did the DNA hiding under a deceptively conservative shell design evolve as quickly as anyone’s?

In a heroic feat of self-control, I’m not spending all night poring over the paper, but I did give a couple of interesting sections a look. Naturally, the first thing I dug out was the Hox cluster hiding in the rather large supplement. This was the first clue that Lingula‘s genome is definitely “living” and not at all a fossil in any sense of the word. If it were, we’d expect one neat string of Hox genes, all in the order we’re used to from other animals. Instead, what we find is two missing genes, one plucked from the middle of the cluster and tacked onto its “front” end, and two genes totally detached from the rest. It’s not too bad as Hox cluster disintegration goes – six out of nine genes are still neatly ordered – but it certainly doesn’t look like something left over from the dawn of animals.

The bigger clue that caught my eye, though, was this little family tree in Figure 2:

Luo_etal2015-fig2

The red numbers on each branch indicate the number of gene families that expanded or first appeared in that lineage, and the green numbers are the families shrunk or lost. Note that our “living fossil” takes the lead in both. What I find funny is that it’s miles ahead of not only the animals generally considered “conservative” in terms of genome evolution, like the limpet Lottia and the lancelet Branchiostoma, but also the sea squirt (Ciona). Squirts are notorious for having incredibly fast-evolving genomes; then again, most of that notoriety was based on the crazily divergent sequences and often wildly scrambled order of its genes. A genome can be conservative in some ways and highly innovative in others. In fact, many of the genes involved in basic cellular functions are very slow-evolving in Lingula. (Note also: humans are pretty slow-evolving as far as gene content goes. This is not the first study to find that.)

So, Lingula, living fossil? Not so much.

The last bit I looked at was the section about shell genetics. Although it’s generally foolish to expect the shell-forming gene sets of two animals from different phyla to be similar (see my first footnote), if there are similarities, they could potentially go at least two different ways. First, brachiopods might be quite close to molluscs, which is the hypothesis Luo et al.‘s own treebuilding efforts support. Like molluscs, brachiopods also have a specialised mantle that secretes shell material, though having the same name doesn’t mean the two “mantles” actually share a common origin. So who knows, some molluscan shell proteins, or shell regulatory genes, might show up in Lingula, too.

On the other hand, the composition of Lingula’s shell is more similar to our skeletons’. So, since they have to capture the same mineral, could the brachiopods share some of our skeletal proteins? The answer to both questions seems to be “mostly no”.

Molluscan shell matrix proteins, those that are actually built into the structure of the shell, are quite variable even within Mollusca. It’s probably not surprising, then, that most of the relevant genes that are even present in Lingula are not specific to the mantle, and those that are are the kinds of genes that are generally involved in the handling of calcium or the building of the stuff around cells in all kinds of contexts. Some of the regulatory mechanisms might be shared – Luo et al. report that BMP signalling seems to be going on around the edge of the mantle in baby Lingula, and this cellular signalling system is also involved in molluscan shell formation. Then again, a handful of similar signalling systems “are involved” in bloody everything in animal development, so how much we can deduce from this similarity is anyone’s guess.

As for “bone genes” – the ones that are most characteristically tied to bone are missing (disappointingly or reassuringly, take your pick). The SCPP protein family is so far known only from vertebrates, and its various members are involved in the mineralisation of bones and teeth. SCPPs originate from an ancient protein called SPARC, which seems to be generally present wherever collagen is (IIRC, it’s thought to help collagen fibres arrange themselves correctly). Lingula has a gene for SPARC all right, but nothing remotely resembling an SCPP gene.

I mentioned that the shell of Lingula is built largely on collagen, but it turns out that it isn’t “our” kind of collagen. “Collagen” is just a protein with a particular kind of repetitive sequence. Three amino acids (glycine-proline-something else, in case you’re interested) are repeated ad nauseam in the collagen chain, and these repetitive regions let the protein twist into characteristic rope-like fibres that make collagen such a wonderfully tough basis for connective tissue. Aside from the repeats they all share, collagens are a large and diverse bunch. The ones that form most of the organic matrix in bone contain a non-repetitive and rather easily recognised domain at one end, but when Luo et al. analysed the genome and the proteins extracted from the Lingula shell, they found that none of the shell collagens possessed this domain. Instead, most of them had EGF domains, which are pretty widespread in all kinds of extracellular proteins. Based on the genome sequence, Lingula has a whole little cluster of these collagens-with-EGF-domains that probably originated from brachiopod-specific gene duplications.

So, to recap: Lingula is not as conservative as its looks would suggest (never judge a living fossil by its cover, right?) We also finally have actual sequences for lots of its shell proteins, which reveal that when it comes to building shells, Lingula does its own thing. Not much of a surprise, but still, knowing is a damn sight better than thinkin’ it’s probably so. We are scientists here, or what.

I am Very Pleased with this genome. (I just wish it was published five years ago 😛 )

***

Notes:

*This, interestingly, doesn’t seem to be the general case for all molluscs. Jackson et al. (2010) compared the genes building the pearly layer of snail (abalone, to be precise) and bivalve (pearl oyster) shells, and found that the snail showed no sign of the chitin-making enzymes and silk type proteins that were so abundant in its bivalved cousins. It appears that even within molluscs, different groups have found different ways to make often very similar shell structures. However, all molluscs shells regardless of the underlying genetics are predominantly composed of calcium carbonate.

**You often hear about sharks, or crocodiles, or coelacanths, existing “unchanged” for 100 or 200 or whatever million years, but in reality, 200-million-year-old crocodiles aren’t even classified in the same families, let alone the same genera, as any of the living species. Again, the living coelacanth is distinct enough from its relatives in the Cretaceous, when they were last seen, to warrant its own genus in the eyes of taxonomists. I’ve no time to check up on sharks, but I’m willing to bet the situation is similar. Whether Lingula‘s jaw-dropping 500-million-year tenure on earth is a result of taxonomic lumping or the shells genuinely looking that similar, I don’t know. Anyway, rant over.

***

References:

Jackson DJ et al. (2010) Parallel evolution of nacre building gene sets in molluscs. Molecular Biology and Evolution 27:591-608

Luo Y-J et al. (2015) The Lingula genome provides insights into brachiopod evolution and the origin of phosphate biomineralization. Nature Communications 6:8301

Hi, real world, again!

The Mammal has emerged from a thesis-induced supermassive black hole and a Christmas-induced food coma, only to find that in the month or so that she spent barely functional and buried in chapters covered in the supervisor’s dreaded Red Pen, things actually happened in the world outside. This, naturally, manifested in thousands of items feeling thoroughly neglected in RSS readers and email inboxes. (Jesus. How many times have I vowed never to neglect my RSS feed again? Oh well, it’s not like unemployment is such a busy occupation that I can’t deal with a measly two and a half thousand articles 😛 )

… earlier tonight, the paragraph here said I wasn’t doing a proper post yet, “just pointing out” a couple of the cooler things I’ve missed. Then somehow this thing morphed into a 1000+ word post that goes way beyond “pointing things out”. It’s almost like I’ve been itching to write something that isn’t my thesis. >_>

So the first cool thing I wanted to “point out” is the genome paper of the centipede Strigamia maritima, which is a rather nondescript little beast hiding under rocks on the coasts of Northwest Europe. This is the first sequenced genome of a myriapod – the last great class of arthropods to remain untouched by the genome sequencing craze after many genomes from insects, crustaceans and chelicerates (spiders, mites and co.).  The genome sequence itself has been available for years (yay!), but its “official” paper (Chipman et al., 2014) is just recently out.

Part of the appeal of Strigamia – and myriapods in general – is that they are considered evolutionarily conservative for an arthropod. In some respects, the genome analysis confirms this. Compared to its inferred common ancestor with us, Strigamia has lost fewer genes than insects, for example. Quite a lot of its genes are also linked together similarly to their equivalents in distantly related animals, indicating relatively little rearrangement in the last 600 million years or so. But this otherwise conservative genome also has at least one really unique feature.

Specifically, this centipede – which is blind – has not only lost every bit of DNA coding for known light-sensing proteins, but also all known genes specific to the circadian clock. In other animals, genes like clock and period mutually regulate one another in a way that makes the abundance of each gene product oscillate in a regular manner (this is about the simplest graphical representation I could find…). The clock runs on a roughly daily cycle all by itself, but it’s also connected to external light via the aforementioned light-sensing proteins, so we can constantly adjust our internal rhythms according to real day-night cycles.

There are many blind animals, and many that live underground or otherwise find day and night kind of irrelevant, but even these are often found to have a functioning circadian clock or keep some photoreceptor genes around. However, based on the genome data, our favourite centipede may be the first to have completely lost both. The authors of the genome paper hypothesise that this may be related to the length of evolutionary time the animals have spent without light. Things like mole rats are relatively recent “inventions”. However, the geophilomorph order of centipedes, to which Strigamia belongs, is quite old (its most likely sister group is known from the Carboniferous, so they’re probably at least that ancient). Living geophilomorphs are all blind, so chances are they’ve been that way for the last 300+ million years.

Nonetheless, the authors also note that geophilomorphs are still known to avoid light – the question now is how the hell they do it… And, of course, whether Strigamia has a clock is not known – only that it doesn’t have the clock we’re used to. We also have no idea at this point how old the gene losses actually are, since all the authors know is that one other centipede from a different group has perfectly good clock genes and opsins.

In comparison with fruit flies and other insects, the Strigamia genome also reveals some of the ways in which evolutionary cats can be skinned in multiple ways. There is an immune-related gene family we share with arthropods and other animals, called Dscam. The product of this gene is involved in pathogen recognition among other things, and in flies, Dscam genes are divided into roughly 100 chunks or exons, most of which are are found in clusters of variant copies. When the gene is transcribed, only one of these copies is used from each such cluster, so in practical terms the handful of fruit fly Dscam genes can encode tens of thousands of different proteins, enough to adapt to a lot of different pathogens.

A similar arrangement is seen in the closely related crustaceans, although with fewer potential alternative products. In other groups – the paper uses vertebrates, echinoderms, nematodes and molluscs for comparison – the Dscam family is pretty boring with at most one or two members and none of these duplicated exons and alternative splicing business. However, it looks like insects+crustaceans are not the only arthropods to come up with a lot of DSCAM proteins. Strigamia might also make lots of different ones (“only” hundreds in this case), but it achieved this by having dozens of copies of the whole gene instead of performing crazy editing feats on a small number of genes. Convergent evolution FTW!

Before I paraphrase the entire paper in my squeeful enthusiasm (no, seriously, I’ve not even mentioned the Hox genes, and the convergent evolution of chemoreceptors, and I think it’s best if I shut up now), let’s get to something else that I can’t not “point out” at length: a shiny new vetulicolian, and they say it’s related to sea squirts!

Vetulicolians really deserve a proper discussion, but in lieu of a spare week to read up on their messiness, for now, it’s enough to say that these early Cambrian animals have baffled palaeontologists since day one. Reconstructions of various types look like… a balloon with a fin? Inflated grubs without faces? I don’t know. Drawings below (Stanton F. Fink, Wikipedia) show an assortment of the beasts, plus Yunnanozoon, which may or may not have something to do with them. Here are some photos of their fossils, in case you wondered.

Vetulicolians from Wiki

They’re certainly difficult creatures to make sense of. Since their discovery, they’ve been called both arthropods and chordates, and you can’t get much farther than that with bilaterian animals (they’re kind of like the Nectocaris of old, come to think of it…).

The latest one was dug up from the Emu Bay Shale of Australia, the same place that yielded our first good look at anomalocaridid eyes. Its newest treasure has been named Nesonektris aldridgei by its taxonomic parents (García-Bellido et al., 2014), and it looks something like this (Diego García-Bellido’s reconstruction from the paper):

Garcia-Bellido_etal2014-nesonektris_recon

In other words, pretty typical vetulicolian “life but not as we know it”, at first glance. Its main interest lies in the bit labelled “nc” in the specimens shown below (from the same figure):

García-Bellido_etal2014-nesonektris_notochords

This chunky structure in the animal’s… tail or whatever is a notochord, the authors contend. Now, only one kind of animal has a notochord: a chordate. (Suspicious annelid muscle bundles notwithstanding. Oh yeah, I also wanted to post on Lauri et al. 2014. Oops?) So if this thing in the middle of Nesonektris’s tail is a notochord, then at the very least it is more closely related to chordates than anything else.

Why do they think it is one? Well, there are several long paragraphs devoted to just that, so here goes a summary:

1. It’s probably not the gut. A gut would be the other obvious ID, but it doesn’t fit very well in this case. Structures interpreted as guts in other vetulicolians – which sometimes contain stuff that may be half-digested food – (a) start in the front half of the body, where the mouth is, (b) constrict and expand and coil and generally look much floppier than this, (c) don’t look segmented, (d) sometimes occur alongside these tail rod-like thingies, so probably aren’t the same structure.

2. It positively resembles modern half-decayed notochords. The notochords of living chordates are long stacks of (muscular or fluid-filled) discs, which fall apart into big blocks as the animal decomposes after death. Here’s what remains of the notochord of a lamprey after two months for comparison (from Sansom et al. (2013)):

Sansom_etal2013-adult_lamprey_notochord_d63

This one isn’t as regular as the blockiness in the fossils, I think, but that could just be the vetulicolians not being quite as rotten.

There is, of course, a but(t). To be precise, there are also long paragraphs discussing why the structure might not be a notochord after all. It’s much thicker than anything currently interpreted as such in reasonably clear Cambrian chordates, for one thing. Moreover, it ends right where the animal does, in a little notch that looks like a good old-fashioned arsehole. By the way, the paper notes, vetulicolian tails in general don’t go beyond their anuses by any reasonable interpretation of the anus, and a tail behind the anus is kind of a defining feature of chordates, though this study cites a book from the 1970s claiming that sea squirt larvae have a vestigial bit of proto-gut going all the way to the tip of the tail. (I suspect that claim could use the application of some modern cell labelling techniques, but I’ve not actually seen the book…)

… and there is a phylogenetic analysis, in which, if you interpret vetulicolians as deuterostomes (which impacts how you score their various features), they come out specifically as squirt relatives whether or not you count the notochord. I’m never sure how much stock to put in a phylogenetic analysis based on a few bits of anatomy gleaned from highly contentious fossils, but at least we can say that there are other things – like a hefty cuticle – beyond that notochord-or-not linking vetulicolians to a specific group of chordates.

Having reached the end, I don’t feel like this paper solved anything. Nice fossils either way 🙂

And with that, I’m off. Maybe next time I’ll write something that manages to be about the same thing throughout. I’ve been thinking that I should try to do more posts about broader topics rather than one or two papers (like the ones I wrote about ocean acidification or homology versus developmental genetics), but I’ve yet to see whether I’ll have the willpower to handle the necessary reading. I’m remarkably lazy for someone who wants to know everything 😀

(Aside: holy crap, did I ALSO miss a fucking Nature paper about calcisponges’ honest to god ParaHox genes? Oh my god, oh my GOD!!! *sigh* This is also a piece of incredibly exciting information I’ve known for years, and I miss it when it actually comes out in a journal bloody everyone reads. You can tell I’ve been off-planet!)

References:

Chipman AD et al. (2014) The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede Strigamia maritima. PLoS Biology 12:e1002005

García-Bellido DC et al. (2014) A new vetulicolian from Australia and its bearing on the chordate affinities of an enigmatic Cambrian group. BMC Evolutionary Biology 14:214

Lauri A et al. (2014) Development of the annelid axochord: insights into notochord evolution. Science 345:1365-1368

Sansom RS et al. (2013) Atlas of vertebrate decay: a visual and taphonomic guide to fossil interpretation. Palaeontology 56:457-474

The ctenophore conundrum, by popular demand

So, a new ctenophore genome has just been published in Nature (Moroz et al., 2014), it makes some extraordinary claims, and my resident palaeontologist/web-buddy Dave Bapst wants my opinion 😉

Given that I already planned to have an opinion about the first ctenophore genome back in December (Ryan et al., 2013) and miserably failed to finish the post… the temptation is just too strong. (That thesis chapter draft in the other window of MS Word wasn’t going to be finished today anyway  >_>)

Whatever I might seem from words on the internet, I’m not some kind of expert on phylogenetics, so I’m going to use a crutch. I had this idea back when I first read Ryan et al. (2013), because I remember thinking that it was written almost as if Nosenko et al. (2013) had never happened, and I’d really liked Nosenko et al. (as you can guess from the word count of this post), so I was mildly indignant about that. The Nosenko paper is going to be my crutch. (No offence to Hervé Philippe and friends, but there are only so many papers I’m going to reread for an out of the blue blog post 😉 )

Although I’m obviously not writing a public post specifically for a phylogeny nut, I may get somewhat technical, and I’m definitely going to get verbose.

***

Ctenophores. Comb jellies, sea gooseberries, Venus girdles. They are floaty, ethereal, mesmerizingly beautiful creatures, and I have it on good authority that they are also complete pains in the arse.

Here’s some pretty pictures before it gets too painful 😉 Left: Mnemiopsis leidyi from Ryan et al. (2013); right: Pleurobrachia bachei from Moroz et al. (2014). And a bonus video of a Venus girdle making like an ancient nature spirit. I could watch these beasties all day.

mnemi_pleuro

Venus from Sandrine Ruitton on Vimeo.

The problem(s)

And now, the pain. Let’s pull out my trusty old animal phylogeny, because the question marks are once again highly appropriate. (Also, I’m hell-bent on breaking your bandwidth with PICTURES.)

animalPhylogeny

Ryan et al. (2013) helpfully have a figure distilling the ideas people have had about those question marks so far:

ryan_etal2013-ctenophoreHypotheses

Bi = bilaterians, Cn = cnidarians, Ct = ctenophores, Tr = Trichoplax, and Po = sponges (Porifera).

I say “helpfully,” but it’s not all that helpful after all, since pretty much every possible configuration has been proposed. Why is this such a difficult question? Here’s a quick rundown of the problems Nosenko et al.’s study found to affect the question marks:

  1. Fast-evolving protein sequences – these can cause artefacts because too much change overwrites informative changes and creates chance similarities. Excluding faster-evolving sequences from the analysis changes the tree.
  2. Sequence data that don’t conform to the simplifying assumptions of popular evolutionary models – again, this can result in chance similarities and artefacts, and using a poorer model replicates the effects of using less ideal sequences.
  3. Long-branched outgroups – these are the non-animal groups used to place the root of animals. The more distant from animals and less well-sampled the outgroup, the longer the branches it forms, which can attract fast-evolving animal lineages towards the root. In Nosenko et al.’s analyses, even the closest outgroup seemed to cause problems, and removing the outgroup altogether made the conflicts between different models and datasets disappear completely – but this isn’t exactly helpful when you’re looking for the root of the animal tree!

The problem with ctenophores in particular is illustrated by this one of Nosenko et al.’s trees, made from one of their less error-prone datasets:

Nosenko_etal2013-ribosomalCATtree

The ctenophore branch is not only longer overall than pretty much any other in the tree; its length is also very unevenly distributed between the loooong history common to all species and the short unique lineage of each individual species. That is bad news. And it may stay that way forever, because the last common ancestor of living ctenophores may genuinely be very recent, so there’s no way to divide up that long-ass internal branch without a time machine.

Round 1: Nosenko vs. Ryan

In fairness, the Mnemiopsis genome team probably didn’t have a whole lot of time to specifically deal with Nosenko et al.’s points (OTOH, none of those individual points were truly new). The Nosenko paper came out in January 2013, and the Mnemiopsis genome paper was received by Science in July of the same year – I imagine most of the data had been generated way before then, and you can’t just redo all your data analysis and rewrite a paper on short notice.

I’m still going to view Ryan et al. (2013) in the light of Nosenko, because regardless of the genome team’s ability to answer them, some of Nosenko et al.’s points are very relevant to the claims they make. Their biggest claim, of course, being that ctenophores are the sister group to all other animals.

In Nosenko et al.’s experiments, this placement showed up in trees where faster-evolving genes, poorer models or more distant outgroups were used, but not when the slowest-evolving gene set was analysed with the best models and the closest outgroup.

Ryan et al. acknowledge that “supermatrix analyses of the publicly available data are sensitive to gene selection, taxon sampling, model selection, and other factors [cite Nosenko].” Their data are obviously sensitive to such factors. In fact, they behave rather similarly to what I saw in the Nosenko study.

Ryan et al. used two method/model combinations – one of the models was the preferred CAT model of Nosenko et al., and the other was the OK but not great GTR model that CAT beat by miles in terms of actually fitting Nosenko et al.’s data. (Caveat: in the genome paper, the CAT and GTR models were used with different treebuilding methods, so we can’t blame the models for different results with any certainty.) Also, they analysed the data with three different outgroups.

And guess what – the ctenophores-outside-everything tree was best supported with (1) the GTR model, (2) the more distant outgroups. There is not much testing of the effect of gene choice – there were two different data sets, but they were both these massive amalgamations of everything useable, and they also included totally different samples of species.

However, here comes another nod to Nosenko et al. and all the other people who advocated trying things other than “conventional” sequence comparisons through the years. Provided you can securely identify genes across different organisms, you can also try to deduce evolutionary history based on their presences and absences rather than their precise sequences. This is not a foolproof approach because genes can be (commonly) lost or (occasionally) picked up from other organisms, but it is often regarded as less artefact-prone than sequence-based trees.

But does it help with ctenophores? Like the GTR model-based sequence trees, the tree based on gene presence/absence (you obviously need complete genomes for this!) supports ctenophores being the outsider among animals:

Ryan_etal2014-RGCtree

My problem with this? Note what else it supports. The white circles indicate groupings that this method had absolutely no doubt about. And these groupings include things that frankly sound like abject nonsense. Here’s one annelid worm (the leech Helobdella) sitting next to a flatworm, while another annelid worm (Capitella) teams up with a limpet right next to a chordate. If anything, that is more controversial than the placement of ctenophores, because we thought we had it settled!

So if we’re concluding that ctenophores are basal to all other animals, why aren’t we also making a fuss about the explosion of phylum Annelida? Surely, if this method gives us strong enough conclusions to arbitrate between different sequence-based hypotheses about ctenophores, it’s strong enough to make those claims too. The cake can’t quite decide if it’s being eaten, I think.

I’m not sure what to think about the sequence trees. I’m far more confident about the presence/absence one. Maybe I’m just demonstrating the Dunning-Kruger effect here, but I’m not buying that tree for a second.

Overall verdict?

Not convinced. Not by a long shot.

Round 2: Nosenko vs. Moroz

The Pleurobrachia genome took me completely by surprise. I’d known Mnemiopsis was sequenced since Ryan et al. (2010). (Three years. Can you imagine the twitching?) I had no idea this other project was happening, so I nearly fell off my chair when Nature dropped it into my RSS reader yesterday. Another ctenophore genome – and another one that supports ctenophore separatism? (This hypothesis is becoming strangely popular…)

Bonus: it’s not just a genome paper, it also describes the transcriptomes of ten different ctenophores. Transcriptomes, the set of all active genes, are a little bit easier to sequence and assemble than genomes, and if you’re thorough they’ll catch most of the genes the organism has, so they can be almost as good for the analysis of gene content.

Which they kind of don’t do properly. There is a discussion of specific gene families that ctenophores lack – including many immune- and nervous system-related genes – but that’s not exactly saying much given that we know even “important” genes can be lost (case in point: the disappearing (Para)Hox genes of Trichoplax). The fact that ctenophores seem to completely lack microRNAs is interesting, but again, it doesn’t mean they never had them. Sponges do have microRNAs but don’t seem to be nearly as big on them as other animals.

As for the global analysis of gene content – I had to chase down a reference (Ptitsyn and Moroz, 2012) to understand what they actually did. As far as I can tell, there is no phylogenetic analysis involved – they just took a tree they already had, and used this method to map gene gains and losses onto that tree. Which is cool if you’re fairly sure about your tree, but pretty much meaningless when the tree is precisely the question. The Mammal is disappointed.

One of the problems with listing genes that aren’t there or don’t work in the “expected” way in ctenophores is that even if they’re not outside everything else, it’s still a distinct possibility that these guys branched off from our lineage before cnidarians did. For example, the Pleurobrachia paper spends a lot of time on “nervous system-specific” genes like elav missing or not being expressed in neurons, and common neurotransmitters like serotonin not being used by ctenophores.

But, assuming that the tree of animals looks something like (sponges + (ctenophores + (cnidarians + bilaterians))), we wouldn’t expect ctenophore nervous systems to share every property that cnidarians and bilaterians share. Remember: (1) sponges don’t have nervous systems, so they’re not much use as a comparison, (2) cnidarians + bilaterians had a longer common ancestry than either did with ctenophores. Genes possessed by sponges PLUS cnidarians and/or bilaterians but missing from ctenophores are more suggestive, but only if you can demonstrate that they weren’t lost. (We’re kind of going in circles here…)

The other problem is that pesky last common ctenophore ancestor. If it really is very recent, then taking even all living ctenophores to represent ctenophore diversity is like taking my close family to represent human diversity. Just like my family contains pale-skinned, lactose tolerant people, it is entirely possible that this lone surviving ctenophore lineage possesses (or lacks) important traits that aren’t at all typical of ctenophores as a whole. Ryan et al.’s supplementary data are clear that at least the Mnemiopsis genome is horribly scrambled, all trace of conserved gene neighbourhoods erased from it. That’s not exactly promising if you’re hoping for “trustworthy” animals.

The actual phylogenetic trees in Moroz et al. (2014) seem to follow an approach of throwing AAAALLL the genes at the problem. The biggest dataset contains 586 genes, compared to 122 in Nosenko et al.’s largest collection, and there is not much filtering by gene properties other than “we can tell what it is”. I have no idea how the CAT + WAG model they used compares to CAT or WAG or GTR on their own; unfortunately, the Nosenko paper doesn’t test that particular setup and this one doesn’t do any model testing. Moroz et al.’s supplementary methods claim it’s pretty good, cite something, and I’m not gonna chase down that reference. (Sorry, I’ve been poring over this for four hours at this point).

Interestingly, the support for ctenophores being apart from other animals increases when they start excluding distant outgroups. The only time it’s low is when they add all ten ctenophores and use fewer genes. Hmm. This is where I would like to hear some real experts’ opinions, because on the face of it, I can’t pinpoint anything obviously wrong. (Other than saying that chucking more genes at a problem tree is perfectly capable of making the problem worse)

TL;DR version: While I’m generally underwhelmed by the gene content stuff, I literally have no idea what to think about the trees.

I’m banking on the hope that someone will do.

***

And… I think that is all the opinion I’m going to have about ctenophores for a long time. Lunch was a long time ago, my brain is completely fried, and I’m not sure how much of the above actually makes sense. To be clear, I don’t really have a horse in this race, though I’d really like to know the truth. (Fat chance of that, by the looks of it…) I think I’m going to need a bit more convincing before I stop looking sideways at this idea that ctenophores are further from us than sponges. If anything is clear from recent phylogenomics papers, it’s that what data you analyse and how you analyse them makes a huge difference to the result you get, and this is happening with data and methods where it’s not necessarily easy to dismiss an approach as clearly inferior.

It’s a mess, damn it, and I’m not qualified to untangle it. Urgh.

***

References

Moroz LL et al. (2014) The ctenophore genome and the evolutionary origin of neural systems. Nature advance online publication, 21/05/2014; doi: 10.1038/nature13400

Nosenko T et al. (2013) Deep metazoan phylogeny: When different genes tell different stories. Molecular Phylogenetics and Evolution 67:223-233

Ptitsyn A & Moroz LL (2012) Computational workflow for analysis of gain and loss of genes in distantly related genomes. BMC Bioinformatics 13:S5

Ryan JF et al. (2010) The homeodomain complement of the ctenophore Mnemiopsis leidyi suggests that Ctenophora and Porifera diverged prior to the ParaHoxozoa. EvoDevo 1:9

Ryan JF et al. (2013) The genome of the ctenophore Mnemiopsis leidyi and its implications for cell type evolution. Science 342:1242592

News bites

Since work is quite frantic lately, and my attention span has gone on holiday, I’ve decided to do something I haven’t done before and just say a few words about papers that caught my interest today without actually reading them. Each of these is probably worth a full-blown meandering of its own, but I know I wouldn’t ever get to them at this rate. Better read their abstracts and give some quick thoughts than let them sink unnoticed into the murk of my “papers” folder!

(1) How many genomes do we (not) have?

Reference: Ali Abbasi A & Hanif H (2012) Phylogenetic history of paralogous gene quartets on  human chromosomes 1, 2, 8 and 20 provides no evidence in favor of the vertebrate octoploidy hypothesis. Molecular Phylogenetics and Evolution, in press (doi: 10.1016/j.ympev.2012.02.028)

(How many papers have authors with alliterating names? :D)

In the circles I move in, it’s pretty much canon that the ancestors of living vertebrates doubled their entire genomes twice. It’s still debated exactly when these duplications occurred, but few people doubt that they did. This so-called 2R hypothesis is supported by things like our possession of several (quite often, four) copies of genes that are singletons in our closest living relatives (read: lancelets*), and more importantly, that whole big chunks of lancelet chromosomes can be matched to chunks of four different vertebrate (mainly, human) chromosomes. Genes that are close to one another in lancelets are often also close together in vertebrates.

The relationship is not perfect – in well over 500 million years of evolution, genes inevitably get lost and bits of chromosome scrambled. And, thus, there is always room to question the 2R scenario, which is what this paper clearly does. They propose that those four-gene families originated at all sorts of different times, from small local duplications and rearrangements. If they are right, this is a very important result. It basically uproots every bit of speculation ever proposed on how the genome duplications contributed to the evolution of vertebrates, which, far as I can tell, is a hell of a lot of speculation. Not having read the whole paper, I would still put my money on 2R, but who knows what the future holds? Maybe we are facing a minor paradigm shift?

(2) The segmentation clock also ticks in insects!

Reference: Sarrazin AF et al. (2012) A segmentation clock with two-segment periodicity in insects. Science, advance online publication (doi: 10.1126/science.1218256)

The evolutionary history of segmentation is one of my random interests, and from my point of view, the above is a good reason to squee in a most fangirlish way. Segmentation is the construction of a body from repeating units. In its purest form, which isn’t that common in modern animals, the animal is essentially made up of identical repeated blocks containing a copy of each key organ like kidneys, nerve centres, limbs and muscles. (Even in the most perfectly segmented creatures, head and tail ends form something of an exception. Ragworms make a nice example.) More commonly, only some components are repeated, and they are repeated with slight differences along the body. Vertebrates’ spine and associated muscles are a good example, and so are the defining traits of arthropods, their jointed exoskeletons equipped with repeated pairs of appendages.

Although traditionally it has been thought that arthropod and vertebrate segmentation have independent origins, parts of the genetic machinery are shared between both groups (as well as segmented worms). Various “segmentation genes” are active in distinct stripes in our embryos, marking out future segments even before we can see the segments themselves. In vertebrates, cells periodically switch “segmentation genes” on and off, and this combined with the growth of the embryo produces a dynamic stripey pattern of gene expression. While segments and stripes of gene expression are darn obvious in arthropods, this is the first time anyone has confirmed that some arthropod segmentation genes actually oscillate like their vertebrate counterparts do, as opposed to, say, the cells expressing them moving about. Whether this is a spectacular example of convergent evolution or evidence of a shared ancestral heritage, I couldn’t say, but it’s really cool either way.

(3) Old genes are entrenched, new genes are redundant after all?

Reference: Chen WH et al.(2012) Younger genes are less likely to be essential than older genes, and duplicates are less likely to be essential than singletons of the same age. Molecular Biology and Evolution advance online publication (doi: 10.1093/molbev/mss014)

So, this claims to resolve a conundrum I wasn’t even aware of before. Gene duplication is thought to be important for the evolution of new functions because two copies of a gene mean there is a backup if one of them fails at its original function. Hence, theory goes, duplicate genes are much less restricted in the evolutionary paths they can take. Apparently, studies in mice have contradicted this common wisdom by claiming that duplicate genes are just as likely to be indispensable as genes without backup copies. However, Chen et al. are saying that this is wrong, confounded by gene age. Since new genes are less likely to be essential than old genes (which had more time to evolve interactions with the rest of the genome), and mouse duplicates are on average older than mouse singletons, the two effects end up cancelling out. When they factor in gene age, duplicates are indeed less essential than loners. One of the central tenets of current thinking about (genetic) novelty stays in the ring for another round…

(4) Is reducing complexity easier than increasing it?

Reference: Harjunmaa E et al. (2012) On the difficulty of increasing dental complexity. Nature advance online publication (doi: 10.1038/nature10876)

How complexity increases in evolution is more than a breeding ground for creationist incredulity, it’s also quite interesting for bona fide evolutionary biologists. Looking at the development of mouse teeth, Harjunmaa et al. notice that increases and decreases in the complexity of tooth shapes require different sorts of mess-ups. Simpler-than-normal teeth are common in mutants and easy to make in experiments. More complex teeth – i.e. those with more cusps – are rarely if ever seen in natural mutants. Turns out they are perfectly possible – you just need to manipulate several genetic pathways at the same time to produce a clear result.

Can this be generalised? Is greater complexity usually harder to achieve? When does this apply and when does it not? I’ve recently read papers that explore how complexity increases easily and completely by chance (I have a half-written post about them languishing on my hard drive, FWIW). Are the rules different for different levels of organisation? The aforementioned complexity-by-chance papers analyse the molecular level: one is about the architecture of gene switches, the other about a protein machine. Teeth are pretty large pieces of life with thousands upon thousands of such machines participating in their production. Does that make a real difference, or is what I’m seeing just coincidence? Dunno, but it’s fascinating to think about!

***

Heh, it looks like I took rather bigger “bites” of these news than I planned to. I kind of managed to write the equivalent of a full-blown meandering anyway. The only difference is that I didn’t painstakingly reference this one. I hope that doesn’t mean that half of what I wrote off the top of my head is wrong 😀

***

Note:

*Lancelets are now not considered our closest relatives. Unbelievable as that may seem, that honour goes to sea squirts and their ilk. However, the sea squirt bunch are ridiculously weird in all sorts of respects, and their genomes are jumbled beyond recognition. So… not so great if you want to learn anything about our ancestors.