New genes, new tricks

I’ve previously written about the birth of new genes. Since new genes are cool, and I just found two recent papers on them, you’re getting more of them.

Part 1: how to survive duplication

Technically, the first paper isn’t about new new genes: Assis and Bachtrog (2013) examined recently duplicated genes in fruit flies. But screw technicalities, what they’re saying makes my eyes pop.

When a gene is accidentally copied, a variety of possible fates can await it. Most of the time, the extra copy just dies. Some mechanisms of gene duplication just take the gene without the regulatory elements it needs to function properly. Even if the new copy works, it’s still redundant, so there’s nothing stopping mutations from destroying it over time. However, sometimes redundancy is removed before the new gene breaks irrevocably, and both copies are kept. This can, in theory, happen in a number of ways. Because I’m feeling lazy, let me just quote them from the paper (square brackets are mine, because I hate repeatedly typing out long ugly words :)):

Four processes can result in the evolutionary preservation of duplicate genes: conservation, neofunctionalization, subfunctionalization, and specialization. Under conservation, ancestral functions are maintained in both copies, likely because increased gene dosage is beneficial (1). Under neofunctionalization [NF], one copy retains its ancestral functions, and the other acquires a novel function (1). Under subfunctionalization [SF], mutations damage different functions of each copy, such that both copies are required to preserve all ancestral gene functions (9, 10). Finally, under specialization, subfunctionalization and neofunctionalization act in concert, producing two copies that are functionally distinct from each other and from the ancestral gene (11).

We might add a variation on NF, too: Proulx and Phillips (2006) theorised that differences in function that arise in different alleles (variants) of a single gene can turn duplication into an advantage, turning the conventional duplication-first, new function-next scenario on its head.

Either way, genomes contain lots of duplicated genes, there’s no question about that. What isn’t nearly as well understood is the relative importance of various mechanisms in producing all these duplicates. It’s much easier to theorise about mechanisms than to test the theories. Since evolution doesn’t stop once a new gene has earned its place in the genome, it can be hard to disentangle the mechanism(s) responsible for its preservation from the stuff that happened to it later. Also, to really assess the relative role of different mechanisms, you’ve got to look at whole genomes.

(Assis and Bachtrog say that this hasn’t been done before, and then go right on to cite He and Zhang [2005], which is a genome-wide study of SF and NF. I guess it doesn’t look at all the mechanisms…)

Assis and Bachtrog used the amazing resource that is the 12 Drosophila genomes project, focusing on D. melanogaster and D. pseudoobscura to find slightly under 300 pairs of genes that duplicated after the divergence of those two species. Since Drosophila genomes are very well-studied, they were able to identify the “parent” and “child” in each pair based on where they sit on their chromosomes. They then also extracted thousands of unduplicated genes from the melanogaster and pseudoobscura genomes, to use as a measure of background divergence between the two species.

To measure changes in gene function, they compared the expression of parent and child genes to each other and to the “ancestral” copy (i.e. the unduplicated gene in the other species) in different parts of the body (if a gene is suddenly turned on somewhere it wasn’t before, it’s probably doing something new!).

Long story short, it turned out that in the majority of cases (167/281) cases the child copy behaved much more differently from the “ancestor” than expected, while the parent copy stayed pretty close. These child copies also showed faster sequence evolution than their parents. This means that NF – and specifically that of the new copy – is the most common fate of newly duplicated genes in these animals. There’s also a fair number of gene pairs where both copies gained new functions or both stuck with the old ones, but only three where both copies lost functions. Pure SF, which very influential studies like Force et al. (1999) championed as the dominant mode of duplicate gene survival, appears to be an incredibly rare occurrence in fruit flies!

A few paragraphs ago I mentioned the caveat that duplicated genes don’t stop evolving just because they’ve managed to survive. Well, the advantage of having all these Drosophila genomes is that you can further break down “young” duplicates into narrower age groups, using the species that fall between melanogaster and pseudoobscura on the tree. However, looking at this breakdown doesn’t change the general pattern – NF of the child copy is the most common and SF is rare or nonexistent in even the youngest age groups, along both the melanogaster and the pseudoobscura lineages.

So what exactly is going on here?

Part of the difference in expression patterns between parent/ancestral and child copies is because these new genes are turned on in the testicles, which might give us a big clue. Testicles, you see, are a bit anarchical. Things that are normally kept silent in the genome, like various kinds of parasitic DNA, wake up and run wild during the making of sperm. If you remember my throwaway reference to duplication mechanisms that cut the gene off from its old regulatory elements – well, the balls are a place where even such lost and lonely genes get a second chance.

The genomic anarchy of testes is also one of the reasons these duplications happen in the first place; the aforementioned mechanism involves those bits of parasitic DNA that copy and paste themselves via an RNA intermediate. The enzymes they use to reverse transcribe this RNA into DNA and insert it back into the genome aren’t particularly discerning, and they’ll happily do their thing on a piece of RNA that isn’t the parasite. Indeed, slightly more NFed child genes than you’d expect originated via RNA, although it’s worth noting that more than half of them still didn’t. So while the testes look like a good place for new gene copies to find a use, they aren’t totally responsible for their origins.

Why is there so little SF among these genes?

This is the Obvious Question; my jaw nearly landed on my desk when I saw the numbers. The authors have two hypotheses, both of which may be true at the same time.

First, SF assumes that the two copies have the same functions to begin with. This is not necessarily true when just a small segment of DNA is duplicated – even when it’s not just a bare gene you’re copying, the new copy might lose part of its old regulatory elements and/or land next to new ones, not to mention Proulx and Phillips’s idea of new functions appearing before duplication. So maybe SF is more common after wholesale duplications of entire genomes, and Drosophila species didn’t have any of those recently.

Secondly, SF happens by genetic drift, which is a random process that works much better in small populations. Fruit flies aren’t known for their small populations, and therefore the dominant evolutionary force acting on their genomes will be selection.

This makes sense to me, but the degree to which NF dominates the picture is still pretty amazing. I wonder what you’d get if you applied the same methods to different species. Would species with smaller populations, or those that recently duplicated their whole genomes, show more evidence for SF as you’d expect if the above reasoning is correct? Or would the data slaughter all those seemingly reasonable explanations? What would you see in parthenogenetic species that have no males (and testicles)?

Part two, with really new genes, hopefully coming soon…



Assis R & Bachtrog D (2013) Neofunctionalization of young genes in Drosophila. PNAS 110:17409-17414

He X & Zhang J (2005) Rapid subfunctionalization accompanied by prolonged and substantial neofunctionalization in duplicate gene evolution. Genetics 169:1157-1164

Force A et al. (1999) Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151:1531-1545

Proulx SR & Phillips PC (2006) Allelic divergence precedes and promotes gene duplication. Evolution 60:881-892

Shining a light on retinoic acid

I was planning to do more bioinformaticky stuff tonight, but then I saw Shimozono et al. (2013), and… SHINY!

I derive a particular joy from seeing neat methods, and what these guys did is pretty damn neat. They used genetic engineering and a clever trick with fluorescence to (almost) directly study an important but rather elusive molecule in vertebrate development.

Retinoic acid (RA) is related to vitamin A; in fact, it is synthesised from vitamin A by enzymes in our cells. It is what developmental biologists call a morphogen: a molecule that spreads through an embryo by diffusion, and influences development depending on its concentration. Among other things, RA is thought to be responsible for the subdivision of the embryonic body axis by Hox genes, and also the correct formation of somites, the basic repeating units that eventually form our spine.

So RA is pretty darn important, but it’s also a bit difficult to investigate. It’s a relatively small and simple molecule that isn’t encoded in the genome, so some of the popular tools for detecting important molecules don’t work on it. Its activity can be monitored indirectly, though. Retinoic acid works by binding to proteins called retinoic acid receptors (RARs), which then latch onto certain DNA sequences that regulate nearby genes. So you can, for example, construct a piece of DNA that responds to an activated RAR by producing a fluorescent protein. You can also examine the distribution of the enzymes that make and break down RA, the assumption being that this corresponds to the distribution of RA itself.

The Japanese team, however, created a modification of retinoic acid receptors that is basically a direct indicator of RA level. Their RARs have been engineered to glow in different ways depending on whether or not RA is bound to them. They were able to zap these miniature RA detectors into zebrafish embryos without affecting the little creatures’ development, creating a gentle way to monitor RA levels in live animals.

They exploited a fascinating phenomenon called fluorescence resonance energy transfer (FRET for short). FRET needs two fluorescent molecules that glow at different wavelengths, such that the wavelength one of them emits is the same that turns the other on. (Wikipedia tells me FRET is actually based on spooky quantum effects involving virtual particles rather than ordinary light travelling from molecule to molecule. Wow, I didn’t know that!)

If the two molecules are very, very close, the emissions of the first one can give the other enough energy to light up. You can detect this by shining the colour of light needed to excite the first molecule on your sample, but then also measuring the fluorescence from the second molecule. The ratio of Molecule 1 to Molecule 2 glow can tell you how much FRETting is going on.

What Shimozono et al. did was to add the code for a FRET-capable pair of fluorescent proteins to various RAR genes. RARs change shape when retinoic acid binds to them, and in these engineered versions this means that they bring their fluorescent tags close enough for FRET to work. (The above figure, from Carr and Hetherington [2000], illustrates the principle – just substitute “Ca2+” with “RA”.) The scientists calibrated their little RA detectors by measuring how much FRET happened at various controlled RA concentrations first; this allowed them to turn FRET intensities into accurate measurements of RA. They then tested whether the detectors were truly RA-specific (and not activated by, say, vitamin A) by using them in fish embryos with their RA-making enzymes crippled.

Of course, they also got round to looking at the behaviour of RA during development, which was, after all, the point of their new toys. They did a basic visualisation of RA concentration throughout developing embryos – and confirmed that the established method of looking for the enzymes involved in RA synthesis and degradation is actually a decent substitute for measuring RA itself.

They then interfered with the production of a protein called FGF8 that is thought to regulate RA synthesis, and found that this altered the RA gradient – as well as the expression of the main enzyme that produces RA. Basically – the technique seems to work, and what it shows agrees with what we’ve thought about RA signalling. Hooray!

And, of course, they got pretty pictures like the ones below, coloured according to the amount of FRET (red = high, green = low) they measured. These two compare a normal embryo (left) and an embryo of the same age whose fgf8 gene has been messed with (right). If you have normal colour vision*, it’s pretty clear how the control embryo has this massive band of redness halfway down its body, and how even nearer the head and tail ends it’s more yellow than the sad green of the treated baby.

(I spliced these together from panels in an overwhelmingly massive figure and labelled them for those of you who don’t look at fish embryos much. No copyright infringement and no financial gain is intended, of course ;))

I think this whole thing is waaaaay cool. I wish I could come up with something clever like that. Oh well, at least I get to work with fluorescent things and take pretty glowy pictures every now and then. When I’m not neck deep in protein sequences and ‘omics data. 🙂


*Being a red-green colour blind developmental biologist must be a hell of a lot of fun. I just realised that pretty much everything involving fluorescence in biology is red, green or both – and developmental biologists love sticking fluorescent tags on everything. By the way, this particular figure could have been presented in any old combination of colours – they’re illustrating abstract numbers, not the actual colours of the specimens, which in this case would have been glowing in cyan and yellow. Of course, there’s probably a colour vision deficiency for every combination you can think of, so, uh. I probably overthunk this?


Carr K & Hetherington A (2000) Calcium dynamics in single plant cells. Genome Biology 1:reports024

Shimozono S et al. (2013) Visualization of an endogenous retinoic acid gradient across embryonic development. Nature 496:363-366