Zooming in on mutations

Evolution depends on variation, and variation depends on mutations. The evolution of new features, in particular, wouldn’t be possible without new mutations. Thus, mutation is of great interest to evolutionary biologists. More specifically, how mutations affect an organism’s fitness has been discussed and debated ever since the concept of mutations entered evolutionary theory. Relatively speaking, how many mutations are harmful, beneficial, or neither? What kinds of mutations are likely to be each in which parts of the genome? It’s hard to get a confident picture on such questions, partly because there are so many possible mutations in any given gene, let alone genome, and partly because fitness isn’t always easy to measure (see Eyre-Walker and Keightley [2007] for a review).

How do mutations affect fitness? Top: three theoretical possibilities; bottom: the real thing. (Hietpas et al., 2011)

Hietpas et al. (2011) did something really cool that hasn’t been done before: they took a small piece of an important gene, and examined the fitness consequences of every possible mutation in that sequence. This approach is limited in its own way, of course. Due to the sheer number of possibilities, it’s only feasible for short sequences, which might make it hard to generalise any results. But the unique window it opens on the relationship of a gene’s sequence and its owner’s success is invaluable.

What did they do?

Let’s examine the method in a bit more detail, mainly to understand what “every possible mutation” means in this context; because it’s a little more complicated than it sounds.

The bit of DNA they chose codes for a 9-amino acid region of heat shock protein 90 (Hsp90) in brewer’s yeast. So it really is small, only 27 base pairs altogether (recall that in the genetic code, 3 base pairs [1 codon] translate to 1 amino acid). Hsp90 is a very important protein found all over the tree of life. It’s a so-called chaperone, a protein that helps other proteins fold correctly, and in eukaryotes it’s absolutely required for survival.

The team generated mutant versions of the Hsp90 gene, each of which differed from the “wild type” version in one codon out of these nine. So each “mutation” examined could actually be anywhere between one and three mutations. They generated all possible mutants like that, amounting to over 500 different sequences.

[NOTE: If you check back at the genetic code, you’ll note that most amino acids are encoded by more than one codon, so not all of the resulting proteins differed from one another. Mutations that don’t change the amino acid are called synonymous. This will become important later.]

Then came the measurement of fitness. The researchers took a strain of yeast whose own Hsp90 gene was engineered not to work at high temperatures, and infected the cells with small pieces of DNA called plasmids, each carrying either a wild type (temperature-insensitive) Hsp90 gene or one of the 500+ mutants. They then grew all cells together in a common culture. After a while, they raised the growing temperature to let the engineered genes determine the cells’ survival.

They took samples every few hours – wild type yeast populations doubled every 4 hours – and did something that would not have been possible even a few years ago: sequenced the region of interest from this mixed culture, and compared the abundance of different sequence variants. By counting how many times each mutant was sequenced at each time point, they got a very good estimate of their relative abundances. The way each mutant prospered or declined relative to others over time gave a measurement of their fitness.

What did they find?

There are so many interesting things in this study that I’m not sure where to begin. Let’s start with the result that concerns the first question posed in my introductory paragraph. How are the mutations distributed along the deleterious – beneficial axis?

Perhaps not surprisingly, most non-synonymous mutations were harmful to fitness. I say not surprisingly because this protein has been honed by selection for many, many millions of years. It is probably close to the best it can be, although the researchers tried to pick a region that contained variable as well as highly conserved amino acids.

[ASIDE: They didn’t really succeed in that – among the 400+ species they say they used for comparison, 4 of 9 positions don’t vary at all, 2 are identical in almost all species, another 2 can have two amino acids with roughly equal chance, and only one can hold three different amino acids. I’ve seen more variation in supposedly highly conserved sequences over smaller phylogenetic distances. Perhaps Hsp90 is just that conserved everywhere.]

There were a few mildly beneficial mutations, but no highly beneficial ones. Deleterious mutations could be divided into two large groups, with very few in between: mostly they were either very harmful or close to neutral. This constitutes support for the nearly neutral theory of molecular evolution, but as I said, the sequence they examined is hardly representative of all sequences under all circumstances. It would be interesting to see how (if) the distribution changes in sequences under directional selection, or sequences that don’t experience much selection at all. I’m kind of hoping that that’s their next project 😛

The second interesting observation – interesting to me, anyway – is that nonsense mutations, those that introduce an early stop codon in the sequence, were not as unfit as complete deletions of the gene. A stop codon means the end of the protein – an early stop codon eliminates everything that comes after it. Cells making a truncated protein were lousy at survival, but not quite as lousy as cells with no Hsp90 at all. This is a bit strange, given that earlier the paper states that a region of Hsp90 that comes after their 9 amino acids is necessary for its function. A nonsense mutation in the test region removes that supposedly necessary part, so why did those cells do any better than mutants lacking the gene entirely?

Looking at synonymous mutations, the team determined that these don’t affect fitness much. This has practical importance, because synonymous mutations have long been used as a “baseline” to detect signs of selection in other mutations. If they weren’t neutral, the central assumption of that approach would fall down.

Another question the study asked was whether certain positions in the protein require amino acids of a certain type. The twenty amino acids found in proteins can be loosely grouped according to their physical and chemical properties. For example, some of them are positively charged, while others carry no charge at all; some are (relatively speaking) huge and some are tiny. These properties determine how a protein folds and what its different regions can do, so one would expect that in important positions, only amino acids similar in size and chemistry could work.

To find all the amino acids that worked equally well in a given position, Hietpas et al. looked at a subset of amino acid changes: those whose fitness was very close to the wild type. Surprisingly, they found that several positions tolerated radically different amino acids without losing much fitness. Quoting from the paper,

“[t]his type of physical plasticity illustrates the degenerate relationship between physics and biology: Biology is governed by physical interactions, but biological requirements can have multiple physical solutions.”

This is kind of stating the obvious in this context, but it does echo a more general observation about life. In evolution, there is often more than one way to skin a cat.

[ASIDE: Analogous enzymes provide a striking demonstration of that. These are pairs – or even groups – of enzymes that catalyse the same reaction, without bearing any physical resemblance to one another. Their sequences are different, their 3D structures are different, and their catalytic mechanisms are different, yet they do essentially the same thing. But there are also more familiar, if less extreme, examples. For instance, within vertebrates only, we see three different solutions for powered flight and even more variations on gliding (here are some of them).]

The researchers built a “fit amino acid profile” of their test sequence using these “wild type-like” mutations, then compared it to the actual pattern of amino acid substitutions observed in “real” Hsp90 proteins. It turns out the two are quite different: eight out of the nine positions are conspicuously less variable in real life than the fitness profile would predict. The paper lists a few possible explanations. Lab environments are not natural environments, and amino acids that work fine in their very controlled environment may not be so great under harsher or less stable real-world conditions. Wild type-like fitness does not mean the substitution is completely neutral – many of them are slightly deleterious, which may come out more strongly under natural circumstances, especially over the long term. And one of the substitutions would require more than one mutation at the DNA level – with strongly deleterious intermediate steps.

That last point leads me to the part of the study I personally found most interesting. Thus far, we’ve taken the genetic code as a given, and hardly paid any attention to it at all. But, in fact, the genetic code itself is a product of evolution. Most likely, it didn’t spring into existence fully formed when organisms invented protein synthesis. There is a mind-blowingly large number of possible genetic codes – why is it that organisms use this particular one, with only minor variations? We won’t go into all of the hypotheses about that, mostly because I’m not very familiar with them. It’s enough to note that in principle, the genetic code could be accidental – it just happened to be the one some distant ancestor of all living things stumbled on –, a chemical inevitability of some sort, or it could have risen to prominence by natural selection.

[ASIDE: The options are not mutually exclusive. For example, it is possible that the only important thing about the genetic code is how easy it is to mutate from particular amino acids to certain others – in other words, that it’s the structure of the code that’s under selection, while its finer details, such as which four codons stand for glycine, may be largely coincidental or determined by chemical necessity.]

For this tiny region of the Hsp90 gene/protein, it looks very much like selection had a hand in it. Hietpas et al. used their theoretical fit amino acid profile and a sample of 1000 randomly generated genetic codes – and asked how many substitutions it would take to switch between equally fit amino acids under each genetic code. Intriguingly, very few genetic codes made it as easy as the real one. In other words, the genetic code seems geared to minimise the number of deleterious mutations.

What’s really fascinating about that result is that it came from an analysis of such a tiny sequence. Earlier, I mentioned that it might be hard to generalise anything from a short sequence. But it’s hard to believe that this particular finding doesn’t have general applicability. The genetic code sets the rules for all proteins – if it weren’t optimised in general, what’s the chance that such strong optimisation would be detected in such a tiny sample? This also suggests that roughly the same amino acids are interchangeable across the board, regardless of which protein we’re talking about. (Which is not necessarily surprising if you’ve ever spent time comparing protein sequences between species, but still, it’s valuable as a new way of looking at a familiar phenomenon).

All in all, this is the kind of paper that makes me all giddy with excitement. It digs deep into fundamental questions in evolutionary theory, and it finds some intriguing answers. It’s also a great reminder of how amazingly far technology has come – merely sequencing 27 base pairs would have been a formidable task at the dawn of molecular biology, and now we can mix 500 different versions together, sequence all of them in a single experiment, and reliably count how many of each variant there are. And that’s nowhere near the limits of current sequencing technology. This is the future, folks, and it’s better than sci-fi.


Eyre-Walker A & Keightley PD (2007) The distribution of fitness effects of new mutations. Nature Reviews Genetics 8:610-618

Hietpas RT et al. (2011) Experimental illumination of a fitness landscape. PNAS 108:7896-7901

Internet science gets it wrong

While I was writing the post about treehoppers, I went in search of a simple diagram of insect anatomy. One of the pictures that Google threw my way was the one here. From my limited experience, Earth Life Web isn’t bad in general, but  they got the wings of their generalised insect totally wrong. Insects (without exception AFAIK) have three thoracic segments, and the wings of flying insects are on the second and third. NOT the first and second, which is what the Earth Life diagram looks like.

Just goes to show that you have to be very, very careful when mining scientific information from the web. Or from anywhere else, really.

Some funky bugs and the novelty of novelty

These must be some of the craziest-looking animals I’ve ever seen.

An assortment of treehoppers (family Membracidae), from Prud'homme et al. (2011)

(Yes, they are actually bugs, as in they belong to order Hemiptera)

Apparently, those extravagant shapes are all due to one special body part called the helmet – an outgrowth of the first thoracic segment of these insects. (Here‘s a little reminder of insect anatomy.) It only occurs in treehoppers, according to Prud’homme et al. (2011). I confess, I know very little about insects in general, and nothing about treehoppers in particular, but talk of evolutionary novelties always gives me a little kick.

[NOTE: I won’t define “novelty” exactly. You can probably figure out what it means, and it’s one of those funny concepts that defies an easy definition. Which is kind of the point of this post, though I didn’t originally intend it to come out that way.]

Evolutionary novelty, at least in complex, multicellular organisms like animals, is usually thought to come from tinkering more than “true” innovation. This is thought to hold on all levels; new genes are often modified versions of old genes, new cell types originate from old cell types, and new body parts are built on old body parts. If you think about it, this makes perfect sense: the old parts are already there, doing jobs that can be used as a starting point, whereas sticking a mutation in a piece of DNA that doesn’t encode anything and stumbling on a useful new gene is not exactly the likeliest event in evolution.

[ASIDE: Whole new body parts practically have to come from old parts on some level – the probability of evolution assembling a complex organ entirely from scratch has many times more zeroes after the decimal point than the probability of accidentally making a new gene. The question is how much of the new part is new. Is it built almost completely from an old structure, such as a whole arm – individual bones, muscles and everything – being modified into a wing, or does it only borrow basic building blocks and put them together in a completely new way?]

The outlandish helmets of treehoppers (sort of) uphold the prevailing view. Prud’homme et al. (2011) tell us that this has been a matter of some controversy – most held that they were “true” novelties that were not homologous to any other body part, but there were clues that there’s more to the story than that. And, indeed.

The first hints were anatomical. Helmets don’t simply grow out of the animal’s back – they are attached by a joint. Above that, they share a few other details, including their tissue structure and their veins, with the appendages almost all insects bear on their other thoracic segments: wings. What’s more, although the mature helmet is a single structure, it develops from two precursors that eventually fuse together. Two wings, two helmet primordia, you get the picture.

Prud’homme et al.‘s investigation involved more than dismantling the thoraxes of baby treehoppers. Homologous structures often share a common genetic underpinning, so they checked the expression of some “wingy” genes (or, to be precise, their protein products) to see just how deep the similarity between helmets and wings extended. The first of these, Nubbin, is wing-specific in better-studied insects. As expected if helmets are homologous to wings, the developing helmet was chock full of Nubbin. The two other genes they analysed, Distal-less (Dll) and homothorax (hth), are more generally expressed in insect appendages (wings, legs and antennae), defining their different regions from base (hth) to tip (Dll). They showed the same expression pattern in the helmet – which doesn’t necessarily mean that helmets are modified wings, but it does suggest they are based on some kind of appendage. And, given what appendages the other thoracic segments bear in the same position…

[NOTE: Well, I don’t know much about hth, but Dll is a bit problematic in this respect. It’s not just an “appendage gene” in insects, but also in a wide variety of other animals. Were it not for Dll expression, no one would suggest homology between, say, the tube feet of a starfish and the legs of a fly (Panganiban et al., 1997) – it’s pretty likely that Dll was originally more of an “anything that sticks out of the body” gene than an “appendage”, never mind a “wing”, gene proper. Dll/Dlx genes also do other stuff, like making neurons migrate in vertebrate brains (Anderson et al., 1997). So Dll expression alone doesn’t mean something is an appendage, let alone a specific type of appendage. Luckily, it’s not alone here. Incidentally, this is lesson number one of comparative/evolutionary developmental genetics. When the question is homology of a structure or process, always look at combinations of genes.]

This is not too surprising given the evolutionary history of wings, or what the fossil record was kind enough to preserve for posterity. The first known winged insects (link leads to drawing of Stenodictya lobata in Grimaldi and Engel, 2005) actually had winglets on the first thoracic segment as well, but those were lost before the last common ancestor of living insects. (How that happened in genetic terms, and how it may have been reversed in treehoppers, is also discussed in the paper, but it isn’t directly relevant to the novelty issue) In a way, treehoppers’ “invention” is a giant laugh in the face of Dollo’s Law, which proposes that complex features don’t re-evolve once they are lost (I kind of touched on this “law” here).

Nevertheless, helmets look nothing like wings and function nothing like wings. (To be fair, they look nothing like one another, either.) They are so dissimilar to their proposed evolutionary sisters that apparently their relationship eluded most researchers. How “novel” are they, then? It’s something of a philosophical question. Since, at this level of complexity, literally nothing comes from scratch, at what point do we stop calling something “tinkering” and start calling it “true novelty”?

As with most philosophical questions, I don’t think this one has a correct answer. That doesn’t mean these questions are not worth pondering. The way we word things influences the way we think about them. Exactly where (or even if) we draw a line between two fuzzy concepts isn’t important in my opinion. But to be aware that there is a dilemma about that line, and that other people may draw it in different places, is. Effective communication is one of my Big Issues, and being critical of your own thinking is an issue that ought to be Big for anyone doing science. (Or for anyone, full stop.) Thinking about unanswerable questions like this is a great way of exercising those (self-)critical muscles.

(Originally, I just wanted to gush about the excitement of figuring out the origin of novelties, but I managed to turn it into a philosophical treatise. Whoda thunk that? <.< )


Anderson SA et al.(1997) Interneuron migration from basal forebrain to neocortex: dependence on Dlx genes. Science 278:474-476

Grimaldi D and Engel MS (2005) Evolution of the Insects. Cambridge University Press.

Panganiban G et al. (1997) The origin and evolution of animal appendages. PNAS 94:5162-5166

Prud’homme B et al. (2011) Body plan innovation in treehoppers through the evolution of an extra wing-like appendage. Nature 473:83-86