Acerca de

Cuts, Cleaves, and Cash: A History of RADseq

If you have read through our dolphin genetics pages, you will probably have heard the term “RADseq.” RADseq is a shorthand name for restriction-site associated DNA sequencing. That may sound intimidating, but it’s a relatively simple process. Essentially, enzymes are used to cut DNA at certain short sequences that are located throughout the genome. The DNA around these cut sites is then sequenced. The enzymes that are used for this process are called restriction endonucleases, and there are many variations that cut the DNA in different ways. For example, in our work, we used SbfI and MspI. Using two enzymes together is called ddRADseq (with the “dd” standing for double-digest).

ddRADseq is a common technique used today. But what are its origins? What is the history of RADseq itself? Let's find out!

Scientists have been cutting DNA with restriction enzymes for over 50 years. Restriction enzymes themselves were first discovered in 1968 by a man named Werner Arber. He found that bacteria were able to destroy the foreign DNA of bacteriophages with enzymes, and he hypothesized that this was done by cutting at particular DNA sites. We know now that he was absolutely correct!

Restriction enzymes were not used for population genetics until 1979. New research found that variations in mitochondrial DNA could be detected by cutting the mitochondrial genome with many restriction enzymes (this paper used 6) and then filtering for different lengths of DNA on a gel. This was done in different species of deer mice, and the scientists found that each species exhibited unique proportions of specific fragment lengths. Essentially, this was an early way to discover mitochondrial haplotypes. However, there was no actual sequencing involved. It would take 20 more years before restriction enzymes and sequencing were effectively combined.

In 2000, a new method of discovering variants was introduced: reduced representation shotgun sequencing. Previously, discovering SNPs had required either designing PCR primers for specific locations in the genome or using shotgun sequencing of the whole genome. However, PCR primers required previous knowledge of the genome and normal shotgun sequencing required tremendous amounts of coverage of the genome before there was any confidence in the discovered SNPs. Therefore, Altshuler et al. decided to reduce the amount of genome that could be sequenced in order to improve “flexibility and efficiency.” This was done with restriction enzymes. Essentially, restriction enzymes that were known to cut at certain sites were used to isolate specific sections of DNA. Then only those sections were sequenced. This cut down on the cost and time of shotgun sequencing and still enabled the discovery of many SNPs. By 2008, this method was also able to be used for genotyping.

However, reduced representation shotgun sequencing was still not quite “RADseq.” The fragments created by the restriction enzymes were being sequenced, but not necessarily the area on both sides of the restriction site. That would come in October 2008 with Baird et al. This was the advent of RADseq as it is known today: sequencing around restriction sites with the ability to discover vast amounts of SNPs. At the time, this process was called “RADtag.” The paper also introduced barcoding samples in order to easily combine them during sequencing and save money. However, at this time, RADseq was only being done with one enzyme at a time.

In 2012, double-digest RADseq was introduced by Peterson et al. This enables size selection of specific fragments, reduces the amount of the genome that’s present in the library, ensures that sequences will be commonly sequenced between individuals, and therefore increases the confidence in any discovered variants. This opened up RADseq to a whole new list of applications (see Fig. 1 from Peterson et al. below). Additionally, it’s much cheaper and easier to analyze than whole-genome sequencing. Its cheapness, relative ease, and power to discover huge numbers of variants has led to an explosion in the use of ddRADseq ever since 2012.

For an example of the popularity of RADseq, I searched “radseq” on Google Scholar. From 2012 to 2017, there were 3,950 results; from 2017 to 2022, that number balloons to 11,400. Of course, this is a very rough estimate of the number of papers published about RADseq, but it serves to prove my point: RADseq has become a very popular technique in very little time. This has followed the general trend in genetics: improvements in technology and laboratory techniques (such as RADseq) has created a genetics/genomics revolution in the last 10 years. This has enabled a similar explosion in population genetics, as non-model organisms can now be sequenced at a rate that was previously unimaginable. In my own research, I have come across RADseq papers on every kind of animal, including yeast, bees, trees, and of course- whales. As this revolution continues, who knows what the future of RADseq will look like!

Dr. Werner Arber

Dr. David Altshuler

Threespine stickleback (study animal of Baird et al.)

Dr. Hopi Hoekstra (senior author on Peterson et al.)

Japanese Sugi Tree