Genome editing of human pluripotent stem cells to generate human cellular disease models

Disease modeling with human pluripotent stem cells has come into the public spotlight with the awarding of the Nobel Prize in Physiology or Medicine for 2012 to Drs John Gurdon and Shinya Yamanaka for the discovery that mature cells can be reprogrammed to become pluripotent. This discovery has opened the door for the generation of pluripotent stem cells from individuals with disease and the differentiation of these cells into somatic cell types for the study of disease pathophysiology. The emergence of genome-editing technology over the past few years has made it feasible to generate and investigate human cellular disease models with even greater speed and efficiency. Here, recent technological advances in genome editing, and its utility in human biology and disease studies, are reviewed.


Disease modeling with human pluripotent stem cells
There are two main varieties of human pluripotent stem cells (hPSCs): human embryonic stem cells (hESCs), which are derived directly from embryos (Thomson et al., 1998;Reubinoff et al., 2000) and continue to be considered the gold-standard hPSCs, and induced pluripotent stem cells (iPSCs), which are generated by the introduction of 'reprogramming factors' into fibroblasts or other differentiated somatic cell types (Takahashi et al., 2007;Yu et al., 2007;Park et al., 2008a;Nakagawa et al., 2008). A third type, stem cells derived by somatic cell nuclear transfer (SCNT) -the transfer of a nucleus from a differentiated cell into a denucleated ovumhave recently been successfully generated for humans (Tachibana et al., 2013).
All hPSCs share two useful theoretical properties. First, they can be maintained in culture for a large number of passages without loss of genomic integrity, which distinguishes them from standard cultured cell lines that are transformed or immortalized and have severely abnormal karyotypes. [In reality, upon continued passaging, both hESCs and iPSCs eventually accumulate genetic alterations that confer a growth advantage in culture (Draper et al., 2004;Cowan et al., 2004;Mitalipova et al., 2005;Maitra et al., 2005;Mayshar et al., 2010;Laurent et al., 2011;Taapken et al., 2011;Martins-Taylor et al., 2011;Amps et al., 2011).] Second, hPSCs can be differentiated into any of the myriad of somatic cell types in the human body. [In practice, the ability to differentiate into a desired cell type depends on the availability of an efficient protocol to achieve the differentiation, which at present is only true of a small number of cell types (e.g. Lee et al., 2010;Lian et al., 2013) but will surely expand to cover more in the coming years.] This feature is advantageous because it makes it possible to derive cell types for which standard cultured cell lines do not exist and which are difficult to obtain from patients as primary cells (e.g. neurons).
Owing to recent advances, iPSCs can now be derived from a skin biopsy (Dimos et al., 2008;Park et al., 2008b) or blood sample (Seki et al., 2010;Loh et al., 2010;Staerk et al., 2010) from virtually any given patient, making it possible to derive, expand and differentiate somatic cells that are genetically matched to the patient. In principle, this provides a means by which an investigator can extensively study a patient's pathophysiology without having to touch the patient after the iPSCs are generated.
However, there are several limitations to the utility of iPSC-based studies. First, the disease under study must have a strong genetic component. In the best-case scenario, the disease is monogenic in nature and driven by a single gene mutation (e.g. cystic fibrosis), which would be retained in patient-derived iPSCs and cause disease-related phenotypes to manifest at the cellular level in the appropriate differentiated cell type (e.g. lung epithelial cells). In contrast, for a disease that is driven by numerous genetic and environmental factors (e.g. myocardial infarction), the extent to which studies using patient-derived iPSCs will offer any advantage in understanding the disease process is unclear.
Second, as with any scientific study, the quality of iPSC-based studies depends on the availability of appropriate controls -any phenotypes observed in a patient's iPSC-derived cells should only be interpreted via comparison with control cells (Fig. 1). There are a number of published studies in which one or a few iPSC lines from patients with a disease and one or a few iPSC lines from individuals without the disease have been generated and differentiated, with claims that phenotypic differences observed between the cell lines are relevant to disease (e.g. Ebert et al., 2009;Lee et al., 2009;Ye et al., 2009;Carvajal-Vergara et al., 2010;Rashid et al., 2010;Moretti et al., 2010;Swistowski et al., 2010;Marchetto et al., 2010;Brennand et al., 2011;Sun et al., 2012;HD iPSC Consortium, 2012). However, these studies are potentially flawed because they do not account for possible confounders that might be responsible for the phenotypic differences.
Differences in genetic background are of greatest concern; even in studies in which healthy siblings have been used as controls for disease patients, only ~50% of the genome is shared between Disease modeling with human pluripotent stem cells has come into the public spotlight with the awarding of the Nobel Prize in Physiology or Medicine for 2012 to Drs John Gurdon and Shinya Yamanaka for the discovery that mature cells can be reprogrammed to become pluripotent. This discovery has opened the door for the generation of pluripotent stem cells from individuals with disease and the differentiation of these cells into somatic cell types for the study of disease pathophysiology. The emergence of genome-editing technology over the past few years has made it feasible to generate and investigate human cellular disease models with even greater speed and efficiency. Here, recent technological advances in genome editing, and its utility in human biology and disease studies, are reviewed.
any siblings, and phenotypic differences could be the result of DNA variants in the other ~50% of the genome, rather than the disease-associated mutations. Furthermore, a number of studies have documented that the process of generating, expanding and passaging iPSC lines can lead to the accumulation of a variety of genetic alterations, ranging from single-nucleotide variants to copy-number variants to chromosomal amplifications, deletions and rearrangements (Mayshar et al., 2010;Laurent et al., 2011;Hussein et al., 2011;Gore et al., 2011;Taapken et al., 2011;Howden et al., 2011;Martins-Taylor et al., 2011;Yusa et al., 2011;Amps et al., 2011;Ji et al., 2012). Another significant confounder is epigenetic state. A number of studies have documented that iPSCs vary widely with respect to genomic methylation patterns, in some cases seeming to retain epigenetic 'memory' reflecting the somatic cell of origin from which the iPSCs were reprogrammed; some iPSCs seem to retain this memory indefinitely, whereas others gradually lose this memory as they go through many passages in culture (Kim et al., 2010;Polo et al., 2010;Bock et al., 2011;Lister et al., 2011;Nishino et al., 2011;Bar-Nur et al., 2011;Kim et al., 2011;Nazor et al., 2012;Ruiz et al., 2012). Some of these studies suggest that epigenetic state can affect the differentiation potential of an iPSC line, i.e. differentiation into some cell types over others is favored. Other potential confounders include: unmatched age, gender and ethnicity between the patients and control individuals; differences in methodology used to induce pluripotency (e.g. lentivirus versus RNA transfection); and differences in passage number and adaptation to culture of the iPSC lines.
The most rigorous possible comparisons would be between cell lines that differ only with respect to disease mutations, i.e. otherwise isogenic cell lines. One way to ensure this would be to use wildtype and mutant cell lines derived from the same parental cell line (Fig. 1). This strategy would also eliminate, or at least mitigate, all of the other confounders, allowing investigators to directly connect genotype to phenotype to establish causality. Such a strategy would require the ability to efficiently introduce specific genetic alterations into the genomes of hPSCs at will. Fortunately, the emerging technology known as 'genome editing' is now putting this ability within easy reach.
An elegant demonstration of the superiority of an isogenic cell line study design over a study design comparing iPSC lines from patients versus control individuals was provided by a recent study of a mutation in the LRRK2 gene, which is implicated in Parkinson disease (Reinhardt et al., 2013). The investigators derived iPSC lines from patients with the LRRK2 mutation and healthy individuals without the mutation; they also used genome editing to correct the mutation in the mutant iPSC lines. They then differentiated the cell lines into neurons and compared global gene expression profiles, followed by cluster analysis to assess the degree of similarity among the cell lines. Notably, they found that the healthy iPSC lines and the mutant iPSC lines did not cluster in separate groups, as would be hoped; rather, one of the healthy lines clustered very closely with one particular mutant line, whereas the other healthy line was very different from all of the other lines. Two mutant lines from the same patient were quite different, with one of these mutant lines being more similar to one of the healthy lines

Fig. 1. A comparison of two study designs for disease modeling using human pluripotent stem cells. (A) Induced pluripotent stem cells (iPSCs) are
reprogrammed from an individual(s) with disease and a control individual(s). The iPSCs are differentiated into a cell type of interest; the cell lines are compared for relevant phenotypes. This study design is susceptible to a number of potential confounders emanating from the fact that the cell lines are not matched (genetically, epigenetically, etc.) and could have been derived by different methods and in different circumstances. The study design is also time-consuming and costly. (B) Human pluripotent stem cells (hPSCs) -whether human embryonic stem cell lines (hESCs) or pre-existing iPSCs -are modified with genome editing, thereby creating optimally matched cell lines. The wild-type and mutant hPSCs are differentiated into a cell type of interest; the cell lines are compared for relevant phenotypes. This study design minimizes confounders -making it more scientifically rigorous -as well as reducing the associated time and costs. and to a mutant line from a different patient. The only cell lines that reliably clustered close together (i.e. were extremely similar) were pairs of mutant lines with and without correction of the mutation by genome editing (Reinhardt et al., 2013).

The emerging promise of genome editing
Classical gene-targeting technology uses homologous recombination to target an investigator-specified gene for disruption or modification (Smithies et al., 1985;Thomas and Capecchi, 1987). This approach has proven to be invaluable through its use in mouse embryonic stem cells to generate germline knockout or knock-in mice. However, homologous recombination is challenging in hPSCs (Zwaka and Thomson, 2003) and, although homologous recombination has become a mainstay for investigating gene function, its use in mammals has been limited primarily to studies in mice. Knockout strategies utilizing homologous recombination in human somatic cells are similarly challenging and, as a result, technologies such as antisense oligonucleotides and short interfering RNAs (nucleic acids that match to sequences in cellular RNA transcripts and result in their degradation) have flourished as a means to knock down gene expression. However, these reagents interfere with gene expression only transiently, and the knockdown effect can be incomplete or extend to off-target genes (Qiu et al., 2005). In light of recent advances in the use of hPSCs for disease modeling, as described in the previous section, the demand for more efficient and rapid methods of gene knockout or modification has only increased.
The emerging technology of genome editing, also known as genome engineering, seeks to meet this need by providing the ability to more efficiently introduce a variety of genetic alterations, ranging from single-nucleotide modifications to whole gene addition or deletion, all with a high degree of target specificity. The key features of the most widely used genome-editing systems, in addition to the major advantages and disadvantages of each (Table 1), are described below.

Zinc finger nucleases
Zinc finger nucleases (ZFNs) are a type of genome-editing technology that is increasingly being used in academic and industry research (Urnov et al., 2010). ZFNs are fusion proteins consisting of an array of site-specific DNA-binding domains -adapted from zinc finger transcription factors -fused to the nuclease domain of the bacterial FokI restriction enzyme. Each zinc finger domain recognizes a 3-to 4-base-pair (bp) DNA sequence, and individual domains can be arranged in tandem to bind to an extended nucleotide sequence that is unique within a genome.
To cleave a target site of interest, ZFNs are typically designed in pairs that recognize sequences flanking the site; upon binding of the ZFN pair around the site, the FokI nuclease domains dimerize and generate a double-strand break (DSB) (Urnov et al., 2010). DSBs are repaired by the cell using either the error-prone process of nonhomologous end-joining (NHEJ) or homology-directed repair (HDR) with the corresponding locus on the sister chromosome serving as a repair template (Fig. 2). NHEJ can be used to introduce frameshifts into the coding sequence of a gene, thereby generating premature truncations that effectively knock out the gene. HDR can be exploited by the introduction of an exogenous repair template that harbors a desired mutation flanked by homology arms, thereby greatly improving upon the efficiency of traditional homologous recombination, in which the initiation of the process (generation of a DSB) must occur spontaneously. The exogenous repair template can either be in the form of a double-strand DNA vector or a singlestranded DNA oligonucleotide. In the case of the latter, homology arms of as little as 20-nucleotides in length are sufficient for the introduction of DNA sequences into the genomes of hPSCs (Soldner et al., 2011). The efficiency seems to be high enough that antibiotic selection to expedite the screening for correctly targeted clones might be unnecessary in some cases, with no subsequent need to remove an antibiotic cassette from the genome using the Cre-lox or FLP-FRT system (which typically leaves a 'scar' behind in the genome), saving a considerable amount of time.
Despite the advantages offered by ZFN technology, ZFNs have proven difficult for non-specialist investigators to engineer from scratch because it has not been straightforward to successfully assemble zinc finger domains to bind an extended stretch of nucleotides (Ramirez et al., 2008). Although a library of zinc finger components and protocols to perform screens to identify optimized ZFNs has been made freely available to the academic dmm.biologists.org

898
Genome editing of stem cells community (Maeder et al., 2008;Maeder et al., 2009), it can take months for non-specialists to engineer ZFNs that target a genomic site with high efficiency. Furthermore, target-site selection is limited -these freely available ZFN components can only be used for binding sites every few hundred bp throughout the genome. Alternative platforms to construct ZFNs have since emerged, and these show variation in speed, site selection and success in generating efficacious ZFNs (Sander et al., 2011;Gupta et al., 2012;Bhakta et al., 2013). A commercial option to obtain optimized ZFNs for a specified target site evidently has a high success rate but remains expensive.

Transcription activator-like effector nucleases (TALENs)
Recent studies of a class of proteins called transcription activatorlike effectors (TALEs) have characterized a newly identified DNA-binding module, termed a TAL repeat, that is used by each protein in a tandem array with 10-30 repeats to recognize extended DNA sequences with a 1-repeat to 1-bp correspondence (Bogdanove and Voytas, 2011). Each repeat has 33-35 amino acids, with two adjacent amino acids [termed the repeat-variable diresidue (RVD)] conferring specificity for one of the four DNA bases (Moscou and Bogdanove, 2009;Boch et al., 2009). Deciphering of the RVD 'code' has led to the creation of a new class of engineered site-specific nucleases comprising an array of TAL repeats fused to the FokI nuclease domain, termed TAL effector nucleases (TALENs) (Christian et al., 2010;Miller et al., 2011). TALENs function in a similar way to ZFNs in that they generate DSBs within a target site, and so they can also be used to knock out genes or knock in mutations (Fig. 2). However, TALENs seem to be far easier to design than ZFNs: the RVD 'code' has been successfully used to create many de novo extended TAL repeat arrays that bind with high affinity to desired genomic DNA sequences (Miller et al., 2011;Zhang et al., 2011;Hockemeyer et al., 2011). Moreover, there seem to be fewer constraints with respect to which sites can be targeted in human cells, with at least a few potential sites available within each 100 bp of genomic DNA, although methylation of the DNA target site can attenuate the binding affinity (

CRISPR/Cas systems
Even more recently, genome editing tools have been adapted from bacterial adaptive immune systems known as clustered regularly interspaced short palindromic repeats (CRISPRs) and CRISPRassociated (Cas) systems, which use a combination of proteins and short guide RNAs to recognize and cleave complementary DNA

Fig. 2. Genome editing to knock out genes or knock in DNA variants.
Engineered nucleases -whether ZFNs or TALENs -are designed to bind to a specific DNA sequence in the genome, typically as a dimer, as depicted at the top left. The DNA-binding domains of the proteins bind to flanking DNA sequences (indicated in bold) and position their nuclease domains such that they dimerize and generate a double-strand break (DSB) between the binding sites. In CRISPR/Cas systems, as depicted at the top right, the guide RNA recognizes and hybridizes a 20-bp protospacer in the genome (indicated in bold); the Cas9 protein binds the guide RNA, unwinds the DNA, binds to the NGG motif (indicated in blue) and generates the DSB. The consequence of the DSB is variable with respect to the sequence around the break because native enzymes might further process the free DNA ends. The DSB can be repaired by non-homologous endjoining (NHEJ), which usually restores the original sequence (indicated in gray) but occasionally introduces an insertion or deletion (indel) that can cause a frameshift knockout in the coding sequence of a gene. Alternatively, the DSB can be repaired by homology-directed repair (HDR) using a homologous template -either the endogenous sister chromosome or an exogenously introduced DNA repair template, whether a double-stranded vector or a single-stranded DNA oligonucleotide. If the repair template contains a mutation, the mutation (indicated in red) can be stably incorporated into the genome, resulting in site-specific mutagenesis.
sequences. The bacteria accumulate 'protospacers' that correspond to foreign DNA sequences, such as plasmids and phage genomes, which are then targeted for destruction. By early 2013, four groups had shown that heterologous expression of the Streptococcus pyogenes Cas9 protein along with a guide RNA complex (comprising either a single chimeric RNA or two separate RNAs) in mammalian cells results in DSBs at a site with (1) a 20-bp sequence matching the protospacer of the guide RNA and (2) an adjacent NGG amino acid sequence [termed the protospacer-adjacent motif (PAM)] recognized by Cas9 (Fig. 2) (Cong et al., 2013;Mali et al., 2013;Jinek et al., 2013;Cho et al., 2013). Thus, in principle, a CRISPR/Cas system can be easily adapted to target a genomic sequence by simply changing the guide RNA, which entails switching out only a 20-bp sequence, with the Cas9 protein component unchanged. This makes CRISPRs easier to engineer than ZFNs or even TALENs, particularly if one desires to generate a library of vectors to target numerous sites in the genome. Another potential advantage of a CRISPR/Cas system is that a single vector can accommodate multiple guide RNAs in series, which are then processed into individual RNAs that allow for simultaneous, multiplexed targeting of multiple sites in the same cell (Cong et al., 2013). At present, even with the most versatile CRISPR/Cas system, genomic site selection is limited to 23-bp sequences on either strand that end in an NGG motif (the PAM for S. pyogenes Cas9), which occur on average once every 8 bp (Cong et al., 2013). Studies to determine the relative efficacies, specificities and ease of use of ZFNs, TALENs and CRISPRs in hPSCs are ongoing (Table 1) and will no doubt influence the relative popularity of the genome-editing tools among the biomedical community.

Additional genome-editing tools
Other tools that have successfully been used for genome editing in human cells include meganucleases (Grizot et al., 2009), adenoassociated viruses (Russell and Hirata, 1998;Khan et al., 2010) and adenoviruses (Suzuki et al., 2008;Liu et al., 2012). Although each of these tools carries its own advantages, at the present time none of them offers the same adaptability and ease of use as ZFNs, TALENs and CRISPRs. Nonetheless, given the breathtaking rate of progress in the field, it would not be surprising if an even more attractive genome-editing tool were to emerge in the near future.

Differentiation and phenotyping of cell models
To date, there have been a number of reports demonstrating the feasibility of performing genome editing in hPSCs with ZFNs, TALENs, CRISPRs and other tools, although these studies were largely performed as proof-of-principle exercises (e.g. Lombardo et al., 2007;Suzuki et al., 2008;Zou et al., 2009;Hockemeyer et al., 2009;Hockemeyer et al., 2011;Soldner et al., 2011;Yusa et al., 2011;Zou et al., 2011;Sebastiano et al., 2011;Li et al., 2011;Mali et al., 2013). As discussed below, in only a few cases have genome-editing tools been used to generate isogenic wild-type versus mutant cell lines that have then been differentiated into disease-relevant cell types and shown to display phenotypic differences that give insight into disease pathophysiology.
In landmark studies, iPSC lines from patients with monogenic disorders have been 'cured' via genome-editing-based correction of the causal mutation and then compared with the parental lines.
Fibroblasts obtained from patients with Hutchinson-Gilford progeria syndrome (HGPS) and atypical Werner syndrome (AWS), each caused by mutations in the LMNA gene, were used to generate iPSCs . The investigators used an adenoviral vector containing the wild-type LMNA sequence to correct the mutation in each of the cell lines. The original and corrected HGPS cell lines were differentiated into vascular smooth muscle cells and fibroblasts; large proportions of the original differentiated cells displayed dysmorphic nuclei and senescence that are characteristic of HGPS, in contrast to corrected cells. An iPSC line from a Huntington disease (HD) patient with an allele bearing 72 polyglutamine repeats was altered to carry only 21 polyglutamine repeats . Upon differentiation into neural stem cells, the original HD cells displayed a significant increase in caspase-3/7 activity upon growth factor deprivation and were significantly more susceptible to cell death compared with corrected cells; in addition, the former displayed reduced mitochondrial bioenergetics compared with the latter, consistent with established HD pathophysiology.
Subsequent studies have gone a step further by both introducing disease mutations into wild-type cell lines and, in parallel, correcting the same mutations in patient-derived iPSC lines. Two studies have focused on the G2019S mutation of the LRRK2 gene, which is associated with familial and sporadic Parkinson's disease (PD). In one study (Liu et al., 2012), the investigators generated iPSC lines from patients with the mutation, used adenovirus to correct the mutation in one of the lines, and then differentiated the matched cell lines into neural stem cells (NSCs). The PD NSCs displayed progressive nuclear aberrations after being maintained for more than a dozen passages in culture; these aberrations were absent in the corrected NSCs. Furthermore, when the PD NSCs were propagated for more than a dozen passages, they became impaired in their ability to differentiate into neurons, compared with similarly passaged corrected NSCs. When the investigators used adenovirus to insert the LRRK2 G2019S mutation into wildtype hESCs, they found that NSCs from the mutant hESCs -but not NSCs from the wild-type hESCs -displayed the same nuclear and differentiation defects as the PD iPSCs, thereby establishing the necessity and sufficiency of the G2019S mutation for the disease phenotypes. The coup de grace was the investigators' demonstration of aberrant nuclear morphology in neurons in the hippocampal dentate gyrus of post-mortem human brain samples from individuals with PD, a pathological feature not previously described in PD. Thus, this study stands as the first example of a disease phenotype being initially discovered in an hPSC-based model system and then subsequently confirmed in patients.
In the second study on the G2019S mutation of the LRRK2 gene (Reinhardt et al., 2013), the investigators generated iPSC lines from PD patients with the mutation and from control individuals, and used ZFNs to correct the mutation in three of the patient-derived lines and to insert the mutation into a control iPSC line. The matched cell lines were then differentiated into midbrain dopaminergic (mDA) neurons. Interestingly, the investigators assessed a rather different set of phenotypes than those studied by Liu et al. (Liu et al., 2012), despite focusing on the same mutation. They found that mutant neurons consistently displayed reduced neurite outgrowth, as well as increased apoptosis in response to oxidative stress, when compared with isogenic wild-type neurons.

dmm.biologists.org
Expression profiling of pairs of isogenic wild-type and mutant cell lines revealed several genes that are consistently dysregulated by the mutant LRRK2 gene, including CPNE8, CADPS2, MAP7 and UHRF2; remarkably, individual knockdown of each of those genes in mutant neurons modulated their sensitivity to oxidative stress. The investigators also established that the increased sensitivity to stress of the mutant neurons was at least in part due to activation of ERK signaling and could be reversed with an inhibitor of ERK phosphorylation, pointing to a potential new therapeutic approach for individuals with PD.

Shortcomings of genome-edited cell models
A strong rationale for using genome-edited cell models for phenotypic studies is that, by assessing the effect of a DNA variant on an isogenic background, potential confounders will be minimized. However, this presupposes that genome-editing tools will yield cell lines that are truly isogenic. One large concern about the use of ZFNs, TALENs and CRISPRs -all of which are designed, after all, to introduce DSBs into genomic DNA -is that they will not only cleave at 'on-target' sites but also at 'off-target' sites. Thus, there is the possibility that the tools will introduce significant genomic alterations besides the desired DNA variants, rendering the resulting cell lines not truly isogenic and introducing a source of confounding.
Data remain scarce as to the extent of off-target effects of ZFNs, TALENs and CRIPSRs. In one study in which ZFN targeting was performed in hPSCs, the investigators searched ten predicted possible off-target genomic sites (based on sequence similarity to the on-target site) for evidence of mutagenesis and identified one event in 184 clones assessed (Hockemeyer et al., 2009). Although this might seem to be a low rate, when extrapolated to the entire diploid genome of 6 billion bp, this result implies that there is a concrete risk of an off-target event in any given targeted clone. Two studies of ZFNs that used unbiased methods to identify off-target genomic sites for several ZFN pairs documented infrequent offtarget effects at numerous loci in a cultured human tumor cell line (Gabriel et al., 2011;Pattanayak et al., 2011). A study in which TALENs were introduced into a pool of hPSCs documented low but measurable rates of mutagenesis at some of 19 predicted possible off-target sites (based on sequence similarity to the ontarget site) (Hockemeyer et al., 2011). Preliminary results in mammalian cells demonstrate that CRISPR/Cas systems can tolerate single-nucleotide mismatches from the expected target sequence (Cong et al., 2013).
Thus, off-target effects produced by genome-editing tools, however infrequent, might make it unrealistic to expect to obtain 100% isogenic wild-type and targeted cell lines in any given experiment. This does not undermine the genome-editing study design; rather, it establishes the degree of rigor that will be needed to be sure that any phenotypic differences observed between wildtype clones and targeted clones are truly related to the DNA variant of interest. Any one-by-one comparison of a wild-type clone and a targeted clone could be confounded by an off-target effect in one of the clones. However, given the low frequency of off-target effects at any given locus, it is unlikely that multiple clones will have the same off-target effect, and so a study in which multiple wild-type clones show consistent phenotypic differences from multiple targeted clones would argue for those differences being due to the DNA variant of interest. Thus, a prudent study design would entail generating and comparing at least two or three of each type of clone to mitigate any concern of genetic heterogeneity -whether offtarget effects or other genetic alterations accumulated during passaging of the cells -being a confounder, regardless of which particular genome-editing tool is employed.

Challenges and outlook
It seems likely that, within a few years, the use of genome editing to generate human cell-based disease models will become a standard, routine approach in the laboratory that could rival the use of genetically modified mice in popularity. Indeed, the former has the advantage of being far more rapid than the latterinvolving a timeframe of months instead of years -as well as potentially being better at reflecting human physiology. However, it also carries the disadvantage of being limited to the study of cell-autonomous phenotypes, which will be inadequate for assessing complex physiological conditions. One means of moving beyond 'cells-in-a-dish' studies would entail the incorporation of multiple hPSC-derived differentiated cell types into a single model (Di Giorgio et al., 2008). Another strategy would be to incorporate hPSC-derived cells into chimeric animal models, e.g. replacing a mouse's endogenous hepatocytes with engrafted hPSC-derived hepatocytes (Yusa et al., 2011), thus allowing for interrogation of the effects of human genetic variation in wholeanimal models.
As it becomes easier to introduce mutations into hPSCs, it will become feasible to test the effects of the mutations in multiple cell lines with different genetic backgrounds. This will allow investigators to assess the importance of genetic modifiers on disease penetrance, i.e. if a mutation evokes a disease phenotype in some cell lines but not others. In some cases, it might be more informative to start with a patient-specific iPSC line and use genome editing to 'cure' a disease mutation. The most robust possible study design would be to do both: insert a disease mutation into a wild-type cell line -thereby testing for sufficiency of the mutation for disease -and correct the disease mutation in a patientspecific iPSC line -thereby testing for necessity of the mutation for disease (Liu et al., 2012;Reinhardt et al., 2013).
Finally, genome editing will make it possible to go beyond disease modeling and facilitate the discovery of therapeutics. For example, genome-editing tools should make it straightforward to insert reporters into genomic loci of interest, allowing for RNAinterference screens or small-molecule screens to identify genes and probes that have a desired functional effect. Genome-editing tools might themselves become the therapies, as is the case in the use of ZFNs to disrupt the CCR5 gene in T cells, thereby rendering them resistant to human immunodeficiency virus (HIV) infection and useful for transplantation into HIV-positive individuals (Perez et al., 2008), a strategy that is now in clinical trials. Indeed, the ability to modify the human genome upon demand is so transformative that it certainly will be applied in ways that we can only begin to imagine.