Functional testing of a human PBX3 variant in zebrafish reveals a potential modifier role in congenital heart defects

ABSTRACT Whole-genome and exome sequencing efforts are increasingly identifying candidate genetic variants associated with human disease. However, predicting and testing the pathogenicity of a genetic variant remains challenging. Genome editing allows for the rigorous functional testing of human genetic variants in animal models. Congenital heart defects (CHDs) are a prominent example of a human disorder with complex genetics. An inherited sequence variant in the human PBX3 gene (PBX3 p.A136V) has previously been shown to be enriched in a CHD patient cohort, indicating that the PBX3 p.A136V variant could be a modifier allele for CHDs. Pbx genes encode three-amino-acid loop extension (TALE)-class homeodomain-containing DNA-binding proteins with diverse roles in development and disease, and are required for heart development in mouse and zebrafish. Here, we used CRISPR-Cas9 genome editing to directly test whether this Pbx gene variant acts as a genetic modifier in zebrafish heart development. We used a single-stranded oligodeoxynucleotide to precisely introduce the human PBX3 p.A136V variant in the homologous zebrafish pbx4 gene (pbx4 p.A131V). We observed that zebrafish that are homozygous for pbx4 p.A131V are viable as adults. However, the pbx4 p.A131V variant enhances the embryonic cardiac morphogenesis phenotype caused by loss of the known cardiac specification factor, Hand2. Our study is the first example of using precision genome editing in zebrafish to demonstrate a function for a human disease-associated single nucleotide variant of unknown significance. Our work underscores the importance of testing the roles of inherited variants, not just de novo variants, as genetic modifiers of CHDs. Our study provides a novel approach toward advancing our understanding of the complex genetics of CHDs. Summary: This study uses genome editing in zebrafish to demonstrate that a human DNA sequence variant of unknown significance might contribute to the complex genetics of congenital heart defects.


INTRODUCTION
Whole-genome and exome sequencing efforts are increasingly identifying genetic variation in the general human population as well as candidate genetic variants associated with disease (Lek et al., 2016;Wright et al., 2015). There are several approaches for predicting and testing the pathogenicity of a genetic variant (Cox, 2015;Richards et al., 2015;Starita et al., 2017). However, demonstrating a function of a particular sequence variant remains challenging. Genome editing now allows for the precise engineering of human genetic variants in animal models for rigorous functional testing (Doudna and Charpentier, 2014;Peng et al., 2014). There are examples of the use of CRISPR-Cas9 engineering in mouse and Caenorhabditis elegans animal models to demonstrate functional effects of human disease-associated sequence variants (Arno et al., 2016;DiStasio et al., 2017;Lin et al., 2016;Prior et al., 2017). However, it is not yet clear how effectively human variants of unknown significance can be functionally tested through genome editing in animal models.
Congenital heart defects (CHDs) are a prominent example of a human disorder with complex genetics (Fahed et al., 2013;Gelb and Chung, 2014;Zaidi and Brueckner, 2017). CHDs occur in ∼1% of live births and are the leading cause of infant death owing to birth defects. Intensive studies have uncovered prominent roles for transcription and chromatin factors in heart development and CHDs (Chang and Bruneau, 2011;Fahed et al., 2013;Gelb and Chung, 2014). Cardiac transcription factors of the GATA, HAND, MEF2, NKX, SRF and TBX families are required for heart development in mouse and zebrafish animal models, and mutations in genes encoding these factors can cause human CHDs (Evans et al., 2010;McCulley and Black, 2012;Olson, 2006). Large-scale wholeexome sequencing studies find that CHD cases show an excess of de novo mutations for many transcription and chromatin factors (Homsy et al., 2015;Zaidi et al., 2013). In spite of these efforts, these de novo mutations likely account for only ∼10% of CHDs (Gelb and Chung, 2014;Homsy et al., 2015;Zaidi et al., 2013). Although additional studies are identifying potential contributions of inherited mutations in CHDs (Jin et al., 2017), our understanding of the genetics of CHDs is still incomplete.
The genetics of CHDs is complex, in part, because the same candidate gene, and even the same sequence variant, can be associated with a spectrum of heart malformations and can even be present in control cases (Fahed et al., 2013;Gelb and Chung, 2014;Zaidi and Brueckner, 2017). This is exemplified in studies of NKX2.5 mutations in CHD patients and families (Elliott et al., 2003;McElhinney et al., 2003;Stallmeyer et al., 2010). Thus, genetic risk factors, or modifier genes, likely influence the phenotype of CHDs, but modifier genes are difficult to identify and characterize (Fahed et al., 2013;Gelb and Chung, 2014;Zaidi and Brueckner, 2017). In order to understand the etiology of CHDs and the roles of genetic risk factors in influencing the phenotypic spectrum of CHDs, it is imperative that we increase our understanding of how genetic variants and modifier alleles regulate heart development and contribute to CHDs. Sequence variants in human PBX genes have been identified in patients with CHDs (Arrington et al., 2012;Slavotinek et al., 2017), indicating that these PBX variants could contribute to CHDs. One of these variants, an inherited missense variant in the coding region of the PBX3 gene ( p.A136V; 9:128678097 C>T), occurred at a frequency of 2.6% in a cohort of CHD patients (0.66% in controls; Arrington et al., 2012). These patients exhibited a spectrum of CHDs, particularly outflow tract malformations (Arrington et al., 2012). Pbx genes encode three-amino-acid loop extension (TALE)-class homeodomain-containing DNA-binding proteins, which have diverse roles in development and disease (Cerdá-Esteban and Spagnoli, 2014;Moens and Selleri, 2006). In mouse embryos, Pbx1 is required for heart development, and loss of different combinations of null alleles of Pbx1, Pbx2 and Pbx3 leads to a spectrum of cardiac outflow tract defects Stankunas et al., 2008). Our previous studies have shown that zebrafish Pbx proteins are also needed for outflow tract development, and for early myocardial differentiation and morphogenesis (Kao et al., 2015;Maves et al., 2009). The PBX3 p.A136V variant lies in a highly conserved polyalanine tract, which has been implicated in Pbx binding to histone deacetylase (HDAC) chromatin proteins (Saleh et al., 2000). Although in silico programs predict this variant to be deleterious (Arrington et al., 2012), it is not known whether it affects the function of PBX3 or contributes to CHD. Because this variant is present in controls and has the potential to be inherited, it might represent a modifier or risk factor for CHDs. The PBX3 p.A131V variant is present in the human population with an allele frequency of >0.6% [Arrington et al., 2012;Exome Aggregation Consortium (ExAC), http://exac. broadinstitute.org/], and so it is likely to be excluded from studies of de novo or rare inherited variants associated with CHDs (Homsy et al., 2015;Jin et al., 2017;Sifrim et al., 2016;Zaidi et al., 2013).
Here, we use CRISPR-Cas9 genome editing in zebrafish to test whether the PBX3 p.A131V variant can function as a modifier allele in CHDs. In particular, we test whether this variant enhances the phenotype caused by loss of a cardiac specification factor, Hand2, in zebrafish heart development. Our study is the first example, of which we are aware, of using precision genome editing in zebrafish to demonstrate a function for a human disease-associated single nucleotide variant of unknown significance. Our work provides a proof of principle for using genome editing in zebrafish to test the functions of human DNA variants in complex genetic disease. Our work also underscores the importance of testing the roles of inherited variants as genetic modifiers of CHDs.

RESULTS
Zebrafish pbx4, but not pbx3b, is required for early cardiac morphogenesis An inherited heterozygous variant in PBX3, c.407 C>T, predicting p.Arg136>Val, was previously identified as enriched in a cohort of patients with CHDs (Arrington et al., 2012). The allelic frequency of this variant was significantly less frequent in the study's control population (P=0.047; Arrington et al., 2012), and is also significantly less frequent in the current ExAC database population (P=0.0047, Chi-square test; ExAC, http://exac.broadinstitute.org/). This PBX3 p.A136V variant is present in a highly conserved polyalanine tract, which lies between the two PBC domains that interact with HDAC and Meis proteins ( Fig. 1A; Choe et al., 2009;Saleh et al., 2000). The human and zebrafish Pbx genes that do not show 100% conservation of the polyalanine tract (human PBX4 and zebrafish pbx2 and pbx3a; Fig. 1A) appear to have reduced functional roles in development, as human PBX4 variants in the general population are observed at about the same frequencies as expected by chance (ExAC, http://exac. broadinstitute.org/), and zebrafish pbx2 null mutants are homozygous viable as adults (G.H.F. and L.M., unpublished). However, loss-of-function variants in PBX3 are highly underrepresented in the ExAC database, suggesting that human PBX3 is required for viability. The enrichment of the PBX3 p.A136V variant in a CHD patient cohort suggests that it might contribute a modifier role in the complex genetics of CHDs, and led us to investigate a potential function of this allele in zebrafish.
Engineering zebrafish pbx4 p.A131V variant strains Compared with the human PBX3 p.A136 site, the zebrafish pbx4 gene has an orthologous site of A131 (nucleotide C392; Figs 1A and 2A). To engineer zebrafish carrying the pbx4 p.A131V variant, we turned to the CRISPR-Cas9 system (Blackburn et al., 2013;Hwang et al., 2013aHwang et al., , 2015. Although there have been recent improvements, most studies report a low efficiency of introducing precise base changes using CRISPR-Cas9 in zebrafish (Hoshijima et al., 2016;Shah and Moens, 2016;Zhang et al., 2017Zhang et al., , 2018. Therefore, to optimize our efforts, we first wanted to ensure that we were using efficient CRISPR-component reagents. We began by designing and testing synthetic guide RNAs. We designed guide RNAs that utilized two adjacent potential protospacer adjacent motif (PAM) sequences (NGG) and overlapped the C392 nucleotide ( Fig. 2B,C). Synthetic single-guide RNAs (sgRNAs) were produced in vitro, as previously described (Hwang et al., 2013a), by annealing and ligating pairs of complementary oligonucleotides into a plasmid vector containing a T7 promoter (Table 1, Fig. 2C). To test the ability of each sgRNA to induce insertions or deletions (indels) at the target locus, we injected onecell-stage zebrafish embryos with a mixture of Cas9 mRNA and one of the three sgRNAs, allowed the embryos to develop for 24 h and then collected individual embryos for DNA analysis. Indels were detected by PCR amplifying the region around the target site and digesting the amplicons with PstI, which has a recognition site within the target site ( Fig. 2B,D). We observed a striking difference in the efficacy of the three sgRNAs. The first two sgRNAs produced no or extremely low levels of indels ( Fig. 2D and data not shown). In contrast, 24 of 24 embryos assayed after injection of Cas9+sgRNA-3 had clearly detectable levels of nondigested PCR product, indicating disruption of the PstI restriction site by CRISPR-induced indels (Fig. 2D).
Next, we designed and tested synthetic single-stranded oligodeoxynucleotide donor templates (ssODNs) for introducing the pbx4 p.A131V variant through homology directed repair (HDR) (Hwang et al., 2015). We designed five ssODNs for which the sequences overlap the C392 target site and match the genomic sequence, except at the C392 position, where the ssODNs contain a T residue (corresponding to *C in Fig. 2B; ssODNs listed in Table 1). At the time this work was being done, there was not a clear consensus regarding the design of oligodeoxynucleotides for HDR, with some studies having success with ssODNs with homology arms in the 18-26 nucleotide range (Bedell et al., 2012;Hwang et al., 2013b), and others using ssODNs with homology arms in the 40-49 nucleotide range (Chen et al., 2011;Hruscha et al., 2013). We therefore tested ssODNs with a variety of homology arm lengths. One ssODN had 40 nucleotides matching the genomic sequence on either side of the mismatched T and corresponded to the sense strand (ssODN-1). Two oligodeoxynucleotides had 20-nucleotide homology arms on either side of the base to be changed, with one oligodeoxynucleotide corresponding to the sense strand (ssODN-2) The *C nucleotide is nucleotide 392 in the A131 codon. Two adjacent PAM sequences (TGG and AGG) are underlined and the target site for sgRNA-3 is in uppercase. Surrounding sequence is in lowercase. (C) pbx4 guide RNAs tested. The middle column shows the genomic region for each guide RNA. The target site is in uppercase. The PAM is in bold underlined uppercase. Surrounding sequence is in lowercase. The right column shows the sequence of the variable region of the sgRNA produced after transcription from pDR274, which begins with a 5′ GG dinucleotide. A red 'G' indicates a mismatch with the genomic sequence; a green 'G' indicates a match. (D) DNA restriction fragment length polymorphism (RFLP) analysis of the pbx4 target region after injection of sgRNA-2 or sgRNA-3. The wild-type PCR amplicon is 684 bp and is cleaved to 241 bp and 443 bp fragments by PstI. Embryos were injected with either a low dose (300 pg Cas9+12.5 pg sgRNA) or a high dose (600 pg Cas9+25 pg sgRNA) of RNA. In embryos injected with Cas9+sgRNA-2, only background levels of nondigested PCR product are present, similar to those seen in noninjected embryos. Injection of Cas9+sgRNA-3 results in a large proportion of nondigested PCR product in all embryos assayed (arrow). (E) DNA RFLP analysis of F1 embryos, from CRISPR-Cas9-engineered F0 fish, shows incorporation of the C>T mutation in one embryo (asterisk), owing to F0 germline mosaicism. Digestion of the PCR product is dependent on the C>T base change.
(F) DNA sequencing chromatogram from heterozygous F1 embryo (asterisk by C) showing precise incorporation of the C>T change. (G) Wild-type sequence around the pbx4 C392 site, with the pbx4 sgRNA target site underlined and C392 highlighted. Sequencing of scm14 allele shows precise C392T/A131V editing. and the other to the antisense strand (ssODN-3). Finally, two oligodeoxynucleotides had 25-nucleotide homology arms and corresponded to the sense (ssODN-4) and antisense (ssODN-5) strands. We injected zebrafish embryos with Cas9 mRNA +sgRNA-3+each ssODN, allowed the embryos to develop for 24 h, and then assayed them for the presence of indels (using the PstI digest, as above for the sgRNA assays) and for the introduction of the C>T change. To detect the specific base change, a derived cleaved amplified polymorphic sequences (dCAPS) assay (Neff et al., 1998) was designed, such that a new AccI restriction site is created when C392 is changed to a T. Of the five ssODNs tested, two produced detectable levels of the restriction fragment expected from introduction of the C>T base change in some injected embryos (ssODN-4, 5/12 embryos; ssODN-5, 2/12 embryos). The remaining three oligodeoxynucleotides induced no detectable levels of C>T base change. None of the ssODNs appeared to affect the frequency of induction of indels, with all injection conditions yielding ∼90% of embryos with some level of indels (similar to that shown in Fig. 2D; data not shown).
Based on these results, we then co-injected the pbx4 guide RNA (sgRNA-3), Cas9 mRNA and the ssODN-4 oligodeoxynucleotide into one-cell zebrafish embryos and raised these F0 animals to adulthood. We screened a total of 60 adult F0 fish for those transmitting germline pbx4 p.A131V mutations by assaying their F1 embryos at 24 hpf for incorporation of the C392T mutation, using the same PCR and restriction fragment length polymorphism (RFLP) analysis used in screening the ssODNs (Fig. 2E). Eleven of 60 F0 fish yielded clutches of F1 embryos positive for the C392T change, irrespective of other indels induced at the pbx4 locus. From these 11 F0 fish, we observed that a range of one to six out of 12 F1 embryos was positive for the C392T change (Fig. 2E), yielding an 8-50% rate (average 14%) of germline transmission of the mutation. For these positive F1 embryos, we sequenced the genomic region around the CRISPR target site and found that two of the 11 F0 fish were transmitting the C392T change alone without any indels in pbx4 (9/11 had additional indels). Additional F1 embryos from the two F0 fish that were positive for the precise C392T change were then raised to adulthood and screened for the mutation, using the RFLP assay on F1 adult fin biopsies. The genomic region around the CRISPR target site was again sequenced to identify F1 adult fish heterozygous for the precise desired pbx4 p.A131V variant (Fig. 2F,G).
In addition to identifying F0 and F1 fish carrying the precise C392T change, we identified fish carrying a variety of indels at the pbx4 target site. The mutations we found, consisting of small (1-39 nucleotides) deletions at the target site, are similar to the CRISPR-Cas9-induced indels previously reported in zebrafish (Chang et al., 2013;Gagnon et al., 2014;Hwang et al., 2013a,b;Varshney et al., 2015). However, out of 13 different indels we identified in F1 fish (from 10 different F0 founders) none were insertions (even in combination with a deletion). In addition, a disproportionate number were in-frame deletions (9/13 indels). As we have only examined in detail the mutations resulting from targeting a single gene with CRISPR-Cas9 and an HDR ssODN, we do not know if the ssODN affected the type of indels produced, or if the indels induced were affected by the particular sgRNA used or by the target site sequence.
With this approach, we identified F1 fish, from two independent F0 founder fish, transmitting the precise C392T change. F1 fish carrying the precise change with no other mutations were then bred to wild-type fish to generate a zebrafish strain, pbx4 scm14 , carrying the pbx4 p.A131V variant (Fig. 2G). We sequenced the pbx4 gene in the pbx4 scm14 strain to confirm that there were no other changes introduced in pbx4 (data not shown).
Upon genotyping adult fish derived from a pbx4 scm14/+ ×pbx4 scm14/+ cross, we obtained the expected frequency of pbx4 scm14/scm14 fish (7/32 fish; Chi-square test including all genotypes of the cross, P=0.7788), showing that we obtain viable homozygous adults for the pbx4 p.A131V variant, whereas homozygous null pbx4 b557/b557 animals die at ∼5 days postfertilization (dpf) (Pöpperl et al., 2000). The adult viability suggests that the pbx4 p.A131V variant does not, on its own, lead to a significant defect in heart development (as we further confirm below).
To test the strength of the pbx4 p.A131V allele, we crossed the pbx4 scm14 strain with the null pbx4 b557 strain. From a cross of pbx4 scm14/+ fish with pbx4 b557/+ fish, we observed that transheterozygous pbx4 scm14/b557 fish survive to adulthood and are present at the expected frequency (10/43 fish; Chi-square test including all genotypes of the cross, P=0.2640). Thus, this test does not detect any effect of the pbx4 p.A131V allele. We then used the new pbx4 variant allele strain to test whether the pbx4 p.A131V variant functions as a genetic modifier of other CHD genes. To further test the function of the pbx4 p.A131V allele, we next wanted to additionally perturb the genetic burden of fish carrying the pbx4 p.A131V allele. In particular, we decided to directly test whether the pbx4 p.A131V variant functions as a genetic modifier and enhances the phenotype caused by loss of a known CHD gene, hand2. hand2 encodes a basic helix-loop-helix factor that has critical requirements for embryonic heart development in zebrafish and mice (Srivastava et al., 1997;Yelon et al., 2000), and mutations in the human HAND2 gene have been associated with CHD (Lu et al., 2016;Shen et al., 2010;Sun et al., 2016;Töpf et al., 2014). In zebrafish, hand2 is required for both cardiomyocyte differentiation and early myocardial morphogenesis (Garavito-Aguilar et al., 2010;Schoenebeck et al., 2007;Trinh et al., 2005;Yelon et al., 2000). Furthermore, our previous work used morpholino knockdowns to show that hand2 and pbx4 act together in zebrafish early myocardial morphogenesis (Maves et al., 2009). We therefore crossed our pbx4 p.A131V allele strain ( pbx4 scm14 ) with the established null mutant strain, hand2 s6 (Yelon et al., 2000). We additionally included the pbx4 b557 null allele in these crosses. Fig. 3A shows the genetic crosses used to obtain embryos of all possible homozygous and heterozygous mutant combinations needed for our analyses. We first examined early cardiac development in embryos from these crosses at 24 hpf, using RNA in situ expression of myl7 to examine myocardial precursors. For these experiments, embryos from all crosses were genotyped for pbx4 b557 , pbx4 scm14 and hand2 s6 , using tail tissue from post-in situ-hybridized embryos as described in the Materials and Methods. At 24 hpf, wild-type embryos have formed a heart tube ( Fig. 3B; Staudt and Stainier, 2012), and pbx4 scm14/scm14 embryos also appear to have a normal heart tube (data not shown). As we previously described, pbx4 b557/b557 embryos show an abnormally shaped, medially positioned myocardium ( Fig. 3C; Kao et al., 2015), and hand2 s6/s6 embryos show reduced, medial, crescent-shaped myocardial domains that have not assembled together properly, with defective fusion of the myocardial precursors at the midline ( Fig. 3D; Yelon et al., 2000). We quantified the myocardial fusion defects in these embryos by measuring the distance between the myl7 domains (Fig. 3H). The hand2 s6/s6 myocardial fusion defect is significantly more severe in pbx4 b557/b557 ;hand2 s6/s6 embryos, in that the myl7 domains have greater bilateral separation (Fig. 3D,E,H). We found that pbx4 scm14/b557 ;hand2 s6/s6 embryos also show a more severe effect on myocardial fusion than that seen in hand2 s6/s6 embryos, and, critically, the defect is also more severe than that seen in pbx4 b557/+ ;hand2 s6/s6 embryos, or in pbx4 scm14/scm14 ;hand2 s6/s6 embryos (Fig. 3D-H). These results show that the pbx4 p.A131V variant, combined with a null pbx4 allele, increases the severity of, or enhances, the early myocardial morphogenesis phenotype caused by loss of hand2 function. We analyzed the frequencies of the myocardial phenotypes in embryos from these crosses (Table 2). We classified embryos as having a wild-type heart tube, a medial heart cone (the pbx4 b557/b557 phenotype), a medial crescent (the hand2 s6/s6 phenotype) or a bilateral domain phenotype (the pbx4 b557/b557 ;hand2 s6/s6 phenotype). This analysis also shows that the myocardial fusion defect is more severe in pbx4 scm14/b557 ;hand2 s6/s6 embryos than in hand2 s6/s6 Fig. 3. The pbx4 p.A131V variant enhances myocardial morphogenesis defects caused by loss of hand2. (A) Genetic crosses of zebrafish strains used to obtain embryos for the analyses of pbx4;hand2 mutant embryos. All adult breeder fish used in crosses 1-3 were 'siblings', derived from the same clutch from a group cross. (B-G) Myocardial marker myl7 expression at 24 hpf. Dorsal views; anterior is up. Animal numbers for phenotypic classes are provided in Table 2. (B) The heart tube appears normal in pbx4 +/+ ;hand2 +/+ embryos. (C) pbx4 b557/b557 ;hand2 +/+ embryos have a medial heart cone myl7 domain. (D) pbx4 +/+ ;hand2 s6/s6 embryos show a crescent-shaped myocardial fusion defect of the myl7 domains. (E) A similar phenotype is observed in pbx4 b557/+ ;hand2 s6/s6 . (F,G) The myl7 fusion defect is more severe in pbx4 b557/b557 ;hand2 s6/s6 (F) and pbx4 scm14/b557 ;hand2 s6/s6 (G). (H) Quantitation of fusion defect of myl7 domains in different genetic combinations. myl7 distance measurements were made blind to embryo genotypes. The averages for the myl7 distances among the different genotypes were compared using one-way ANOVA, and P-values were corrected for multiple comparisons using Tukey's test. The boxes extend from the 25th to 75th percentiles, the whiskers are at the minimum and maximum, and the bar within the box represents the median. (I-N) Myocardial marker myl7 expression at 60 hpf. In I-J, ventral views; anterior is up. In K-N, anterior views; dorsal is up. Animal numbers for phenotypic classes are provided in Table 3. (I) The heart appears normal in pbx4 +/+ ;hand2 +/+ embryos. V, ventricle; A, atrium. (J) pbx4 b557/b557 ;hand2 +/+ embryos have dysmorphic hearts with bulges of the ventricular myocardium (arrow). (K) pbx4 +/+ ;hand2 s6/s6 embryos show an abnormally shaped, medial myocardium positioned more caudally between the eyes (asterisks). (L) A similar phenotype is observed in pbx4 b557/+ ;hand2 s6/s6 . (M,N) More severe myl7 bilateral domain phenotypes are observed in pbx4 b557/b557 ; hand2 s6/s6 (M) and pbx4 scm14/b557 ;hand2 s6/s6 (N) embryos. Scale bars: 50 μm. embryos, pbx4 b557/+ ;hand2 s6/s6 embryos or pbx4 scm14/scm14 ;hand2 s6/s6 embryos (Table 2).
Taken together, these results demonstrate that the pbx4 p.A131V variant functions as a genetic modifier and enhances the myocardial morphogenesis phenotypes caused by loss of hand2 function. Therefore, even though the pbx4 p.A131V fish are viable and do not have obvious heart defects in an otherwise wild-type genetic background, our results support our hypothesis that the pbx4 p.A131V allele can function as a genetic modifier in heart development.

DISCUSSION
Here, we used CRISPR-Cas9 precision genome editing to successfully engineer a zebrafish strain with a PBX3 p.A131V variant of unknown significance that was previously identified in a cohort of patients with CHDs. We engineered the human variant into the homologous pbx4 p.A131V site in zebrafish. We used the engineered zebrafish to demonstrate that the pbx4 p.A131V allele acts as a genetic enhancer of the known CHD gene hand2. Our work underscores the importance of testing the roles of inherited variants as genetic modifiers of CHDs. Our study is the first example, of which we are aware, of using precision genome editing in zebrafish to demonstrate a function for a human disease-associated variant of unknown significance. Our work helps advance our understanding of the complex genetics of CHDs, and also provides an example of using genome editing in zebrafish to test the causal roles and genetic interactions of human disease-associated DNA variants.
Many approaches in zebrafish and mammalian models have previously been used to characterize the functions of human heart disease-associated DNA sequence variants. One study showed that Wild-type heart tube  zebrafish embryos could be used to characterize heart function defects caused by a human sodium channel gene SCN5A variant, which is strongly associated with human heart disorders such as arrhythmias (Huttner et al., 2013). However, this study employed overexpression of the human gene in transgenic zebrafish. Zebrafish embryos have often been used to test the functions of potential human disease variants through complementation, in which the orthologous zebrafish gene is typically knocked out through genetic mutation or knocked down with antisense morpholinos, and then mRNA injections of a human wild-type or mutant form are used to attempt rescue of a phenotype (Davis et al., 2014). However, owing to the inherent transient and mosaic nature of the mRNA injection and overexpression approach, these complementation assays have challenges for examining genetic interactions. A CHD-associated GATA4 variant has been functionally characterized in a mouse model (Misra et al., 2012), but this was done with conventional embryonic stem cell targeting, which incorporates a targeting vector into the genome. This same GATA4 variant was also functionally characterized in patient-derived cardiomyocytes induced from pluripotent stem cells, and this analysis provided a deep, systems-level mechanistic understanding of how this particular variant disrupts cardiomyocyte gene expression and function (Ang et al., 2016). However, such cell culture models also have challenges for examining genetic interactions and might not reveal the full effects of a variant on a gene's function in vivo.
Although the development of CRISPR-Cas9 technology has made possible the creation of targeted single-nucleotide changes in zebrafish, the process is still inefficient. Our studies support optimization of the component reagents for successful singlenucleotide editing. We found that different sgRNAs and ssODNs had very different efficacies. Of the three sgRNAs tested, two failed to induce indels at a level we could detect with an RFLP assay, whereas the third resulted in readily detectable indels in nearly all injected embryos. The highly effective sgRNA has a 20-nucleotide targeting region with a single mismatch at the second position, whereas the ineffective sgRNAs had 20-nucleotide targeting regions with two additional mismatched nucleotides at their 5′ ends. Although sgRNAs similar to these (two free Gs 5′ to a 20-nucleotide protospacer) have been used in zebrafish effectively (Hwang et al., 2013b), most published studies have used sgRNAs with 20-nucleotide protospacers and no additional 5′ mismatched bases. One-or two-base mismatches at or near the 5′ end of the protospacer have been shown to be tolerated (Cong et al., 2013;Gagnon et al., 2014;Hwang et al., 2013b), and a guanine immediately 5′ to the PAM sequence has been shown to correlate positively with indel frequencies (Farboud and Meyer, 2015;Gagnon et al., 2014). Thus, the high incidence of indels seen with pbx4 sgRNA-3 might result from a combination of optimal protospacer length, the presence of only a single mismatch near the 5′ end of the protospacer and a G nucleotide in the final position of the protospacer before the PAM. Less is known about the design of homology-directed repair templates to maximize precise editing, and there is conflicting evidence as to whether ssODNs are more efficient at introducing single-nucleotide changes than long doublestranded DNA repair templates (Bedell et al., 2012;Boel et al., 2016;Chen et al., 2011;Hoshijima et al., 2016;Hruscha et al., 2013;Hwang et al., 2013b;Irion et al., 2014;Liang et al., 2017;Zhang et al., 2018). Previous studies have achieved templatemediated repair using ssODNs of a similar length to, or longer than, our ssODN-4 (Boel et al., 2016;Hruscha et al., 2013;Hwang et al., 2013b;Liang et al., 2017). Wild-type heart Numbers indicate the total number of embryos of each genotype with the indicated myocardial phenotype class. Our study provides further support for crucial roles for PBXrelated TALE-class homeodomain transcription factors in heart development and CHDs. Pbx1, Pbx2 and Pbx3 proteins and the Pbx-related factor Meis1 all contribute to outflow tract development in mouse embryos Stankunas et al., 2008). Zebrafish Pbx and Meis genes are also needed for proper heart development (Guerra et al., 2018;Kao et al., 2015;Maves et al., 2009;Paige et al., 2012). In human cardiomyocyte cell culture models, DNA binding sites for PBX and MEIS factors have been identified as enriched in open chromatin and occurring nearby sites for other cardiac transcription factors (He et al., 2011;Paige et al., 2012;Stergachis et al., 2013;Wamstad et al., 2012). Human sequencing studies have found PBX and MEIS gene variants associated with CHDs (Arrington et al., 2012;Crowley et al., 2010;Louw et al., 2015;Slavotinek et al., 2017). Notably, a recent study described five patients, each with a CHD and other congenital anomalies, that all had de novo missense or nonsense sequence variants in PBX1 (Slavotinek et al., 2017). The missense PBX1 variants occurred in or near the homeodomain and likely affect the transcriptional capabilities of PBX1 (Slavotinek et al., 2017). The PBX3 p.A136V variant that we addressed here is in the more N-terminal polyalanine tract, which might bind HDAC proteins (Saleh et al., 2000;Choe et al., 2009). The polyalanine tract is adjacent to the PBC domains, which interact with Meis and HDAC proteins (Choe et al., 2009;Mann et al., 2009;Saleh et al., 2000). Even though one caveat of our study is that we model the human PBX3 variant in the zebrafish pbx4 locus, the polyalanine tract is highly conserved in both PBX3 and Pbx4. As future studies identify additional inherited and de novo variants in PBX genes associated with CHDs, we will gain a better understanding of which human PBX genes, and which PBX protein domains, are needed for human heart development.
Our study also provides further support for the oligogenic basis of CHDs. Previous studies of Pbx genes in mouse heart development supported a multigenic basis for CHDs . Family members carrying the same HAND2 mutation can exhibit different cardiac defects (Sun et al., 2016), supporting a role for additional genetic changes influencing the severity of CHDs associated with HAND2 mutation. In order to reveal a contribution of the pbx4 p.A131V variant in zebrafish heart development, we had to employ an additional null allele of pbx4 as well as homozygous loss of hand2, conditions that are likely to not be present in human CHD cases. Our efforts underscore the challenges of modeling the functions of human disease-associated genetic variants in animal models. Mutations that are haploinsufficient in humans can often be tolerated when heterozygous in mice (Cassa et al., 2017;Georgi et al., 2013). There are increasing numbers of examples showing that to produce disease-related phenotypic effects of a specific locus in mice or zebrafish animal models, additional or more severe alleles need to be utilized (Bradford et al., 2017;Cassa et al., 2017;Distasio et al., 2017;Varga et al., 2018;Zhang et al., 2014). For example, human patients homozygous for a variant in COPB2 exhibited microcephaly, whereas a mouse model required the Copb2 variant allele to be combined with a Copb2 null allele to cause brain malformations (Distasio et al., 2017). Furthermore, we do not yet understand the full genetic burden of variants in human CHD cases. A recent study found evidence for rare inherited variants, in genes with a known association with cardiac malformations and in parent-child trios affected by atrioventricular septal defects, and these particular rare variants were generally not observed in control trios (Priest et al., 2016). This and our study support the idea that inherited variants across multiple genetic loci might contribute to the penetrance and expressivity of CHDs, and provide further support to the oligogenic inheritance of CHDs (Fahed et al., 2013;Gelb and Chung, 2014).
Forward genetics in mice was used recently to identify an interaction between two genes, Sap130 and Pcdha9, in causing hypoplastic left heart syndrome (HLHS) (Liu et al., 2017). Intriguingly, this study also identified one human HLHS patient with variants in both SAP130 and PCDHA13, a Pcdha9 homolog (Liu et al., 2017). In mice, Sap130 and Pcdha9 might work combinatorially, each regulating different pathways important for proper heart development (Liu et al., 2017). Although we previously described pbx4 and hand2 as working together in zebrafish heart development (Maves et al., 2009), it is possible that they are also working through parallel pathways. Future studies that continue to take advantage of animal models, as well as human genetic sequence data, will further enhance our understanding of the complex genetics of human diseases such as CHD.

Zebrafish husbandry
All experiments involving live zebrafish (Danio rerio) were carried out in compliance with Seattle Children's Research Institute Institutional Animal Care and Use Committee guidelines. Zebrafish were raised and staged as previously described (Westerfield, 2000). Staging time refers to hpf at 28.5°C. The wild-type stock and genetic background used was AB. The pbx4 b557 mutant strain was previously described and is likely a null allele (Pöpperl et al., 2000). pbx4 b557 genotyping was performed as previously described (Kao et al., 2015) using forward primer 5′-ACTCGGCGGACT-CTCGCAAGC-3′ and reverse primer 5′-GGCTCTCGTCGGTGATGG-CCATGATCT-3′ primers. The genotyping PCR product is 128bp; digesting with XbaI yields a 98bp product from the mutant allele. In some cases, pbx4 b557 animals were genotyped using a KASP assay (LGC Genomics). Reactions were run and fluorescence was measured in a Bio-Rad CFX96, and genotypes were assigned using the Bio-Rad CFX Manager software, according to instructions provided by LGC Genomics. Details for ordering the pbx4 b557 KASP assay are available upon request. The hand2 s6 mutant strain was previously described and is a deletion of the hand2 locus (Yelon et al., 2000). For hand2 s6 genotyping, we devised a quantitative PCR assay for the number of copies of the hand2 gene. hand2 was amplified with forward primer 5′-ACCAAAGCGTACTCC-GTCTG-3′ and reverse primer 5′-CAGCGAAGGAATAGCCGTCA-3′. pbx4 was also amplified as a normalization control using forward primer 5′-GCCGTTAAAACAGCCGTGG-3′ and reverse primer 5′-GTGTTGC-TGGAGAGTTTGCC-3′. Reactions were run in duplicate for both genes using the KAPA SYBR FAST kit (KAPA Biosystems KK4600) on a Bio-Rad CFX96 machine. The resulting Ct values were averaged, and a ΔCt was calculated to distinguish hand2 homozygous, heterozygous and wild-type embryos.

Pbx phylogenetic analysis
Accession numbers for the sequences aligned in Fig. 1A are provided in Table 4. Sequences were aligned using Clustal Omega (www.EMBL.org).

Quantitative reverse transcription PCR
Total RNA was isolated using TriZol (Ambion; Thermo Fisher Scientific) and reverse-transcribed with a SensiFAST cDNA Synthesis kit (Bioline BIO-65053). Primers were designed using Primer-BLAST such that they either span an intron or one of the pair spans an exon-exon boundary. Primers are listed in Table 5. Quantitative reverse transcription PCR (qRT-PCR) was carried out using a KAPA SYBR FAST kit (KAPA Biosystems KK4600) on a Bio-Rad CFX96 machine. For the analysis of Pbx expression levels in wild-type embryos at different stages, Ct values for the zebrafish Pbx genes were corrected for observed primer efficiencies and normalized to odc1. For the analysis of Pbx gene expression in pbx3b scm8 mutants, Pbx genes were normalized to eef1a1l1 and ΔΔCt values were calculated for four wild-type and four mutant replicates, each consisting of pools of eight 48 hpf embryos. P-values were calculated with an unpaired Student's t-test with Welch's correction for unequal population standard deviations, using the ΔΔCt values. The graph was constructed by log transforming the ΔΔCt values. Statistical analysis was performed and the box plot was made in GraphPad Prism 7. The boxes extend from the 25th to 75th percentiles, the whiskers are at the minimum and maximum, and the bar within the box represents the median.

Generation of mutant zebrafish strains with CRISPR-Cas9
Single-guide constructs were made by annealing pairs of oligonucleotides (listed in Table 1) and ligating them into BsaI-digested pDR274 (Hwang et al., 2013a). The single-guide plasmids were digested with DraI, and guide RNA was transcribed with the T7 Maxiscript kit (Ambion; Thermo Fisher Scientific). Cas9 mRNA was made by transcribing PmeI-digested pMLM3613 (Hwang et al., 2013a) with the T7 Ultra kit (Ambion; Thermo Fisher Scientific). One-cell-stage zebrafish embryos were injected with 600 pg (for pbx3b) or 400 pg (for pbx4) of Cas9 mRNA and 25 pg of an sgRNA in a volume of 2 nl. To generate the desired single-nucleotide change in pbx4, 50 pg of an ssODN (Table 1) was co-injected with Cas9 mRNA and the sgRNA. Zebrafish were screened for Cas9-generated mutations (insertions or deletions, 'indels') by amplifying with primers flanking the target site (indel assay primers, Table 1) followed by a digest to test for loss of a restriction site just upstream of or overlapping the PAM site (HpyAV for pbx3b, PstI for pbx4). To detect the single-nucleotide change in pbx4, a dCAPS assay was designed (http://helix.wustl.edu/dcaps/; Neff et al., 1998) such that PCR amplification of the mutant allele generated an AccI site ( pbx4 dCAPS assay primers in Table 1). Mutant alleles were identified in heterozygous F1 or F2 fish by Sanger sequencing and deconvolving the chromatograms with Poly Peak Parser (http://yosttools. genetics.utah.edu/PolyPeakParser/).

Immunoblotting
Embryos were obtained from an incross of pbx3b scm8/+ ;pbx4 b557/+ fish. At 48 hpf, tail tips were cut from the embryos and arrayed in 96-well plates for genotyping. The remaining portions of the embryos were arrayed in separate 96-well plates, to which sodium dodecyl sulfate (SDS) sample buffer was added and stored at −20°C until genotyping was completed, at which point the embryo lysates were pooled by genotype. Approximately one embryo equivalent was separated by reducing SDS-polyacrylamide gel electrophoresis and blotted as previously described (Maves et al., 2007). Blots were probed with anti-pan-Pbx (rabbit serum, 1:500; Pöpperl et al., 2000) and with anti-Actin as a loading control (1:500; Clone C4, Millipore). The anti-pan-Pbx antibody was raised and validated against multiple zebrafish Pbx proteins (Maves et al., 2007;Pöpperl et al., 2000;Waskiewicz et al., 2002). Infrared dye-labeled secondary antibodies (Rockland) were visualized using a LI-COR Odyssey infrared scanner.
Whole-mount RNA in situ hybridization and measurement analysis The following cDNA probes were used: myl7 (Yelon et al., 1999) and elnb (Miao et al., 2007). Whole-mount in situ hybridization colorimetric and fluorescent in situ staining was performed as previously described (Maves et al., 2007;Talbot et al., 2010), with the following modifications. Embryos at 60 hpf were depigmented in 1 part 0.1% KOH (vol.): 1 part 1× PBS-0.1% Tween (vol.): 0.1 part 30% hydrogen peroxide (vol.) for 2 h at room temperature with gentle agitation. Hybridizations for both colorimetric and fluorescent in situ experiments were performed in hybridization buffer with 5% dextran sulfate. Following staining, tail clips from post-in situhybridized embryos were lysed and genotyped for pbx3b scm8 , pbx4 b557 , pbx4 scm14 and hand2 s6 , as above. Imaging was performed on a Leica SP5 confocal microscope with a 40× water immersion objective. For myl7 measurements at 24 hpf, we imaged genotyped embryos. Sample sizes were not pre-determined, and no animals were excluded from the analysis. Embryos stained for myl7 were individually imaged at a consistent magnification on a stereomicroscope. ImageJ (https://imagej.nih.gov/ij/) was used to measure the area of myl7 expression and the distance between bilateral domains of expression, expressed in pixels. The investigator making the measurements was blinded to the sample genotype. For embryos with separate bilateral myl7 domains, three distance measurements were made for each embryo: between the medial edges of expression at the anterior extent of expression, between the medial edges of expression at the posterior extent of expression, and at the approximate midpoint between the anterior and posterior measurements. These three values were then averaged. For embryos with a single midline domain of myl7 expression, a distance value of 0 was assigned. For embryos with a crescent-shaped domain, the posterior distance measurement was 0 and the middle and anterior measurements were made as for separate bilateral domains. The averages for the myl7 distances among the different genotypes were compared using one-way ANOVA, and P-values were corrected for multiple comparisons using Tukey's test. ANOVA was performed and the box plots were made in GraphPad Prism 7. The boxes extend from the 25th to 75th percentiles, the whiskers are at the minimum and maximum, and the bar within the box represents the median.