Hot Keywords
Rett Syndrome Intellectual disability mitochondrial disease next-generation sequencing language disorders monogenic epilepsies early-onset epilepsy movement disorders autism suicide schizophrenia

Top
J Transl Genet Genom 2021;5:112-123.10.20517/jtgg.2021.09© The Author(s) 2021.
Open AccessOriginal Article

Duo Shared Genomic Segment analysis identifies a genome-wide significant risk locus at 18q21.33 in myeloma pedigrees

1Huntsman Cancer Institute, Salt Lake City, UT 84112, USA.

2University of Utah School of Medicine, Salt Lake City, UT 84112, USA.

3Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55905, USA.

Correspondence Address: Dr. Nicola J. Camp, Huntsman Cancer Institute, 2000 Circle of Hope, Rm 4757, Salt Lake City, UT 84112, USA. E-mail: Nicki.Camp@hci.utah.edu

    This article belongs to the Special Issue Genetics of Blood Cancers: from Etiology to Treatment
    Views:409 | Downloads:95 | Cited:0 | Comments:0 | :2
    Academic Editor: Susan L. Slager | Copy Editor: Yue-Yue Zhang | Production Editor: Yue-Yue Zhang
    ...

    © The Author(s) 2021. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, sharing, adaptation, distribution and reproduction in any medium or format, for any purpose, even commercially, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

    Abstract

    Aim: High-risk pedigrees (HRPs) are a powerful design to map highly penetrant risk genes. We previously described Shared Genomic Segment (SGS) analysis, a mapping method for single large extended pedigrees that also addresses genetic heterogeneity inherent in complex diseases. SGS identifies shared segregating chromosomal regions that may inherit in only a subset of cases. However, single large pedigrees that are individually powerful (at least 15 meioses between studied cases) are scarce. Here, we expand the SGS strategy to incorporate evidence from two extended HRPs by identifying the same segregating risk locus in both pedigrees and allowing for some relaxation in the size of each HRP.

    Methods: Duo-SGS is a procedure to combine single-pedigree SGS evidence. It implements statistically rigorous duo-pedigree thresholding to determine genome-wide significance levels that account for optimization across pedigree pairs. Single-pedigree SGS identifies optimal segments shared by case subsets at each locus across the genome, with nominal significance assessed empirically. Duo-SGS combines the statistical evidence for SGS segments at the same genomic location in two pedigrees using Fisher’s method. One pedigree is paired with all others and the best duo-SGS evidence at each locus across the genome is established. Genome-wide significance thresholds are determined through distribution-fitting and the Theory of Large Deviations. We applied the duo-SGS strategy to eleven extended, myeloma HRPs.

    Results: We identified one genome-wide significant region at 18q21.33 (0.85 Mb, P = 7.3 × 10-9) which contains one gene, CDH20. Thirteen regions were genome-wide suggestive: 1q42.2, 2p16.1, 3p25.2, 5q21.3, 5q31.1, 6q16.1, 6q26, 7q11.23, 12q24.31, 13q13.3, 18p11.22, 18q22.3 and 19p13.12.

    Conclusion: Our results provide novel risk loci with segregating evidence from multiple HRPs and offer compelling targets and specific segment carriers to focus a future search for functional variants involved in inherited risk formyeloma.

    Introduction

    Multiple myeloma (MM) is the second most common adult-onset lymphoid neoplasm and has the worst 5-year survival[1]. Inherited germline susceptibility for MM is consistently supported[2]: excess MM risk among relatives has been observed in family aggregation[3,4], epidemiologic case-control[5-9], and registry-based[10,11] studies. Disease rarity, short survival, clinical and locus heterogeneity challenge study ascertainment and genetic discovery[12]. Genome-wide association studies have identified 23 loci harboring common-risk single nucleotide polymorphisms (SNPs) for MM[13-19]. Family-based studies have identified rare germline variants in ARID1A and USP45[20], KDM1A[21], and DIS3[22] in exome sequencing. However, considerable missing heritability remains. Additional approaches are needed to aid the detection of the remaining risk loci and genes.

    We recently described a novel strategy to map genes involved in complex disease risk using extremely large high-risk pedigrees and allowing for intra-familial heterogeneity, called Shared Genomic Segment (SGS)[20]. Cases sharing genomic segments from a common ancestor through 15 meioses or more are unexpected at a genome-wide level[23], and hence a single large high-risk pedigree (HRP) can provide the power to identify novel loci with genome-wide significance[24]. Our resource of eleven large myeloma pedigrees included several with 3-4 cases and meioses in the 8-14 range[20]. While these remain extremely large families, they may lack sufficient power individually for genome-wide significance. Also, a multi-pedigree strategy is attractive. Evidence for the same risk locus in two extended pedigrees adds confidence to the locus and can build on the power of both. The remaining challenge for any multi-pedigree approach, however, is to adequately address heterogeneity between pedigrees[25].

    Here, we expand the SGS method based on combining evidence from pairs of HRPs, while still allowing for intra-familial heterogeneity within each pedigree. In our approach, duo-SGS, we fix one pedigree and optimize over all pedigree pairs to balance discovery with multiple testing. Both pedigrees must have a segregating genomic segment at the same risk locus. The method is robust to allelic heterogeneity as different alleles at the same locus may be shared within each pedigree. We apply the duo-SGS approach to eleven MM HRPs to identify novel loci involved in myeloma risk.

    Methods

    Duo-SGS method

    An overview of the duo-SGS approach can be found in Figure 1. After identifying HRPs and genotyping cases, the observed shared genomic segments in single pedigrees are established and compared between pedigrees, and genome-wide thresholds are determined.

    Figure 1. Overview of the duo-SGS method. SGS: Shared genomic segment; SNP: single nucleotide polymorphism.

    Observed duo-SGS sharing

    The single pedigree SGS approach has been described previously[20]. Briefly, the single SGS approach identifies shared observed genomic segments by defining consecutive runs of SNPs that are identity-by-state in a group of cases (Figure 1, Step 2). If the length of an observed segment is significantly longer than it would be by chance, inherited sharing (identity-by-descent) is implied. The nominal significance of each segment is assessed empirically. Expected length sharing under the null hypothesis is generated using a gene-dropping algorithm (Figure 1, Step 3). Chromosomes are assigned to the pedigree founders (those with no parents in the pedigree) randomly and according to a population linkage disequilibrium model. These simulated chromosomes are “dropped” through the pedigree structure using Mendelian inheritance expectations according to a genetic map for recombination. All members of the pedigree receive genotypes under the null hypothesis, and simulated genomic segments from this null configuration are established. These simulations are repeated at least one million times. The empirical P-value for an observed segment is the proportion of simulated segments that are identical or encompass the observed segment to the number of simulations. All subsets of at least two cases within a pedigree are assessed for observed segments. Then, at every position across the genome, the best evidence (lowest empirical P-value) for an excessive length of sharing is established (Figure 1, Step 4). This process results in a final optimized set of shared segments for a single pedigree. Each optimal segment corresponds to a specific subset of cases and has a nominal empirical P-value.

    For two pedigrees, the duo-SGS evidence is the combination of the nominal empirical P-values for the optimal segments at the same genome position in the two pedigrees. Specifically, the Fisher method to combine P-values was used. All possible pedigree pairs could be considered as separate analyses, but there are pedigree pairs (ways to select 2 pedigrees from n total pedigrees), and hence multiple testing can rapidly become an issue. Alternatively, a single analysis comprising optimization across all pedigree pairs could be considered, but this global approach may cloud individual pedigree-pair findings. To balance these two extremes, we propose a fixed-pedigree duo-SGS strategy (Figure 1, Step 5). The procedure is as follows: (1) fix a pedigree of interest; (2) calculate genome-wide duo-SGS evidence for the fixed pedigree with each of the other pedigrees; and (3) optimize across the duo-SGS findings to identify the most significant duo-SGS result at each point across the genome. The optimized findings over pedigree pairs are the duo-SGS results for the fixed pedigree. In this approach, we identify the best two-pedigree results that include the fixed pedigree. The procedure is then repeated for each pedigree, thus producing duo-SGS results for each pedigree.

    Genome-wide thresholds for duo-SGS

    Critical to interpreting the observed duo-SGS results are genome-wide significance duo-SGS thresholds for each pedigree (Figure 1, Step 6). To establish these, we echo the same optimization process in null data. Establishing these thresholds is similar to the calculation described for the single pedigree SGS method[20]. Under the reasonable assumption that the vast majority of the genome represents chance sharing (i.e., most of the genome does not contain a disease risk gene) we model the distribution for null sharing on the distribution of the empirical P-values for each pedigree. To avoid comparing the findings to themselves or skewing to possible true-positives, the empirical-P-values are perturbed, and the distribution-fitting is performed at 1 million simulations. The latter is to avoid inappropriate distribution-fitting to extreme outliers, the few results from the alternate hypothesis if included at their final resolution. To perturb an empirical P-value we determine its Wilson score 95% confidence interval (CI) (Equation 1) and randomly sample a value from within it.

    where p̂ is the empirical P-value, z is 1.96 (for the 95%CI), and n is the number of simulations (here, 1,000,000). The Wilson interval was selected because it always produces non-negative confidence bounds for the P-values. The genome-wide set of perturbed empirical P-values for a pedigree are considered the “null” P-values for that single pedigree. The duo-SGS procedure (described above) is performed using the single pedigree genome-wide null P-values. The result of this process is a set of optimal duo-SGS null P-values.

    Genome-wide significant and suggestive thresholds are determined following our previously described method for single pedigree SGS[20]. Briefly, the null duo-SGS P-values are log-transformed and fitted to a gamma distribution. The shape (k) and rate (σ) parameters of the fitted distribution are applied using the Theory of Large Deviations to calculate the significance thresholds by solving:

    where μ(X) is the genome-wide false positive rate, C is the number of chromosomes, α(X) is the probability of exceeding , and G is the genome length in Morgans[26]. The false-positive rate is set to 0.05 for the genome-wide significant threshold and 1.0 for the genome-wide suggestive threshold. After solving for X, the threshold, T is determined by . Thresholds are specific to each fixed pedigree to assess their duo-SGS results.

    MM high-risk pedigrees

    The statewide Utah Cancer Registry (UCR) has been an NCI-supported Surveillance, Epidemiology, and End Results (SEER) Program registry since its inception in 1966. The UCR was utilized to invite all individuals with myeloma in the state to participate. Peripheral blood was collected for DNA extraction from individuals who completed informed consent.

    The Utah Population Database (UPDB) is a unique resource[27]. It includes a 16-generation genealogy of approximately 5 million people with at least one event in Utah that is record-linked to the UCR and state vital records. Using the UPDB, ancestors whose descendants have an excess of disease based on internal cancer rates and years at risk can be identified and studied as HRPs. The UPDB was used to identify ancestors whose descendants showed a statistical excess of MM (P < 0.05). The expectation was based on internal disease rates based on birth cohort, sex, birthplace (in/outside Utah), and years at risk. The total number of myeloma cases in each HRP identified ranged from 4 to 37 cases. After annotating the pedigrees with those with DNA, 11 pedigrees were identified to contain 3 or 4 myeloma cases with DNA (28 individuals; 8 individuals were in more than one pedigree). In each pedigree, the cases were separated by 8 to 23 meioses.

    DNA from the 28 cases was genotyped on the Illumina Omni Express high-density SNP array at the University of Utah. Only high-quality bi-allelic SNPs and individuals with adequate call rates across the genome were included. The PLINK software[28] was used for quality control. SNPs with < 95% call rate across the 28 individuals were removed. After filtering, 678,447 SNPs remained. These SNPs were transformed to match 1000Genomes strand orientation.

    Individuals were removed if < 90% of the filtered SNPs are called. One myeloma case had a < 90% call rate and was eliminated from the study. We also checked for sex inconsistency based on the genotypes - all cases passed. PLINK relationship estimates were compared with the UPDB pedigree structures - no issues were found.

    The duo-SGS method was applied to the MM pedigrees to identify regions with genome-wide suggestive or significant evidence. Post-hoc, some duo-SGS regions were removed from consideration. Duplicate regions occur when the same pair of pedigrees identify the same region in both their fixed-pedigree results. In these situations, duo-SGS P-values are identical, but thresholds vary by which pedigree is fixed, potentially leading to different significance levels. The most significant result was reported, and the lesser removed. If an individual resided in two pedigrees and also shared the region in both pedigrees, the region was removed. If the region spanned a centromere, it was removed. Forty-two suggestive regions were removed as duplicates, involving an overlap individual or at the centromere.

    Results

    Duo-SGS findings were identified for each of the eleven MM HRPs. The significance thresholds for each fixed pedigree are in Table 1. One region at 18q21.33 reached genome-wide significance and 13 regions were genome-wide suggestive. Table 2 shows the details of the significant or suggestive regions identified, including the duo-SGS P-value, expected rate per genome μ(t), the two pedigrees involved, each segregating shared region in the pedigrees, and the overlapping region.

    Table 1

    Multiple myeloma high-risk pedigrees and duo-SGS thresholds

    PedigreeMultiple myeloma casesDuo-SGS thresholds
    TotalGenotypedMeiosesSignificantSuggestive
    260313163.82 × 10-84.31 × 10-7
    212253183.10 × 10-83.66 × 10-7
    482343131.02 × 10-78.92 × 10-7
    2024543138.56 × 10-87.71 × 10-7
    34955123163.94 × 10-84.35 × 10-7
    48833204231.01 × 10-81.21 × 10-7
    546699142112.21 × 10-71.92 × 10-6
    549917184211.11 × 10-81.29 × 10-7
    571744373202.23 × 10-82.90 × 10-7
    57683494162.80 × 10-82.61 × 10-7
    65162663138.32 × 10-87.50 × 10-7
    Table 2

    Duo-SGS genome-wide significant and suggestive regions

    LocusDuo-SGSFixed pedigreeDuo pedigree
    P-valueμ(t)MBRegionPed IDP-valueMBRegionNPed IDP-valueMBRegionN
    1q42.21.08 × 10-60.4640.34233,695,394-234,039,8575466991.62 × 10-31.22233,288,913-234,508,70425499173.80 × 10-50.34233,695,394-234,039,8574
    2p16.12.43 × 10-70.9110.1156,267,182-56,374,9105768346.70 × 10-50.8356,267,182-57,098,1394488331.89 × 10-41.0555,327,407-56,374,9103
    3p25.22.81 × 10-70.9641.0712,006,892-13,075,7605717441.80 × 10-42.0311,092,264-13,117,9992488338.20 × 10-51.0712,006,892-13,075,7603
    5q21.35.36 × 10-80.3540.02107,141,104-107,158,1585499174.30 × 10-40.36106,800,148-107,158,1583488336.00 × 10-60.85107,141,104-107,994,5654
    5.27 × 10-70.6300.64107,141,104-107,784,6846516264.78 × 10-30.64107,140,556-107,784,6843
    3.15 × 10-70.8400.85107,141,104-107,994,56521222.78 × 10-31.69107,118,121-108,803,3192
    5q31.11.99 × 10-70.3960.30133,295,892-133,591,1272601.83 × 10-40.58133,011,503-133,591,12735499175.60 × 10-51.30133,295,892-134,597,2673
    6q16.18.60 × 10-80.2501.7198,528,404-100,243,0335717442.50 × 10-61.7598,495,952-100,243,03335499171.70 × 10-32.7898,528,404-101,304,4112
    7.81 × 10-80.1211.2399,013,740-100,243,033349551.53 × 10-31.5999,013,740-100,606,56825717442.50 × 10-61.7598,495,952-100,243,0333
    3.99 × 10-70.3410.9898,495,952-99,478,30448238.55 × 10-32.3997,093,259-99,478,3042
    6.03 × 10-70.7520.4898,495,952-98,980,1136516261.32 × 10-20.8098,177,018-98,980,1133
    2.31 × 10-70.8560.3599,893,527-100,243,0335768344.81 × 10-30.4099,893,527-100,293,5313
    6q261.61 × 10-70.3790.48162,474,893-162,957,57521221.53 × 10-31.04162,250,522-163,291,3002488335.33 × 10-60.48162,474,893-162,957,5754
    1.30 × 10-60.5950.23162,724,247-162,957,5755466991.40 × 10-20.82162,724,247-163,544,7052
    6.20 × 10-70.7520.31162,615,606-162,928,898202456.38 × 10-30.31162,615,606-162,928,8983
    8.73 × 10-70.9710.28162,474,893-162,752,05048239.17 × 10-30.30162,450,471-162,752,0503
    7q11.234.09 × 10-80.2551.3775,441,794-76,813,2985499172.90 × 10-51.6675,441,794-77,098,7723488336.70 × 10-51.5975,222,070-76,813,2983
    1.50 × 10-60.7201.6675,442,855-77,098,7725466993.00 × 10-32.2275,442,855-77,663,48625499172.90 × 10-51.6675,441,794-77,098,7723
    12q24.314.50 × 10-70.4940.11121,438,844-121,547,455202459.34 × 10-41.16120,388,858-121,547,45535717442.60 × 10-51.59121,438,844-123,030,7883
    13q13.32.66 × 10-70.6890.1336,297,517-36,424,69821223.60 × 10-51.1235,308,299-36,424,6983488333.87 × 10-40.2436,297,517-36,537,6354
    18p11.226.48 × 10-70.7990.179,693,654-9,866,696202452.90 × 10-30.199,693,654-9,878,67035717441.23 × 10-50.589,285,824-9,866,6963
    18q21.337.30 × 10-90.0290.8558,208,260-59,059,2625499172.80 × 10-51.2257,945,602-59,167,8363488331.14 × 10-50.8558,208,260-59,059,2624
    18q22.32.72 × 10-70.5690.2068,834,971-69,039,896349552.80 × 10-40.6868,357,156-69,039,8963488335.10 × 10-50.4368,834,971-69,260,5664
    19p13.129.82 × 10-70.4060.4415,773,427-16,217,1075466992.64 × 10-31.3215,773,427-17,090,61225717442.10 × 10-50.5715,648,456-16,217,1073

    The genome-wide significant region at 18q21.33 [duo-SGS P = 7.3 × 10-9, μ(t) = 0.029] was found in pedigree pair UT-549917/UT-48833. A 1.2 Mb chromosomal segment (chromosome 18 57,945,602-59,167,836 bp) segregated to three MM cases separated by 17 meioses in pedigree UT-549917 (single pedigree P = 2.8 × 10-5). A nested 0.8 Mb chromosomal segment (58,208,260-59,059,262 bp) was observed in four MM cases separated by 23 meioses in pedigree UT-48833 (single pedigree P = 1.1 × 10-5). The intersecting 0.8 Mb region overlaps one gene: Cadherin 20 (CDH20). Figure 2 shows the two regions and the overlap.

    Figure 2. Duo-SGS genome-wide significant region. +/- indicates genotyped cases and SGS carrier status. Squares indicate male and circles female. Filled in shapes have a MM diagnosis. Pedigrees are trimmed to descendants with a MM case. SGS: Shared genomic segment; MM: multiple myeloma.

    Thirteen loci were found with genome-wide suggestive evidence [Table 2]. In four of these loci, several pedigree pairs provide duo-SGS evidence beyond genome-wide suggestive. The locus at 6q16.1 was previously identified as significant in single pedigree SGS in UT-571744, with risk variants in USP45 implicated[20]. Here, we find five pedigree pairs, all including UT-571744, and provide suggestive evidence, including one pair which achieves the second-highest duo-SGS significance in the study [μ(t) = 0.121, P = 7.8 × 10-8]. The 6q26 region achieves suggestive evidence in four pedigree pairs and harbors the PARK2 gene. At 5q21.3 four pedigree pairs show suggestive evidence and the locus contains the gene FBXL17. The locus at 7q11.23 is also supported by two genome-wide suggestive duo-SGS results. The remaining suggestive loci were supported by one pedigree pair: 1q42.2, 2p16.1, 3p25.2, 5q31.1, 12q24.31, 13q13.3, 18p11.22, 18q22.3 and 19p13.12. Genes in each of the duo-SGS regions are shown in Table 3.

    Table 3

    Protein coding genes in duo-SGS regions by locus

    LocusGene nameStartEnd
    1q42.2KCNK1233,749,750233,808,258
    3p25.2TIMP412,194,55112,200,851
    PPARG12,328,86712,475,855
    TSEN212,525,93112,581,122
    C3orf8312,556,43312,602,558
    MKRN212,598,51312,625,212
    RAF112,625,10012,705,725
    TMEM4012,775,02412,810,956
    CAND212,837,97112,913,415
    RPL3212,875,98412,883,087
    IQSEC112,938,71913,114,617
    5q21.3FBXL17107,194,736107,717,799
    5q31.1C5orf15133,291,201133,304,478
    VDAC1133,307,606133,340,824
    TCF7133,450,402133,487,556
    SKP1133,484,633133,512,729
    CTD-2410N18.5133,502,861133,561,762
    PPP2CA133,530,025133,561,833
    CDKL3133,541,305133,706,738
    6q16.1POU3F299,282,58099,286,660
    FBXL499,316,42099,395,849
    FAXC99,719,04599,797,938
    COQ399,817,27699,842,080
    PNISR99,845,92799,873,207
    USP4599,880,19099,969,604
    CCNC99,990,256100,016,849
    PRDM13100,054,606100,063,454
    6q26PARK2161,768,452163,148,803
    7q11.23CCL2475,440,98375,452,674
    RHBDD275,471,92075,518,244
    POR75,528,51875,616,173
    STYXL175,625,65675,677,322
    MDH275,677,36975,696,826
    SRRM375,831,21675,916,605
    HSPB175,931,86175,933,612
    YWHAG75,956,11675,988,348
    SRCRB4D76,018,65176,039,012
    ZP376,026,83576,071,388
    DTX276,090,99376,135,312
    UPK3B76,139,74576,648,340
    POMZP376,239,30376,256,578
    CCDC14676,751,75176,958,850
    FGL276,822,68876,829,143
    GSAP76,940,06877,045,717
    12q24.31HNF1A121,416,346121,440,315
    C12orf43121,440,225121,454,305
    OASL121,458,095121,477,045
    13q13.3DCLK136,345,47836,705,443
    18p11.22RAB319,708,1629,862,548
    18q21.33CDH2059,000,81559,223,006
    19p13.12CYP4F315,751,70715,773,635
    CYP4F1215,783,56715,807,984
    OR10H215,838,83415,839,862
    OR10H315,852,20315,853,153
    OR10H515,904,76115,905,892
    OR10H115,917,81715,918,936
    CYP4F215,988,83316,008,930
    CYP4F1116,023,17716,045,677
    OR10H416,059,81816,060,768
    TPM416,177,83116,213,813

    Discussion

    We expanded the shared genomic segment method to identify segregating chromosomal segments with overlapping statistical evidence from two HRPs. The strategy allows for genetic heterogeneity within each pedigree and provides formal significance thresholds for interpretation. The approach circumvents issues of intra-familial heterogeneity that can hinder mapping in large pedigrees. For complex diseases, large HRPs are likely enriched for multiple susceptibility variants[24] and sprinkled with sporadic cases; hence methods that require all cases to share to attain discovery power are not suitable. Here we optimize over subsets within pedigrees and consider pairing with all other pedigrees to provide the flexibility required. The method also specifically defines which pedigrees and cases share evidence at a locus, which is imperative for follow-up sequencing. Additional value may be gained by comparing demographic or clinical characteristics of the sharers in each pedigree to nuance the phenotype which may aid future gene mapping and provide insight into the nature of the mechanism of risk at a locus.

    Application of the novel duo-SGS approach to eleven MM HRPs implicated a novel genome-wide significant region at 18q21.33 in MM risk, as well as 13 suggestive regions. Other than 6q16.1, which overlaps with our previous single pedigree SGS study, all loci identified in this study provide novel regions of interest in myeloma. None of the regions overlapped with existing genome-wide association study loci or other prior rare risk variants implicated in MM. A next step would be to investigate the loci for rare and deleterious coding variants or regulatory variants. Pedigree segregation methods can provide statistically compelling regions to concentrate efforts to identify and characterize regulatory risk variants. Also, SGS results can be used as genomic annotations of prior evidence to layer with additional omic information or provide a focused region for interrogating regulatory risk variants.

    The literature supports a role of some of the genes found in our duo-SGS regions in MM. The genome-wide significant region at 18q21.33 contained CDH20, a gene that plays a role in intracellular adhesion by forming cadherin junctions. Cadherins have been suggested in solid tumor invasion, and metastasis as disruption to cell-cell junctions is a prerequisite[29]. Solid tumors co-aggregate in MM families suggesting a shared genetic background[10]. At 6q26, several pedigree pairs were genome-wide suggestive, and the overlapping segments fall in PARK2 which mediates proteasomal degradation. PARK2 is a tumor suppressor[30] and the gene harbors risk variants for lung cancer[31].

    While the duo-SGS approach is useful for analyzing pedigrees smaller than those typically required for the single pedigree SGS approach, a large number of meioses are still required. The HRPs in this study are still substantially larger than those available in most family-based resources (8-23 meioses between sampled cases). Hence the method has practical limitations in other settings. Nonetheless, the interesting regions identified in large pedigrees provide evidence that can be used to narrow the search for risk variants in smaller families as well, as demonstrated in our prior study[20].

    As in all family-based genetic studies, our results could be sensitive to inaccurate pedigree structures. However, relationship and ethnicity checks are standard protocols and mitigate the possibility of error. Another limitation to this study is the observational nature. Additional functional studies will be required to describe causation and characterize the mechanisms involved in these loci and myeloma risk.

    We have identified several novel loci that segregate in at least two myeloma HRPs. These loci are likely to harbor genes and rare risk variants for MM and are compelling new targets for inherited risk to MM.

    In conclusion, we developed a novel strategy for gene mapping in complex traits that uses multiple large high-risk pedigrees. The approach is robust to heterogeneity both within and between pedigrees and formally corrects for multiple testing to allow for statistically rigorous discovery. We applied this strategy to MM, a complex cancer of plasma cells, and identified one novel genome-wide significant locus at 18q21.33 and 13 suggestive loci. Our study offers a new technique for gene mapping and demonstrates its utility to narrow the search for risk variants in complex traits.

    Declarations

    Acknowledgments

    We thank the participants and their families who make this research possible. Data collection was made possible, in part, by the Utah Population Database and the Utah Cancer Registry. Computations were supported by the University of Utah’s Center for High-Performance Computing.

    Authors’ contributions

    Designed the study and wrote the manuscript: Waller RG, Camp NJ

    Contributed to the duo-SGS method development: Waller RG, Madsen MJ, Gardner J, Camp NJ

    Provided analysis support and edited the manuscript: Madsen MJ, Gardner J

    Provided clinical support and reviewed the manuscript: Sborov DW

    Generated figures and tables: Waller RG

    Availability of data and materials

    The Shared Genome Segment (SGS) analysis software is freely available and can be accessed online: https://uofuhealth.utah.edu/huntsman/labs/camp/analysis-tool/shared-genomic-segment.php. Data used in the duo-SGS analysis includes pedigree structures, myeloma diagnoses, and genome-wide SNP genotypes. Pedigree structures necessary for these analyses were acquired from the Utah Population Database (UPDB). These are considered potentially identifiable by the Resource for Genetic and Epidemiologic Research (RGE) - the ethical oversight committee for the UPDB. As a result, access to these data requires review by the RGE committee (contact Jahn Barlow, jahn.barlow@utah.edu). Upon RGE approval, we will provide the genotypes and pedigree structure in a format ready to be used by the SGS software.

    Financial support and sponsorship

    Research reported in this publication was supported by the National Cancer Institute of the National Institutes of Health under Award Number F99CA234943 and K00CA234943. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Methodological development of the duo-SGS method was supported by the University of Utah’s Center for Genomic Medicine. This research was supported by the Utah Cancer Registry, which is funded by the National Cancer Institute’s SEER Program, Contract No. HHSN261201800016I, the US Center for Disease Control and Prevention’s National Program of Cancer Registries, Cooperative Agreement No. NU58DP0063200, with additional support from the University of Utah and the Huntsman Cancer Foundation. Partial support for all datasets within the Utah Population Database is provided by the University of Utah, Huntsman Cancer Institute, and the Huntsman Cancer Institute Cancer Center Support grant, P30CA42014 from the National Cancer Institute.

    Conflicts of interest

    All authors declared that there are no conflicts of interest.

    Ethical approval and consent to participate

    Ethics committees at the University of Utah approved this research. All participants provided written informed consent.

    Consent for publication

    Not applicable.

    Copyright

    © The Author(s) 2021.

    References

    • 1. Sung H, Ferlay J, Siegel RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2021; doi: 10.3322/caac.21660.

      DOIPubMed
    • 2. Morgan GJ, Johnson DC, Weinhold N, et al. Inherited genetic susceptibility to multiple myeloma. Leukemia 2014;28:518-24.

      DOIPubMed
    • 3. Grosbois B, Jego P, Attal M, et al. Familial multiple myeloma: report of fifteen families. Br J Haematol 1999;105:768-70.

      DOIPubMed
    • 4. Hemminki K, Li X, Czene K. Familial risk of cancer: data for clinical counseling and cancer genetics. Int J Cancer 2004;108:109-14.

      DOIPubMed
    • 5. VanValkenburg ME, Pruitt GI, Brill IK, et al. Family history of hematologic malignancies and risk of multiple myeloma: differences by race and clinical features. Cancer Causes Control 2016;27:81-91.

      DOIPubMedPMC
    • 6. Bourguet CC, Grufferman S, Delzell E, Delong ER, Cohen HJ. Multiple myeloma and family history of cancer a case-control study. Cancer 1985;56:2133-9.

      DOIPubMed
    • 7. Eriksson M, Hållberg B. Familial occurrence of hematologic malignancies and other diseases in multiple myeloma: a case-control study. Cancer Causes Control 1992;3:63-7.

      DOIPubMed
    • 8. Schinasi LH, Brown EE, Camp NJ, et al. Multiple myeloma and family history of lymphohaematopoietic cancers: Results from the International Multiple Myeloma Consortium. Br J Haematol 2016;175:87-101.

      DOIPubMedPMC
    • 9. Landgren O, Linet MS, McMaster ML, Gridley G, Hemminki K, Goldin LR. Familial characteristics of autoimmune and hematologic disorders in 8,406 multiple myeloma patients: a population-based case-control study. Int J Cancer 2006;118:3095-8.

      DOIPubMed
    • 10. Kristinsson SY, Björkholm M, Goldin LR, et al. Patterns of hematologic malignancies and solid tumors among 37,838 first-degree relatives of 13,896 patients with multiple myeloma in Sweden. Int J Cancer 2009;125:2147-50.

      DOIPubMedPMC
    • 11. Altieri A, Chen B, Bermejo JL, Castro F, Hemminki K. Familial risks and temporal incidence trends of multiple myeloma. Eur J Cancer 2006;42:1661-70.

      DOIPubMed
    • 12. Sud A, Kinnersley B, Houlston RS. Genome-wide association studies of cancer: current insights and future perspectives. Nat Rev Cancer 2017;17:692-704.

      DOIPubMed
    • 13. Mitchell JS, Li N, Weinhold N, et al. Genome-wide association study identifies multiple susceptibility loci for multiple myeloma. Nat Commun 2016;7:12050.

      DOIPubMedPMC
    • 14. Weinhold N, Johnson DC, Chubb D, et al. The CCND1 c.870G>A polymorphism is a risk factor for t(11;14)(q13;q32) multiple myeloma. Nat Genet 2013;45:522-5.

      DOIPubMedPMC
    • 15. Went M, Sud A, Försti A, et al; PRACTICAL consortium. Identification of multiple risk loci and regulatory mechanisms influencing susceptibility to multiple myeloma. Nat Commun 2018;9:3707.

      DOIPubMedPMC
    • 16. Swaminathan B, Thorleifsson G, Jöud M, et al. Variants in ELL2 influencing immunoglobulin levels associate with multiple myeloma. Nat Commun 2015;6:7213.

      DOIPubMedPMC
    • 17. Chubb D, Weinhold N, Broderick P, et al. Common variation at 3q26.2, 6p21.33, 17p11.2 and 22q13.1 influences multiple myeloma risk. Nat Genet 2013;45:1221-5.

      DOIPubMedPMC
    • 18. Broderick P, Chubb D, Johnson DC, et al. Common variation at 3p22.1 and 7p15.3 influences multiple myeloma risk. Nat Genet 2011;44:58-61.

      DOIPubMedPMC
    • 19. Halvarsson BM, Wihlborg AK, Ali M, et al. Direct evidence for a polygenic etiology in familial multiple myeloma. Blood Adv 2017;1:619-23.

      DOIPubMedPMC
    • 20. Waller RG, Darlington TM, Wei X, et al. Novel pedigree analysis implicates DNA repair and chromatin remodeling in multiple myeloma risk. PLoS Genet 2018;14:e1007111.

      DOIPubMedPMC
    • 21. Wei X, Calvo-Vidal MN, Chen S, et al. Germline Lysine-Specific Demethylase 1 (LSD1/KDM1A) Mutations Confer Susceptibility to Multiple Myeloma. Cancer Res 2018;78:2747-59.

      DOIPubMedPMC
    • 22. Pertesi M, Vallée M, Wei X, et al. Exome sequencing identifies germline variants in DIS3 in familial multiple myeloma. Leukemia 2019;33:2324-30.

      DOIPubMedPMC
    • 23. Thomas A, Camp NJ, Farnham JM, Allen-Brady K, Cannon-Albright LA. Shared genomic segment analysis. Mapping disease predisposition genes in extended pedigrees using SNP genotype assays. Ann Hum Genet 2008;72:279-87.

      DOIPubMedPMC
    • 24. Knight S, Abo RP, Abel HJ, et al. Shared genomic segment analysis: the power to find rare disease variants. Ann Hum Genet 2012;76:500-9.

      DOIPubMedPMC
    • 25. McClellan J, King MC. Genetic heterogeneity in human disease. Cell 2010;141:210-7.

      DOIPubMed
    • 26. Lander E, Kruglyak L. Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat Genet 1995;11:241-7.

      DOIPubMed
    • 27. Hanson HA, Leiser CL, Madsen MJ, et al. Family Study Designs Informed by Tumor Heterogeneity and Multi-Cancer Pleiotropies: The Power of the Utah Population Database. Cancer Epidemiol Biomarkers Prev 2020;29:807-15.

      DOIPubMedPMC
    • 28. Purcell S, Neale B, Todd-Brown K, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007;81:559-75.

      DOIPubMedPMC
    • 29. Jeanes A, Gottardi CJ, Yap AS. Cadherins and cancer: how does cadherin dysfunction promote tumor progression? Oncogene 2008;27:6920-9.

      DOIPubMedPMC
    • 30. Xu L, Lin DC, Yin D, Koeffler HP. An emerging role of PARK2 in cancer. J Mol Med (Berl) 2014;92:31-42.

      DOIPubMed
    • 31. Xiong D, Wang Y, Kupert E, et al. A recurrent mutation in PARK2 is associated with familial lung cancer. Am J Hum Genet 2015;96:301-8.

      DOIPubMedPMC

    Cite This Article

    Waller RG, Madsen MJ, Gardner J, Sborov DW, Camp NJ. Duo Shared Genomic Segment analysis identifies a genome-wide significant risk locus at 18q21.33 in myeloma pedigrees. J Transl Genet Genom 2021;5:112-123. http://dx.doi.org/10.20517/jtgg.2021.09

    Views
    409
    Downloads
    95
    Citations
     0
    Comments
    0

    2

    Download and Bookmark

    Download

    Download PDF Add to Bookmark

    Share This Article

    Article Access Statistics

    Full-Text Views Each Month

    PDF Downloads Each Month

    Comments

    Comments must be written in English. Spam, offensive content, impersonation, and private information will not be permitted. If any comment is reported and identified as inappropriate content by OAE staff, the comment will be removed without notice. If you have any queries or need any help, please contact us at support@oaepublish.com.

    Article Access Statistics

    • Viewed: 409
    • Downloaded: 95
    • Cited: Crossref0

    Share This Article

    See Updates

    © 2016-2021 OAE Publishing Inc., except certain content provided by third parties