Assigning single clinical features to their disease-locus in large deletions: the example of chromosome 1q23-25 deletion syndrome

Aim: Assigning a disease-locus within the shortest regions of overlap (SRO) shared by deleted/duplicated subjects presenting this disease is a robust mapping approach, although the presence of different malformation traits and their attendance only in a part of the affected subjects can hinder the interpretation. To overcome the problem of incomplete penetrance, we developed an algorithm that we applied to the deletion region 1q23.3-q25, which contains three SROs, each contributing to the abnormal phenotype without clearly distinguishing between the different malformations. We describe six new subjects, including a healthy father and his daughter, with 1q23.3-q25 deletion of different sizes. The aim of this study was to correlate specific abnormal traits to the haploinsufficiency of specific gene/putative regulatory elements. Methods: Merging cases with those in the literature, we considered four traits, namely intellectual disability (ID), microcephaly, short-hands/feet, and brachydactyly, and conceived a mathematical model to predict with what probability the haploinsufficiency of a specific portion of the deletion region is associated with one of the four


INTRODUCTION
The usual method to identify the shortest regions of overlap (SRO) in contiguous gene syndromes relies on the graphical identification of the area of minimal overlap between deletions in patients sharing the same phenotype. Although this approach is very efficient when dealing with traits present in all the subjects who share the deletion region, it is much less productive when the trait is shared by some of the patients only. The usual way to overcome this uncertain correlation is to attribute an incomplete penetrance to the trait, a definition that may hide multiple factors such as the influence of any other genetic factors necessary for the manifestation of the trait, the differences in the breakpoints of the deletion involving any different dynamics of chromatin interactions between enhancers and promoters, environmental factors, or more simply inaccurate assignment of phenotype. Obviously, "non-penetrant" deletions may either overlap the disease locus (DL) or not include it, so that they constitute a limitation to defining SRO boundaries. However, they still strongly modulate the probability profile of the DL location along the SRO, i.e., the probability for the DL to map at a given position, considering the whole body of experimental data (i.e., all the deletions, either penetrant or non-penetrant, overlapping a given genomic position inside the SRO). In fact, the trait(s) considered in a given genomic region are often de novo and present in restricted numbers of subjects, so that the exclusion of even a single case can really be limiting to a correct locus assignment. Therefore, it is highly desirable to find a probabilistic model that, by considering also the "non-penetrant" cases, makes more reliable the assignment of specific traits to specific genomic portions. For this purpose, we propose a new genotype-phenotype correlation approach, applying our statistical procedure to interstitial deletions of 1q23.3-q25, of which more than 30 cases have been reported, with the imbalance being mainly de novo with the exception of three subjects who have inherited the deletion from the affected mother (Patients P10 and P17 [1] , Patient A [2] , and Case 1 [3] ). These deletions are associated with a complex malformation condition consisting in proportionate pre-and postnatal growth deficit, cardiac malformations, small hands and feet with brachydactyly, intellectual disability (ID) of various degrees, and craniofacial dysmorphisms such as microcephaly, micrognathia, short nose with bulbous tip, dysplastic ears, elongated upper lip, and small chin have been reported in most subjects [1,3] . The relationship between the size and localization of the copy number variants and phenotypic abnormalities in thirty-five patients [1][2][3][4][5][6][7][8][9] allowed identifying three non-overlapping regions whose haploinsufficiency seemed crucial for the manifestation of some specific characteristics [1] . The SRO associated with growth and developmental delay has been progressively narrowed from 1.9 Mb [4] to a 179-kb region (chr1:172,460,683-72,281,412 hg19) [2] .
A subregion of 2.5 Mb (chr1:164,501,003-167,022,133 hg19), located proximally to the SRO (SRO-P), adds further complexity to the observed phenotype, being more commonly associated with cardiac and renal malformations. A third distal region of 2.7 Mb (chr1: 178.514.910-181.269.712 hg19; SRO-D) could also contribute to intrauterine and postnatal growth retardation [1] . Finally, deletions involving SERPINC1 (chr1: 173.872.942-173.886.516; MIM: 107300) result in low antithrombin-III activity, a risk factor for thrombophilia. We present the detailed phenotypic and molecular description of six new cases whose partially overlapping 1q24q25 deletions were identified by chromosome microarray analysis (CMA). Four cases, each encompassing at least one of the three critical regions, were sporadic and identified by studying unrelated patients with syndromic intellectual disability. The fifth case, with a 1q24.3q25.2 deletion that did not involve any of the three critical regions, was ascertained in a newborn baby after the unexpected detection of the same deletion in the healthy father. The latter was studied as the parent of a previous child carrying two CNVs, neither of them located on chromosome 1, which later turned out to be both inherited from the healthy mother.

Clinical data
All six subjects are described in more detail in the following section and their clinical characteristics are summarized in Table 1. Patient photographs are shown in cases where parents have consented to publication.

Case 1
The patient was a 17-year-old male born after a previous miscarriage. At birth, his mother and father were 29 and 31 years old, respectively. He has a younger healthy brother. Family history was remarkable for cognitive delay in the paternal lineage, not otherwise specified, and bipolar disturbance in the maternal one. The delivery was at term with fetal distress consisting of decreased heart rate patterns and meconiumstained amniotic fluid. Birth weight was 3,450 g (50th percentile), length was 51 cm (50th percentile), and cranial circumference-OFC was 35 cm (25th percentile). Apgar scores were 8/9 at 1'/5' , respectively. The perinatal period was remarkable for hypotonia and limb dyskinesia. Early motor milestones were slightly delayed: he sat between 7 and 8 months, crawled at 12 months, and walked autonomously at 18 months. Language and learning difficulties were noticed early. He started babbling at 18 months and language development was delayed. At the age of 13 years, Griffiths scale scores revealed moderate ID (overall IQ: 54) with pragmatic and narrative language difficulties and social skills impairment. Physical examination revealed craniofacial dysmorphisms including short neck with slight pterigium colli, hypoplasia of auricles with flat helix. His OFC was 53.8 cm (25th-50th percentile). In addition, ligamentous laxity, hyperextensibility of the finger joints, bilateral flat foot with sandal gap, bilateral genu valgus, spinal kyphosis, and decreased lumbar lordosis were observed. A supraclavicular cartilage cyst (3-mm diameter) was noted. Brain magnetic resonance imaging, kidney ultrasound, urinalysis, and functional blood tests were normal. The thyroid function test gave normal results. At the age of 16 years, he was classified as suffering from severe intellectual disability with marked repetitive movements, obsessive-compulsive traits, apathy, and abulia with episodes of coprolalia and soliloquy, without any self-hetero-aggressive behavior. Treatment with Risperidal or Abilify was recommended. At the age of 18 2/12 years, his height was 171 cm (< 5th percentile), weight 59.7 kg (10th-25th percentile), and OFC was 55.4 cm (25th-50th percentile). Array-CGH revealed a 4.2-Mb deletion of 1q23.3q24.2.

Case 2
The male child was born to a 40-year-old primigravida and her 43-year-old partner. Due to the father's oligospermia, the couple underwent two cycles of in vitro fertilization (IVF) through ICSI (intracytoplasmic sperm injection), which led to the conception of the patient. The delivery was normal after an unremarkable 40-week pregnancy with a birth weight of 2,380 g (third percentile), length of 45 cm (-2 SD), and cranial circumference-OFC of 33 cm (-1.75 SD). Apgar scores were 10/10 at 1'/5' , respectively. At 5.5 months, he began experiencing recurrent episodes of non-febrile seizures when falling asleep or waking up. Electroencephalogram (EEG) recording showed bilateral and rare paroxysmal slow abnormalities in the fronto-temporal region. A therapy with levetiracetam achieved a reduction of seizure frequency. At age 9.5 months, his height was -2.5 SD. At clinical examination, facial dysmorphisms including prominent forehead, hypertelorism, saddle nose, micrognathia, smooth philtrum with vermillion upper lip, small ears with hypoplastic helix, slight neck pterygium, sparse hair [ Figure 1A], and micropenis were observed. Audiological testing revealed mild sensorineural hearing impairment. At the age of 16 months, he started walking alone and expressive speech was absent. At the age of 22 months, his height was 74 cm (-3 SD), weight was 10 kg (-2 SD), and OFC was 47 cm (-2 SD). His hands and feet were broad with brachydactyly [ Figure 2A]. X-ray showed delayed bone age of 1.3 years. EEG displayed focal epileptiform abnormalities: spike-wave complexes on the left hemisphere. Patient was stable on the therapy with levetiracetam (2 cp × 90 mg). The brain MRI showed an enlarged third ventricle. Aarskog-Scott syndrome (OMIM 305400) was excluded following normal results of FDG1 gene mutation analysis. By 4.5 years of age, his height had decreased to -3.5 SD. His thyroid function and insulin-like growth factor-1 (IGF-1) level were normal. At the last evaluation at the age of five years, Griffiths scale scores revealed moderate ID (IQ: 50), language was absent, and the previously friendly behavior was now characterized by aggression and hyperactivity. Karyotype was normal and array-CGH revealed a 10.3-Mb deletion of 1q24.1q25.2 [Supplementary Figure 1A].

Case 3
The patient was born to an 18-year-old mother and a 20-year-old father, after a pregnancy characterized by the risk of miscarriage. He was delivered vaginally at 36 weeks with weight, length, and OFC far below third percentile. He was admitted for prematurity to Neonatal Intensive Care Unit. Peculiar dysmorphic  Figure 4 and Table 2. According to Chatron et al . [1] and Lefroy et al . [2] . Targeted MLPA analysis for GJB2 , GJB6 , GJB3 , WFS1 , and POU3F4 genes (SALSA MLPA P163-C1 GJB-WSI MRC-Holland, Amsterdam features and hypotonia were noted. A diagnosis of Aarskog syndrome was suggested but not confirmed by the molecular analysis of FDG1gene. He had a normal karyotype, 46,XY. His medical history was positive for failure to thrive and psychomotor delay. He started to walk unsupported at the age of three years and never developed verbal language. At the age of four years, he underwent surgical correction for unilateral cryptorchidism. At the last evaluation at the age of seven years,  he showed severe psychomotor delay. His weight was 15 kg (< 3rd percentile), height was 99 cm (-4 SD), and OFC was 44 cm (-4.2 SD). Physical examination revealed high frontal hairline, down-slanting palpebral fissures, hypertelorism, depressed nasal bridge, mild malar hypoplasia, anteverted ears, deep philtrum, and macrostomia [ Figure 1B]. His hands were small with short fingers, bilateral clinodactyly of the fifth finger, and bilateral single palmar crease. His feet were small with short toes, broad hallux, and bilateral "sandal gap" [ Figure 2B]. Other findings included mild hypotonia and joint laxity. Griffiths scale scores revealed severe ID (IQ: 34; Performance: 34) with absent language. EEG recordings showed an excess of fast rhythms particularly over anterior areas. During sleep, bursts of paroxysmal slow abnormalities were present bilateral, diffuse, and prevalent in anterior left areas. Non-epileptic myoclonus was present both during wakefulness and sleep. His behavior was characterized by impulsiveness. Array-CGH revealed a 13.7-Mb deletion of 1q24.2q25.3 [Supplementary Figure 1B].

Case 4
The 10-year-old patient was the first child born to 34-year-old healthy, non-consanguineous parents. Familiarity for cleft lip/palate and deaf-mutism was recorded. He has a younger healthy nine-yearold brother. The patient was delivered by caesarian section at 43 weeks of gestation after a pregnancy characterized by IUGR and poor fetal movements. His birth weight was 2,900 g (-3 SD); length and OFC were not recorded. Cleft lip/palate was surgically corrected at the age of one year. At the age of two years, his psychomotor and language development was moderately delayed and characterized by inattentivehyperactive behavior. At the same age, left-side cryptorchidism was surgically corrected. Mild growth hormone deficiency was documented but without the need for pharmacological treatment. When evaluated at the age of 10 years, his weight was 24 kg (< 10th percentile), height was 120 cm (3rd-10th percentile), and OFC was 49 cm (-3 SD  Figure 1B].

Cases 5 and 6
The pedigree is shown in Figure 3. The index patient (III.2) was a four-year-old child born to a 27-year-old mother, who during pregnancy suffered from preeclampsia and was treated with anticoagulant drugs (aspirin and heparin) for thrombophilia and eutirox for hypothyroidism. The delivery was induced at 36 weeks of gestation for oligohydramnios. His birth weight was 2,800 g (10th-50th percentile). The perinatal and neonatal period was unremarkable despite feeding difficulties characterized by gastroesophageal reflux until the age of nine months. He crawled at 10 months and walked alone at the age of 18 months. His speech development was delayed and, at the age of four years, he was able to pronounce incomplete words. His behavior was characterized by low frustration tolerance associated with heteroaggressivity and bruxism. A diagnosis of autism spectrum disorder (ASD) was made (QS: 75, F 84.0 ICD 10, 299.00 ICD 9). CMA revealed two  Figure 1C] or other abnormal features but for mild fingers ligamentous hyperlaxity at the hands [ Figure 2C]. His height was 164 cm, at the 25th percentile for the Sardinian population [10] , and his cranial circumference-OFC was 54 cm (25th percentile). The 1q24.3q25.2 deletion was established while his wife (Subject II.2) was 27 weeks pregnant (Subject III: 4, Case 6) and undergoing therapy for gestational diabetes and platelet aggregation inhibitors due to a previous miscarriage (III.1) at the 10th week of pregnancy and a subsequent intrauterine fetal death at the 39th weeks (III.3). This stillborn male was of 2,850 g (10th percentile), the cranial circumference-OFC of 29 cm (-3 SD) and length of 45 cm (< 3rd percentile). The morphological examination did not reveal any congenital malformation, while autoptic microscopic observation revealed macerated internal organs and venous thrombosis of umbilical cord, leading to a diagnosis of IUFD consistent with mild-moderate chorioamnionitis and fetoplacental thrombotic vasculopathy. DNA analysis was not performed. CMA on mother's blood revealed two deletions at chromosomes 8q24.3 of 124 kb and Xp22.2 of 58.9 kb [Supplementary Figure 2].
Patient III.4 (Case 6) was a female delivered by caesarian section at 37 weeks of gestation because of growth retardation (IUGR) and poor fetal movements. Her birth weight was 2,170 g (3rd percentile), length was 45 cm (10th percentile) and cranial circumference-OFC was 30 cm (-2 SD). Apgar scores were 10/10 at 1'/5' , respectively. The perinatal period was unremarkable, although, due to her inability to attach to the breast, she was fed on infant formula. The first neuropediatric assessment occurred at three months of age, showing OFC parameters of 35.2 cm (-3 SD), weight of 4,600 g (25th percentile), and length of 58 cm (50th-75th percentile). At the same age, cerebral ultrasound gave normal results. At last evaluation at the age of 8 months, her OFC was 39 cm (-3 SD) and weight was 7.5 kg (10th-25th). Minor facial dysmorphisms were noted [ Figure 1D]. CMA analysis, performed at birth in light of her father's CMA finding, highlighted the same 1q24.3q25.2 deletion of 5.8 Mb [Supplementary Figure 1D]. At the age of 41 days, routine chromogenic plasma testing revealed low antithrombin activity 3 (32%, normal 80%-120%), similar to what was documented in the father (45%, normal 70%-130%) [3] .

Molecular investigations
After obtaining the informed consent approved by the ethics committee for research at the corresponding institutions, DNA samples were prepared from blood of all six subjects and their parents. The study was conducted in accordance with the Declaration of Helsinki and national guidelines.

Gene content analysis
The gene content for each SRO was analyzed taking into account the haploinsufficiency (HI) and loss-offunction intolerance (pLI) scores. The HI score is defined as the predicted probability that a gene is more likely to exhibit haploinsufficiency (0%-10%) or more likely not to exhibit haploinsufficiency (90%-100%) based on differences in characteristics between known haploinsufficient and haplosufficient genes (https:// decipher.sanger.ac.uk/).
The pLI score represents the probability that a gene is extremely intolerant of loss-of-function variation (pLI ≥ 0.9). Genes with low pLI scores (≤ 0.1) are loss-of-function tolerant. This score is based on proteintruncating variants in the GnomAD database (https://gnomad.broadinstitute.org/). Moreover, according to gnomAD Gene constraint suggestions, to evaluate highly likely haploinsufficient genes, we also used the observed/expected score.

Whole-exome sequencing analysis
Whole-blood samples of all available family members [ Figure 3], except for the newborn baby (Case 6), were collected for WES analysis, which was performed by an external service provider (BGI Genomics, Hong Kong). According to the provider's description, whole-exome enrichment was carried out using Illumina kit and sequenced with the DNB-SEQ500 to generate 100-bp-paired end reads that were aligned to the human genome (UCSC GRCh38), at an average coverage of 150 ×.

Probability profiling of genomic regions linked to selected traits
To computationally infer the genomic segments being most likely associated with selected clinical features, we assumed that a specific trait was predominantly the outcome of the hemizygosity of specific DL, either a protein-coding gene or a putative regulatory element, rather than the synergistic effect of the haploinsufficiency of several genomic elements.
Given this assumption, the probability for a DL to map at a given genomic location essentially depends on the penetrance of its haploinsufficiency and on the causative and non-causative deletions that overlap the genomic position. Briefly, molecular data from patients, in whom the clinical status for a specific trait was assessed, were grouped and analyzed independently. Clearly, as not all patients were evaluated for a specific trait, the number of individuals in each group varied. In the first step of the procedure, we identified SRO regions, taking into account only overlaps between deletions associated with the trait. By definition, these SROs have probability 1 to contain the DL. The next step was to estimate the probability distribution inside SRO(s). At this purpose, we used a Bayesian approach to calculate, for each non-overlapping sliding window (Δ) of 1 kb within the SRO, the posterior probability to intersect the DL, conditioned by the experimental data (i.e., all the deletions overlapping the specific window inside the SRO). In this regard, we assumed that the a priori probability P (Δ overlaps DL) was inversely proportional to the SRO size and that the best estimator for the penetrance of the DL was the value which maximizes the likelihood function P (Experimental data given that Δ overlaps DL) (see Supplementary Materials, Mathematical Model).
In the last phase of the procedure, for each clinical feature (intellectual disability, microcephaly, kidney malformations, dysplastic ears, hypertelorism, short hands and feet, hypotonia, brachydactyly, microretrognathia, speech delay, and walk delay), custom UCSC tracks were automatically built to visualize in their genomic context the set of deletions and the probability profiles, calculated either in absolute or in log-scale. The software is available on request.

Parental origin analysis
The parent of origin was determined for three subjects, two (Cases 3 and 5) paternal and one (Case 2) maternal [Supplementary Table 1]. The parental DNA samples of the remaining subjects (Cases 1 and 4) were unavailable.

Whole-exome sequencing
Exome sequencing of Subject II.1 and his parents (trio analysis) did not provide a strong candidate variant likely relevant for ASD. Taking advantage of whole-exome sequencing data on the mother, we ruled out the possibility that the paternally inherited 1q24.3q25.2 deletion on Patient III.4 might have unmasked a recessive allele lying on the maternal chromosome. We also explored the hypothesis that variants in genes involved in coagulation cascade or fibronolysis could cause inherited predisposition to thrombophilia, possibly linked to the recurrent miscarriages observed in the mother. Interestingly, while we did not identify any candidate variants in genes already associated with thrombophilia, WES analysis demonstrated in the mother two missense variants (NM_001061.6:c.796C > T:p.R266W and c.1279G > A:p.A427T) in previously defined [1,2] . The size of each SRO is indicated (see Discussion) the TXBAS1 gene (MIM 274180) [Supplementary Figure 3]. These variants were in trans as only one of them (c.796C > T) was identified in the son (Subject II.1), rare (AF < 0.001 on several databases), and predicted to impact on protein function by in silico analysis. Both variants were technically verified by Sanger sequencing. The TBXAS1 gene encodes the enzyme thromboxane synthase (TXAS), which catalyzes the conversion of prostaglandin H2 to thromboxane A2 (TXA2), a potent vasoconstrictor and inducer of platelet aggregation [14] . Biallelic missense mutations in TBXAS1, accounting for a decreased TXAS activity, have been associated with the Ghosal hematodiaphyseal syndrome (MIM 231095), a disease characterized by abnormal bone remodeling and anemia.

Short region of overlaps
According to the genotype-phenotype correlations emerging from our study, we defined three new SROs, each associated with specific phenotypic traits, such as ID, microcephaly (MCH), and skeletal anomalies including short hands and feet and brachydactyly. Their details are summarized in Table 2 and visualized in Figure 5.

Probability profiling of genomic regions linked to selected traits
Computational prediction of the DLs localization allowed us to identify in the eleven traits examined a total of 26 SROs with sizes ranging from 0.038 to 22 Mb (mean: 3.5 Mb) [Supplementary Table 2].
Importantly, genomic intervals having a cumulative probability to contain the DL > 85% are considerably shorter than their corresponding SROs, reducing the number of high-priority candidate genes. Striking examples of this reduction concern SROs related, respectively, to ID (Peaks 1 and 2), brachydactyly (Peaks 1 and 2), and microcephaly (Peak 1) [Supplementary Table 2].
To check whether the intrinsic limitation of CMA to precisely map the breakpoints could affect the results, we decided to perform the analysis taking into account for each rearrangement either the smallest or the largest deletion regions, as defined by CMA. Interestingly, while nine out of eleven trait-related analyses remained unaffected, results concerning the traits "dysplastic ears" and "speech delay" markedly differed in the localization and size of the intermediate SRO [ Supplementary Figures 4 and 5], essentially due to a different set of overlapping deletions involved in the definition of that SRO. This finding suggests that caution should be applied in drawing firm conclusions when interpreting the results, as uncertainty about exact breakpoint localization as well as inaccuracy of clinical assessment may lead to erroneous SRO localization and probability profiles.

DISCUSSION
In this paper, we describe six individuals with deletions scattered within chromosome 1q23.1q25 where three specific SRO deletion syndromes are reported: a proximal one of 2.  In particular, two SROs (SRO-I and -D) are described as associated with IUGR resulting in short stature up to -5 SD, microcephaly up to -4 SD, small hands and feet with fifth finger clino-brachydactyly, and variable degree of ID, in addition to peculiar facial dysmorphisms [1,2,4,6] .
Indeed, our subjects with deletion including both SRO-I and -D (Figure 4, Cases 2-4) exhibited all these features [ Figures 1 and 2, Table 1]. On the contrary, our Case 5 with a de novo 1q24.3q25.2 deletion (5.9 Mb, chr1:172,667,560-178,548,677) that did not contain the SRO-I and only partly the proximal portion of SRO-D [ Figure 4] has no ID or microcephaly, works in a qualified profession, and has a stature at the 25th percentile for the Sardinian population [7] . He also does not show any of the dysmorphic features [ Figure 3A and Table 1] reported for the overlapping 1q24q25 deletions. However, his nine-month-old daughter (Case 6) with the same 1q deletion [ Figure 4] had a history of IUGR and showed mild craniofacial dysmorphisms, including microcephaly (-3 SD), micro-retrognathia, and short neck [ Figure 3B], as observed in our Cases  [1] , while the light blue vertical box represents the new SROs indicated by 1-3 defined by this study (see Table 2 for the details). (bottom) A graph showing the estimated probability distribution of the genomic location of the disease loci associated with the traits   Figure 6A and B is indicated as N.1.
This deletion is common to the 17 cases, including our Case 1, listed with an asterisk in Figure 5A with deletions ranging from 276 kb to 14.1 Mb. All have from severe to moderate ID but not microcephaly [ Figure 5B]. The probability mass distribution at this region, as computationally calculated according to methods [Supplementary Table 2], shows that highest values for ID, kidney abnormalities, dysplastic ears, hypotonia, and speech delay [ Figure 6A, Supplementary Figures 4 and 5].
Consistent with these findings, de novo, deleterious PBX1 sequence variants result in a highly variable syndromic form of intellectual disability, which includes external ear abnormalities and congenital defects of the kidney and urinary tract. In fact, most patients with 1q deletion including PBX1 have renal abnormalities and the association between Cakut syndrome and PBX1 variants/deletions is well demonstrated [15,16] . PBX1 alterations may also contribute to severe behavioral traits such as autism and obsessive-compulsive disorder [15] . Indeed, in addition to moderate ID, the phenotype of our Case 1 was characterized by repetitive movements and psychiatric traits resembling Tourette syndrome, such as obsessive-compulsive behavior with episodes of coprolalia and soliloquy [ Table 1]. Interestingly, PBX1 gene has been recently identified among pleiotropic risk loci that play important roles in the neurological development processes associated with psychiatric disorders [17] .
Within this region, the ATP1B1 gene [ Figure 6C] shows the highest pLI value with observed/predicted scores indicating haploinsufficiency intolerance [Supplementary Table 3]. ATP1B1 (OMIM 182330) encodes for the subunit ß1 of Na,K-ATPase family, responsible for the homeostasis of the electrochemical gradients of Na and K ions across the plasma membrane. While mutations of the of Na,K-ATPase α-subunits have already been associated with neurological diseases (reviewed by Clausen et al. [18] ), no confirmed mutations in any of the ß-subunits have yet been correlated with human disorders. Interestingly, ATP1B1, as part of an Na,K-ATPase multiprotein complex, interacts with the calcium channel TRPV4 (MIM *605427) [19] whose heterozygous de novo variants, either or missense, associate with skeletal disorders. Similar to our patients (Cases 2 and 3), those with TRPV4-related skeletal dysplasias include short stature, small hands and feet, and brachydactyly [20][21][22] , suggesting a role for ATP1B1 in these disorders.
Deletions within SRO-I (chr1:172,460,683-172,281,412) show high probability to include a disease-locus for ID, microcephaly, and, as previously defined by Lefroy [2] , skeletal anomalies including short hands and feet and brachydactyly [ Figures 5 and 6A].
Specifically, the SRO-I includes Dynamin-3 (DNM3, OMIM *611445), a gene harboring a 7.9-kb antisense transcript for miR199-214 genes [23] . These two miRs are involved in vertebrate skeletogenesis [24,25] , thus suggesting a role for the skeletal phenotype in 1q24 deleted patients [2,6] . Indeed, the phenotype of Cases 2-4 with 1q24q25 deletions fully including the DNM3, with its two guests miR199-214 [ Figure 6D], share significant pre-and postnatal growth deficiency, microcephaly, and small hand and feet with fifth finger clino-brachydactyly [ Figures 1A and B and 2A  BRINP2/FAM5B is the only gene of the region that is intolerant to haploinsufficiency but its role is unknown [ Figure 6E and Supplementary Table 4].
In contrast, our probability distribution profiles indicate that deletions for the distal portion of SRO-D are significantly associated only with ID and brachydactyly [ Figures 5A and 6A], although it should be noted that the microcephaly area of probability does not encompass this region due to the dubious effect of the deletion in our Patient 5. This region includes CEP350, RALGPS2, TDRD5, and XPR1 genes that are intolerant to haploinsufficiency; all have a low brain expression but none of them is thus far recognized as a disease gene [Supplementary Table 4]. In addition, LIM homeobox 4 (LHX4, OMIM 602146), a gene implicated in the etiology of congenital hypopituitarism [26] (OMIM #262700), was previously evoked as possible candidate gene for growth deficiency [7,27] .
Altogether, we have to assume that deletion for the proximal region 1q23.3q24.1 (SRO-P [1] ) is associated with kidney anomalies of high penetrance for total PBX1 loss [ Supplementary Figures 4 and 5] and ID but not microcephaly [ Figure 5]. Microcephaly is fully associated with deletions of more than one SROs (SRO-2, -I, and -D, Figures 6 and 7) and the most favorable new candidate gene is ATP1B1 in SRO-2.
Cases 5 and 6 are puzzling for the apparently different phenotypes in the presence of identical deletion. Indeed, Case 5 is a healthy adult and Case 6 is still a newborn with a nuanced disorder and perhaps would have been considered healthy if we had not incidentally identified the deletion in the father. Among genes mapping in their 5.9-Mb deleted region, between 172,667,560 and 178,548,677 [Supplementary Table 5], at least two are associated with systemic diseases: TNFSF4 (MIM *603594), related to systemic lupus erythematosus (OMIM#152700), and DARS2 (MIM*610956), involved in recessive leukoencephalopathy with brainstem and spinal cord involvement and lactate elevation (MIM#611115). None of these conditions were consistent with the clinical presentation of our Cases 5 and 6. We reasoned that the deletion might have unmasked a maternally inherited recessive variant in that region. However, WES analysis on the mother did not support this hypothesis, suggesting that other genetic or environmental factors may modulate the phenotype associated with this deletion. Indeed, the microcephaly, which is the main feature, observed in our Case 6, is a neurological sign that may be caused by a multitude of disease-causing genes with recessive or dominant inheritance [28] . However, by WES, we did not highlight any possible pathogenic variants in all genes associated with microcephaly having a frequency < 1% [28] that could explain this trait in our Case 6.
Similarly, IUGR, which characterized the prenatal life of our Case 6, is an end result of various etiologies that include maternal, placental, fetal, and genetic factors [29] . Looking at the clinical history of our family [ Figure 3], IUGR was documented not only in the child carrying the 1q deletion, but also in her brother without this deletion (Subject III.2), and even in the IUFD at 39th week of gestation (Subject III.3). Indeed, by WES data, we identified in the mother's genome, two heterozygous missense variants, c.796C>T:p. R266W and c.1279G>A:p.A427T in TXBAS1 [ Supplementary Figure 3], a gene having a possible role in thrombotic events [30,31] . Since both variants are rare (AF< 0.001%), predicted deleterious in several databases, and almost certainly in trans, only one being present in the first son of the couple, the two variants most likely represent a risk factor for the recurrent pregnancy losses and IUGR observed in this family.
Taken together, the link between 1q deletion identified in this family and the phenotype in our patient remains elusive. A long-term clinical follow-up of our newborn patient will help to clarify whether this deletion represents a benign CNV or a rearrangement showing incomplete penetrance.
In conclusion, we confirmed and identified several genes whose haploinsufficiency appears crucial in the manifestation of the main phenotypic abnormalities associated with 1q23.3q25.2 deletions [Supplementary Table 6]. In particular, PBX1, in addition to its well-known role in kidney abnormalities, is strongly associated with ID and contributes to the behavioral traits along with psychiatric disorders. DNM3 and LHX4 are hereby confirmed as responsible for growth retardation [7,27] while ATP1B1 represents a new candidate gene for microcephaly.
It should however be underlined that, apart from SRO-1, the other three SROs contain genes belonging to different TADS, some of which are interrupted by the deletion (http://3dgenome.org).
We cannot therefore rule out that some phenotypic abnormalities are due to an altered expression of some of the non-deleted genes following the breakdown of the TADs, rather than the haploinsufficiency of specific genes [32] .
Finally, we propose a method to computationally predict the probability that a given DL lies in a specific genomic segment. Although this approach may be hampered by long-term position effects of regulatory elements, synergistic cooperation of several genes, and incomplete clinical assessment, it can be useful, especially for contiguous gene syndromes that show a complex pattern of clinical characteristics. Obviously, functional approaches are needed to warrant its reliability.