Manuscript #11597

Published on


Metadata

eLife Assessment

This important study links allelic expression imbalance with replication timing, suggesting a stochastic model for haploinsufficiency in dosage-sensitive disease. The integration of allele-specific RNA-seq and replication timing in clonal systems provides solid evidence for an association between asynchronous replication and allelic imbalance, although the scope and generality of some conclusions require more cautious interpretation. This study will interest epigeneticists and genome regulation researchers studying replication timing and monoallelic expression, as well as developmental biologists and human geneticists concerned with clonal heterogeneity, haploinsufficiency, and variable disease penetrance.

[Editors' note: this paper was reviewed by Review Commons.]

Reviewer #1 (Public review):

Summary:

The existence of VERT regions is well supported, but the number of regions called as ISCs may be inflated by permissive thresholds (e.g., AEI {greater than or equal to} 0.8 or {less than or equal to} 0.2 in a single clone). This risks conflating transient stochastic differences with stable ISCs. Similarly, the claim of cell-type specificity is not convincingly demonstrated given the small sample size (n=4) and strong batch confounding between lymphoblastoid and cartilage progenitors. While syntenic VERT regions across mouse and human are intriguing, they complicate interpretation of strong clustering by cell type. Sampling depth may also have exaggerated allelic imbalance calls.

The proposed role of ISCs in haploinsufficiency is conceptually interesting but remains speculative; developmental stochasticity and founder population size may play larger roles than replication timing. The claim that autosomal inactivation is mechanistically distinct from XCI, however, is reasonable and supported.

Some conclusions should be more explicitly qualified as preliminary. Cell-type specificity and mitotic stability both require stronger evidence; the latter is inferred indirectly from clonal expansion rather than shown directly, and orthogonal experiments (e.g., allele-specific ChIP-seq, DNA methylation) would be required. Estimated genomic coverage of ISCs should also be re-evaluated, as single-clone observations may inflate counts.

Replication is limited. Hierarchical clustering is confounded by batch and based on presence/absence calls that lack quantitative resolution. More robust approaches would include using magnitude of imbalance, annotating VERTs by genomic location, applying stricter thresholds for replication timing, and benchmarking AEI distributions against the X chromosome. These are realistic re-analyses requiring no new data and could be completed in ~1 month.

Methods are generally well described and reproducible. Figures and text would benefit from improved clarity: axis labels are missing in places (e.g., Fig. 1c, Fig. 2g), legends should explain chromosome arm colors, and cluttered figures such as Fig. 1j could be re-visualized for interpretability. Gene set enrichment analysis should be restricted to avoid inflated significance from overly broad categories. A useful citation for XCI timing (pmid=39420003) could be added to strengthen background.

Significance:

Conceptually, this work introduces ISC-like phenomena in human and mouse progenitor lines, coupling allelic expression imbalance with replication timing. Technically, it combines allele-specific RNA-seq with Repli-seq in genotyped, clonal, single-cell-derived lines. Clinically, it suggests an alternative model for haploinsufficiency, relevant to dosage-sensitive diseases where stochastic transcriptional delays could shape penetrance.

The study builds on prior work in allelic exclusion (e.g., HLA, olfactory receptors) and random monoallelic expression, generalizing these phenomena into ISC/vert frameworks and proposing mitotic stability of allele choice. By extending beyond expression to replication timing, the authors suggest a broader paradigm for epigenetic regulation at autosomal loci.

The paper will be of interest to epigeneticists studying XCI, allelic exclusion, and monoallelic expression; to developmental biologists examining replication timing and differentiation; and to clinicians concerned with dosage-sensitive and haploinsufficient disorders.

Reviewer #2 (Public review):

Summary:

- This is a complicated research topic that touches on a few sub-fields of biology, and thus to make the paper more approachable I would recommend a careful edit of the text for clarity and precision of language.
- Authors point out that this is a decades-old field; it would make sense to use terminology established within the field rather than inventing their own. Allelic imbalance has been referred to as AI, MAE (monoallelic expression), RMAE (random monoallelic expression) etc. The paper whose mouse data the authors make use of uses Asynchronous Stochastic Replication Timing (ASRT) instead of VERT to refer to the same phenomenon. Creating unnecessary jargon makes the paper more difficult to read and adds needless complexity to an already complex field.
- Methods do not provide sufficient detail to fully evaluate or reproduce these experiments.
- It is helpful to show representative loci as the authors do in Fig 1F and G and Fig 2, but these panels are very densely rendered and thus difficult to process visually - even the cartoon version (1D) is thick with overlapping lines. The point that allelic imbalance is enriched in VERTs would be enhanced if the authors could present the allelic ratio for all genes found in all VERTs, demonstrating how replication timing on either chromosome affects the allelic ratio.
- The authors make the important point that VERTs are unlikely to be shared among different cell types and tissues (Fig 1i) but then find an enrichment for neuronal and immune genes in VERT regions identified in ACPs. It follows that these same genes are unlikely to be in such regions in the tissues where they are relevant. Some of the GO terms presented are too broad to suggest any biological significance to the result, even if there is statistical significance (for example, the top term for LCL clones 'Cytoplasm' is associated with 12,000 genes, and the second term for mouse clones 'Membrane' is associated with 10,000). It would be helpful to focus on GO terms lower in the GO hierarchy.
- Figure 3 highlights the association of related gene clusters with VERTs but the VERTs are assigned based on variable replication timing in just 1 or 2 clones. This is an interesting observation, but to make the point that "VERT regions frequently coincide with gene clusters in the human genome" there needs to be a systematic assessment of replication timing at all gene clusters across all clones, and a statistical test for significance.
- It is an interesting hypothesis that VERTs are conserved between species at synentic loci. If such regions are really conserved, one would expect that replication timing at these sites would be consistently asynchronous. However, the data presented shows that in human clones these VERTs can be specific to an individual donor (as in 5A) or an individual clone (as in 5H).
- Again, the finding that VERTs coincide with neurodevelopmental disease genes in immune and cartilage cells is at odds with the previous statements and data about the tissue specificity of VERTs. In order to support the claim that neurodevelopmental disease associated genes reside in asynchronously replicating regions, and are thus more prone to allelic imbalance, the authors would need to demonstrate this phenomenon in neuronal cells.

Significance:

The authors pair analysis of replication timing and allele-specific expression in clonal populations of primary human cells. They combine these data with previously published data on clones from transformed human cell lines. They identify a number of genomic regions that display asynchronous replication timing in at least one clone and correlate these regions with allele-specific expression of genes within them. They also observe that several interesting gene sets, including genes that are associated with human diseases, map to asynchronously replicating regions. This is a good experimental approach that builds on already published data demonstrating the connection between allelic imbalance and replication timing. However, the authors consistently lean on thin evidence (i.e. a single clone) within a modestly sized dataset (4 clones from 2 donors each) to propose a new model for haploinsufficiency in human disease. The consistent focus on limited elements in the data and perhaps an overreach in the interpretation makes it difficult to appreciate what is in fact a very good experiment.

Author response:

 

General Statements

 

We thank the reviewers for their thoughtful and constructive comments, which will substantially improve our manuscript. In response, we will revise the text and figures throughout to address the points raised. Specifically, we will:

 

i. Refine our definition of Inactivation/Stability Centers (I/SCs): We will limit this designation to loci where both Allelic Expression Imbalance (AEI) and Variable Epigenetic Replication Timing (VERT) are detected, either in the present study or in previously published work.

 

ii. Expand methodological clarity: We will provide detailed descriptions of how VERT regions were identified, annotated, and quantified, including thresholds for allelic imbalance, replication timing variability, and sampling depth. We also justify the ≥80% AEI cutoff, which is based on recent studies showing that modest allelic biases can have biological and clinical significance.

 

iii. Enhanced benchmarking and validation: In addition to the analysis of X inactivation in female ACP cells, we will include comparisons between imprinted and non-imprinted regions to benchmark the magnitude of allelic replication timing imbalance, demonstrating that the magnitude of imbalance observed at imprinted loci is comparable to that at the non-imprinted VERT regions.

 

iv. Address tissue specificity and sampling limitations: We will discuss the limited number of clones, tissues, and individuals analyzed, emphasizing that while our data identify robust AEI and VERT patterns, additional tissues and individuals will be required to capture the full diversity of I/SC regulation.

 

v. Clarify biological relevance: We will expand our discussion to highlight the consistency of AEI findings across cell types, including examples of genes implicated in neurodevelopmental and neurodegenerative disorders, and we will clarify our model of how I/SC regulation may contribute to haploinsufficiency, variable expressivity, and incomplete penetrance in human disease.

 

vi. Improved figures and supplemental data: We will update figure legends for clarity, add a new supplementary figure comparing imprinted and non-imprinted regions, and cross-reference all supplemental tables.

 

We believe these revisions strengthen the manuscript conceptually and experimentally, and we thank the reviewers and editors for their valuable feedback.

 

Description of the planned revisions

 

Reviewer #1:

The existence of VERT regions is well supported, but the number of regions called as ISCs may be inflated by permissive thresholds (e.g., AEI {greater than or equal to} 0.8 or {less than or equal to} 0.2 in a single clone). This risks conflating transient stochastic differences with stable ISCs.

 

We selected the >80% (or <20%) allelic imbalance threshold, along with the requirement of at least one biallelic clone, as our criterion for significant AEI. This choice was guided by a recent study demonstrating that allelic imbalance as low as a 65%/35% is enough to effect disease penetrance in humans (Nature 2025; 637:1186–1197). For completeness, results obtained using more stringent thresholds (>90% and >95% imbalance) are presented in Supplementary Table 2.

Furthermore, it is unlikely that transient stochastic differences in allelic expression, such as those detected by single-cell RNA sequencing assays (Nat. Rev. Genet. 2015; 16:653–664), would be captured by our approach. Each clone in our study was expanded from a single cell to over one million cells before both RNA-seq and Repli-seq analysis, effectively averaging out transient transcriptional and/or replication fluctuations, and thus reflecting stable, mitotically heritable epigenetic states.

 

More robust approaches would include using magnitude of imbalance, annotating VERTs by genomic location, applying stricter thresholds for replication timing, and benchmarking AEI distributions against the X chromosome.

 

All VERT regions identified in this study were annotated according to both the magnitude of allelic imbalance and their genomic coordinates, using 250 kb windows for the human samples and 50 kb windows for the mouse samples (see Supplementary Tables 1 and 6). Figure 1c directly compares the magnitude of imbalance, defined as outliers in the standard deviation, for both allelic replication timing and allelic expression across autosomal and X-linked loci in female ACP cells.

In addition, we will benchmark the magnitude of replication timing imbalance using autosomal imprinted regions as a second internal control. We detected allelic replication imbalance at 13 known imprinted loci, and the standard deviation of replication timing at these loci, measured in 250 kb windows, is comparable to that observed across the >350 VERT regions detected at non-imprinted sites. To illustrate this comparison, we will include a supplementary figure directly comparing imprinted and non-imprinted regions.

 

Figures and text would benefit from improved clarity: axis labels are missing in places (e.g., Fig. 1c, Fig. 2g), legends should explain chromosome arm colors, and cluttered figures such as Fig. 1j could be re-visualized for interpretability.

 

Figure labels will be added to Figs. 1c and 2g, and legends will be modified for clarity.

 

“the claim of cell-type specificity is not convincingly demonstrated given the small sample size (n=4) and strong batch confounding between lymphoblastoid and cartilage progenitors.” And “Hierarchical clustering is confounded by batch and based on presence/absence calls that lack quantitative resolution.”

 

We agree that the limited number of individuals and clones, as well as the comparison between only two distinct tissue types (LCLs and ACPs), have quantitative limitations. Our primary intent was to evaluate whether any I/SCs were shared between independently derived clonal datasets and to determine whether there is evidence of tissue-specific I/SC usage, rather than to make quantitative claims about global cell-type specificity.

To address this concern, we will replace the hierarchical clustering analysis currently shown in Figure 1i with a Venn diagram that more directly illustrates the overlap and tissue-specific distribution of VERT regions detected in the different clonal sets. This revised representation avoids assumptions about clustering relationships and removes batch-driven bias, while still conveying the key observation that many VERT regions are shared across tissues and others appear tissue-restricted.

 

While syntenic VERT regions across mouse and human are intriguing, they complicate interpretation of strong clustering by cell type. Sampling depth may also have exaggerated allelic imbalance calls.

 

We note that the human LCLs used in our study are B cells, and immunoglobulin gene rearrangements were used to confirm the clonal uniqueness of each line. Similarly, the mouse replication timing data analyzed here was generated from pre-B cells, which also undergo immunoglobulin gene rearrangement. Thus, both the human LCL and mouse pre-B cell datasets were derived from B-cell lineages, providing a consistent cellular context for comparative analysis.

Sequencing depth is an important consideration for all variant base calls. Without fully haplotype-resolved genomes, previous studies relied on calculating per-SNP calls of allelic imbalance based on reads covering a single nucleotide locus. To improve sequencing depth supporting the identification of VERT and AEI regions, we utilized fully haplotype-resolved genomes that allowed all informative allele-specific reads to be pooled across all heterozygous SNPs within genomic windows or expressed genes. For AEI, we set a minimum threshold of 20 informative allele-specific reads per gene, a minimum FDR-corrected p-value of <=0.05, and a minimum of 80% vs 20% allelic imbalance. Importantly, a recent study has shown that allelic imbalance as low as a 65%/35% is enough to effect disease penetrance in humans (Nature 2025; 637:1186–1197). We reiterate that more stringent thresholds (>90% and >95% imbalance) are presented in Supplementary Table 2.

 

Gene set enrichment analysis should be restricted to avoid inflated significance from overly broad categories.

 

Reviewer #2:

Some of the GO terms presented are too broad to suggest any biological significance to the result, even if there is statistical significance (for example, the top term for LCL clones 'Cytoplasm' is associated with 12,000 genes, and the second term for mouse clones 'Membrane' is associated with 10,000). It would be helpful to focus on GO terms lower in the GO hierarchy.

 

We will include our complete Gene Ontology analysis, with more specific biological categories, in Supplemental Table 5.

 

Allelic imbalance has been referred to as AI, MAE (monoallelic expression), RMAE (random monoallelic expression) etc. The paper whose mouse data the authors make use of uses Asynchronous Stochastic Replication Timing (ASRT) instead of VERT to refer to the same phenomenon. Creating unnecessary jargon makes the paper more difficult to read and adds needless complexity to an already complex field.

 

While we agree that allelic expression imbalance has been described by different investigators using many different phrases, we believe that MAE, RMAE and AI do not represent an accurate description of the phenomenon. In our study [and our previous study; Nat Commun. 2022; 13(1):6301] we used clonal analysis of allele-specific expression and found that while some clones display equivalent levels of expression between alleles of a given gene (i.e. bi-allelic expression) other clones express only one allele (i.e. mono-allelic expression), and yet other clones have undetectable expression (i.e. silent on both alleles). This pattern of allele-restricted expression indicates that each allele independently adopts either an expressed or silent state. Importantly, because these expression states are mitotically stable, allele-autonomous, and independent of parental origin, we refer to the choice of the expressed allele as stochastic. Given this variability, we believe that the phrase “Allelic Expression Imbalance” (AEI) represents a more accurate descriptor for this phenomenon. We also point out that “Allelic Expression Imbalance” has been used >120 times in the Pubmed database.

In addition, the replication asynchrony that exists at these loci is not consistent with purely ASynchronous Replication Timing (ASRT) between alleles. We found that each allele can independently adopt either earlier or later replication timing in different clones. This variability results in some clones exhibiting pronounced asynchrony between alleles, while in others, the two alleles replicate synchronously, with both adopting either the earlier or later timing state. As reported in our previous study (Nat. Commun. 2022; 13:6301), this behavior reflects a stochastic and allele-autonomous process, leading us to describe these loci as exhibiting Variable Epigenetic Replication Timing (VERT), which we believe is a more accurate descriptor of this phenomenon.

 

The point that allelic imbalance is enriched in VERTs would be enhanced if the authors could present the allelic ratio for all genes found in all VERTs, demonstrating how replication timing on either chromosome affects the allelic ratio.

 

The stochastic nature of allelic expression and replication timing observed at VERT loci indicates that each allele independently acquires its epigenetic state. Specifically, the expressed or silent status of one allele does not predict the replication timing or expression status of the opposite allele. Accordingly, the Early/Late pattern of replication timing that we detect, both in this study and in our previous work (Nat. Commun. 2022; 13:6301), is not correlated with which allele is transcriptionally active. This supports our conclusion that asynchronous replication timing is not a downstream consequence of monoallelic transcription, but rather an independent epigenetic feature of I/SCs. Regardless, we will provide the combined expression ratios for all transcripts that are located within the VERT regions in a Supplemental Table.

In addition, our analysis of imprinted loci reveals that even at genomic regions with parent-of-origin–specific expression, replication timing does not align with allelic activity: both early- and late-replicating alleles can be transcriptionally active, depending on the gene. This observation is consistent with the complex organization of many imprinted domains, where genes on opposite alleles exhibit reciprocal expression patterns. To illustrate this point, we will include a new supplemental figure demonstrating that imprinted loci harbor genes expressed from both the earlier- and later-replicating alleles.

           

Figure 3 highlights the association of related gene clusters with VERTs but the VERTs are assigned based on variable replication timing in just 1 or 2 clones. This is an interesting observation, but to make the point that "VERT regions frequently coincide with gene clusters in the human genome" there needs to be a systematic assessment of replication timing at all gene clusters across all clones, and a statistical test for significance.

 

Our intent in Figure 3 was not to suggest that all gene clusters are subject to VERT and AEI, but rather to highlight that several well-characterized multigene families that are known to exhibit random AEI, such as olfactory receptor and HLA gene clusters, coincide with VERT regions at their genomic locations. These examples serve as representative illustrations demonstrating that I/SC-associated regulation occurs at established AEI loci organized in gene clusters.

To clarify this point, we will revise the text to explicitly state that Figure 3 presents illustrative examples of known AEI-associated gene clusters overlapping with VERT regions, rather than a comprehensive or statistically exhaustive analysis of all gene clusters across the genome.

 

It is an interesting hypothesis that VERTs are conserved between species at synentic loci. If such regions are really conserved, one would expect that replication timing at these sites would be consistently asynchronous. However the data presented shows that in human clones these VERTs can be specific to an individual donor (as in 5A) or an individual clone (as in 5H).

 

As discussed in our Limitations section, our analysis was restricted to a limited number of cell types, clones, and individuals, which may not capture the full diversity of I/SC usage across tissues and populations. While our dataset was sufficient to identify robust patterns of AEI and VERT, it likely represents only a subset of the broader landscape of I/SC regulation in both humans and mice. We anticipate that future studies incorporating a wider range of tissues, individuals, and clonal analyses will uncover an even greater degree of conservation and diversity in I/SC usage across genomes.

 

In order to support the claim that neurodevelopmental disease associated genes reside in asynchronously replicating regions, and are thus more prone to allelic imbalance, the authors would need to demonstrate this phenomenon in neuronal cells.

 

We make two points that address this critique: First, many of the neurodevelopmental disease genes located within or adjacent to VERT regions are not exclusively expressed in neuronal cells and have already been shown to exhibit AEI in non-neuronal contexts. For example, Gimelbrant and Chess (Science, 2007; 318:1136–1140) demonstrated AEI of the Parkinson disease genes SNCA and LRRK2 in lymphoblastoid cell lines (LCLs), and in our previous study, we detected AEI of DNAJC6, another Parkinson disease gene, in LCL cells (Nat. Commun. 2022; 13:6301). In the present study that used ACP cells, we identified VERT and AEI of several epilepsy-associated genes, including SCN1A, SCN2A (Fig. 6b), GABRA1(Fig. 6e), and SAMD12 (Fig. 6j), as well as a gene implicated in autism and neurodevelopmental disorders, SEMA5A (Fig. 5c).

Second, independent studies from the E. Heard laboratory have provided further evidence that AEI occurs in neuronal lineages. Using mouse neural progenitor cells (NPCs), they identified genes subject to AEI (Dev. Cell, 2014; 28:366–380) and they later evaluated AEI of syntenic human neurodevelopmental disease genes, including Snca, App, Eya4, and Grik2 (Nat. Commun. 2021; 12:5330). In addition, they used the phrase “Allelic Expression Imbalance” to describe the epigenetic expression biases at these genes.

Together, these findings reinforce that AEI, and by extension I/SC regulation, is not restricted to specific cell types, but rather represents a generalizable mechanism of stochastic epigenetic regulation that includes genes relevant to neurodevelopment and disease.

 

However, the authors consistently lean on thin evidence (i.e. a single clone) within a modestly sized dataset (4 clones from 2 donors each) to propose a new model for haploinsufficiency in human disease. The consistent focus on limited elements in the data and perhaps an overreach in the interpretation makes it difficult to appreciate what is in fact a very good experiment.

 

We agree that our analysis was conducted on a modest number of clones and individuals, which we explicitly acknowledge as a limitation of the present study. However, several key points support the robustness and broader relevance of our conclusions:

i. Clonal Design and Replication: The strength of our approach lies in its clonal resolution. Each clone represents a single-cell–derived population expanded to over a million cells, enabling direct detection of stable, mitotically heritable allele-specific epigenetic states that would not be apparent in population-averaged data. Importantly, many of the VERT regions we identified are shared between independent clones from different donors and across distinct cell types (ACP and LCL), demonstrating reproducibility and biological consistency.

ii. Cross-Species Validation: We further identified syntenic VERT regions in mouse pre-B cell clones, including at loci known to exhibit AEI in prior studies, providing independent validation and evolutionary conservation of the phenomenon.

iii. Integration with Published Evidence: Our findings extend prior observations of AEI and variable replication timing (e.g. Gimelbrant et al. Science 2007; Heskett et al. Nat. Commun. 2022) and are fully consistent with known stochastic allelic expression imbalance of autosomal genes. We also draw parallels with the absence of cellular selection mechanisms that dictate dominant inheritance patterns for loss of function alleles for X linked disease genes (reviewed in: J Clin Invest, 2008, 20-23; and Nat Rev Genet. 2025, 26, 571–580). Our proposed model linking I/SC regulation to haploinsufficiency is therefore a synthesis of our results with an extensive body of published data, not an inference drawn from isolated observations.

iv. Scope and Framing: We will revise the manuscript to clarify that our proposed model represents a mechanistic framework, not a definitive or exclusive explanation, for how stochastic allelic regulation could contribute to dosage-sensitive disease phenotypes. We will also explicitly discuss the need for larger datasets and additional tissues to refine and test this model.

 

In summary, while we recognize the limited sampling inherent to clonal analyses, the consistency of our observations across donors, cell types, and species, together with prior corroborating studies, supports the validity of the conclusions and justifies the broader conceptual implications.

 

Description of analyses that authors prefer not to carry out

 

Reviewer #1:

Cell-type specificity and mitotic stability both require stronger evidence; the latter is inferred indirectly from clonal expansion rather than shown directly, and orthogonal experiments (e.g., allele-specific ChIP-seq, DNA methylation) would be required.

 

We disagree with this reviewer that the mitotic stability of the epigenetic states are “inferred indirectly from clonal expansion rather than shown directly”. Our experimental design inherently captures mitotically stable, allele-specific states because each clonal line is derived from a single progenitor cell and expanded to millions of cells before analysis. The allele-specific replication timing and expression profiles observed in these clones therefore reflect epigenetic states that are stably inherited across many cell divisions, rather than transient or stochastic fluctuations. This approach was also validated in our previous study (Nat. Commun. 2022; 13:6301), where the same clonal strategy demonstrated stable allele-restricted replication and expression patterns over extended passages.

We agree that orthogonal assays such as allele-specific ChIP-seq or DNA methylation analyses would provide additional mechanistic detail on the nature of I/SC-associated regulation. However, these experiments fall outside the scope of the present study, which was designed specifically to identify and map autosomal loci that exhibit coordinated AEI and VERT, the defining epigenetic features of I/SCs. While we fully acknowledge that defining the precise molecular marks (e.g., histone modifications, DNA methylation, chromatin accessibility) that underlie I/SC regulation will be an important future direction, our current data provide a genome-wide, allele-resolved foundation upon which such mechanistic studies can build.

In summary, the current dataset achieves the central goal of defining the genomic distribution and conservation of I/SCs based on functional readouts of replication timing and expression. Future work will extend these findings using allele-specific epigenomic profiling to characterize the epigenetic modifications associated with I/SC stability and cell-type specificity.