How can duplication of a chromosome




















This is a cause of some birth defects. Each chromosome has many segments. These are usually divided into a "short arm" and a "long arm" of the chromosome.

The short arm, which is the upper half of the chromosome, is known as the "p arm. Click Image to Enlarge. The term "deletion" simply means that a part of a chromosome is missing or "deleted. When genes are missing, there may be errors in the development of a baby, since some of the "instructions" are missing.

One example of a genetic syndrome caused by a deletion is called "Cri du Chat," where part of the 5 chromosome is missing or deleted. Cri du Chat or "Cat Cry syndrome" is found in approximately 1 in 20, to 50, live births in the U. Cri du Chat is caused by a deletion of chromosome 5p, which is written "5p-. They also have problems with language, and may express themselves by using a small number of words or sign language.

Other abnormalities that can occur with a 16p However, there is no particular pattern of physical abnormalities that characterizes 16p These changes are present in about 4 in 10, people who have mental health problems or difficulties with speech and language.

Many people with the duplication are likely never diagnosed because there are many causes of these problems, and some people with the duplication have no related health or developmental problems. People with a 16p This duplication affects one of the two copies of chromosome 16 in each cell. The length of the duplicated segment is most often about , DNA building blocks base pairs , also written as kilobases kb.

The kb region contains more than 25 genes, and little is known about the function of most of these genes. Researchers are working to determine how the extra genetic material contributes to the features of 16p Most affected individuals inherit the duplication from one affected parent ; they may have similar characteristics of the condition as the parent, or they may be either more or less severely affected.

However, in some cases 16p Instead, they occur as random events during the formation of reproductive cells eggs and sperm or in early fetal development. People with a new duplication typically have no history of related signs or symptoms in their family, although their children can inherit the chromosomal change. Genetics Home Reference has merged with MedlinePlus.

Learn more. The information on this site should not be used as a substitute for professional medical care or advice. Contact a health care provider if you have questions about your health.

From Genetics Home Reference. Description 16p Paleopolyploidization is rampant in the plant kingdom and is the dominant feature of plant genome evolution but not the evolution of animals and fungi Moghe and Shiu, ; Michael and VanBuren, ; Wendel, ; Salman-Minkov et al. In addition to WGD, single gene duplication is also prevalent in plant genomes over long evolutionary time periods Freeling, ; Wang Y. However, the gene loss after gene or genome duplication is very common in plant genomes Lynch and Conery, Tandem duplications often occur as a result of unequal crossing over and are often followed by inversion events Freeling, ; Hahn, The proximal gene pair comprises two gene copies that are closely located on the chromosome but separated by a few genes Wang et al.

Two contiguous gene duplicates that originated from ancient tandem duplication events can be disrupted by inserting other genes Freeling et al. In addition, localized transposon activities can result in the proximal duplications Zhao et al. Transposed duplication events can take place through DNA-based or RNA-based transposition or retrotransposition in which the duplicated gene is relocated to a new chromosomal position Freeling, ; Hahn, ; Wang et al. However, the mechanism underlying the abundance of dispersed duplicated genes remains unclear.

Because of the various genetic mechanisms for generating different modes of gene duplications, we can speculate that different types of gene duplications may evolve along distinct evolutionary trajectories, and may have been retained in a biased manner over long evolutionary time periods.

The preservation of duplicate genes can be attributed to the interactions of multiple factors, such as gene features, gene expression level, alternative splicing and protein—protein interactions Du et al. The evolutionary rate, structural complexity, and GC3 content may be intensely correlated with the retention of WGD-derived duplicated genes Jiang et al. The expression divergence between duplicated genes occurred ubiquitously after gene duplication in plant genomes Blanc and Wolfe, ; Renny-Byfield et al.

A positive correlation between structural divergence and gene expression divergence has been observed in Arabidopsis Wang et al. Following gene duplication, the divergence of the promoter sequence between duplicated genes may lead to their expression divergence Zhang, ; Hahn, The frequent gain and loss of cis -regulatory elements contained in promoters between parent and daughter genes occurred shortly after gene duplication, resulting in subfunctionalization Force et al.

Another important model underlying duplicated gene retention following WGD is the gene dosage balance model Birchler and Veitia, This model states that those duplicated genes that are dosage-sensitive or frequently interact with other genes tend to be retained because the loss of one of the duplicates causes dosage imbalances and decreases fitness. Many other evolutionary models have also been proposed to elucidate the mechanisms underlying the short- and long-term retention of duplicated genes Freeling, ; Conant et al.

However, the relationships among structural, expression and regulatory divergences between duplicated genes are not well understood. What factors maintain the genetic redundancy over long time periods are still controversial. Second, we attempted to explore the relationship among sequence, expression and regulatory divergence. Third, we further addressed whether different modes of duplicated genes evolved toward biased functional roles. In addition, the contribution of gene duplication to biological innovation was evaluated by investigating the expansion patterns of gene families involved in key fruit traits.

Chinese white pear Pyrus bretschneideri genome sequences and annotation files were downloaded from the Pear genome project 1 Wu et al. Chinese plum Prunus mume genome sequences and annotation information were downloaded from the Prunus mume Genome Project 2 Zhang et al. The other 32 plant genome data sets were downloaded from Phytozome v9. First, an all-vs.

The modes of gene duplication were determined using the algorithm within MCScanX according to the following procedure: all genes were initially ranked according to their order along chromosomes and were labeled as singletons. A transposed duplicate pairs must be meet the following criteria: one gene existed in its ancestral locus, and the other was located in a non-ancestral locus Wang Y.

Therefore, ancestral gene locations were first discerned by synteny aligning. The synteny analyses between pear and 34 other plant genomes were conducted locally using a method similar to that developed for the Plant Genome Duplication Database PGDD 5 Tang et al. Then, all syntenic blocks between pear and the 34 other species mentioned earlier were identified. Finally, genes located in these syntenic blocks in pear were deemed to be ancestral loci.

If a pair of transposed duplicated genes comprised an ancestral gene with more than two exons and a novel transposed copy without an intron, then this pair was inferred to be derived from RNA-based transposition retrotransposition.

If both genes in a transposed duplicated pair had a single exon, the pair of duplicates was removed temporarily. The other remaining pairs of transposed duplicated genes were inferred to have originated from DNA-based transposition Wang Y. In the present study, because multiple ancestral loci may be found for a transposed duplicate, the ancestral locus with the highest similarity was identified as the parental duplicate Wang et al.

After all duplicated pairs were classified into different patterns, each duplicated gene was assigned to a unique mode. The valid duplicate gene pairs originated from different duplication modes were used to calculate the K a and K s substitution rates. This method averages parameters across 14 candidate models Zhang et al. The information regarding the RNA-seq samples used in this study can be retrieved from Supplementary Table 6.

The raw reads were filtered using Trimmomatic version 0. The high-quality clean reads were adopted in the downstream analysis. The reference transcripts obtained from pear genome annotation files were used to construct a Kallisto index.

Then, the Kallisto quantification algorithm was performed with default parameters for single-ends, -l -s 20 to process single-end or paired-end reads. The output included the normalized count estimates and TPM values for each transcript. The TPM value was used as the measure of gene expression levels in different tissues and developmental stages. Furthermore, we extracted all of the intergenic regions at the whole-genome level for pear, and then we quantified the expression abundance levels for intergenic sequences using the same procedure and RNA-seq reads that were used for the above exonic regions.

We used the mean value 0. Here, we only used those duplicated pairs in which both gene copies were expressed in at least one tissue Makova and Li, ; Wang et al. Then, we established a cutoff r -value below which two duplicate genes were considered divergent in expression.

We randomly selected 10, gene pairs and computed r -values for their expression profiles. As the putative promoter sequence, bp upstream of the transcriptional start site for each gene was extracted using BEDTools Quinlan and Hall, Then, we used SharMot -l 16 to estimate the promoter-sequence divergence d SM ; shared-motif divergence for each gene pair Castillo-Davis et al. We randomly selected 10, gene pairs and computed their s LS values.

In this study, we investigated the whole-gene conversion for each gene pair generated by different modes of gene duplication in pear. First, we determined the homologous gene quartets, comprised of two paralogs in pear and their respective orthologs in apple outgroup species. Then, we compared the gene similarity or tree topology between homologs in quartets by estimating their K s value.

Bootstrap tests of repetitive random samples was performed to evaluate the significance of putative gene conversions. Because the genome duplication occurred before species divergence between pear and apple, we hypothesized that the pear-apple orthologs would be more similar to one another than to their respective paralogs in each species.

However, if the paralogs had experienced gene conversion after speciation, we would observe they would be more similar to each other than to their respective orthologs Wang et al. Then, we used hmmpress and Pfam-A. For each domain, we calculated the percentage of the domains represented in the different duplication modes of proteins or among the total proteins. The GO annotation for pear genes was obtained from the pear genome project 8 Wu et al.

The referred IDs for the sugar- and acid-related metabolism genes in Arabidopsis were obtained from previous studies Shangguan et al. The corresponding protein sequences of Arabidopsis were downloaded from Phytozome v11 9.

Finally, the gene family members involved in the sugar and organic acid metabolism pathways were determined in pear. The local all-vs. The MCScanX package was used to detect WGD- and TD-derived gene pairs, while the other modes of duplicated gene pairs were determined according to the procedures described in the Methods section. Additionally, gene pairs derived from PD were further identified according to the chromosomal interval 10 or fewer genes between two genes from a BLASTP hit.

Moreover, the density levels of different modes of duplicated genes fluctuated greatly along each chromosome Supplementary Figure 2. However, the density levels of the DSD-genes in the pericentromeric or chromosomal arm regions are similar. The chromosomal distribution of different modes of duplicated genes. Some chromosomal regions with low frequency levels of WGD-derived genes often showed high frequency levels of DSD-derived genes Figure 1 and Supplementary Figures 2, 3.

In particular, the significant positive correlation between the density levels of WGD- and TD-derived genes was detected on 10 out of 17 chromosomes. In addition, the distributions of TD- and PD-derived genes overlapped to some extent on each chromosome. A positive correlation was found for the genomic density between TD- and PD-derived genes on 15 out of 17 chromosomes. In addition, we investigated the gene features of different modes of duplicated genes, including the GC content, GC3 content, average exon length and coding-region length Supplementary Figure 4.

Moreover, the RD-derived genes showed a strong trend to longer average exon length and shorter coding-region lengths. In contrast, the DD-derived genes had shorter average exon lengths and longer coding-region lengths, suggesting that these genes possessed more exons.

Different gene duplication modes exhibited divergent K a and K s distributions. The boxplot further revealed that the RD-, DD-, and DSD-derived pairs had higher median of K a distribution values than the other three modes, suggesting that they were more extensively mutated during the long evolutionary time period Figure 2D. Evolutionary patterns of gene pairs duplicated by different modes in pear. We further classified the duplicated gene pairs into three groups based on their different selection pressures Figure 3A.

The percentage of PD-derived pairs 7. Furthermore, we performed the GO analysis for those duplicated genes undergoing positive selection to explore their functional roles Figure 3B and Supplementary Table 4.

Protein binding GO was overrepresented in all modes of duplicate genes under positive selection. The characteristic of duplicated genes that experienced positive selection. A The proportion of duplicated gene pairs under different selection pressure.

Green dot: positive selection; red dot: neutral selection; blue dot: negative selection. B GO analysis of duplicated genes that experienced positive selection. The larger circle indicates a higher frequency of occurrence of a GO term. In addition, we investigated the whole-gene conversion events that occurred in different modes of duplicated gene pairs. RD- and DD-derived pairs were excluded in the following analysis because their homologous gene quartets were not identified. The high frequency of gene conversion that occurred in WGD-derived pairs may partially account for their lower sequence divergence levels.

The functional roles of converted gene pairs were further analyzed Supplementary Figure 6. Additionally, apoptotic process GO and defense response GO were overrepresented in converted PD-derived pairs.

RNA-seq data from different pear tissues and development stages were collected to comprehensively measure the expression divergence between duplicated genes Supplementary Table 6.

Here, we only analyzed the duplicated pairs in which both gene copies were expressed in at least one tissue or developmental stage. The r -value was calculated between the expression profiles of two copies of each gene pair, and 1- r was used to measure the expression divergence between duplicated genes. To determine the cutoff that indicated two gene copies of a pair had diverged in expression, we randomly selected 10, gene pairs and computed r between their expression profiles.

Moreover, we investigated the dynamic process of expression divergence using K s values for different modes of gene duplication in pear. We used the Python NumPy module to fit the smooth curve between expression divergence and K s for each mode of duplicated gene pairs with 10 degrees of freedom Figure 4D.

The RD-derived pairs appear to have experienced more dramatic expression divergence than the other classes of duplicated genes. The abnormal curve for RD-derived pairs may be resulted from there being fewer of these pairs available RD pairs when fitting the curve between pearson r and K s for RD-derived pairs using a smooth spline with 10 degrees of freedom Figure 4D.

Initially, RD-derived pairs were identified in this study. After filtering the RD-derived pairs with abnormal or null r or K s values, only 56 RD-derived pairs were reserved. The expression divergence between duplicate genes in pear. A The distributions of TPM values in different tissues and conditions for intergenic sequences. The horizontal green line indicates the mean value of the medians in different boxplots. C The proportion of divergent and undifferentiated or conserved gene pairs in expression.

D The dynamic expression divergence between duplicate genes with increasing K s values. Furthermore, we extracted bp upstream of the transcription start site for each gene as the putative promoter sequence. The RD-, DD-, and DSD-derived pairs, which had undergone extensive expression divergence, had dramatically diverged in their promoter regions. Furthermore, the dynamic process of promoter divergence with K s was dissected for different modes of gene duplication in pear.

The smooth curve between promoter divergence and K s was fitted with 10 degrees of freedom for each mode of duplicated gene pairs Figure 5C. In addition, the smooth curve between expression divergence and promoter divergence was fitted with 10 degrees of freedom for each mode of duplicated gene pairs. However, the promoter divergence between duplicated genes showed no significant correlation to expression divergence Figure 5D. The promoter region divergence between duplicate genes in pear.

A The density distribution of the shared-motif similarity in the promoter sequence s LS between two duplicated genes resulting from different modes of duplicated pairs.

B The proportion of divergent and conserved gene pairs in the promoter region. C The dynamic promoter divergence between duplicate genes with increasing K s values.

D The dynamic expression divergence between duplicated genes with increasing d SM values. Additionally, we identified the duplicated gene pairs retained from the recent and ancient WGD events in the pear genome and compared the patterns of divergence between these two sets of genes. Furthermore, two K s peaks corresponding to the two WGD events were fitted from the K s distribution by using mixture models with two components Tang et al.

The non-synonymous substitution rates K a were used to measure the sequence divergence between duplicated genes. The comparison of sequence, expression and promoter divergence between duplicated genes derived from recent and ancient whole-genome duplications WGD events.

B The comparison of sequence divergence between duplicated genes from two different WGD events. C The comparison of expression divergence between duplicate genes from two different WGD events.

D The comparison of promoter divergence between duplicate genes from two different WGD events. The conserved domains contained in protein sequence may be related to protein functions. Therefore, we identified the Pfam domains for protein sequences encoded by different modes of duplicated genes to resolve their biased functional roles Supplementary Table 7.

The proportion of different domains detected in each mode of duplicate genes was calculated. We also estimated the proportion of different domains in whole-genome proteins as the control. The first 10 domains with high frequency levels in each mode of gene duplication were selected for a comparative analysis Supplementary Figure 7. The enriched domains for different modes of duplicate genes were biased. Protein kinases function in a multitude of cellular processes, including metabolism, transcription, signal transduction, cell cycle progression, cytoskeletal rearrangement and cell movement, apoptosis, and differentiation Manning et al.

Therefore, the WGD-derived genes may play important roles in basal metabolism and biological regulation. Several domains involved in plant resistance and defense response, such as leucine rich repeat PF Prior studies revealed that PPR proteins play important roles in organellar gene expression, organelle e.

Notably, the ankyrin repeats domain PF Ankyrin repeat proteins are associated with plant organogenesis, male—female gamete recognition, and plant defense Dong, ; Huang et al. Moreover, we investigated the functional roles of different modes of duplicated genes through a GO enrichment analysis. First, we assigned pear genes into these three GO categories according to their GO annotations, and then we estimated the proportion of different GO categories detected in each mode of duplicate genes.

Interestingly, the results showed that different modes of duplicated genes were biased toward particular categories Supplementary Figure 8. Sparklingly, RD- and DD-derived genes may have large contribution to the biosynthesis of cellular component with respect to the higher proportion of genes involved in the category cellular component. Secondly, we performed the GO enrichment analysis with strict statistical tests.



0コメント

  • 1000 / 1000