IntroductionΒΆ
Even though the DNA double helix is a symmetric structure, many biological processes such as replication, transcription and transcription factor binding are directional. The directionality of such processes results in the inhomogeneous distribution of genomic sequences relative to the two complementary DNA strands. Reflecting the directionality biases, strong compositional strand asymmetries have been observed across the entire tree of life, ranging all the way from viral to eukaryotic genomes.
DNA mutations can be oriented relative to transcription and replication, using as reference the template/non-template and leading/lagging strands, respectively. If the reference nucleotide or motif at the site of the mutation is found more frequently in one strand relative to the other, following correction for background strand preferences, it indicates a mutational strand asymmetry. This mutational strand imbalance can have a major impact on disease, development and evolution.
Sequences that are non palindromic can be oriented relative to one another. An exemplar case reflecting the contribution of strand asymmetries in biological processes is the transcription factor CCCTC-binding factor (CTCF). The orientation of the CTCF motif dictates chromatin looping and three dimensional genome topology. Other cases of strand asymmetries include endogenous repetitive elements with preferences in their orientation relative to each other and relative to transcriptional and replicative direction, which can influence their jumping activity.
By studying systematically strand asymmetries we can identify novel DNA elements, improve our comprehension regarding their interactions with one another and advance our understanding regarding the contribution of underlying processes in mutagenesis and evolution. To date, there is no versatile tool to perform analysis of strand asymmetries across biological problems.