Cracking The Genomic Language of Somatic Hypermutation targeting

(Collaboration with Dr. Jukka Alinikula Turku University, Finland)

In mature B cells, the affinity of antibodies to the antigen is improved by introducing point mutations in the Ig genes. This process, termed somatic hypermutation (SHM), is initiated by the activation induced deaminase (AID). AID-induced SHM is linked to transcription, but the preferable targeting to the Ig genes over non-Ig genes in not well understood. Recent studies demonstrated that SHM targeting to Ig loci is dependent on cis-acting elements reside at the Ig enhancers (IgE), collectively termed DIVAC (DIversification ACtivators). These studies highlighted some Transcription Factor Binding Sites (TFBS) that are important for DIVAC activity, but none of them is unique to the Ig loci enhancers. Nevertheless, to date, no non-Ig enhancer has demonstrated SHM targeting activity similar to that of an Ig enhancer. We hypothesize that SHM targeting is enabled by a unique regulatory “language” that – like any other language – contains a highly defined combination of “letters” (individual TFBS), “words” (specific combinations of adjacent binding sites), and “sentences” (DIVAC).

In this project we apply machine learning on data from millions of synthetic regulatory elements that have been tested for their capacity to induce SHM, in order to crack the genomic language of AID targeting.