<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <title>OAR@UM Collection:</title>
  <link rel="alternate" href="https://www.um.edu.mt/library/oar/handle/123456789/111132" />
  <subtitle />
  <id>https://www.um.edu.mt/library/oar/handle/123456789/111132</id>
  <updated>2026-04-07T18:44:35Z</updated>
  <dc:date>2026-04-07T18:44:35Z</dc:date>
  <entry>
    <title>An annotation framework for variants that alter promoter transcription factor binding sites (nCODREG)</title>
    <link rel="alternate" href="https://www.um.edu.mt/library/oar/handle/123456789/119660" />
    <author>
      <name />
    </author>
    <id>https://www.um.edu.mt/library/oar/handle/123456789/119660</id>
    <updated>2024-03-12T08:43:47Z</updated>
    <published>2022-01-01T00:00:00Z</published>
    <summary type="text">Title: An annotation framework for variants that alter promoter transcription factor binding sites (nCODREG)
Abstract: Transcriptional regulation is a complex biological process requiring the combined activity &#xD;
of numerous molecules, including transcription factors, cofactors and chromatin &#xD;
regulators. Transcription factors recognise and bind to short non-coding sequences known &#xD;
as motifs found in genes’ regulatory regions, such as promoter, enhancer and silencer &#xD;
regions. This allows transcription factors to modulate the recruitment and activation of &#xD;
RNA polymerase II, the multiprotein complex responsible for the transcription of all &#xD;
protein-coding genes. The presence of genetic variants in regulatory regions may disrupt &#xD;
transcription factor binding, culminating in altered gene expression and protein &#xD;
production. Indeed, genome-wide association studies (GWAS) have flagged several &#xD;
variants in regulatory regions associated with disease development and traits. Hence, the &#xD;
annotation of variants residing in regulatory sites has become increasingly important in &#xD;
genomic studies and disease interpretation. &#xD;
This study describes the implementation of an annotation framework for variants residing &#xD;
in gene promoter regions which may potentially create, delete or alter the binding affinity &#xD;
of transcription factor binding sites. Variants are annotated by querying a publicly available &#xD;
RESTful web-service called VEP, and a BioPython library called Bio.Motifs which computes &#xD;
the position weight matrix (PWM) scores from two locally saved motif collections called &#xD;
JASPAR and HOCOMOCO. The outcome is a list of promoter variants annotated with &#xD;
transcription factors which may be affected by the variants, and the expected binding &#xD;
ability at the variants’ site. Used together, the VEP and motif collections can strengthen &#xD;
the outcome of a particular variant. Results on our dataset show that on average 12% of &#xD;
Whole Exome Sequencing (WES) variant locations and 8.5% of Whole Genome Sequencing &#xD;
(WGS) locations flagged by VEP were also flagged by JASPAR’s motif collection.&#xD;
Compared to other motif finding tools, the implemented annotation framework automates &#xD;
the whole annotation process by building the required nucleotide sequences adjacent to &#xD;
the promoter variants, while ensuring the variants are always within the nucleotide &#xD;
sequence being scanned by the motifs. In addition, the annotation process is able to scale &#xD;
up according to the number of CPUs available on the running machine. Enabling multi-core &#xD;
execution on a 4-core processor resulted in a 66% decrease in execution time of the dataset &#xD;
compared to single-core execution, thus speeding up the annotation processing of millions &#xD;
of variants within high-throughput sequencing data files.
Description: M.Sc.(Melit.)</summary>
    <dc:date>2022-01-01T00:00:00Z</dc:date>
  </entry>
  <entry>
    <title>Recalibration of minor alleles in the human reference sequence</title>
    <link rel="alternate" href="https://www.um.edu.mt/library/oar/handle/123456789/111302" />
    <author>
      <name />
    </author>
    <id>https://www.um.edu.mt/library/oar/handle/123456789/111302</id>
    <updated>2023-07-05T11:01:09Z</updated>
    <published>2022-01-01T00:00:00Z</published>
    <summary type="text">Title: Recalibration of minor alleles in the human reference sequence
Abstract: The intrinsic problem of minor alleles occupying reference positions in the Human Reference &#xD;
Sequence build 37 may challenge the notion of accurate variant calling and result in variant &#xD;
misinterpretation in the clinical practice. In this research study, a bioinformatics pipeline, &#xD;
RecAl, was developed with the primary aim to detect all reference minor alleles and generate &#xD;
three VCF files during sample analysis. These files include the false-positive variants, the &#xD;
false-negative variants, and a separate corrected sample VCF file with the eliminated false-positive    &#xD;
variants and incorporated false-negative variants. &#xD;
When the sample files were processed through RecAl, the percentage of false positives &#xD;
variants detected for an alternate allele frequency threshold of 0.90, 0.95 and 0.99 were 9.7%, &#xD;
7.5% and 5.4% respectively. For the false negative variants, RecAl identified 0.013%, 0.007% &#xD;
and 0.005% respectively. Each of these variants were annotated using popular pathogenicity &#xD;
prediction tools including CADD (Kircher M et al., 2014), Polyphen-2 (Adzhubei I et al., 2010) &#xD;
and SIFT (Ng, P. and Henikoff, S., 2001). From the results, it was presented that 1.24% of the &#xD;
false-positive variants and 0.87% of the false-negative variants are deleterious with significant &#xD;
impact of sequence variation. &#xD;
Additionally, the list generated through RecAl for reference minor alleles was compared to &#xD;
the study carried out by Fuentes F et al., (2012) which focused on false-positive calls due to &#xD;
reference minor alleles in exome regions. From this evaluation, 90% of the variants matched &#xD;
which signifies that the problem of minor alleles occupying reference positions is still &#xD;
prevalent and the list of reference minor alleles generated by RecAl is reliable. Lastly, a &#xD;
comparative analysis of the reference minor alleles in the Human Reference build 37 was &#xD;
compared to the reference minor alleles in build 38 to assess how many reference minor alleles &#xD;
were corrected which resulted in only 9% being corrected.
Description: M.Sc.(Melit.)</summary>
    <dc:date>2022-01-01T00:00:00Z</dc:date>
  </entry>
</feed>

