TRAN: Transitions and Transversions | Ben Cunningham

TRAN: Transitions and Transversions

Problem by Rosalind · on December 4, 2012

For DNA strings and having the same length, their transition/transversion ratio is the ratio of the total number of transitions to the total number of transversions, where symbol substitutions are inferred from mismatched corresponding symbols as when calculating Hamming distance (see “Counting Point Mutations”).

Given: Two DNA strings and of equal length (at most 1 kbp).

Return: The transition/transversion ratio .

Sample Dataset

>Rosalind_0209
GCAACGCACAACGAAAACCCTTAGGGACTGGATTATTTCGTGATCGTTGTAGTTATTGGA
AGTACGGGCATCAACCCAGTT
>Rosalind_2200
TTATCTGACAAAGAAAGCCGTCAACGGCTGGATAATTTCGCGATCGTGCTGGTTACTGGC
GGTACGAGTGTTCCTTTGGGT

Sample Output

1.21428571429

R

library(dplyr)
library(purrr)
library(seqinr)

f <- "tran.txt"

dna <- data_frame(
  raw = read.fasta(f, as.string = TRUE),
  dna =
    toupper(raw) %>%
    strsplit(split = "")
)

s <-
  data_frame(
    a = unlist(dna$dna[1]),
    b = unlist(dna$dna[2]),
    t = map2_lgl(a, b, function(x, y) {
      all(c(x, y) %in% c("A", "G")) || all(c(x, y) %in% c("C", "T"))
    })
  ) %>%
  filter(a != b)

cat(sum(s$t) / (nrow(s) - sum(s$t)))
1.214286