Source: Rosalind("Consensus and Profile")
Brief summary
A T C C A G C T G G G C A A C T A T G G A T C T DNA Strings A A G C A A C C T T G G A A C T A T G C C A T T A T G G C A C T A 5 1 0 0 5 5 0 0 Profile C 0 0 1 4 2 0 6 1 G 1 1 6 3 0 1 0 0 T 1 5 0 0 0 1 1 6 Consensus A T G C A A C TGiven: A collection of at most 10 DNA strings of equal length (at most 1 kbp) in FASTA format.
Return: A consensus string and profile matrix for the collection. (If several possible consensus strings exist, then you may return any one of them.)
Model (cons.rb):
#!/usr/bin/env ruby require_relative '../ie_module' class DnaConsensus include ImportExport DNA_BASES = %w(A C G T) attr_reader :dna_strings, :consensus, :profile def initialize(source = "rosalind_#{current_dir_name}.txt") @dna_strings = (source =~ /txt$/ ? import_lines(source) : source).values @profile = build_profile @consensus = build_consensus end def to_s "#{consensus.join}\n#{stringify(profile)}" end private def build_profile prof = DNA_BASES.map{|b| [b, []]}.to_h dna_strings.map(&:chars).transpose.each.with_object(prof) do |arr, hsh| hsh.merge!(hashed(arr)){ |_, oldval, newval| oldval << newval } end end def hashed(arr) hsh = arr.group_by(&:chr).map{ |k,v| [k, v.size] }.to_h (DNA_BASES - hsh.keys).each { |b| hsh[b] = 0 } hsh end def build_consensus dna_strings.first.length.times.with_object([]) do |index, arr| arr << profile.max_by{|_, list| list[index]}.first end end end a = DnaConsensus.new a.export_to_file([a.to_s]) File read/write logic (ie_module.rb):
module ImportExport def export_to_file(result, file = "result_#{current_dir_name}.txt") File.open(file, 'w') do |f| result.each{ |val| f << "%s" % val } end end private def current_dir_name File.basename(Dir.getwd) end def stringify(obj) if obj.is_a?(Hash) then obj.map{|k,v| "#{k}: #{v.join(' ')}"} else obj.map{|e| e.join(' ')} end.join("\n") end def import_lines(file) File.foreach(file).with_object({}) do |line, hsh| line = line.strip.sub(/^>/, '') $' ? hsh[line] = '' : hsh[hsh.keys.last] << line end end end Here is a lot of code, but #build_profile is the most "complicated" part. I know, that "alternate way" exists. All suggestions are welcome.