1

I have a string of characters and a list of characters. I wish to create a dictionary in which the keys are the characters as and the values are the list, only without the key character.

A string of characters:

sequence = 'ATGCG' 

The list:

bases = ['C', 'T', 'A', 'G'] 

The resulting dictionary would be:

{'A': ['C', 'T', 'G'], 'T': ['C', 'A', 'G'], 'G': ['C', 'T', 'A'], 'C': ['T', 'A', 'G'], 'G': ['C', 'T', 'A'], } 

I tried using the following code but got a list of 4 items:

variations = {current_base: [base for base in bases if current_base != base] for current_base in sequence} 

I'd love to get ideas regarding what I'm doing wrong. Thanks.

3
  • Does this answer your question? All combinations of a list of lists Commented Nov 9, 2021 at 12:42
  • works for me the dict has 4 elements but each list has 3 Commented Nov 9, 2021 at 12:43
  • 2
    Dictionaries do not support duplicate keys. Commented Nov 9, 2021 at 12:44

2 Answers 2

3

What you want to do is impossible, a dictionary cannot have duplicated keys.

{'A': ['C', 'T', 'G'], 'T': ['C', 'A', 'G'], 'G': ['C', 'T', 'A'], 'C': ['T', 'A', 'G'], 'G': ['C', 'T', 'A'], ## this is impossible } 

You can use a list of tuples instead. I am taking the opportunity to show you a more efficient method using python sets:

sequence = 'ATGCG' bases = set(list('ACGT')) [(b,list(bases.difference(b))) for b in sequence] 

NB. actually, it is even more efficient to pre-compute the diffs as you have a potentially very long DNA sequence, but only 4 bases:

sequence = 'ATGCG' bases = set(list('ACGT')) diffs = {b: list(bases.difference(b)) for b in bases} [(b,diffs[b]) for b in sequence] 

output:

[('A', ['T', 'C', 'G']), ('T', ['A', 'C', 'G']), ('G', ['T', 'A', 'C']), ('C', ['T', 'A', 'G']), ('G', ['T', 'A', 'C'])] 
alternative output using the position as key:
{i: list(bases.difference(b)) for i,b in enumerate(sequence)} 

output:

{0: ['T', 'C', 'G'], 1: ['A', 'C', 'G'], 2: ['T', 'A', 'C'], 3: ['T', 'A', 'G'], 4: ['T', 'A', 'C']} 
Sign up to request clarification or add additional context in comments.

Comments

0

Try this:

sequence = 'ATGCG' d = {c: list(''.join(set(list(sequence))).replace(c,'')) for c in set(list(sequence))} 

1 Comment

This is invalid. Duplicate keys will be overwritten by the last value.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.