Case study with a biological populations: a list of lists of lists

Question

I have a population (Pop) which has an attribute which is a list of individuals (Ind) where each individual has an attribute which is a list of chromosomes (Chromo). Each chromosome is a list of numbers which tells about the fitness (=reproductive success, the fitness is obtain by multiplying all numbers of a all chromosomes) of the individuals. Within one position on the chromosome, there are different values for the different individuals. I'd like to set the greatest value to 1 and the others to keep their relative value in comparison to the biggest value.

For example, if on one position on the n-th chromosome, the individuals in the population have the values [3,4,0.4,12,5,6] (that would be a case of a population of 6 individuals).

I'd like to set these value to:

a = [3,4,0.4,12,5,6] [i/float(max(a)) for i in a]

I tried to create this function but I got lost and can hardly find out a solution in all these lists!

You can reach the 14th of the 4th chromosome in the 25th individual by writing:

population.inds[24].chromosomes[3].alleles[13]

To make sure my aim is understood. I'd like to create a function which take an instance of Pop as argument and return the same or another Pop where all positions on the chromosomes are replaced by numbers in the range [0,1] respecting the relative values of all numbers at the same position on the same chromosome in the population.

Below is my code (it is long but reading the constructor of the classes Chromo, Ind and Pop (which are all impressively basic) should, I hope, be enough. The class WalkerRandomSampling serve only the purpose of performing a random weighted sampling).

One might have a look to what I tried, the method is called set_best_fitness_to_one and is within the Pop class.

from random import uniform, gauss, choice, expovariate, shuffle from numpy import arange, array, bincount, ndarray, ones, where from numpy.random import seed, random, randint, binomial from operator import mul, methodcaller class WalkerRandomSampling(object): """Walker's alias method for random objects with different probablities. Based on the implementation of Denis Bzowy at the following URL: http://code.activestate.com/recipes/576564-walkers-alias-method-for-random-objects-with-diffe/ """ def __init__(self, weights, keys=None): """Builds the Walker tables ``prob`` and ``inx`` for calls to `random()`. The weights (a list or tuple or iterable) can be in any order and they do not even have to sum to 1.""" n = self.n = len(weights) if keys is None: self.keys = keys else: self.keys = array(keys) if isinstance(weights, (list, tuple)): weights = array(weights, dtype=float) elif isinstance(weights, ndarray): if weights.dtype != float: weights = weights.astype(float) else: weights = array(list(weights), dtype=float) if weights.ndim != 1: raise ValueError("weights must be a vector") weights = weights * n / weights.sum() inx = -ones(n, dtype=int) short = where(weights < 1)[0].tolist() long = where(weights > 1)[0].tolist() while short and long: j = short.pop() k = long[-1] inx[j] = k weights[k] -= (1 - weights[j]) if weights[k] < 1: short.append( k ) long.pop() self.prob = weights self.inx = inx def random(self, count=None): """Returns a given number of random integers or keys, with probabilities being proportional to the weights supplied in the constructor. When `count` is ``None``, returns a single integer or key, otherwise returns a NumPy array with a length given in `count`. """ if count is None: u = random() j = randint(self.n) k = j if u <= self.prob[j] else self.inx[j] return self.keys[k] if self.keys is not None else k u = random(count) j = randint(self.n, size=count) k = where(u <= self.prob[j], j, self.inx[j]) return self.keys[k] if self.keys is not None else k def test(self): weights = [12,3,2,0,5] test = WalkerRandomSampling(weights=weights) a = [] for i in xrange(10000): a.append(test.random()) b = [] for value in range(4): b.append(len([i for i in a if i == value])/float(len(a))) print b print weights class Chromo(object): def __init__(self, alleles): self.alleles=alleles def mutations(self): nb_mut = binomial(chromo_size, mut_rate) for one_mut in xrange(nb_mut): self.alleles[choice(range(chromo_size))] *= pdf_mutation(pdf_mut_scale) return self class Ind(object): def __init__(self, chromosomes): self.chromosomes = chromosomes def fitness(self): if nb_chromosomes == 1: return reduce(mul, self.chromosomes[0].alleles) fit = 1 for gene_pos in xrange(chromo_size): alleles = [] for chromo_pos in range(len(self.chromosomes)): alleles.append(self.chromosomes[chromo.pos].alleles[gene_pos]) fit *= sum(alleles)/len(alleles) # + dominance effect. Epistasis?! return fit def reprod(self,other): off = Ind(chromosomes = []) for one_chromo in xrange(nb_chromosomes): # recombination. Because the population has been shuffled, it is not necessary to create two recombined chromosomes and that select one (segragation). I construct only one recombined chromosome where self construct the first part of the chromosome. nb_cross = binomial(chromo_size, recombination) cross_pos = WalkerRandomSampling([1]*(chromo_size-1)).random(count=nb_cross).sort() recombined_chromo = Chromo([]) previous_cross = 0 for sex, one_cross in enumerate(cross_pos): if sex%2 == 0: recombined_cromo.alleles.append(self.chromosomes.alleles[previous_cross:(one_cross+1)]) else: recombined_cromo.alleles.append(other.chromosomes.alleles[previous_cross:(one_cross+1)]) previous_cross = one_cross off.chromosomes.append(recombined_chromo) return off class Pop(object): def __init__(self, inds): self.inds = inds def reproduction(self): "First chose those that reproduce and then simulate mutations in offsprings" # chosing those who reproduce - Creating the offspring population new_pop = Pop(inds=[]) fitness = [] for one_ind in self.inds: fitness.append(one_ind.fitness()) min_fitness = min(fitness) if min_fitness < 0: fitness = [one_ind - min_fitness for one_ind in fitness] pick = WalkerRandomSampling(weights = fitness) if nb_chromosomes == 1 and recombination == 0: for i in xrange(pop_size): new_pop.inds.append(self.inds[pick.random()]) else: for i in xrange(pop_size): father = self.inds[pick.random()] mother = self.inds[pick.random()] off = father.reprod(mother) new_pop.inds.append(off) nb_off += 1 # Mutations for one_ind in new_pop.inds: for chromo_number in xrange(nb_chromosomes): one_ind.chromosomes[chromo_number].mutations() return new_pop def create_population(self): one_chromo = Chromo(alleles = [1]*chromo_size) one_ind = Ind(chromosomes = [one_chromo for i in range(nb_chromosomes)]) return Pop(inds=[one_ind for i in xrange(pop_size)]) def stats(self, generation): line_to_write = str(generation) + '\t' + str(replicat) + '\t' + str(mut_rate) + '\t' + str(pdf_mut_scale) + '\t' + str(pdf_mutation.__name__)\ + '\t' + str(pop_size) + '\t' + str(nb_chromosomes) + '\t' + str(chromo_size) + '\t' + str(recombination) + '\t' + str(dominance) + '\t' if output_type == 'mean fitness': add = sum([ind.fitness() for ind in self.inds])/pop_size output_file.write(line_to_write + str(add) + '\n') def set_best_fitness_to_one(self): list_chromo = zip(*[map(fun_for_set_fitness,[ind.chromosomes[chromo_number] for ind in self.inds]) for chromo_number in xrange(nb_chromosomes)]) new_pop = Pop([]) for one_ind in xrange(0,pop_size,nb_chromosomes): new_pop.inds.append(list_chromo[one_ind:(ond_ind+nb_chromosomes)]) return new_pop def fun_for_set_fitness(list_one_chromo_number): l = zip(*list_one_chromo_number) for locus_pos, one_locus in enumerate(l): max_one_locus = max(one_locus) one_locus = [i/float(max_one_locus) for i in one_locus] l[locus_pos] = one_locus return zip(*l) ######### Main ############# def main_run(): population = Pop([]).create_population() for generation in xrange(Nb_generations): population.stats(generation) population = population.reproduction() shuffle(population.inds) population = population.set_best_fitness_to_one() ####### PARAMETERS ########## # Parameters Nb_generations = 120 # output_type = 'all individuals fitness' output_type = 'mean fitness' max_pop_size = 100 # this is only used to create the first (title, header) line! # Output file file_name = 'stats3.txt' path = '/Users/remimatthey-doret/Documents/Biologie/programmation/Python/Fitness distribution in the population/' + file_name output_file = open(path,'w') first_line = 'Generation\treplicat\tmut_rate\tpdf_mut_scale\tpdf_mutation\tpop_size\tnb_chromosomes\tchromo_size\trecombination\tdominance\t' if output_type == 'mean fitness': first_line += 'mean_fitness' if output_type == 'all individuals fitness': for ind in xrange(max_pop_size): first_line += 'fit_ind_' + str(ind) + '\t' output_file.write(first_line + '\n') # Parameters that iterate total_nb_runs = 3 * 10 # just enter the total number of iteration of the main_run function nb_runs_performed = 0 for mut_rate in [0.0001]: for pdf_mut_scale in [0.01,0.1,0.3]: # Note: with an negative exponential distribution (expovariate) the expected value is 1/lambda for pdf_mutation in [expovariate]: for pop_size in [1000]: for nb_chromosomes in [1]: for chromo_size in [1000]: for recombination in [0]: for dominance in [0]: for replicat in xrange(10): main_run() nb_runs_performed += 1 print str(float(nb_runs_performed)/total_nb_runs * 100) + '%'

Jamal · Accepted Answer · 2015-01-18 18:43:14Z

Some advice here:

You should try to follow the PEP 8 guidelines whenever it's possible. In your case, the naming convention is not followed.
You should try to keep things simple. For instance, in WalkerRandomSampling.__init__(), it seems like you are doing a bit of logic to handle cases when keys is None and cases when it's not but at the end, you never init a WalkerRandomSampling with keys so it's quite hard to tell whether this is useful at all.
Don't repeat yourself. It seems like the beginning of WalkerRandomSampling.random() is roughly the same no matter if count is None or not. The common code could be factorised out.

Don't repeat yourself. The branches in:

 previous_cross = 0 for sex, one_cross in enumerate(cross_pos): if sex%2 == 0: recombined_cromo.alleles.append(self.chromosomes.alleles[previous_cross:(one_cross+1)]) else: recombined_cromo.alleles.append(other.chromosomes.alleles[previous_cross:(one_cross+1)]) previous_cross = one_cross

look way too similar. It probably would be better to write :

 previous_cross = 0 for sex, one_cross in enumerate(cross_pos): relevant_ind = other if sex%2 else self recombined_cromo.alleles.append(relevant_ind.chromosomes.alleles[previous_cross:(one_cross+1)]) previous_cross = one_cross

Use list (or set or dict) comprehension whenever you can.

a = [] for i in xrange(10000): a.append(test.random()) b = [] for value in range(4): b.append(len([i for i in a if i == value])/float(len(a)))

could be written :

a = [test.random() for i in xrange(10000)] b = [len([value for i in a if i == value])/float(len(a)) for value in range(4)]

and the second line can actually be written differently so that we have :

a = [test.random() for i in xrange(10000)] b = [a.count(value)/float(len(a)) for value in range(4)]

Similarly :

 alleles = [] for chromo_pos in range(len(self.chromosomes)): alleles.append(self.chromosomes[chromo.pos].alleles[gene_pos])

can be written :

 alleles = [self.chromosomes[chromo.pos].alleles[gene_pos] for chromo_pos in range(len(self.chromosomes)]

and you could simplify the way you iterate :

 alleles = [c.alleles[gene_pos] for c in self.chromosomes]

And :

fitness = [] for one_ind in self.inds: fitness.append(one_ind.fitness())

could become :

fitness = [one_ind.fitness() for one_ind in self.inds]

Do not build strings using concatenations multiple times. The recommended way is to use join:
```
for ind in xrange(max_pop_size): first_line += 'fit_ind_' + str(ind) + '\t' 
```
could be written as:
```
first_line += '\t'.join('fit_ind_' + str(ind) for ind in xrange(max_pop_size)) 
```
(Also, that will not add the useless \t at the end)

Stack Exchange Network

Case study with a biological populations: a list of lists of lists

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Case study with a biological populations: a list of lists of lists

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions