Документ взят из кэша поисковой машины. Адрес оригинального документа : http://kodomo.fbb.msu.ru/hg/allpy/file/9369dbad919d/sandbox/common.py
Дата изменения: Unknown
Дата индексирования: Mon Feb 4 07:53:34 2013
Кодировка:
allpy: 9369dbad919d sandbox/common.py

allpy

view sandbox/common.py @ 251:9369dbad919d

Incompatible changes to Monomer interfaces. This branch does not work! - (!!) only changed allpy._monomer, not uses - (!!) removed (temporarily) classes for specific monomer types (DNAMonomer, etc) - refurbished allpy.data.AAcodes to allpy.data.codes with much cleaner interface - refurbished allpy._monomer for simplicity and more friendly interface Now it will (someday) be possible to say: a = Monomer.from_name("alanine") b = protein.Monomer.from_code1("a") c = protein.MonomerType.from_code3("ala") d = dna.Monomer.from_code3("DA") but impossible to say: d = protein.Monomer.from_code3("DA")
author Daniil Alexeyevsky <me.dendik@gmail.com>
date Mon, 13 Dec 2010 20:12:11 +0300
parents
children
line source
1 def autoload(filename):
2 seqs = load(filename)
3 maxlen, seqs = measure(seqs)
4 stats, seqs = stat(maxlen, seqs)
5 seqs = color(maxlen, stats, seqs)
6 return seqs
8 def load(filename):
9 seqs = []
10 for block in open(filename).read().split('\n>'):
11 lines = block.split('\n')
12 name = lines[0].lstrip('>').strip()
13 body = "".join(lines[1:])
14 seqs.append((name, body))
15 return seqs
17 def measure(seqs):
18 maxlen = max([len(body) for name, body in seqs])
19 for i, (name, body) in enumerate(seqs):
20 body += "-" * (maxlen - len(body))
21 seqs[i] = name, body
22 return maxlen, seqs
24 def stat(maxlen, seqs):
25 stats = []
26 for x in xrange(maxlen):
27 stat = {}
28 for name, body in seqs:
29 char = body[x]
30 stat[char] = stat.get(char, 0) + 1
31 stats.append(stat)
32 return stats, seqs
34 def color(maxlen, stats, seqs):
35 full = len(seqs)
36 for i, (name, body) in enumerate(seqs):
37 ids = []
38 colors = []
39 for x in xrange(maxlen):
40 id = stats[x][body[x]] * 10 // full
41 norm = id * 256 // 10
42 color = '#%02x%02x%02x' % (norm, norm, norm)
43 ids.append(id)
44 colors.append(color)
45 seqs[i] = name, body, ids, colors
46 return seqs