allpy
view lib/block.py @ 151:675b402094be
day commit -- a lot of changes
fasta.py:
universal save_fasta()
determine_long_line -- for determine length of fasta sequence string
in user input
everywhere: standart long_line=60 --> 70
blocK.sequences_chains: returns sequences in order as in project
added monomer pdb_secstr to store secondary structure
pdb adding: some improvements and fixes
fix in from_pdb_chain: use all peptides, not only first
Sequence.pdb_files added to store information about pdb file for each chain
dssp bindings to get secondary structure
/sec_str -- tool to map secondary structure on each sequence of alignment
author | boris (netbook) <bnagaev@gmail.com> |
---|---|
date | Tue, 26 Oct 2010 00:40:36 +0400 |
parents | f7dead025719 |
children | 0c7f6117481b |
line source
1 #!usr/bin/python
16 """ Block of alignment
18 Mandatory data:
19 * self.project -- project object, which the block belongs to
20 * self.sequences - set of sequence objects that contain monomers
21 and/or gaps, that constitute the block
22 * self.positions -- sorted list of positions of the project.alignment that
23 are included in the block
25 Don't change self.sequences -- it may be a link to other block.sequences
27 How to create a new block:
28 >>> import project
29 >>> import block
30 >>> proj = project.Project(open("test.fasta"))
31 >>> block1 = block.Block(proj)
32 """
35 """ Builds new block from project
37 if sequences==None, all sequences are used
38 if positions==None, all positions are used
39 """
49 """ Saves alignment to given file in fasta-format
51 No changes in the names, descriptions or order of the sequences
52 are made.
53 """
64 """ Returns length-sorted list of blocks, representing GCs
66 max_delta -- threshold of distance spreading
67 timeout -- Bron-Kerbosh timeout (then fast O(n ln n) algorithm)
68 minsize -- min size of each core
69 ac_new_atoms -- min part or new atoms in new alternative core
70 current GC is compared with each of already selected GCs
71 if difference is less then ac_new_atoms, current GC is skipped
72 difference = part of new atoms in current core
73 ac_count -- max number of cores (including main core)
74 -1 means infinity
75 If more than one pdb chain for some sequence provided, consider all of them
76 cost is calculated as 1 / (delta + 1)
77 delta in [0, +inf) => cost in (0, 1]
78 """
104 break
108 break
112 """ Returns string consisting of gap chars and chars x at self.positions
114 Length of returning string = length of project
115 """
122 """ Save xstring and name in fasta format """
126 """ Iterates monomers of this sequence from this block """
131 """ Iterates Ca-atom of monomers of this sequence from this block """
135 """ Iterates pairs (sequence, chain) """
142 """ Superimpose all pdb_chains in this block """
151 # Apply rotation/translation to the moving atoms
155 """ Save all sequences
157 Returns {(sequence, chain): CHAIN}
158 CHAIN is chain letter in new file
159 """
165 # TODO: read from tmp_file.name
166 # change CHAIN
167 # add to out_file