allpy
view lib/project.py @ 151:675b402094be
day commit -- a lot of changes
fasta.py:
universal save_fasta()
determine_long_line -- for determine length of fasta sequence string
in user input
everywhere: standart long_line=60 --> 70
blocK.sequences_chains: returns sequences in order as in project
added monomer pdb_secstr to store secondary structure
pdb adding: some improvements and fixes
fix in from_pdb_chain: use all peptides, not only first
Sequence.pdb_files added to store information about pdb file for each chain
dssp bindings to get secondary structure
/sec_str -- tool to map secondary structure on each sequence of alignment
author | boris (netbook) <bnagaev@gmail.com> |
---|---|
date | Tue, 26 Oct 2010 00:40:36 +0400 |
parents | f7dead025719 |
children | 0c7f6117481b |
line source
1 #!/usr/bin/python
3 """
4 "I will not use abbrev."
5 "I will always finish what I st"
6 - Bart Simpson
8 """
22 """ Alignment representing class
24 Mandatory data:
25 * sequences -- list of Sequence objects. Sequences don't contain gaps
26 - see sequence.py module
27 * alignment -- dict
28 {<Sequence object>:[<Monomer object>,None,<Monomer object>]}
29 keys are the Sequence objects, values are the lists, which
30 contain monomers of those sequences or None for gaps in the
31 corresponding sequence of
32 alignment
34 """
36 """overloaded constructor
38 Project() -> new empty Project
39 Project(sequences, alignment) -> new Project with sequences and
40 alignment initialized from arguments
41 Project(fasta_file) -> new Project, read alignment and sequences
42 from fasta file
44 """
55 """ Returns width, ie length of each sequence with gaps """
59 """ The number of sequences in alignment (it's thickness). """
63 """ Calculate the identity of alignment positions for colouring.
65 For every (row, column) in alignment the percentage of the exactly
66 same residue in the same column in the alignment is calculated.
67 The data structure is just like the Project.alignment, but istead of
68 monomers it contains float percentages.
69 """
70 # Oh, God, that's awful! Absolutely not understandable.
71 # First, calculate percentages of amino acids in every column
87 # Second, map these percentages onto the alignment
101 @staticmethod
103 """ Import data from fasta file
105 monomer_kind is class, inherited from MonomerType
107 >>> import project
108 >>> sequences,alignment=project.Project.from_fasta(open("test.fasta"))
109 """
128 #if there is description
156 @staticmethod
158 """ Constructs new alignment from sequences
160 Add None's to right end to make equal lengthes of alignment sequences
161 """
171 """ Saves alignment to given file
173 Splits long lines to substrings of length=long_line
174 To prevent this, set long_line=None
175 """
179 """ Simple align ths alignment using sequences (muscle)
181 uses old Monomers and Sequences objects
182 """
208 """ returns list of columns of alignment
210 sequence or sequences:
211 if sequence is given, then column is (original_monomer, monomer)
212 if sequences is given, then column is (original_monomer, {sequence: monomer})
213 if both of them are given, it is an error
214 original (Sequence type):
215 if given, this filters only columns represented by original sequence
216 """
232 """ Adds pdb information to each sequence
234 TODO: conformity_file
235 """
246 """ Returns string representing secondary structure """
253 """ Save secondary structure and name in fasta format """