allpy
view allpy/fileio.py @ 825:4f896db3531d
allpy.fileio. markup filetype now allows either - or _ to be used in headers interchangeably [closes #89]
author | Daniil Alexeyevsky <dendik@kodomo.fbb.msu.ru> |
---|---|
date | Fri, 15 Jul 2011 18:01:31 +0400 |
parents | c76ccff11df5 |
children | 18119191a4c8 |
line source
7 """This ugly helper is to avoid bad untimely import loops."""
12 """Automatical file IO."""
25 """Some helpers."""
34 """Append alignment to the file."""
38 )
41 """Read alignment from the file."""
47 """Fasta parser & writer."""
50 """Append one sequence to file."""
63 """Write sequences to file.
65 Sequences are given as list of tuples (string, name, description).
66 """
71 """Read parts beginning with > in FASTA file.
73 This is a drop-in replacement for self.file.read().split("\n>")
74 It is required for markup format, which combines parts read with
75 different parsers. Python prohibits combining iterators and file.read
76 methods on the same file.
77 """
97 """Parser & writer for our own marked alignment file format.
99 Marked alignment file consists of a list of records, separated with one or
100 more empty lines. Each record consists of type name, header and optional
101 contents. Type name is a line, containing just one word, describing the
102 record type. Header is a sequence of lines, each in format `key: value`.
103 Content, if present, is separated from header with an empty line.
105 Type names and header key names are case-insensitive and '-' and '_' in
106 them are equivalent.
108 Known record types now are:
110 - `alignment` -- this must be the last record in file for now
111 - `sequence_markup`
112 - `alignment_markup`
114 Example::
116 sequence-markup
117 sequence-name: cyb5_mouse
118 sequence-description:
119 name: pdb_residue_number
120 type: SequencePDBResidueNumberMarkup
121 markup: -,12,121,122,123,124,13,14,15,-,-,16
123 alignment-markup
124 name: geometrical_core
125 type: AlignmentGeometricalCoreMarkup
126 markup: -,-,-,-,+,+,+,-,-,-,+,+,-,-,-,-
128 alignment
129 format: fasta
131 > cyb5_mouse
132 seqvencemouse
133 """
136 """Helper attribute for write_empty_line."""
139 """Write alignment to file."""
145 }
153 """Write a dictionary of markups as series of records."""
163 """Write record to file. Add new line before every but first record."""
172 """Add empty line every time except the first call."""
177 """Read alignment from file."""
183 """Found sequence markup record in file. Do something about it."""
191 return
195 """Found alignment markup record in file. Do something about it."""
200 """Found alignment record. It has been handled in read_payload."""
201 pass
204 """Read records and return them as a list of dicts."""
207 continue
211 """Read record headers and record payload."""
225 """Read record payload, if necessary."""
230 @staticmethod
238 """Parser & writer for file formats supported by EMBOSS."""
241 """Write sequences to file."""
244 )
251 """EMBOSS does not permit : in file names. Fix sequences for that."""
256 """Read sequences from file."""
259 )
265 # vim: set et ts=4 sts=4 sw=4: