This program can analyse alignments of two formats - .fasta
and .aln. It's no use telling a lot about them...
But every analysed alignment should have either
.fasta or .aln format!
Fasta format
>mlr8250
-----MPTGYADGRSMTDTVETIDYSKTLYLPQTDFPMRAGLP----EKEPVLVKRWQDM
D---LYAKLRES---AAG-----RTKYVLHDGPPYANG-NIHIGHALNKILKDVITRSFQ
MRGYDSTYVPG-WDCHGLPIEWKIEEQYRAKGKNKDEVPV----NEFRKECREFAAHWIT
>CC0701
---------MAD-----DATTARDYRETVFLPDTPFPMRAGLP----KKEPEILEGWAAL
SEKGLYGAVRQKR-QAAG-----APLFVFHDGPPYANG-AIHIGHALNKILKDFVVRSRF
ALGYDVDYVPG-WDCHGLPIEWKIEEQFRAKGRRKDEVPA----EEFRRECRAYAGGWIE
>ileS
---------------------MSDYKSTLNLPETGFPMRGDLA----KREPGMLARWTDD
D---LYGIIRAA---KKG-----KKTFILHDGPPYANG-SIHIGHSVNKILKDIIVKSKG
LSGYDSPYVPG-WDCHGLPIELKVEQEYGKPG---EKFTA----AEFRAKCREYAATQVD
>ZileS
---------------------MSDYKSTLNLPETGFPMRGDLA----KREPGMLARWTDD
D---LYGIIRAA---KKG-----KKTFILHDGPPYANG-SIHIGHSVNKILKDIIVKSKG
LSGYDSPYVPG-WDCHGLPIELKVEQEYGKPD---EKFTA----AEFRAKCREYAATQVD
>VC0682
---------------------MSEYKDTLNLPETGFPMRGDLA----KREPEMLQRWYQE
D---LYGAIRQA---KKG-----KKSFVLHDGPPYANG-DIHIGHALNKILKDVIIKSKT
LSGFDAPYIPG-WDCHGLPIELMVEKKVGKPG---QKVTA----AEFREKCREYAAGQVE
>HI0962
--------------------MTVDYKNTLNLPETSFPMRGDLA----KREPDKXKNWYEK
N---LYQKIRKA---SKG-----KKSFILHDGPPYANG-NIHIGHAVNKILKDIIIKSKT
ALGFDSPYIPG-WDCHGLPIELKVEGLVGKPN---EKISA----AEFRQKCREYAAEQVE
>PM1662
--------------------MTVDYKNTLNLPETGFPMRGDLA----KREPNMLKSWYEK
D---LYQKIRQA---SKG-----KKSFILHDGPPYANG-TIHIGHAVNKILKDIIVKSKT
ALGYDSPYIPG-WDCHGLPIELKVEGLVGKPN---QNISA----AQFREACRQYAAEQVE
>PA4560
---------------------MTDYKATLNLPETAFPMKAGLP----QREPETLKFWNDI
G---LYQKLRAI---GGD-----RPKFVLHDGPPYANG-SIHIGHAVNKILKDIIVRSKT
LAGYDAPYVPG-WDCHGLPIEHKVETTHGK------NLPA----DKTRELCREYAAEQIE
>NMB1833
---------------------MTDYSKTVNLLESPFPMRGNLA----KREPAWLKSWYEQ
K---RYQKLREI---AKG-----RPKFILHDGPPYANG-DIHIGHAVNKILKDIIIRSKT
QAGFDAPYVPG-WDCHGLPIEVMVEKLHGK------DMPK----ARFRELCREYAAEQIA
>NMA0622
---------------------MTDYSKTVNLLESPFPMRGNLA----KREPAWLKSWYEQ
K---RYQKLREI---AKG-----RPKFILHDGPPYANG-DIHIGHAVNKILKDIIIRSKT
QAGFDAPYVPG-WDCHGLPIEVMVEKLHGK------DMPK----ARFRELCREYAAEQIA
All sequences follow each other. Declaration of
every sequence begins with declaration of the name of this sequence,
which is signed by the symbol ">". Then, the sequence declaration itself
begins from the new line
No additional information is supposed.
Aln format
CLUSTAL W (1.74) multiple sequence alignment
mlr8250 -----MPTGYADGRSMTDTVETIDYSKTLYLPQTDFPMRAGLP----EKEPVLVKRWQDM
CC0701 ---------MAD-----DATTARDYRETVFLPDTPFPMRAGLP----KKEPEILEGWAAL
ileS ---------------------MSDYKSTLNLPETGFPMRGDLA----KREPGMLARWTDD
ZileS ---------------------MSDYKSTLNLPETGFPMRGDLA----KREPGMLARWTDD
VC0682 ---------------------MSEYKDTLNLPETGFPMRGDLA----KREPEMLQRWYQE
HI0962 --------------------MTVDYKNTLNLPETSFPMRGDLA----KREPDKXKNWYEK
PM1662 --------------------MTVDYKNTLNLPETGFPMRGDLA----KREPNMLKSWYEK
PA4560 ---------------------MTDYKATLNLPETAFPMKAGLP----QREPETLKFWNDI
NMB1833 ---------------------MTDYSKTVNLLESPFPMRGNLA----KREPAWLKSWYEQ
NMA0622 ---------------------MTDYSKTVNLLESPFPMRGNLA----KREPAWLKSWYEQ
mlr8250 D---LYAKLRES---AAG-----RTKYVLHDGPPYANG-NIHIGHALNKILKDVITRSFQ
CC0701 SEKGLYGAVRQKR-QAAG-----APLFVFHDGPPYANG-AIHIGHALNKILKDFVVRSRF
ileS D---LYGIIRAA---KKG-----KKTFILHDGPPYANG-SIHIGHSVNKILKDIIVKSKG
ZileS D---LYGIIRAA---KKG-----KKTFILHDGPPYANG-SIHIGHSVNKILKDIIVKSKG
VC0682 D---LYGAIRQA---KKG-----KKSFVLHDGPPYANG-DIHIGHALNKILKDVIIKSKT
HI0962 N---LYQKIRKA---SKG-----KKSFILHDGPPYANG-NIHIGHAVNKILKDIIIKSKT
PM1662 D---LYQKIRQA---SKG-----KKSFILHDGPPYANG-TIHIGHAVNKILKDIIVKSKT
PA4560 G---LYQKLRAI---GGD-----RPKFVLHDGPPYANG-SIHIGHAVNKILKDIIVRSKT
NMB1833 K---RYQKLREI---AKG-----RPKFILHDGPPYANG-DIHIGHAVNKILKDIIIRSKT
NMA0622 K---RYQKLREI---AKG-----RPKFILHDGPPYANG-DIHIGHAVNKILKDIIIRSKT
: :*** .. *:
mlr8250 MRGYDSTYVPG-WDCHGLPIEWKIEEQYRAKGKNKDEVPV----NEFRKECREFAAHWIT
CC0701 ALGYDVDYVPG-WDCHGLPIEWKIEEQFRAKGRRKDEVPA----EEFRRECRAYAGGWIE
ileS LSGYDSPYVPG-WDCHGLPIELKVEQEYGKPG---EKFTA----AEFRAKCREYAATQVD
ZileS LSGYDSPYVPG-WDCHGLPIELKVEQEYGKPD---EKFTA----AEFRAKCREYAATQVD
VC0682 LSGFDAPYIPG-WDCHGLPIELMVEKKVGKPG---QKVTA----AEFREKCREYAAGQVE
HI0962 ALGFDSPYIPG-WDCHGLPIELKVEGLVGKPN---EKISA----AEFRQKCREYAAEQVE
PM1662 ALGYDSPYIPG-WDCHGLPIELKVEGLVGKPN---QNISA----AQFREACRQYAAEQVE
PA4560 LAGYDAPYVPG-WDCHGLPIEHKVETTHGK------NLPA----DKTRELCREYAAEQIE
NMB1833 QAGFDAPYVPG-WDCHGLPIEVMVEKLHGK------DMPK----ARFRELCREYAAEQIA
NMA0622 QAGFDAPYVPG-WDCHGLPIEVMVEKLHGK------DMPK----ARFRELCREYAAEQIA
. . :* **:* * .
Aln. file begins with the word "CLUSTAL".
In the very beginning some additional information is included.
Every line with such information begins with the symbol "#".
Additional information can be only in the beginning of the
.aln file.
After this blocks of the alignment are present. Every block
is a part of the alignment. All sequences are present in every
block; their list and order of following are the same for all
blocks. Every block has length 60 positions of alignment.
In every block under the part of alignment some signs can be
present. Usually these signs mean type of the position
(conservative or semi-conservative). But these signs are not
taken into account in this program complex.