These files consist of a group of XML tags and attribute values. A DOCTYPE header may or may not be present. The MEGA input converter for XML file formats does not implement a full parser; it only looks for a few specific tags that might be present. For example, an XML file might contain the following data:
<Bioseq-set>
<Bioseq>
<name>G019uabh</name>
<length>240</length>
<mol>DNA</mol>
<cksum>302C447C</cksum>
<seq-data>ATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTGAATT
AAAGACTTGTTTAAACACAAAAATTTAGAGTTTTACTCAACAAAAGTGATTGATTGATTGATTGATTGATTGATGGTT
TACAGTAGGACTTCATTCTAGTCATTATAGCTGCTGGCAGTATAACTGGCCAGCCTTTAATACATTGCTGCTTAGAGT
CAAAGCATGTACTTAGAGTT</seq-data>
</Bioseq>
</Bioseq-set>
The MEGA format converter looks for the following two tags:
<name>G019uabh</name>
<seq-data>ATACATCATAACACTAC. . .</seq-data>
If it finds these tags, it uses the text between the <name>. . .</name> tags as the sequence name, and the text between the <seq-data>. . .</seq-data> tags as the sequence data corresponding to that name. The conversion of the above XML block into MEGA format would look like this:
#Mega
Title: filename.xml
#G019uabh
ATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTGAATT
AAAGACTTGTTTAAACACAAAAATTTAGAGTTTTACTCAACAAAAGTGATTGATTGATTGATTGATTGATTGATGGTT
TACAGTAGGACTTCATTCTAGTCATTATAGCTGCTGGCAGTATAACTGGCCAGCCTTTAATACATTGCTGCTTAGAGT