Converting XML Format

These files consist of a group of XML tags and attribute values. A DOCTYPE header may or may not be present. The MEGA input converter for XML file formats does not implement a full parser; it only looks for a few specific tags that might be present. For example, an XML file might contain the following data:

 

<Bioseq-set>

<Bioseq>

<name>G019uabh</name>

<length>240</length>

<mol>DNA</mol>

<cksum>302C447C</cksum>

<seq-data>ATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTGAATT

AAAGACTTGTTTAAACACAAAAATTTAGAGTTTTACTCAACAAAAGTGATTGATTGATTGATTGATTGATTGATGGTT

TACAGTAGGACTTCATTCTAGTCATTATAGCTGCTGGCAGTATAACTGGCCAGCCTTTAATACATTGCTGCTTAGAGT

CAAAGCATGTACTTAGAGTT</seq-data>

</Bioseq>

</Bioseq-set>

 

The MEGA format converter looks for the following two tags:

 

<name>G019uabh</name>

<seq-data>ATACATCATAACACTAC. . .</seq-data>

 

If it finds these tags, it uses the text between the <name>. . .</name> tags as the sequence name, and the text between the <seq-data>. . .</seq-data> tags as the sequence data corresponding to that name. The conversion of the above XML block into MEGA format would look like this:

 

#Mega

Title: filename.xml

 

#G019uabh

ATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTGAATT

AAAGACTTGTTTAAACACAAAAATTTAGAGTTTTACTCAACAAAAGTGATTGATTGATTGATTGATTGATTGATGGTT

TACAGTAGGACTTCATTCTAGTCATTATAGCTGCTGGCAGTATAACTGGCCAGCCTTTAATACATTGCTGCTTAGAGT