The Sequence Data Explorer shows the aligned sequence data. You can scroll along the alignment using the scrollbar at the bottom right hand side of the explorer window. The Sequence Data Explorer provides a number of utilities for exploring the statistical attributes of the data and also for selecting data subsets.
This explorer consists of a number of regions as follows:
Menu Bar
Help: This item brings up the help file for the Sequence Data Explorer.
Tool Bar
The tool bar provides quick access to the following menu items:
General Utilities
: This brings up the Exporting
Sequence Data
dialog box, which contains options to control how MEGA writes the
output data, available options are Text, MEGA, CSV, and Excel.
: This brings up the Exporting
Sequence Data dialog box and sets the default output format to MEGA.
: This brings up the Exporting
Sequence Data dialog box and sets the default output format to Excel.
: This brings up the Exporting
Sequence Data dialog box and sets the default output format to CSV (Comma
separated values).
: This brings up the dialog box
for setting up and selecting domains
and genes.
: This brings up the dialog
box
for setting up, editing, and selecting taxa and groups
of taxa .
: This toggle replaces the nucleotide/amino
acid at a site with the identical symbol (e.g. a dot) if the site contains
the same nucleotide/amino acid.
: This button provides the facility
to translate codons in the sequence data into amino acid sequences and
back. All protein-coding
regions will be automatically identified and translated for display. When the translated
sequence is already displayed, then issuing this command displays the
original nucleotide sequences (including all coding and non-coding regions). Depending on the data
displayed (translated or nucleotide), relevant menu options in the Sequence
Data Explorer become enabled. Note
that the translated/un-translated status in this data explorer does not
have any impact on the options for analysis available in MEGA
(e.g., Distances or Phylogeny menus), as MEGA
provides all possible options for your dataset at all times.
Highlighting Sites
C: If this button is pressed, then all constant sites will be highlighted. A count of the highlighted sites will be displayed on the status bar.
V: If this button is pressed, then all variable sites will be highlighted. A count of the highlighted sites will be displayed on the status bar.
Pi: If this button is pressed, then all parsimony-informative sites will be highlighted. A count of the highlighted sites will be displayed on the status bar.
S: If this button is pressed, then all singleton sites will be highlighted. A count of the highlighted sites will be displayed on the status bar.
L: If this button is pressed, then all labelled sites will be highlighted and a count of highlighted sites will be displayed on the status bar (see also labelled sites).
0: If this button is pressed, then sites will be highlighted only if they are zero-fold degenerate sites in all sequences displayed. A count of highlighted sites will be displayed on the status bar. (This button is available only if the dataset contains protein coding DNA sequences).
2: If this button is pressed, then sites will be highlighted only if they are two-fold degenerate sites in all sequences displayed. A count of highlighted sites will be displayed on the status bar. (This button is available only if the dataset contains protein coding DNA sequences).
4: If this button is pressed, then sites will be highlighted only if they are four-fold degenerate sites in all sequences displayed. A count of highlighted sites will be displayed on the status bar. (This button is available only if the dataset contains protein coding DNA sequences).
Special: This dropdown allows for the selection of a special highlighting option.
CpG/TpG/CpA: if this button is pressed, then all sites which have a C followed by a G, T by G, or C by A will be highlighted. You may also select a percentage of sequences which must have these properties for a site to be counted.
Coverage: if this button is pressed, then you will enter a percentage. All the sites with this percentage or less of ambiguous sites will be highlighted.
: This button allows you to quickly
navigate between highlighted sites by jumping to the previous or next
highlighted site.
Searching
: This button allows you to specify
a sequence name to find. Search
results are bolded and the row is highlighted blue. MEGA
first looks for an exact match to the name you specified, if none exists
it looks for names starting with what you provided, if no names start
with the provided search term, then MEGA looks for your search term anywhere
in the names(rather than just the start).
: This button allows you to specify
a Motif to search for in the sequence data. This
Motif supports IUPAC codes such as R (for A or G) and Y (for T or C). MEGA highlights (in
Yellow) the first instance of this motif it finds.
and
:
These buttons are only enabled if you have already searched for a Sequence
Name or Motif. By clicking
the forward or backward button MEGA will search for the next or previous
search result (assuming there is more than one possible matches).
The 2-Dimensional Data Grid
Fixed Row: This is the first row in the data grid. It is used to display the nucleotides (or amino acids) in the first sequence when you have chosen to show their identity using a special character. For protein coding regions, it also clearly marks the first, second, and the third codon positions.
Fixed Column: This is the first and the leftmost column in the data grid. It is always visible, even when you are scrolling through sites. The column contains the sequence names and an associated check box. You can check or uncheck this box to include or exclude a sequence from analysis. Also in this column, you can drag-and-drop sequences to sort them.
Rest of the Grid: Cells to the right of and below the first row contain the nucleotides or amino acids of the input data. Note that all cells are drawn in light color if they contain data corresponding to unselected sequences or genes or domains .
Status Bar
This section displays the location of the focused site and the total sequence length. It also shows the site label , if any, and a count of the highlighted sites.