Aligning Sequences

In this tutorial, we will show how to create a multiple sequence alignment from protein sequence data that will be imported into the alignment editor using different methods. All of the data files used in this tutorial can be found in the MEGA\Examples\ folder (The default location for Windows users is C:\Program Files\MEGA\Examples\. The location for Mac users is $HOME/MEGA/Examples, where $HOME is the user’s home directory).

Opening an Alignment

The Alignment Explorer is the tool for building and editing multiple sequence alignments in MEGA.

Example 2.1:

Launch the Alignment Explorer by selecting the Align | Edit/Build Alignment on the launch bar of the main MEGA window.

Select Create New Alignment and click Ok. A dialog will appear asking “Are you building a DNA or Protein sequence alignment?” Click the button labeled “DNA”.

From the Alignment Explorer main menu, select Data | Open | Retrieve sequences from File. Select the "hsp20.fas" file from the MEG/Examples directory.

 

Aligning Sequences by ClustalW

You can create a multiple sequence alignment in MEGA using either the ClustalW or Muscle algorithms. Here we align a set of sequences using the ClustalW option.

Example 2.2:

Open the alignment file (using the instructions above) hsp20.fas.

Select the Edit | Select All menu command to select all sites for every sequence in the data set.

Select Alignment | Align by ClustalW from the main menu to align the selected sequences data using the ClustalW algorithm. Click the “Ok” button to accept the default settings for ClustalW.

Once the alignment is complete, save the current alignment session by selecting Data | Save Session from the main menu. Give the file an appropriate name, such as "hsp20_Test.mas". This will allow the current alignment session to be restored for future editing.

Exit the Alignment Explorer by selecting Data | Exit Aln Explorer from the main menu.

 

Aligning Sequences Using Muscle

Here we describe how to create a multiple sequence alignment using the Muscle option.

Example 2.3:

Starting from the main MEGA window, select Align | Edit/Build Alignment from the launch bar. Select Create a new alignment and then select DNA.

From the Alignment Explorer window, select Data | Open | Retrieve sequences from a file and select the “Chloroplast_Martin.meg” file from the MEGA/Examples directory.

On the Alignment Explorer main menu, select Edit | Select All.

On the Alignment Explorer launch bar, you will find an icon that looks like a flexing arm. Click on it and select Align DNA.

Near the bottom of the MUSCLE - AppLink window, you will see a row called Alignment Info. You can read information about the Muscle program.

Click on the Compute button (accept the default settings). A Progress window will keep you informed of Muscle alignment status. In this window, you can click on the Command Line Output tab to see the command-line parameters which were passed to the Muscle program. Note: The analysis may complete so fast, that you won’t be able to click on this tab or read it. The information in this tab isn’t essential, it’s just interesting.

When the Muscle program has finished, the aligned sequences will be passed back to MEGA and displayed in the Alignment Explorer window.

Close the Alignment Explorer by selecting Data | Exit Aln Explorer. Select No when asked if you would like to save the current alignment session to file.

 

Obtaining Sequence Data from the Internet (GenBank)

Using MEGA’s integrated browser you can fetch GenBank sequence data from the NCBI website if you have an active internet connection.

Example 2.4:

From the main MEGA window, select Align | Edit/Build Alignment from the main menu.

When prompted, select Create New Alignment and click ok. Select DNA

Activate MEGA’s integrated browser by selecting Web | Query Genbank from the main menu.

When the NCBI: Nucleotide site is loaded, enter CFS as a search term into the search box at the top of the screen. Press the Search button.

When the search results are displayed, check the box next to any item(s) you wish to import into MEGA.

If you have checked more than one box: locate the Display Settings dropdown (located near the top left hand side of the page directly under the tab headings). Change the value to FASTA (Text) and click the Apply button. This will output all the sequences you selected as a text in the FASTA format.

Press the Add to Alignment button (with the red + sign) located above the web address bar. This will import the sequences into the Alignment Explorer.

With the data now displayed in the Alignment Explorer, you can close the Web Browser window.

Align the new data using the steps detailed in the previous examples.

Close the Alignment Explorer window by clicking Data | Exit Aln Explorer. Select No when asked if you would like the save the current alignment session to file.

Note: We have aligned some sequences and they are now ready to be analyzed. Whenever you need to edit/change your sequence data, you will need to open it in the Alignment Editor and edit or align it there. Then export it to the MEGA format and open the resulting file.