This example shows how to identify gene duplications (and optionally speciation events) in MEGA. For this analysis, MEGA uses a Gene Duplication Wizard window which will walk you through the necessary steps. The data files used in this example can be found in the MEGA/Examples folder (The default location for Windows users is C:\Users\UserName\Documents\MEGA7\Examples\. The default location for Mac users is $HOME/MEGA/Examples, where $HOME is the user’s home directory).
Setting up the analysis
From the main MEGA window, select User Tree | Find Gene Duplications. The Gene Duplication Wizard window, which outlines the 6 steps for identifying gene duplications in MEGA will be displayed.
Step1: First, we will load a gene tree file. In the Gene Duplications Wizard window, click the Load Gene Tree... button and then using the file open dialog, find and select the “gene_tree.nwk” tree file in the MEGA\Examples directory. After the tree file is parsed by MEGA, the Map Species To Taxa action in step 2 will become enabled.
Step 2: Second, we will provide species names for each taxon in the gene tree. Click the Map Species Names… button and the species mapping dialog be displayed. Species names could be mapped manually using the grid displayed in this dialog, but we will load the names from a text file that specifies the mapping as taxon_name=species_name for each taxon in the gene tree. Click File | Import and then find the “taxa_to_species_map.txt” file. Once MEGA loads the file, the grid will be populated with species names for each taxon. Click the Save button to complete this step and then step 3 will become enabled.
Step 3: Next, we can optionally load a trusted species tree file. Click the Load Species Tree… button and then using the file open dialog, find and select the “species_tree.nwk” file in the MEGA\Examples directory. After the species tree file is parsed by MEGA, the Gene Duplication Wizard will jump to Step 5. This is because the tree in the “gene_tree.nwk” file is already rooted so we don’t need to specify the root to MEGA.
Step 4: We skip this step for brevity (but don’t worry, it is done exactly as in Step 5). Note – if our gene tree was not rooted, we could optionally skip this step. In that case, MEGA would execute the analysis with all possible placements of the root and keep the result(s) that minimize the number of gene duplications found.
Step 5: Next, must specify the placement of the root for the species tree as this is required for the analysis. Click the Set Species Tree Root… button. The species tree will be displayed in Tree Explorer window and the cursor will be adorned with the root placement tool icon. Click on the branch to “puffer fish” in the tree and then click the Finished button on the toolbar at the top of the window. MEGA will set the placement of the root internally and advance to the last step.
Step 6: Finally, in the Gene Duplications Wizard window, click the Launch Analysis button. Progress will be displayed as the analysis runs. When the analysis completes, the Tree Explorer window will return and display the gene tree.
Viewing the results
In the Tree Explorer window, the gene tree will be displayed with gene duplications and speciation events shown. Closed blue diamonds indicate those nodes which represent gene duplication events. Open red diamonds indicate speciation events. To display species names instead of taxa names, click View | Show/ Hide | Species Names. You can change back to taxa names by clicking View | Show/Hide | Taxa Names). You can toggle the display of markers for gene duplications and speciation events by clicking View | Show/Hide | Gene Duplication Markers (or Speciation Markers). You can also traverse gene duplications or speciations throughout the tree by clicking Search | Gene Duplication/Speciation Events.