Anonymous | Login | Signup for a new account | 2024-11-22 18:39 MST |
My View | View Issues | Report Issue | Change Log | Roadmap | My Account |
View Issue Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||||
ID | Project | Category | View Status | Date Submitted | Last Update | ||||
0001158 | MEGA | Alignment Explorer | public | 2019-03-19 18:26 | 2019-03-21 13:11 | ||||
Reporter | guest | ||||||||
Assigned To | gstecher | ||||||||
Priority | normal | Severity | minor | Reproducibility | have not tried | ||||
Status | resolved | Resolution | won't fix | ||||||
Platform | PC | OS | Linux | ||||||
Product Version | MEGA-CC 11 (command line version) | ||||||||
Target Version | Fixed in Version | ||||||||
Summary | 0001158: error in TClustalThread.SetSeqArray: Out of memory | ||||||||
Description | I am trying to run multiple sequence alignment with a very large data sets containing 100,000 to 1,000,000 sequences using the megacc program on a high performing computing cluster associated with my school. Cluster runs on linux OS. My alignments get aborted about 5 minutes after starting, and they are giving me an error message that reads: "error in TClustalThread.SetSeqArray: Out of memory". The issue may be an internal one regarding the module itself, not a problem with the cpu memory usage (see image "cpu memory usage.jpg"). | ||||||||
Steps To Reproduce | For my data set (P02c.fas, too large to be uploaded into this error report but it contains ~ 1.6 million sequences), I am using ClustalW alignment (clustal_align_NC.mao). I set my output to P02_align. I run this command using the following script: megacc -a clustal_align_NC.mao -d P02c.fas -o P02_align After I submit this job on the high-throughput scheduler, the job is in qw (waiting) for a minute or two before the run begins. Run lasts about 5 minutes before the error occurs. This worked in the past with a different dataset, but it only contained ~ 5,000 sequences. I don't know if this is a problem with the size of the input or software bug, but it definitely isn't a problem with the HPCC. | ||||||||
Additional Information | See "error_report.txt" for the error message I received when running this script. Size of input file is ~ 527MB | ||||||||
Tags | No tags attached. | ||||||||
Attach Tags | (Separate by ",") | ||||||||
First Name | Tom | ||||||||
Last Name | Caldwell | ||||||||
tcaldwel@uci.edu | |||||||||
Confirm Email | tcaldwel@uci.edu | ||||||||
Attached Files | mega bug report.zip (198,878 bytes) 2019-03-19 18:26 | ||||||||
Notes | |
(0004211) gstecher (administrator) 2019-03-21 13:11 |
Hi Tom, I am writing in response to the bug report you recently submitted regarding the megacc software. The ClustalW implementation in megacc is not an appropriate tool to align such a large data set - it was written over 20 years ago, before you could get so much data. I am not sure aligning that many sequences makes sense (see https://www.drive5.com/muscle/manual/bigalignments.html [^]). Anyway, if you need a software that might work with that data set, you might try MAFFT (https://mafft.cbrc.jp/alignment/software/ [^]). -- Best regards, Glen Stecher Institute for Genomics and Evolutionary Medicine igem.temple.edu |
Issue History | |||
Date Modified | Username | Field | Change |
2019-03-19 18:26 | guest | New Issue | |
2019-03-19 18:26 | guest | File Added: mega bug report.zip | |
2019-03-21 13:11 | gstecher | Note Added: 0004211 | |
2019-03-21 13:11 | gstecher | Status | new => resolved |
2019-03-21 13:11 | gstecher | Resolution | open => won't fix |
2019-03-21 13:11 | gstecher | Assigned To | => gstecher |
Copyright © 2000 - 2024 MantisBT Team |