208k views
0 votes
I'm trying to align the output of I got previously to against the swissprot database, and I need to have an output in tabular form with -qseqid -sacc -qlen -slen -length -nident -pident -evalue -stitle and I want to set the evalue less than 1e-10. Here is my code :

#!/usr/bin/env bash
blastp -query Trinity.fasta.transdecoder.pep \
-db swissprot \
-outfmt 6 qseqid sacc qlen slen length nident pident evalue stitle -evalue 1e-10 1>Predict.txt \
2>wrongPredicted.err
However this is the first several lines of output I got in the txt file:

TRINITY_DN0_c0_g1::TRINITY_DN0_c0_g1_i1::g.132::m.132 Q964E0 400 376 376 364 96.81 0.0 RecName: Full=Actin, cytoplasmic; Contains: RecName: Full=Actin, cytoplasmic, intermediate form; Flags: Precursor
TRINITY_DN0_c0_g1::TRINITY_DN0_c0_g1_i1::g.132::m.132 Q964D9 400 376 376 364 96.81 0.0 RecName: Full=Actin, cytoplasmic; Contains: RecName: Full=Actin, cytoplasmic, intermediate form; Flags: Precursor
TRINITY_DN0_c0_g1::TRINITY_DN0_c0_g1_i1::g.132::m.132 P53472 400 376 376 364 96.81 0.0 RecName: Full=Actin, cytoskeletal 1A; AltName: Full=Actin, cytoskeletal IA; Flags: Precursor
TRINITY_DN0_c0_g1::TRINITY_DN0_c0_g1_i1::g.132::m.132 P92179 400 376 376 364 96.81 0.0 RecName: Full=Actin, cytoplasmic; Contains: RecName: Full=Actin, cytoplasmic, intermediate form; Flags: Precursor
TRINITY_DN0_c0_g1::TRINITY_DN0_c0_g1_i1::g.132::m.132 Q964E1 400 376 376 363 96.54 0.0 RecName: Full=Actin, cytoplasmic; Contains: RecName: Full=Actin, cytoplasmic, intermediate form; Flags: Precursor
TRINITY_DN0_c0_g1::TRINITY_DN0_c0_g1_i1::g.132::m.132 Q964E2 400 376 376 364 96.81 0.0 RecName: Full=Actin, cytoplasmic; Contains: RecName: Full=Actin, cytoplasmic, intermediate form; Flags: Precursor
TRINITY_DN0_c0_g1::TRINITY_DN0_c0_g1_i1::g.132::m.132 P69004 400 376 376 364 96.81 0.0 RecName: Full=Actin-15B; Flags: Precursor
The problem is that all the evalue is 0.0 for some reasons, because what I want is something like this:

TRINITY_DN8_c0_g1_i1 Q5ZKK7 283 788 64 53 82.81 1e-30 RecName: Full=General transcription and DNA repair factor IIH helicase subunit XPB; Short=TFIIH subunit XPB; AltName: Full=DNA excision repair protein ERCC-3
TRINITY_DN8_c0_g1_i1 Q7ZVV1 283 782 64 53 82.81 3e-30 RecName: Full=General transcription and DNA repair factor IIH helicase subunit XPB; Short=TFIIH subunit XPB; AltName: Full=DNA excision repair protein ERCC-3
TRINITY_DN8_c0_g1_i1 Q1RMT1 283 782 64 52 81.25 6e-30 RecName: Full=General transcription and DNA repair factor IIH helicase subunit XPB; Short=TFIIH subunit XPB; AltName: Full=DNA excision repair protein ERCC-3
TRINITY_DN8_c0_g1_i1 Q5RA62 283 782 64 52 81.25 7e-30 RecName: Full=General transcription and DNA repair factor IIH helicase subunit XPB; Short=TFIIH subunit XPB; AltName: Full=DNA excision repair protein ERCC-3
TRINITY_DN8_c0_g1_i1 Q60HG1 283 782 64 52 81.25 7e-30 RecName: Full=General transcription and DNA repair factor IIH helicase subunit XPB; Short=TFIIH subunit XPB; AltName: Full=DNA excision repair protein ERCC-3
The 8th column should be valid evalue.

User Xdite
by
8.5k points

1 Answer

3 votes

Final answer:

The e-value in the BLAST output represents the significance threshold of an alignment. Removing the space in the code between the -evalue flag and value should fix the issue.

Step-by-step explanation:

The e-value in the BLAST output represents the background noise or significance threshold of an alignment. The closer the e-value is to zero, the better the alignment. In your code, you have set the e-value to 1e-10, but the actual output shows an e-value of 0.0 for all the alignments.

This could be due to a formatting issue in your code. To fix it, remove the space between the -evalue flag and the value, like this:

blastp -query Trinity.fasta.transdecoder.pep -db swissprot -outfmt 6 qseqid sacc qlen slen length nident pident evalue stitle -evalue 1e-10 1>Predict.txt 2>wrongPredicted.err

By removing the space, the e-value parameter should be properly recognized and give you the expected results in tabular form.

User Rob Buhler
by
8.0k points