I'm approaching HHsuite for the first time and I'm trying to replicate with my local installation the results I get when I run the web version of HHblits and HHpred. I suspect that the web version is doing something "hidden" while preparing the input files. At the moment I'm running HHsuite 3.3.0, I installed blast 2.2.26, psipred 4.02 and the latest versions of pdb70 and uniref. My workflow is:

1) Start from a fasta sequence (file: isoform1.txt)

2) Run hhblits:

hhblits -cpu 4 -i isoform1.txt -d Uniref/UniRef30_2020_03 -o out_hhblits.hhr -oa3m out_hhblits.a3m -e 1e-3 -n 3 -p 20 -Z 250 -z 1 -b 1 -B 250

There are already some differences between the results that I get from the webserver and locally, which makes me think I'm missing something probably in the generation of the input file for hhblits

3) run addss.pl

addss.pl out_hhblits.a3m

4) run hhmake

hhmake -i out_hhblits.a3m

5) run hhsearch

hhsearch -cpu 4 -i out_hhblits.a3m -d ../../pdb70/pdb70 -o hras_hhpred.hhr -oa3m hras_hhpred.a3m -p 20 -Z 250 -loc -z 1 -b 1 -B 250 -ssm 2 -sc 1 -seq 1 -dbstrlen 10000 -norealign -maxres 32000 -contxt /usr/local/src/hh-suite-master/data/context_data.crf

local outputs:

blits:

Query sp|P01112|RASH_HUMAN GTPase HRas OS=Homo sapiens OX=9606
GN=HRAS PE=1 SV=1 Match_columns 189 No_of_seqs 1380 out of 7256
Neff 11.8887 Searched_HMMs 28721 Date Wed Sep 30
10:04:06 2020 Command hhblits -cpu 4 -i isoform1.txt -d
../../UniRef30_2020_03_hhsuite/UniRef30_2020_03 -o out_hhblits.hhr
-oa3m out_hhblits.a3m -e 1e-3 -n 3 -p 20 -Z 250 -z 1 -b 1 -B 250

No Hit Prob E-value P-value Score SS
Cols Query HMM Template HMM 1 UniRef100_A0A3M0JH82 Uncharact 100.0
2.6E-49 5.9E-55 252.6 0.0 189 1-189 97-285 (285) 2 UniRef100_A0A061I953 GTPase HR 100.0 6E-49 1.4E-54 250.6 0.0 188
1-188 1-188 (289) 3 UniRef100_UPI0012625CA5 GTPase 100.0 7E-47
1.6E-52 237.7 0.0 189 1-189 131-319 (319) 4 UniRef100_A0A023FX11 Uncharact 100.0 1.3E-46 2.9E-52 238.2 0.0 188
1-189 65-252 (272) 5 UniRef100_A0A022QBX4 Uncharact 100.0 2.7E-45
5.8E-51 238.0 0.0 162 2-164 27-190 (218) 6 UniRef100_A0A015IFZ5 Rsr1p n=3 100.0 3E-44 6.5E-50 235.7 0.0 166
3-168 21-188 (258)

search:

Query sp|P01112|RASH_HUMAN GTPase HRas OS=Homo sapiens OX=9606
GN=HRAS PE=1 SV=1 Match_columns 189 No_of_seqs 672 out of 48006
Neff 14.1061 Searched_HMMs 83244 Date Thu Oct 1
09:58:32 2020 Command hhsearch -cpu 4 -i hras_hhblits.a3m -d
../../pdb70/pdb70 -o hras_hhpred.hhr -oa3m hras_hhpred.a3m -p 20 -Z
250 -loc -z 1 -b 1 -B 250 -ssm 2 -sc 1 -seq 1 -dbstrlen 10000
-norealign -maxres 32000 -contxt /usr/local/src/hh-suite-master/data/context_data.crf

No Hit Prob E-value P-value Score SS
Cols Query HMM Template HMM 1 5XCO_A B-cell lymphoma 6 prote 99.9
2E-22 2.4E-27 121.3 22.7 166 1-166 3-168 (171) 2 5UQW_A
GTPase KRas (E.C.3.6.5. 99.9 2.9E-22 3.4E-27 122.8 23.1 166
1-166 21-186 (189) 3 2CE2_X GTPASE HRAS; SIGNALING 99.9 2.8E-22
3.3E-27 120.0 22.1 164 1-164 1-164 (166) 4 6BOF_A GTPase KRas; HYDROLASE, 99.9 4.5E-22 5.4E-27 119.4 22.7 168 2-169
1-168 (168) 5 5E95_A Mb(NS1), GTPase HRas; H 99.9 4.3E-22 5.2E-27
119.2 22.5 165 1-165 3-167 (168) 6 4KLZ_A GTP-binding protein Rit 99.9 1E-21 1.2E-26 118.5 23.3 170 1-170 3-173
(173)

Web outputs:

blits:

Query hras_hhblits Match_columns 189 No_of_seqs 1 out of 1
Neff 1 Searched_HMMs 20000 Date Tue Sep 29 14:33:42
2020 Command hhblits -cpu 8 -i ../results/hras_hhblits.in.a3m -d
/cluster/toolkit/production/databases/hhblits/UniRef30 -o
/ebio/toolkit_rye/user/toolkit/production/jobs/hras_hhblits/results/hras_hhblits.hhr
-oa3m /ebio/toolkit_rye/user/toolkit/production/jobs/hras_hhblits/results/hras_hhblits.a3m
-e 1e-3 -n 1 -p 20 -Z 250 -z 1 -b 1 -B 250

No Hit Prob E-value P-value Score SS
Cols Query HMM Template HMM 1 UniRef100_A0A061I953 GTPase HR 100.0
7E-130 2E-135 837.4 0.0 188 1-188 1-188 (289) 2
UniRef100_A0A3M0JH82 Uncharact 100.0 1E-126 4E-132 816.7 0.0 189
1-189 97-285 (285) 3 UniRef100_UPI0012625CA5 GTPase 100.0 1E-119
3E-125 777.3 0.0 189 1-189 131-319 (319) 4
UniRef100_A0A023FX11 Uncharact 100.0 4E-118 1E-123 760.8 0.0 188
1-189 65-252 (272) 5 UniRef100_A0A3L7HWJ0 H-RAS (Fr 100.0 9E-116
2E-121 757.3 0.0 184 1-184 155-338 (338) 6
UniRef100_UPI000C29D954 GTPase 100.0 2E-114 4E-120 758.7 0.0 187
1-187 1-187 (394)

search:

Query Q_hras Match_columns 189 No_of_seqs 196 out of 1455
Neff 12.736 Searched_HMMs 52941 Date Mon Sep 28
11:28:02 2020 Command hhsearch -cpu 8 -i ../results/full.a3m -d
/cluster/toolkit/production/databases/hh-suite/mmcif70/pdb70 -o
../results/hras.hhr -oa3m ../results/hras.a3m -p 20 -Z 250 -loc -z 1
-b 1 -B 250 -ssm 2 -sc 1 -seq 1 -dbstrlen 10000 -norealign -maxres 32000 -contxt
/cluster/toolkit/production/bioprogs/tools/hh-suite-build-new/data/context_data.crf

No Hit Prob E-value P-value Score SS
Cols Query HMM Template HMM 1 2CE2_X GTPASE HRAS; SIGNALING 99.9
2.6E-22 4.8E-27 120.1 22.2 164 1-164 1-164 (166) 2 6MS9_A GTPase KRas; GTPASE KRA 99.9 4.3E-22 8.2E-27 119.3 23.1 166
1-166 1-166 (169) 3 5XCO_A B-cell lymphoma 6 prote 99.9 4.3E-22
8.1E-27 120.0 22.4 166 1-166 3-168 (171) 4 6MQT_H GTPase KRas; GTPASE KRA 99.9 7.1E-22 1.3E-26 118.3 23.0 165 1-165
2-166 (167) 5 3LVQ_E Arf-GAP with SH3 domain 99.9 2.1E-22 4E-27
141.0 20.7 170 1-174 320-493 (497) 6 6H47_A GTPase KRas, darpin K19 99.9 2.8E-21 5.3E-26 115.9 22.1 165 1-165 4-168
(169)



Source link