Instructions
Version
You can choose which version of the software you want to use from the drop down list.
Input data type
You must select what format you file has. For now, PathogenFinder2 only accepts input files in
FASTA format.
The fasta file has to contained the genomic data of one bacterial isolate. For more than one input, consider using the GitHub repository locally. The files must not be compressed.
To avoid problems caused by file names, we only allow a limited
selection of ASCII characters: a-z, A-Z, 0-9, "_" (underscore),
"-" (hyphen), "." (full stop)
Upload and submit job
Click on the 'Submit job' button to submit your job after having attached
the files. The waiting page will be displayed and constantly updated until
it terminates and the server output page appears in your browser. You also
have the option to input your email and be notified as soon as your the
results are ready. The data is available for one week from the moment the
results are created.
Output
PathogenFinder2 prediction comes from an ensemble of 4 neural networks. Therefore,
four different predictions will be reported, each one being a number between 0 (without
pathogenic capacity) and 1 (with pathogenic capacity). This number does not correlate with
the pathogenic capacity per se, but for how accurate the prediction is (the closest to 0.5, the
more unsure the neural network is about the pathogenic capacity). It is valid to
use the mean of the four values, but it is recommendable to take into account the separate predictions
when taking decisions about the nature of the bacteria.
As a standard, PathogenFinder2 will report a results file ("results.tsv"), as well as the embeddings file
and the attentions scores file ("embeddings.npz" and "attentions.npz", respectfully). Intermediate files,
like the predicted proteins or/and the embeddings file, are reported in case they were produced when using
PathogenFinder2.
In case the option for mapping the top proteins highlighted by the attentions score to UniRef50 is selected,
a table with the results will be also displayed (unavailable at the moment) as well as possible to download
("meh.tsv").
In case the option for mapping the embeddings to the Pathogenic Bacterial Landscape is selected,
the image and the closer neighbours will be possible to be downloaded...