-
Notifications
You must be signed in to change notification settings - Fork 489
Add kMetaShot and DM for kMetaShot #7421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 2 commits
3862f71
f251d0e
4ef6892
7e91306
a961b6c
2953837
5e9d599
cfa9bf1
e10b88e
0acaf08
4d0b6eb
6a8350d
a56eec4
0acc0e1
be65946
c22e642
61157d7
ed60219
e3d8e77
a230944
102fb8f
2ed91cf
e9bbec2
b1b05c1
587e629
4e6cfc8
d7821e7
bd8a260
77d0f1b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| categories: | ||
| - Data Managers | ||
| - Metagenomics | ||
| homepage_url: https://github.com/gdefazio/kMetaShot | ||
| description: Data manager for kMetaShot reference data | ||
| long_description: Data manager for kMetaShot reference data | ||
| name: kmetashot_build_database | ||
| owner: iuc | ||
| remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/main/data_managers/data_manager_kmetashot | ||
| type: unrestricted |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,61 @@ | ||
| <tool id="kmetashot_build_database" name="kMetaShot" tool_type="manage_data" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@"> | ||
| <description>database builder</description> | ||
| <macros> | ||
| <token name="@TOOL_VERSION@">2.0</token> | ||
| <token name="@VERSION_SUFFIX@">0</token> | ||
| <token name="@PROFILE@">25.0</token> | ||
| </macros> | ||
| <requirements> | ||
| <requirement type="package" version="@TOOL_VERSION@">kmetashot</requirement> | ||
| </requirements> | ||
| <command detect_errors="exit_code"><![CDATA[ | ||
| mkdir -p '$out_file.extra_files_path' && | ||
| #if $test != "true" | ||
| wget 'https://zenodo.org/records/17375120/files/kMetaShot_bacteria_archaea_2025-05-22.h5' && | ||
| mv 'kMetaShot_bacteria_archaea_2025-05-22.h5' '$out_file.extra_files_path' && | ||
| #else | ||
| touch '$out_file.extra_files_path'/kMetaShot_bacteria_archaea_2025-05-22.h5 && | ||
| #end if | ||
| cp '$dmjson' '$out_file' | ||
| ]]></command> | ||
| <configfiles> | ||
| <configfile name="dmjson"><![CDATA[ | ||
| { | ||
| "data_tables":{ | ||
| "kmetashot":[ | ||
| { | ||
| "path":"kMetaShot_bacteria_archaea_2025-05-22.h5", | ||
| "dbkey":"kMetaShot-25-05-22", | ||
| "name":"kMetaShot reference data 2025-05-22", | ||
| "version":"2", | ||
| "value":"25-05-22" | ||
| } | ||
| ] | ||
| } | ||
| }]]> | ||
| </configfile> | ||
| </configfiles> | ||
| <inputs> | ||
| <param name="test" type="hidden"/> | ||
| </inputs> | ||
| <outputs> | ||
| <data name="out_file" format="data_manager_json" /> | ||
| </outputs> | ||
| <tests> | ||
| <test expect_num_outputs="1"> | ||
| <param name="test" value="true"/> | ||
| <output name="out_file"> | ||
| <assert_contents> | ||
| <has_text text='"value":"25-05-22"'/> | ||
| <has_text text='"name":"kMetaShot reference data 2025-05-22"'/> | ||
| </assert_contents> | ||
| </output> | ||
| </test> | ||
| </tests> | ||
| <help><![CDATA[ | ||
| Download and extract kMetaShot reference data. | ||
| ]]></help> | ||
| <citations> | ||
| <citation type="doi">10.1038/s41592-023-01940-w</citation> | ||
| </citations> | ||
| </tool> | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| <data_managers> | ||
| <data_manager tool_file="data_manager/kmetashot_datamanager.xml" id="kmetashot_build_database"> | ||
| <data_table name="checkm2"> | ||
SantaMcCloud marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| <output> | ||
| <column name="value"/> | ||
| <column name="dbkey"/> | ||
| <column name="name"/> | ||
| <column name="version"/> | ||
| <column name="path" output_ref="out_file"> | ||
| <move type="file"> | ||
| <source>${path}</source> | ||
| <target base="${GALAXY_DATA_MANAGER_DATA_PATH}">kmetashot/${value}/${path}</target> | ||
| </move> | ||
| <value_translation>${GALAXY_DATA_MANAGER_DATA_PATH}/kmetashot/${value}/${path}</value_translation> | ||
| <value_translation type="function">abspath</value_translation> | ||
| </column> | ||
| </output> | ||
| </data_table> | ||
| </data_manager> | ||
| </data_managers> | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
|
|
||
| 25-05-22 kMetaShot-25-05-22 kMetaShot reference data 2025-05-22 2 /tmp/tmpf_hplx2a/galaxy-dev/tool-data/kmetashot/2/kMetaShot_bacteria_archaea_2025-05-22.h5 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| #This is a sample file distributed with Galaxy that enables tools | ||
| #to use a the kMetaShot database. | ||
| #You will need to create these data files using the following command | ||
|
|
||
| #wget [selected version] [url_from_donwlaod] | ||
|
|
||
| #The <version> column indicates the version from the kMetaShot ref data was downloaded | ||
|
|
||
| #25-05-22 kMetaShot-25-05-22 kMetaShot reference data 2025-05-22 2 /mnt/galaxyIndices/kMetaShot_database/kMetaShot_bacteria_archaea_2025-05-22.h5 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| <tables> | ||
| <table name="kmetashot" comment_char="#" allow_duplicate_entries="False"> | ||
| <columns>value, dbkey, name, version, path</columns> | ||
| <file path="tool-data/kmetashot.loc" /> | ||
| </table> | ||
| </tables> |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| <tables> | ||
| <!-- Location of kmetashot indexes for testing --> | ||
| <table name="kmetashot" comment_char="#" allow_duplicate_entries="False"> | ||
| <columns>value, dbkey, name, version, path</columns> | ||
| <file path="${__HERE__}/test-data/kmetashot.loc" /> | ||
| </table> | ||
| </tables> |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| name: kmetashot | ||
| owner: iuc | ||
| description: an alignment-free taxonomic classifier based on k-mer/minimizer counting | ||
| long_description: | | ||
| kMetaShot, a bioinformatic approach relying on k-mer/minimizer profiling | ||
| from the reference prokaryotic genomes, in order to build a concise | ||
| representation of genomic diversity and perform MAG taxonomic | ||
| classification up to the strain level | ||
| homepage_url: https://github.com/gdefazio/kMetaShot | ||
| remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/main/tools/kmetashot | ||
| categories: | ||
| - Metagenomics | ||
| type: unrestricted |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,96 @@ | ||
| <tool id="kmetashot" name="kMetaShot" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@"> | ||
| <description>an alignment-free taxonomic classifier based on k-mer/minimizer counting</description> | ||
| <macros> | ||
| <token name="@TOOL_VERSION@">2.0</token> | ||
| <token name="@VERSION_SUFFIX@">0</token> | ||
| <token name="@PROFILE@">25.0</token> | ||
SantaMcCloud marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| </macros> | ||
| <requirements> | ||
| <requirement type="package" version="@TOOL_VERSION@">kmetashot</requirement> | ||
| </requirements> | ||
| <command detect_errors="exit_code"> | ||
| <![CDATA[ | ||
| #import re | ||
SantaMcCloud marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| mkdir 'output' 'bins' && | ||
|
|
||
| #for $file in $bins_dir: | ||
| ln -s '$file' 'bins/$file.element_identifier' && | ||
SantaMcCloud marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| #end for | ||
|
|
||
| kMetaShot_classifier_NV.py | ||
| -b 'bins' | ||
| -o 'output' | ||
| -r '$reference.fields.path' | ||
| -p '\${GALAXY_SLOTS:-1}' | ||
SantaMcCloud marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| -a ${ass2ref} | ||
|
|
||
| ]]> | ||
| </command> | ||
| <inputs> | ||
| <param argument="--bins_dir" type="data" multiple="true" format="fasta,fasta.gz" label="Bin(s)/MAG(s) fasta file"/> | ||
| <param argument="--reference" type="select" label="Select reference"> | ||
| <options from_data_table="kmetashot"> | ||
| <filter type="sort_by" column="2"/> | ||
| </options> | ||
| <validator type="no_options" message="No reference data for kMetaShot is installed. Please contact the Galaxy adminstrators to request one be installed."/> | ||
| </param> | ||
| <param argument="--ass2ref" type="float" min="0.0" value="0.0" max="1.0" label="Set ass2ref parameter" help="Ass2ref is a ratio between the number of MAG minimizers and the reference minimizers related to the assigned strain"/> | ||
|
||
| </inputs> | ||
| <outputs> | ||
| <collection name="result" type="list" label="${tool.name} on ${on_string}: RESULTS"> | ||
SantaMcCloud marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| <discover_datasets pattern="(?P<designation>.*)\.csv" format="tabular" directory="output"/> | ||
| </collection> | ||
| </outputs> | ||
| <tests> | ||
| <!-- Since this tool need his ref data to work there is no way to test this tool really because of this there is only this test to see of the tool is starting or not --> | ||
| <test expect_exit_code="1" expect_failure="true"> | ||
| <param name="bins_dir" value="all_contig.fasta.gz" ftype="fasta.gz"/> | ||
| <param name="ass2ref" value="0.2"/> | ||
| <assert_command> | ||
| <has_text text="kMetaShot_classifier_NV.py -b bins"/> | ||
| <has_text text="-o output"/> | ||
| <has_text text="-a 0.2"/> | ||
| </assert_command> | ||
| </test> | ||
| <test expect_exit_code="1" expect_failure="true"> | ||
| <param name="bins_dir" value="all_contig.fasta.gz" ftype="fasta.gz"/> | ||
| <param name="ass2ref" value="0.3"/> | ||
| <assert_command> | ||
| <has_text text="kMetaShot_classifier_NV.py -b bins"/> | ||
| <has_text text="-o output"/> | ||
| <has_text text="-a 0.3"/> | ||
| </assert_command> | ||
| </test> | ||
| </tests> | ||
| <help> | ||
| <![CDATA[ | ||
|
|
||
| This tool is a taxonomic classifier which used a new algorithm which is alignment free. | ||
SantaMcCloud marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| The data from the input files will be transformed into bites and each base is represented by 2 bites. | ||
|
||
|
|
||
| A = 00 | ||
| C = 01 | ||
| G = 10 | ||
| T = 11 | ||
|
|
||
| Therefore the reference which was build for this can used this transformed data and match them to create a classification for the bin(s)/MAG(s). | ||
|
|
||
| **Input** | ||
|
|
||
| Fasta file(s) in fasta format or/and fasta.gz format (.fa, .fasta, .fna, .fa.gz, .fasta.gz, .fna.gz are allowed extensions) | ||
|
|
||
| **Reference** | ||
|
|
||
| The reference data needed for this tool is provided from a data manager which always installed the latest version of the data | ||
SantaMcCloud marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| **Output** | ||
|
|
||
| The Output is a collection with csv file(s) which contained the classification for the inputted bin(s)/MAG(s) | ||
|
||
|
|
||
| ]]> | ||
| </help> | ||
| <citations> | ||
| <citation type="doi">10.1093/bib/bbae680</citation> | ||
| </citations> | ||
| </tool> | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since there seem to be stable releases:
https://github.com/gdefazio/kMetaShot?tab=readme-ov-file#kmetashot-reference
I think it would be better if the admins could choose between those, that's better for reproducability
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0acaf08 4d0b6eb i did make the DM dynamic so each new version can be added there and can be downloaded