convert

What

The purpose of the convert command is to convert AncientMetagenomeDir tables to input sheets for different pipelines and software. It will also download citation information for you.

It is a partial command-line equivalent to the viewer graphical user interface.

When

You typically will use convert if you are a command-line power user, and have already performed table filtering and sample selection programmatically (e.g. in R or python), rather than using the AMDirT viewer graphical-user-interface.

How

The following description assumes that you have already have a AncientMetagenomeDir samples table that has been filtered to the samples you wish to run through a given pipeline(s).

⚠️ The header, and present columns etc. should match exactly that on AncientMetagenomeDir, only rows may be removed.

To generate sample sheets of the support pipelines: open a terminal (activating software environments if necessary, see AMDirT Installation Page), and there you can run the following command (an example being a filtered ancientmetagenome-hostassociated_samples_warinnersamplesonly.tsv table):

mkdir samplesheets/
AMDirT convert ancientmetagenome-hostassociated_samples_warinnersamplesonly.tsv ancientmetagenome-hostassociated -o samplesheets/ --<tool>

where you provide the filtered TSV, which AncientMetagenomeDir samples table the filtered table is derived from,then the output directory where the samplesheets should be saved into, and which tool to generate a samplesheet from.

See Output for descriptions of all output files.

⚠️ When using a pipeline input samplesheet, you should always double check the sheet is correctly configured. We cannot guarantee accuracy between metadata and sequencing files.

Once you have validated it, you can directly supply it to the appropriate pipeline as follows (using nf-core/eager as an example):

nextflow run nf-core/eager <...> --input ancientMetagenomeDir_eager_input.csv

The citations BibTex file contains all the citation information of your selected samples in a widely used format called BibTex. You can import this file into most referencing or bibliography managing tools (Zotero, Paperpile etc.).

⚠️ Occasionally the cross-ref databases do not have citation information for certain publications. You will receive a warning if so, with the relevant DOIs for you to manually get this information.

Output

⚠️ We highly recommend generating and reviewing AncientMetagenomeDir_filtered_libraries.tsv before downloading or running any pipelines to ensure you have in the download scripts and/or pipeline input sheets only the actual library types you wish to use (e.g. you may only want paired-end data, or non-UDG treated data).

⚠️ To use a pipeline input samplesheet, you should always double check the sheet is correctly configured. We cannot guarantee accuracy between metadata and sequencing files.

All possible output is as follows:

  • <outdir>: where all the pipeline samplesheets are placed (by default .)

  • AncientMetagenomeDir_bibliography.bib:

    • A BibTex format citation information file with all references (where available) present in the filtered sample table.

  • AncientMetagenomeDir_filtered_libraries.tsv:

    • The associated AncientMetagenomeDir curated metadata for all libraries of the samples in the input table.

  • AncientMetagenomeDir_curl_download_script.sh:

    • A bash script containing curl commands for all libraries in the input samples list.

  • AncientMetagenomeDir_aspera_download_script.sh:

    • A bash script containing Aspera commands for all libraries in the input samples list. See How Tos for Aspera configuration information.

  • AncientMetagenomeDir_nf_core_fetchngs_input_table.tsv:

    • An input sheet containing ERS/SRS accession numbers in a format compatible with the nf-core/fetchngs input samplesheet.

  • AncientMetagenomeDir_nf_core_eager_input_table.tsv:

    • An input sheet with metadata in a format compatible with the nf-core/eager input samplesheet.

    • Contained paths are relative to the directory output when using the curl and aspera download scripts (i.e., input sheet assumes files are in the same directory as the input sheet itself).

  • AncientMetagenomeDir_nf_core_taxprofiler_input_table.csv:

    • An input sheet with metadata in a format compatible with the nf-core/taxprofiler input samplesheet.

    • Contained paths are relative to the directory output when using the curl and aspera download scripts (i.e., input sheet assumes files are in the same directory as the input sheet itself).

  • AncientMetagenomeDir_aMeta_input_table.tsv:

    • An input sheet with metadata in a format compatible with the aMeta input samplesheet.

    • Contained paths are relative to the directory output when using the curl and aspera download scripts (i.e., input sheet assumes files are in the same directory as the input sheet itself).

  • AncientMetagenomeDir_nf_core_mag_input_{single,paired}_table.csv:

    • An input sheet with metadata in a format compatible with the nf-core/mag input samplesheet.

    • Contained paths are relative to the directory output when using the curl and aspera download scripts (i.e., input sheet assumes files are in the same directory as the input sheet itself).

    • nf-core/mag does not support paired- and single-end data in the same run, therefore two sheets will be generated if your selected samples contain both types of libraries.