Skip to content

Frequently Asked Questions

This page answers some common questions about the schema, including how to fill in certain sets of metadata fields.

Under construction

This page is currently under construction, and will be updated with more information in the future.

We are currently gathering potential sections to provide guidance too.

What is MInAS?

TODO

What do we define as ancient DNA?

TODO

What are the limits of the MInAS schema in regards to project stage

TODO

How to fill in geographic metadata

TODO

  • Lat:lon
  • geo_loc_names
  • Includes country, town/city/village, region etc.

How to describe sample environment and site types

TODO

How to fill in taxonomy ID fields

Description

Within the MIxS and MInAS schemas, there are multiple places where you can specify taxonomy IDs.

These represent different types of taxonomic information for different purposes.

For all ancient DNA sequencing data (e.g FASTQ files or BAM files), that you upload to the INSDC databases (ENA, DDBJ, NCBI) you should use the following guidance:

  • samp_taxon_id: should always be the 'special' NCBI taxonomy ID for metagenome (256318) or more specific metagenome IDs.
  • You can also use more specific metagenome taxon IDs, such as those for specific environments (e.g. soil, marine, etc.).
  • This is because all ancient DNA is intrinsically metagenomic in nature, as they contain other organisms from the burial environment, and not just the host organism.
  • host_taxid:
  • This taxon ID should be the NCBI taxonomy ID for the host organism that the sample was taken from.
  • For example, if your bone sample was taken from a human, you would use the NCBI taxonomy for Homo sapiens (9606).
  • genomic_probe_capture_id
  • This taxon ID should be used to describe which genomes are represented within the probe oligos.
  • For example, if your library was 'captured' for Yersinia pestis, you should specify a taxon ID of 632.
  • If you have more than one strains or species, you can either specify multiple taxon IDs (depending on the interface), or a higher level (E.g. genus) taxon ID.

Example

I have a 1240k capture library from a Human petrous bone

Metadata term Taxon ID Name Taxon ID
samp_taxon_id human skeleton metagenome 1892068
host_taxid Homo sapiens 9606
genomic_probe_capture_id Homo sapiens 9606

I have a shotgun sequenced library from a Human petrous bone

Metadata term Taxon ID Name Taxon ID
samp_taxon_id human skeleton metagenome 1892068
host_taxid Homo sapiens 9606

This is not a captured library, so genomic_probe_capture_idis not required

I have a Yersinia pestis capture library from a human tooth

Metadata term Taxon ID Name Taxon ID
samp_taxon_id human skeleton metagenome 1892068
host_taxid Homo sapiens 9606
genomic_probe_capture_id Yersinia pestis 632

I have a shotgun sequenced library of dental calculus from a Eurasian brown bear

Metadata term Taxon ID Name Taxon ID
samp_taxon_id oral metagenome 1227552
host_taxid Ursus arctos arctos 563924

This is not a captured library, so genomic_probe_capture_idis not required

I have a sediment capture library for Narwhal DNA

Metadata term Taxon ID Name Taxon ID
samp_taxon_id marine sediment metagenome 412755
genomic_probe_capture_id Monodon monoceros 40151

There is no host_taxid in the MIxS sediment checklist, thus not used here

How to fill in sample age information

TODO

  • Wider description
  • What to do if no age information?
  • E.g. go for a very very wide range, and add to description
  • Describe each method of dating

How to fill in sample collection date information

TODO

TODO

How to fill in metadata for capture data

TODO

How to correctly prepare your ancient data for submission

TODO

If you wish to associate your data with other contextual information, such as anthropological/osteological reports, or archaeological site reports about a specific information, the DOIs or URLs should go in the relevant_electronic_resource field.

If you wish to associate your data with other non-genetic contextual data (such as imaging or scan data), give the DOIs or URLs should to this data in the relevant_electronic_resource field.

How to fill in age of death information

This information is already represented in the HostAssociated and HumanAsosiated checklists, and can be filled in using the host_age and and host_life_stage

How to indicate 'merged' individuals

  • In context of INSDC: Re-use ENA codes to link together!

How to reference exinct species

  • Check is not already in NCBI taxonomy (many already are!)
  • If not: Request NCBI Taxonomy ID from NCBI!
  • If not allowed or recognised: Use generic ID (e.g. Mammuthus sp. ) plus host_common_name
  • Specify this under the permit_authority and/or 'curating_institution' fields.