Seed Microbiota Database (doi:10.15454/2ANNJM)

View:

Part 1: Document Description
Part 2: Study Description
Part 3: Data Files Description
Part 4: Variable Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

Seed Microbiota Database

Identification Number:

doi:10.15454/2ANNJM

Distributor:

Portail Data INRAE

Date of Distribution:

2021-06-04

Version:

2

Bibliographic Citation:

Simonin, Marie; Barret, Matthieu, 2021, "Seed Microbiota Database", https://doi.org/10.15454/2ANNJM, Portail Data INRAE, V2, UNF:6:qmyXzIL+/FEATj0EoRC3lA== [fileUNF]

Study Description

Citation

Title:

Seed Microbiota Database

Identification Number:

doi:10.15454/2ANNJM

Authoring Entity:

Simonin, Marie (INRAE - l'Institut national de recherche pour l’agriculture, l’alimentation et l’environnement)

Barret, Matthieu (INRAE- l'Institut national de recherche pour l’agriculture, l’alimentation et l’environnement)

Distributor:

Portail Data INRAE

Access Authority:

Simonin, Marie

Depositor:

Simonin, Marie

Date of Deposit:

2021-06-04

Study Scope

Keywords:

Microorganisms, Plant Health and Pathology

Abstract:

This dataset compiles all the data of the Seed Microbiota Database associated to the publication Simonin et al. 2021 (BioRxiv) Seed microbiota revealed by a large-scale meta-analysis including 50 plant species. This database includes metabarcoding data from 63 seed microbiota studies on 50 plant species ( total of 3190 seed samples) based on 5 different molecular markers (16S rRNA gene - V4 region, 16S rRNA gene - V5-V6 region, gyrB gene, ITS1 region, ITS2 region). All the studies were re-processed from the fastq files (raw data) using DADA2 and Qiime2 and merged in 5 different datasets depending on the molecular marker targeted. The README file presents the structure of the database (Subsets) and files available.

Kind of Data:

Dataset

Methodology and Processing

Sources Statement

Data Access

Notes:

<img src="https://www.etalab.gouv.fr/wp-content/uploads/2011/10/licence-ouverte-open-licence.gif" alt="Licence Ouverte" height="100"><a href="https://www.etalab.gouv.fr/licence-ouverte-open-licence">Licence Ouverte / Open Licence Version 2.0</a> compatible CC BY

Other Study Description Materials

File Description--f113943

File: Subset1-2_All_studies_merged_16S-rep-seqs-FINAL-V4-MiSeq-taxonomy.tab

  • Number of cases: 31433

  • No. of variables per record: 11

  • Type of File: text/tab-separated-values

Notes:

UNF:6:u+JJ/7BbzWKk3hLFfI3ZkA==

File Description--f113942

File: Subset1-2_All_studies_merged_16S_table-FINAL-V5V6-MiSeq-taxonomy.tab

  • Number of cases: 4225

  • No. of variables per record: 11

  • Type of File: text/tab-separated-values

Notes:

UNF:6:ckeC6MZJr7TU7nJsJBkD1g==

File Description--f113939

File: Subset1-2_All_studies_merged_gyrB-rep-seqs-FINAL-filtered-taxonomy-final.tab

  • Number of cases: 27373

  • No. of variables per record: 10

  • Type of File: text/tab-separated-values

Notes:

UNF:6:egkHr/w8kDUYf5I6eyX7Lw==

File Description--f113951

File: Subset1-2_All_studies_merged_ITS1_taxonomy.tab

  • Number of cases: 8333

  • No. of variables per record: 11

  • Type of File: text/tab-separated-values

Notes:

UNF:6:jDf35n15q+TraP067P8RoQ==

File Description--f113945

File: Subset1-2_All_studies_merged_ITS2-taxonomy.tab

  • Number of cases: 901

  • No. of variables per record: 11

  • Type of File: text/tab-separated-values

Notes:

UNF:6:Sk+tr0oNaZCQ5tGsRe6PuA==

Variable Description

List of Variables:

Variables

SV

f113943 Location:

Variable Format: character

Notes: UNF:6:0kPhe7VqQ8Ml0/pgI5RoWw==

Gene

f113943 Location:

Variable Format: character

Notes: UNF:6:+jP3fOUVQc+7H39nbQL3UQ==

Taxonomy

f113943 Location:

Variable Format: character

Notes: UNF:6:JVlVC759JU3yrwXFN0FNOA==

Confidence

f113943 Location:

Summary Statistics: StDev 0.08715217331925784; Mean 0.9393406238671778; Max. 1.0; Valid 31433.0; Min. 0.700009905

Variable Format: numeric

Notes: UNF:6:wrq0QYHFCVESWZ4fVuIbYA==

Kingdom

f113943 Location:

Variable Format: character

Notes: UNF:6:gzbGNlkjOvPEFBBFa4ooYw==

Phylum

f113943 Location:

Variable Format: character

Notes: UNF:6:tV+UgntLEUT4Aowwt7ktmA==

Class

f113943 Location:

Variable Format: character

Notes: UNF:6:cJ4K9EMYhI76S5dqEE6X+w==

Order

f113943 Location:

Variable Format: character

Notes: UNF:6:sTOkgSwpkdnGX8dCA7sVoA==

Family

f113943 Location:

Variable Format: character

Notes: UNF:6:mptsCysKIZBMtLbIIvDIOg==

Genus

f113943 Location:

Variable Format: character

Notes: UNF:6:LFW2GIvHGwIJnvCl5rJEbA==

Species

f113943 Location:

Variable Format: character

Notes: UNF:6:iO6qffRP+3m/FhK6iJV48w==

SV

f113942 Location:

Variable Format: character

Notes: UNF:6:yvV1g7qkftd8ApZmOwU0wQ==

Gene

f113942 Location:

Variable Format: character

Notes: UNF:6:v+XGgyBKwMos3pZZR81O+Q==

Taxonomy

f113942 Location:

Variable Format: character

Notes: UNF:6:AyvP3Y81oGrm+07BF5Ah3A==

Confidence

f113942 Location:

Summary Statistics: StDev 0.08118860336580708; Min. 0.700082522; Mean 0.9446077843671006; Valid 4225.0; Max. 1.0

Variable Format: numeric

Notes: UNF:6:TP1Dn4wTzi8ksJqIESwDtw==

Kingdom

f113942 Location:

Variable Format: character

Notes: UNF:6:kayyZF2hFHyPwdJKhf0XmQ==

Phylum

f113942 Location:

Variable Format: character

Notes: UNF:6:/w9Zh8Gi44qReq+Gza4Hng==

Class

f113942 Location:

Variable Format: character

Notes: UNF:6:QTuJkql5JvxMZ2t/IzC/EA==

Order

f113942 Location:

Variable Format: character

Notes: UNF:6:QYGX30NZd8CWHVl2eYQa6A==

Family

f113942 Location:

Variable Format: character

Notes: UNF:6:k01RTW0APe8dJK9jRtUa+w==

Genus

f113942 Location:

Variable Format: character

Notes: UNF:6:tlIDEDK9ZuVNQ+E/2derpg==

Species

f113942 Location:

Variable Format: character

Notes: UNF:6:Ywor7yjwwedpc/srzsD6Jg==

SV

f113939 Location:

Variable Format: character

Notes: UNF:6:5it12h8oz8XEDiaoVOiUVA==

Taxonomy

f113939 Location:

Variable Format: character

Notes: UNF:6:Pq97iYK0rNwRWbjCFBTxsg==

Confidence

f113939 Location:

Summary Statistics: Valid 27373.0; Max. 1.0; Min. 0.70001243; Mean 0.9407210396468417; StDev 0.08415415227592508;

Variable Format: numeric

Notes: UNF:6:lZUKysL3WDjZcvEW4HHnKQ==

Gene

f113939 Location:

Variable Format: character

Notes: UNF:6:8PkiCC0Eugw1jWJNX1Jg2Q==

Phylum

f113939 Location:

Variable Format: character

Notes: UNF:6:jm0br41yvYQKg2TTsVS19Q==

Class

f113939 Location:

Variable Format: character

Notes: UNF:6:bZsH8BAIPEXyz961XlCeKw==

Order

f113939 Location:

Variable Format: character

Notes: UNF:6:qkBNMftUcuWh1dhCE4yHDQ==

Family

f113939 Location:

Variable Format: character

Notes: UNF:6:ldbH67H3yfvCTApJgGCjRg==

Genus

f113939 Location:

Variable Format: character

Notes: UNF:6:DUlLkkSwadNxEdwutLzovg==

Species

f113939 Location:

Variable Format: character

Notes: UNF:6:r76idpN2OqKTzT4Vpp4tPA==

SV

f113951 Location:

Variable Format: character

Notes: UNF:6:aUTxmS3Bd159QAhAGsejsA==

Taxonomy

f113951 Location:

Variable Format: character

Notes: UNF:6:m0j0MirwsOUyu3pgq1kkmA==

Confidence

f113951 Location:

Summary Statistics: Mean 0.9293037215711029; StDev 0.09180509266014666; Max. 1.0; Valid 8333.0; Min. 0.700094124;

Variable Format: numeric

Notes: UNF:6:27C9KoTV9hCndEVGikjgyQ==

Gene

f113951 Location:

Variable Format: character

Notes: UNF:6:OAFDwoFw+LJgB5bCF9RAjA==

Kingdom

f113951 Location:

Variable Format: character

Notes: UNF:6:RNAsKoVrihj8pid1/ZZm6g==

Phylum

f113951 Location:

Variable Format: character

Notes: UNF:6:QY2UcPwx78ZkQNI8N9u5iw==

Class

f113951 Location:

Variable Format: character

Notes: UNF:6:g/vNrSYGQwfVPbC/WKiXjA==

Order

f113951 Location:

Variable Format: character

Notes: UNF:6:b4H6kcquthpPPHAB9uGSWw==

Family

f113951 Location:

Variable Format: character

Notes: UNF:6:RwOxyhjYZr+1fa5Gq1r+KA==

Genus

f113951 Location:

Variable Format: character

Notes: UNF:6:+7/EmXmMZZTg+T5NSqYA8Q==

Species

f113951 Location:

Variable Format: character

Notes: UNF:6:Bd4gwMpGET6SmihAK6Vy4w==

SV

f113945 Location:

Variable Format: character

Notes: UNF:6:9bRJabPwOf/fP9Y2O6AOgg==

Taxon

f113945 Location:

Variable Format: character

Notes: UNF:6:cUjFek0VVK/VpraP7VbhLA==

Confidence

f113945 Location:

Summary Statistics: Max. 1.0; Valid 901.0; StDev 0.15807395240497044; Mean 0.8943345149256382; Min. 0.308993448;

Variable Format: numeric

Notes: UNF:6:ahfFs1sLDJjZk1pCJV31JQ==

Gene

f113945 Location:

Variable Format: character

Notes: UNF:6:SPpuvqlRQ4zvjml5UA35HQ==

Kingdom

f113945 Location:

Variable Format: character

Notes: UNF:6:/rRvfU+0Z6poMBw6GT0m/w==

Phylum

f113945 Location:

Variable Format: character

Notes: UNF:6:RgiM6Q+PdZ6lxPur+ApEtg==

Class

f113945 Location:

Variable Format: character

Notes: UNF:6:0GU9GW+sXxs8q8GaXl/8bA==

Order

f113945 Location:

Variable Format: character

Notes: UNF:6:IytzprDKUoXnUwTPF3HCJg==

Family

f113945 Location:

Variable Format: character

Notes: UNF:6:wnb/Ns187ulA3WT43/Jk0w==

Genus

f113945 Location:

Variable Format: character

Notes: UNF:6:OJI/6OM1drNH/S6TqV7CUQ==

Species

f113945 Location:

Variable Format: character

Notes: UNF:6:3w0r31aFDda6l14ZiKqb7Q==

Other Study-Related Materials

Label:

getASVFoundOnSpecies.pl

Text:

Query script that enable to list all the ASVs detected in a given plant species

Notes:

text/x-perl-script

Other Study-Related Materials

Label:

getASVInfosInDb.pl

Text:

Query script to search for a specific ASV (sequence) in the database

Notes:

text/x-perl-script

Other Study-Related Materials

Label:

Metadata_16S_V4_V5V6_withDivSubset2_Jan2021.txt

Text:

16S rRNA gene -V4 and V5-V6, metadata file (info on samples)

Notes:

text/plain

Other Study-Related Materials

Label:

Metadata_gyrB_withDivSubset2_Jan2021.txt

Text:

gyrB gene, metadata table (info on samples)

Notes:

text/plain

Other Study-Related Materials

Label:

Metadata_ITS1_ITS2_withDivSubset2_Jan2021.txt

Text:

ITS1 and ITS2, metadata file (info on samples)

Notes:

text/plain

Other Study-Related Materials

Label:

README - Database Meta-analysis Seed Microbiome-1.txt

Text:

Readme that provides detail on the structure of the database and query scripts

Notes:

text/plain

Other Study-Related Materials

Label:

Subset1-All_studies_merged_16S-rep-seqs-FINAL-V4-MiSeq-filtered.fasta

Text:

16S rRNA gene - V4, fasta file of Subset 1

Notes:

text/plain

Other Study-Related Materials

Label:

Subset1-All_studies_merged_16S-rep-seqs-FINAL-V4-MiSeq-filtered-rooted-tree.nwk

Text:

16S rRNA gene - V4, fasta file of Subset 1

Notes:

application/octet-stream

Other Study-Related Materials

Label:

Subset1-All_studies_merged_16S-rep-seqs-FINAL-V4-MiSeq-filtered-unrooted-tree.nwk

Text:

16S rRNA gene - V4, unrooted phylogenetic tree based on Subset 1

Notes:

application/octet-stream

Other Study-Related Materials

Label:

Subset1-All_studies_merged_16S-rep-seqs-FINAL-V5V6-MiSeq-filtered.fasta

Text:

16S rRNA gene - V5-V6, fasta file of the Subset 1

Notes:

text/plain

Other Study-Related Materials

Label:

Subset1-All_studies_merged_16S-rep-seqs-FINAL-V5V6-MiSeq-rooted-tree.nwk

Text:

16S rRNA gene - V5-V6, rooted phylogenetic tree

Notes:

application/octet-stream

Other Study-Related Materials

Label:

Subset1-All_studies_merged_16S-rep-seqs-FINAL-V5V6-MiSeq-unrooted-tree.nwk

Text:

16S rRNA gene - V5-V6, unrooted phylogenetic tree

Notes:

application/octet-stream

Other Study-Related Materials

Label:

Subset1-All_studies_merged_16S_table-FINAL-V4-MiSeq-filtered.txt

Text:

16S rRNA gene - V4, ASV table of clean database unrarefied = Subset 1

Notes:

text/plain

Other Study-Related Materials

Label:

Subset1-All_studies_merged_16S_table-FINAL-V5V6-MiSeq2-filtered.txt

Text:

16S rRNA gene - V5-V6, ASV table of the clean database unrarefied = Subset 1

Notes:

text/plain

Other Study-Related Materials

Label:

Subset1_All_studies_merged_gyrB-rep-seqs-FINAL-filtered3.fasta

Text:

gyrB gene, fasta file on Subset 1

Notes:

text/plain

Other Study-Related Materials

Label:

Subset1_All_studies_merged_gyrB-rep-seqs-FINAL-filtered3-rooted-tree.nwk

Text:

gyrB gene, rooted phylogenetic tree on Subset 1

Notes:

application/octet-stream

Other Study-Related Materials

Label:

Subset1_All_studies_merged_gyrB-rep-seqs-FINAL-filtered3-unrooted-tree.nwk

Text:

gyrB gene, unrooted phylogenetic tree on Subset 1

Notes:

application/octet-stream

Other Study-Related Materials

Label:

Subset1_All_studies_merged_gyrB_table-FINAL-filtered3.txt

Text:

gyrB gene, ASV table of the clean database unrarefied = Subset 1

Notes:

text/plain

Other Study-Related Materials

Label:

Subset1-All_studies_merged_ITS1-rep-seqs-FINAL-filtered.fasta

Text:

ITS1, fasta file of the Subset 1

Notes:

text/plain

Other Study-Related Materials

Label:

Subset1-All_studies_merged_ITS1_table-FINAL2-filtered.txt

Text:

ITS1, ASV table of the clean database unrarefied = Subset 1

Notes:

text/plain

Other Study-Related Materials

Label:

Subset1-All_studies_merged_ITS2-rep-seqs-FINAL-filtered2.fasta

Text:

ITS2, fasta file of Subset 1

Notes:

text/plain

Other Study-Related Materials

Label:

Subset1-All_studies_merged_ITS2_table-FINAL2-filtered2.txt

Text:

ITS2, ASV table of the clean database unrarefied Subset 1

Notes:

text/plain

Other Study-Related Materials

Label:

Subset2-All_studies_merged_16S-rep-seqs-FINAL-V4-MiSeq-filtered.fasta

Text:

16S rRNA gene - V4, fasta file of Subset 2

Notes:

text/plain

Other Study-Related Materials

Label:

Subset2-All_studies_merged_16S-rep-seqs-FINAL-V5V6-MiSeq-rarefied.fasta

Text:

16S rRNA gene - V5-V6, fasta file of the Subset 2

Notes:

text/plain

Other Study-Related Materials

Label:

Subset2-All_studies_merged_16S_table-FINAL-V4-MiSeq-filtered.txt

Text:

16S rRNA gene - V4, ASV table rarefied by study = Subset 2

Notes:

text/plain

Other Study-Related Materials

Label:

Subset2-All_studies_merged_16S_table-FINAL-V5V6-MiSeq-rarefied.txt

Text:

16S rRNA gene - V5-V6, ASV table rarefied by study = Subset 2

Notes:

text/plain

Other Study-Related Materials

Label:

Subset2_All_studies_merged_gyrB-rep-seqs-FINAL-rarefied-filtered2.fasta

Text:

gyrB gene, fasta file Subset 2

Notes:

text/plain

Other Study-Related Materials

Label:

Subset2_All_studies_merged_gyrB_table-FINAL-rarefied-filtered2.txt

Text:

gyrB gene, ASV table rarefied by study = Subset 2

Notes:

text/plain

Other Study-Related Materials

Label:

Subset2-All_studies_merged_ITS2_table-FINAL.txt

Text:

ITS2, ASV table rarefied by study = Subset 2

Notes:

text/plain

Other Study-Related Materials

Label:

Subset2-ITS1region-rep-seqs-FINAL-rarefied.fasta

Text:

ITS1, fasta file of the Subset 2

Notes:

text/plain

Other Study-Related Materials

Label:

Subset2-ITS1region_table-FINAL-rarefied.txt

Text:

ITS1, ASV table rarefied by study = Subset 2

Notes:

text/plain

Other Study-Related Materials

Label:

Subset2-ITS2region-rep-seqs-FINAL-rarefied.fasta

Text:

ITS2, fasta file of Subset 2

Notes:

text/plain

Other Study-Related Materials

Label:

Subset3_16S_V4_rep-seq-FINAL-rarefied.fasta

Text:

16S rRNA gene - V4, fasta file of Subset 3

Notes:

text/plain

Other Study-Related Materials

Label:

Subset3-16S-V4-table-FINAL-rarefied.txt

Text:

16S rRNA gene - V4, ASV table rarefied across all studies = Subset 3

Notes:

text/plain

Other Study-Related Materials

Label:

Subset3_16S_V5V6_rep-seq-FINAL-rarefied.fasta

Text:

16S rRNA gene - V5-V6, fasta file of the Subset 3

Notes:

text/plain

Other Study-Related Materials

Label:

Subset3-16S-V5V6-table-FINAL-rarefied.txt

Text:

16S rRNA gene - V5-V6, ASV table rarefied across all studies = Subset 3

Notes:

text/plain

Other Study-Related Materials

Label:

Subset3_gyrB-MiSeq_rep-seq-FINAL-rarefied.fasta

Text:

gyrB gene, fasta file on Subset 3

Notes:

text/plain

Other Study-Related Materials

Label:

Subset3-gyrB-MiSeq_table-FINAL-rarefied.txt

Text:

gyrB gene, ASV table rarefied across all studies = Subset 3

Notes:

text/plain

Other Study-Related Materials

Label:

Subset3_ITS1_rep-seq-FINAL-rarefied.fasta

Text:

ITS1, fasta file of the Subset 3

Notes:

text/plain

Other Study-Related Materials

Label:

Subset3-ITS1_table-FINAL-rarefied.txt

Text:

ITS1, ASV table rarefied across all studies = Subset 3

Notes:

text/plain

Other Study-Related Materials

Label:

Subset3_ITS2_rep-seq-FINAL-rarefied.fasta

Text:

ITS2, fasta file of Subset 3

Notes:

text/plain

Other Study-Related Materials

Label:

Subset3-ITS2_table-FINAL-rarefied.txt

Text:

ITS2, ASV table rarefied across all studies = Subset 3

Notes:

text/plain