Transposable Elements prediction and annotation in the M. incognita genome (ICPSR doi:10.15454/EPTDOS)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

Transposable Elements prediction and annotation in the M. incognita genome

Identification Number:

doi:10.15454/EPTDOS

Distributor:

Portail Data INRAE

Date of Distribution:

2020-06-11

Version:

2

Bibliographic Citation:

Kozlowski, Djampa, 2020, "Transposable Elements prediction and annotation in the M. incognita genome", https://doi.org/10.15454/EPTDOS, Portail Data INRAE, V2

Study Description

Citation

Title:

Transposable Elements prediction and annotation in the M. incognita genome

Identification Number:

doi:10.15454/EPTDOS

Authoring Entity:

Kozlowski, Djampa (INRAE/Universite Cote d'Azur)

Distributor:

Portail Data INRAE

Access Authority:

Danchin, Etienne

Access Authority:

Kozlowski, Djampa

Depositor:

kozlowski, djampa

Date of Deposit:

2020-04-29

Study Scope

Keywords:

Biodiversity and Ecology, Omics, Plant Health and Pathology

Abstract:

Summary: contains all the essential files produced during the TE prediction, annotation, and post-processing in the M. incognita genome (e.g. TE consensus library, TE annotations, and associated statistics). Also contains the global workflow (used command lines), the REPET configuration files (with parameters) used for this analysis and the in-house python script used to identify canonical TE annotations using TE consensus library and draft TE annotation (REPET output).

Kind of Data:

Dataset

Kind of Data:

Software

Methodology and Processing

Sources Statement

Data Access

Notes:

CC0 Waiver

Other Study Description Materials

Other Study-Related Materials

Label:

finalAnnot.perConsensusStats.txt

Text:

Per-TE-consensus statistics e.g number of copies, TE class and order, copies length/identity statistics, etc.

Notes:

text/plain

Other Study-Related Materials

Label:

finalAnnot.perCopyStats.txt

Text:

Per-TE-copy statistics e.g linked TE-consenus, copies length/identity statistics, etc

Notes:

text/plain

Other Study-Related Materials

Label:

logs_REPETpostAnal-V1.0.5_minc_v3.TEannotation.19-07-19.txt

Text:

Logs from M. incognita TE annotation post-processing. Summarize annotations statistics for each post-processing steps.

Notes:

text/plain

Other Study-Related Materials

Label:

minc_v3_cleanedTEConsensusLibrary.19-07-19.fa

Text:

TE-consensus sequences library (REPET analysis : TEannot + automated cleaning with 1 loose round of TEannot)

Notes:

application/octet-stream

Other Study-Related Materials

Label:

minc_v3.TEannotations.19-07-19.filtered_classiffiedOnly.minLen250.minId85.minConsCov33.blastVsCons.noOverlap.bed

Text:

canonical TE annotation file extracted from the draft TE annotation (bed format) using REPETpostAnal-V1.0.5.py

Notes:

application/vnd.realvnc.bed

Other Study-Related Materials

Label:

mincV3XA2_draftAnnot.19-07-19.sorted.gff3

Text:

draft TE annotation (REPET output; gff3 format)

Notes:

application/octet-stream

Other Study-Related Materials

Label:

REPETpostAnal-V1.0.5.py

Text:

In-house python (>= 3) script. Used to parse REPET draft annotation (repeatome) and isolate canonical TE annotations.Requires as input i) REPET draft TE-annotation output (.gff3), ii) the TE-consensus library used for the annotation, iii) the genome fasta file. Default parameters are set for stringent filtering post-processing (e.g canonical TE-annotations). External Dependencies: bedtools (>= 2), blast+, bash. Required python libraries: subprocess, os, sys, re, pandas, argparse, numpy, Bio (SeqIO).

Notes:

text/x-python

Other Study-Related Materials

Label:

TEannot.cfg

Text:

TE annot pipeline configuration file with parameters values. Mandatory for TEdenovo pipeline execution. For more information, see : https://urgi.versailles.inra.fr/Tools/REPET https://urgi.versailles.inra.fr/Tools/REPET/TEannot-tuto

Notes:

application/octet-stream

Other Study-Related Materials

Label:

TEdenovo.cfg

Text:

TE denovo pipeline configuration file with parameters values. Mandatory for TEdenovo pipeline execution. For more information, see : https://urgi.versailles.inra.fr/Tools/REPET https://urgi.versailles.inra.fr/Tools/REPET/TEdenovo-tuto

Notes:

application/octet-stream

Other Study-Related Materials

Label:

TE_prediction_and_annotation_Minc_v3_19-07-19.lastVersion.Rmd

Text:

M. incognita TE prediction and annotation workflow (Rmarkdown file). Contains the executed command lines used to perform the analysis.

Notes:

application/octet-stream