Practical bioinformatics topics / NGS

I’m developing a survey instrument that I can use to assess bioinformatics training needs at UC Davis, with a particular emphasis on practical sequencing data analysis. (Please see my blog post on training for more information and background.)

A few notes –

  1. I intend this survey to be for biologists to fill out. So, I’m avoiding technical and foundational skills (cloud computing, Linux/UNIX, R, Python, managing large data sets).
  2. I’m also avoiding sequence analysis approaches for which there are no established pipelines.

Below is my list so far. I welcome comments, additions, and critiques! The live site is at http://ngs-training-needs-survey.rtfd.org/.

Please feel free to copy, fork, and modify freely - the source for this is on github at https://github.com/ngs-docs/ngs-training-needs-survey.

Genome assembly and annotation:

  • Assembling and annotating bacterial and archaeal genomes (w/Illumina, PacBio)
  • Assembling and annotating non-plant/animal eukaryotic genomes (w/Illumina)
  • Assembling animal genomes (w/Illumina)
  • Annotating animal genomes
  • Assembling plant genomes (w/Illumina)
  • Annotating plant genomes
  • Annotating bacterial genomes
  • Annotating fungal genomes
  • Long-read technologies for large genomes (PacBio, Moleculo)
  • Emerging technologies for genome sequencing, assembly, and annotation in plants
  • Emerging technologies for genome sequencing, assembly, and annotation in bacteria and archaea
  • Emerging technologies for genome sequencing, assembly, and annotation in animals

Resequencing and variant calling:

  • Variant calling on bacterial, archaeal, and fungal genomes
  • Variant calling on plant and animal genomes
  • Genotyping by sequencing

Transcriptomics:

  • mRNAseq expression analysis in major model organisms (human, mouse, zebrafish, Arabidopsis, yeast, worm, Drosophila)
  • ab initio transcriptome assembly, annotation, and expression analysis (semi-model animals, plants, and fungi)
  • de novo transcriptome assembly, annotation, and expression analysis (non-model eukaryotes)
  • Reference-genome-based bacterial and archaeal transcriptomics
  • De novo mRNAseq in bacteria and archaea (no reference genome)

Metagenomics and microbial ecology:

  • Amplicon analysis of populations and population structure
  • Reference-based metagenomics (e.g. human microbiome)
  • De novo shotgun metagenome and metatranscriptome assembly and analysis

Other:

  • ChIP-seq analysis
  • Reduced representation analysis of genomes and populations
  • Marker development
  • Genome Wide Association Studies

More open-ended questions:

What bioinformatics software/programs are you using right now?

  • CLC Workbench;
  • Galaxy;
  • Other (pls specify)

What compute resources are you using, if any?

  • Laptop or lab computer;
  • iPlant;
  • XSEDE;
  • DIAG;
  • Amazon cloud;
  • Davis Genome Center;
  • Other cloud (specify)
  • Other (pls specify)

What scripting or programming languages are you using, if any?

  • MATLAB
  • R
  • Python
  • Perl
  • SAS
  • Other (pls specify)

What do you feel is your major bioinformatics or sequence analysis-related obstacle, i.e. what is getting in the way of doing your data analysis?


LICENSE: This documentation and all textual/graphic site content is licensed under the Creative Commons - 0 License (CC0) -- fork @ github. Presentations (PPT/PDF) and PDFs are the property of their respective owners and are under the terms indicated within the presentation.
comments powered by Disqus