The aim of this project was to provide key summary statistics from whole genome DNA resequencing data-sets to assess quality, and to develop methods to quickly identify sub-par samples within a large population.
Identify and implement summary metrics that can be used to describe the quality a sample:
- mean read coverage?
- Classify good from bad samples by selecting good separating metrics
- Report and visualize selected metrics across large sequencing collections