awk

Subsampling a fastq file with awk

What you are going to find here A minimal introduction of the awk command in Linux and Mac (For Mac user, installing GNU awk might be necessary. It introduced some new functions like sorting an array with asort().) An awk command that would randomly subsample k reads from a given fastq file of a pair-ended sequencing. Why I am making this note In single cell RNA-sequencing, there seems to be no good way telling how deep you should sequence to date.