samtools ampliconclip [-o out.file] [-f stat.file] [--soft-clip] [--hard-clip] [--both-ends] [--strand] [--clipped] [--fail] [--filter-len INT] [--fail-len INT] [--no-excluded] [--rejects-file rejects.file] [--original] [--keep-tag] [--no-PG] [-u] -b bed.file in.file
Clip reads in a SAM compatible file based on data from a BED file. By default the reads are soft clipped and clip is only done from the 5' end.
Some things to be aware of. While ordering is not significant, adjustments to the left most mapping position (POS) will mean that coordinate sorted files will need resorting. In such cases the sorting order in the header is set to unknown. Clipping of reads results in template length (TLEN) being incorrect. This can be corrected by samtools fixmates. Any MD and NM aux tags will also be incorrect, which can be fixed by samtools calmd. By default MD and NM tags are removed though if the output is in CRAM format these tags will be automatically regenerated.
BED file of amplicons to be removed.
Output file name (defaults to stdout).
File to write stats to (defaults to stderr).
Output uncompressed SAM, BAM or CRAM.
Soft clip reads (default).
Hard clip reads.
Clip at both ends as opposed to just the 5' end.
Use strand entry from the BED file.
Only output clipped reads. Filter all others.
Mark unclipped reads as QC fail.
Filter out reads of INT size or shorter. In this case soft clips are not counted toward read length. An INT of 0 will filter out reads with no matching bases.
As --filter-len but mark as QC fail rather then filter out.
Filter out any reads that are marked as QCFAIL or are unmapped. This works on the state of the reads before clipping takes place.
Write any filtered reads out to a file.
Add an OA tag with the original data for clipped files.
In clipped reads, keep the possibly invalid NM and MD tags. By default these tags are deleted.
Do not at a PG line to the header.
Written by Andrew Whitwham and Rob Davies, both from the Sanger Institute.
Samtools website: <http://www.htslib.org/>
Copyright © 2023 Genome Research Limited (reg no. 2742969) is a charity registered in England with number 1021457. Terms and conditions.