I gzipped files in a slurm job. But job canceled before it is done. It takes so long(~10min for a file, and there is 200 files). How can I zip only the remaining files which are not zipped yet?
$ ls
SRR7121484_1.fastq
SRR7121484_1.fastq.gz
SRR7121484_2.fastq
SRR7121484_2.fastq.gz
SRR7121485_1.fastq
SRR7121485_2.fastq
SRR7121488_1.fastq
SRR7121488_2.fastq
....
As you see, all the remaining files are bigger than the number 7121485. I tried to extract this value and use conditionals, however no success yet.
Thanks in advance!
You can just run gzip
on all the fastq
files directly. By default, it will ask if you want to overwrite the existing files:
$ gzip -k *.fastq
SRR7121484_1.fastq.gz already exists -- do you wish to overwrite (y or n)?
If gzip
can't read anything from standard input, it simply skips these files:
% gzip -k *.fastq -v < /dev/null
gzip: SRR7121484_1.fastq.gz already exists -- skipping
gzip: SRR7121484_2.fastq.gz already exists -- skipping
SRR7121485_1.fastq: -99.9% -- replaced with SRR7121485_1.fastq.gz
SRR7121485_2.fastq: -99.9% -- replaced with SRR7121485_2.fastq.gz
SRR7121488_1.fastq: -99.9% -- replaced with SRR7121488_1.fastq.gz
SRR7121488_2.fastq: -99.9% -- replaced with SRR7121488_2.fastq.gz
So, just run:
gzip -k *.fastq < /dev/null
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments