Assume i have an gzip compressed tar-ball compressedArchive.tgz (+100 files, totaling +5gb).
What would be the fastest way to remove all entries matching a given filename pattern for example prefix*.jpg and then store the remains in a gzip:ed tar-ball again?
Replacing the old archive or creating a new one is not important, whichever is fastest.
With GNU tar
, you can do:
pigz -d < file.tgz |
tar --delete --wildcards -f - '*/prefix*.jpg' |
pigz > newfile.tgz
With bsdtar
:
pigz -d < file.tgz |
bsdtar -cf - --exclude='*/prefix*.jpg' @- |
pigz > newfile.tgz
(pigz
being the multi-threaded version of gzip
).
You could overwrite the file over itself like:
{ pigz -d < file.tgz |
tar --delete --wildcards -f - '*/prefix*.jpg' |
pigz &&
perl -e 'truncate STDOUT, tell STDOUT'
} 1<> file.tgz
But that's quite risky, especially if the result ends up being less compressed than the original file (in which case, the second pigz
may end up overwriting areas of the file which the first one has not read yet).
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments