如何仅部分匹配文件名的一个实例？

debugcn 发表于 Dev

诺丁·洛菲（Nortine Lotfi）

因此，我有一个文件列表，我将其重命名为filename:hash。

我想要做的是仅匹配哈希值，同时保持组合filename:hash完整，并且不再次计算其哈希值，因为文件没有更改。

在执行此操作时，我需要移动它们或将其删除，但是由于文件uniq名将使其对于该工具而言不够“唯一” ，因此如果直接使用管道，则此文件将无法正常工作。

有什么办法吗？不使用posix工具（如awk，bash等）以外的任何其他工具，并且不使用列表或数据库文件？

详细信息：否，从技术上讲，这不是此帖子的重复项，是的，最终目标在技术上是相同的（即使用我已在另一篇帖子中介绍的方法/情况删除或移动重复项，或者这里）。

库萨兰达

使用bash（实际上不是POSIX工具，但是自从您明确提到它以来）：

#!/bin/bash

names=( *:* )

printf '%s\n' "${names[@]##*:}" | sort | uniq -c |
while read count hash; do
    if [[ $count -gt 1 ]]; then
        echo 'Would delete/move these:'
        printf '%s\n' *:"$hash"
    fi
done

这会将当前目录中包含一个:字符的所有名称收集到数组中names。假定该模式仅*:*匹配我们感兴趣的文件，并且没有其他文件具有这样的名称。

的扩展"${names[@]##*:}"将产生一个仅哈希列表，我们使用对其进行排序和计数sort | uniq -c。

那结果被读入count并hash在一个while read循环，如果计数大于一，我们知道，哈希被复制。如果哈希值重复，则该模式*:"$hash"将匹配具有该哈希值的所有名称。

如果要删除所有具有重复哈希的文件，则可以这样做

rm -f ./*:"$hash"

如果您要保留其中一个文件，则可以这样做，例如

dupnames=( ./*:"$hash" )
rm -f "${dupnames[@]:1}"

这会将数组dupnames设置为匹配的名称，并从文件系统中删除除第一个以外的所有名称。

您可能希望先启用一些调试输出，然后首先rm 禁用它，直到您确信自己确实可以运行为止：

#!/bin/bash

names=( *:* )

printf '%s\n' "${names[@]##*:}" | sort | uniq -c |
while read count hash; do
    if [[ $count -gt 1 ]]; then
        echo 'Would delete/move these:'
        dupnames=( ./*:"$hash" )
        echo rm -f "${dupnames[@]:1}"
    fi
done

sh上面的POSIX变体：

#!/bin/sh

for name in *:*; do
    printf '%s\n' "${name##*:}"
done | sort | uniq -c |
while read count hash; do
    if [ "$count" -gt 1 ]; then
        echo 'Would delete/move these:'
        set -- ./*:"$hash"
        shift
        echo rm -f "$@"
    fi
done

最后一个版本的一种变体，它sort | uniq -c通过以下方式消除了awk：

#!/bin/sh

for name in *:*; do
    printf '%s\n' "${name##*:}"
done |
awk '    { count[$0]++ }
     END { for (hash in count) if (count[hash] > 1) print hash }' |
while read hash; do
    echo 'Would delete/move these:'
    set -- ./*:"$hash"
    shift
    echo rm -f "$@"
done

该awk代码段也可以替换sort | uniq -c此答案中的其他代码段，但是请注意，最终循环现在无需测试计数是否大于1，而仅读取哈希值。

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。