继我之前的问题
我有多个文本文件,这些文件可能有也可能没有用虚线包围的重复文本组。输出中不应包含所有 lorem ipsum 文本。
$ cat /tmp/testAwk/file1.txt
--------------
important text one
important text two
--------------
Lorem ipsum dolor sit amet
consectetur adipiscing elit
--------------
important text three
important text four
--------------
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua
Ut enim ad minim veniam
quis nostrud exercitation ullamco laboris nisi ut aliquip
ex ea commodo consequat
$ cat /tmp/testAwk/file2.txt
Duis aute irure dolor in reprehenderit
--------------
important text one
important text two
--------------
in voluptate velit esse cillum dolore
eu fugiat nulla pariatur
non proident, sunt
--------------
important text three
important text four
--------------
Excepteur sint occaecat cupidatat
$ cat /tmp/testAwk/file3.txt
consequuntur magni dolores
sed quia non numquam
Quis autem vel eum iure reprehenderit
我试图用来awk
捕获两行之间的文本--------------
并打印出与模式匹配的文件的名称。
我接受了@Ed Morton 对我上一个问题的精彩回复:https ://stackoverflow.com/a/55507707/257233
awk '{x=sub(/^-+$/,"")} f; x{f=!f}' *.txt
我试图调整它以打印出与模式匹配的文件的文件名并缩进结果。我不知道如何做的整个工作awk
,所以我结束了一些grep
,并sed
在那里。
$ awk 'FNR==1{print FILENAME} {x=sub(/^-+$/,"---")} f; x{f=!f}' $(grep -E '^-+$' /tmp/testAwk/*.txt -l) | sed -re 's/^([^\/])/ \1/'
/tmp/testAwk/file1.txt
important text one
important text two
---
important text three
important text four
---
/tmp/testAwk/file2.txt
important text one
important text two
---
important text three
important text four
---
我可以只用 awk 来做上面的事情吗?
这是我的做法,特别是因为您的用例似乎正在发展以需要更多功能,因此将其塞进一个简短的单行代码并不是最好的方法:
$ cat tst.awk
FNR==1 { delimCnt=inBlock=block="" }
/^-+$/ {
inBlock = (++delimCnt % 2)
if ( !inBlock ) {
if (delimCnt > 1) {
if (delimCnt == 2) {
print FILENAME
}
print block " ---"
}
block = ""
}
next
}
inBlock { block = block " " $0 ORS }
.
$ awk -f tst.awk file1.txt file2.txt file3.txt
file1.txt
important text one
important text two
---
important text three
important text four
---
file2.txt
important text one
important text two
---
important text three
important text four
---
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句