如何从文件中删除一组行？

debugcn 发表于 Dev

清道夫

我正在编写一个ksh脚本来分析日志文件，并在发现重要消息时发送电子邮件。一些消息仅供参考，我想忽略它们。

日志文件具有格式

2018-01-24.08.24.35.875675    some text

    more text
    more text
    more text
    more text

2018-01-24.08.24.37.164538    some text

    more text
    more text
    INF9999W        <-- informational text
    more text

2018-01-24.08.24.46.8602545    some text

    more text
    more text
    more text

时间戳将被视为消息分隔符，时间戳属于其后的消息。我想在文件中搜索每次出现的“信息文本”，然后从文件中删除整个消息（从上一个时间戳直到下一个时间戳之前）。

如何轻松确定之前和之后时间戳记的行号，以便使用以下方法删除这些行：

awk 'NR<'$preceding_ts' || NR >='$following_ts'

我的方法是将所有时间戳行放入一个文件，然后循环浏览该文件，直到找到“信息文本”行＃前后的时间戳行。似乎需要做很多工作，尤其是在处理大文件时。有没有更有效的方法。

integer inf_line
integer last_ts_line
integer cur_ts
cp $error_log $copy_log
while true
do
   inf_line=$(grep -n "INF99999W" $copy_log | head -1 | cut -f1 -d":")
   if [[ $inf_line -eq 0 ]]
   then
      break
   fi     
   grep -n -E "^20[0-9][0-9]-[0-1][0-9]-[0-3][0-9]-" $copy_log | cut -f1 -d":" > $ts_lines
   last_ts_line=99999999
   cat $ts_lines | while read cur_ts
   do       
      if [[ $cur_ts -gt $inf_line && $last_ts_line -lt $inf_line ]]
      then
         awk 'NR<'$last_ts_line' || NR >='$cur_ts'' $copy_log > $temp_log
         cp $temp_log $copy_log
         last_ts_line=$cur_ts
         break
      fi
      last_ts_line=$cur_ts
   done
   if [[ $last_ts_line -lt $inf_line ]]
   then
      awk 'NR<'$last_ts_line'' $copy_log > $temp_log
      cp $temp_log $copy_log
   fi
done

谢谢。

伊尔卡楚

我将通过存储当前消息的行来处理它，并在消息结束时，如果没有INF看到标记，则打印存储的批次。在这里，d保留当前消息的行（用于数据的d），p告诉我们是否要打印存储的行。

awk -vinfo='INF99+' \
    '/^20[0-9][0-9]-[0-1][0-9]-[0-3][0-9]/ {
         if (p) printf "%s", d; d = $0 ORS; p=1; next } 
     $0 ~ info {p=0} 
     {d = d $0 ORS} 
     END {if (p) printf "%s", d}' < log

这里的第一个规则在时间戳行上匹配，如果p为true，则打印任何存储的行，存储该行并将其设置p为一个。p如果info看到带有图案的线，第二条规则将重置为零；否则，第二条规则将重置为零。用将该模式设置为变量-vinfo=...。第三条规则将当前行添加到所收集的行之后，并且END如果p设置了该规则，则仅再次打印所收集的行。

我们也可以这样写，这info也会在时间戳行上检查模式：

awk -vinfo='INF99+' \
    '/^20[0-9][0-9]-[0-1][0-9]-[0-3][0-9]/ {
         if (p) { printf "%s", d }; d = ""; p=1; } 
     $0 ~ info {p=0} 
     {d = d $0 ORS} 
     END {if (p) printf "%s", d}' < log

通常，在awkPerl中编写这样的内容可能是一个好主意。结果至少会远远快于一个shell脚本来运行叉几十份grep，awk和cut等...

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。