我有一个TCPDUMP文件,其中包含USER和PASS单词的许多用法,我需要制定一个正则表达式以查找所有单词,然后打印出每个单词的数量。(或任何其他方式;尽管如此,正则表达式是我的首选)。我认为我的分歧似乎行不通。不知道我在这里怎么做错了,所以有什么想法吗?提前致谢!
这是输入文件的示例(注意:这只是2006行的文件的第一行。格式相同,但是数字,符号和字母在每一行中都发生变化)
22:28:28.374595 IP 98.114.205.102.1821 > 192.150.11.111.445: Flags [S], seq 147554406, win 64240, options [mss 1460,nop,nop,sackOK], length 0E...<[email protected].... ...\.bfP....Y..echo open 0.0.0.0 8884 > USER 1 1 >>
码:
#!/usr/bin/perl -w
use strict;
use warnings;
use diagnostics;
#opens txt file: read mode
open MYFILE, '<', 'source_file.txt' or die $!;
#opens output txt file: write mode
open OUT, '>', 'Summary_Report.txt' or die $!;
#open output txt file: write mode
#used to store header 'split' info
open OUTFILE, '>', 'Header.txt' or die $!;
my $start_time = undef;
my $end_time;
my $linenum = 0;
my $user;
my $pass;
while (<MYFILE>) {
chomp;
$linenum++;
#print ": $_\n"; ###if I need to see the lines (check)###
#separate pieces of information from TCPDUMP into list
my @header = split (' ',$_);
print OUTFILE "$linenum: @header\n\n";
if (/^22:28/ && !defined($start_time)) {
$start_time = $header[0];
#print "$start_time\n"; ###used as a check###
}
if ($_ = /22:28/) {
$end_time = $header[0];
}
if ($_ =~ m/USER/i) {
$user = $header[10];
}
}
print OUT "Total # of times phrases were used:\n\n
USER (variations thereof) = $user\n\n
PASS (variations thereof) = $pass\n\n\n";
my @lines = (<MYFILE>);
my @matches = grep { $_ =~ /(PASS|USER)/i } @lines;
应该管用?
带行号:
my @lines = (<MYFILE>);
my %results;
map {
if ($lines[$_] =~ /(pass|user)/i) {
$results{$_} = $lines[$_];
}
} 0..$#lines;
%results将具有键作为行号,值是line。Grep更快,因为它是递归的,这将是O(n2)iirc。
现在..
map {
#separate pieces of information from TCPDUMP into list
my @header = split (' ',$results[$_]);
print OUTFILE "$_: @header\n\n";
if (/^22:28/ && !defined($start_time)) {
$start_time = $header[0];
#print "$start_time\n"; ###used as a check###
}
if ($results[$_] = /22:28/) {
$end_time = $header[0];
}
if ($results[$_] =~ m/USER/i) {
$user = $header[10];
}
} keys %results;
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句