我试图从具有间隔(唯一的)的两个文件中提取非重叠间隔。在这种情况下:
file1.txt
Start End
1 3
5 9
13 24
34 57
file2.txt
Start End
6 7
10 12
16 28
45 68
预期结果:具有这些间隔且仅在一个文件中存在元素的数组:
1-3 , 10-12
仅此而已...非常感谢!
逐行处理文件。如果没有重叠,请报告从更早开始的间隔并前进其文件。如果有重叠,请提前两个文件。
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
my @F;
open $F[0], '<', 'file1.txt' or die $!;
open $F[1], '<', 'file2.txt' or die $!;
# Skip headers.
readline $_ for @F;
my @boundaries;
my @results;
sub earlier {
my ($x, $y) = @_;
if (! @{ $boundaries[$y] }
or $boundaries[$x][1] < $boundaries[$y][0]
) {
push @results, $boundaries[$x];
$boundaries[$x] = [ split ' ', readline $F[$x] ];
return 1
}
return 0
}
sub overlap {
my ($x, $y) = @_;
if ($boundaries[$x][1] < $boundaries[$y][1]) {
do { $boundaries[$x] = [ split ' ', readline $F[$x] ] }
until ! @{ $boundaries[$x] }
or $boundaries[$x][0] > $boundaries[$y][1];
$boundaries[$y] = [ split ' ', readline $F[$y] ];
return 1
}
return 0
}
sub advance_both {
@boundaries = map [ split ' ', readline $_ ], @F;
}
# init.
advance_both();
while (grep defined, @{ $boundaries[0] }, @{ $boundaries[1] }) {
earlier(0, 1)
or earlier(1, 0)
or overlap(0, 1)
or overlap(1, 0)
or advance_both();
}
print join(' , ', map { join '-', @$_ } @results), "\n";
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句