我有一个使用CSV文件的目的地,并且第6个字段包含单词,但最大字符长度为16。如果该字段长度超过16个字符,我想复制该行并将其拆分而不破坏单词。
当前文件
"5","4","3","2","1","XYZ ABCD E"
"1","2","3","4","5","AB CDE F GHI JK LMNOP Q RS TUV W XYZ 12 3456 7890"
"9","8","7","6","5","LMN O PQ R"
期望的输出
"5","4","3","2","1","XYZ ABCD E"
"1","2","3","4","5","AB CDE F GHI JK"
"1","2","3","4","5","LMNOP Q RS TUV W"
"1","2","3","4","5","XYZ 12 3456 7890"
"9","8","7","6","5","LMN O PQ R"
使用GNU awk中(gawk
)运行fold
通过一个函数getline /变量/协进程
gawk -F, '
BEGIN{
OFS=FS;
cmd="fold -sw 16";
}
# if total length (16 + 2 for quotes) is within limit, print as-is
length($NF) <= 18 {print; next}
# else
{
# trim the quotes, then fold
print substr($NF,2,length($NF)-2) |& cmd;
close(cmd,"to");
NF--;
while((cmd |& getline var) > 0){
# (optional) trim trailing whitespace
sub(/[ \t]+$/,"",var);
print $0, "\"" var "\"" ;
}
close(cmd,"from");
}
' file.csv
该sub
去除后空格fold
的操作。
请注意,要获得所示的精确输出,将需要使用fold -sw17
16个字符和(随后删除的)尾随空格来中断。但是,这样做可能会导致折叠输出的最后一行超过16个字符的限制。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句