如何查找和替换"West"
以下CSV文件示例中的模式(例如引号)?
"LastName","FirstName","","","890","","6G","","S "West" AVENUE","","CITY","ZIP"
您不能使用CSV类读取此内容,因为它是格式错误的CSV字符串。有时发生这种情况是因为无论生成它的人都不知道他们在做什么:
require 'csv'
foo = '"LastName","FirstName","","","890","","6G","","S "West" AVENUE","","CITY","ZIP"'
arr_of_arrs = CSV.parse(foo)
然后导致异常:
Missing or stray quote in line 1 (CSV::MalformedCSVError)
相反,要解决此问题,您必须先修复数据,然后解析。这是一个起点:
/(?<=\s)("[^"]+")(?=\s)/
http://rubular.com/r/sWEkx07Zyo
模式在引号之间寻找东西,并用前导和尾随空格括起来。不会捕获空格。
以下是适用于此特定示例的一些代码:
foo = '"LastName","FirstName","","","890","","6G","","S "West" AVENUE","","CITY","ZIP"'
REGEX = /(?<=\s)("[^"]+")(?=\s)/
word = foo[REGEX]
foo[REGEX] = word[1..-2]
puts foo
# >> "LastName","FirstName","","","890","","6G","","S West AVENUE","","CITY","ZIP"
此时可以使用CSV:
require 'csv'
arr_of_arrs = CSV.parse(foo)
# => [["LastName",
# "FirstName",
# "",
# "",
# "890",
# "",
# "6G",
# "",
# "S West AVENUE",
# "",
# "CITY",
# "ZIP"]]
这些东西可能会令人困惑:
word = foo[REGEX]
foo[REGEX] = word[1..-2]
foo\[...\]
是String类的一部分,并且是查找和替换字符串中字符的一种很好的便捷方法。
可以使CSV解析器对嵌入的引号感到满意,因此,如果扔掉它们太繁琐,则可以执行以下操作:
word = foo[REGEX]
foo[REGEX] = '"%s"' % word
require 'csv'
arr_of_arrs = CSV.parse(foo)
# => [["LastName",
# "FirstName",
# "",
# "",
# "890",
# "",
# "6G",
# "",
# "S \"West\" AVENUE",
# "",
# "CITY",
# "ZIP"]]
它只是按照CSV规范的规则运行,并在字符串周围使用双引号双引号。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句