我试图识别用户条目中要搜索的关键词,所以我想到了过滤掉某些词性以便提取关键词以在数据库中进行查询。目前,我使用下面的代码从字符串中替换单词“ of”
let rawString = "I’m jealous of my parents. I’ll never have a kid as cool as theirs, one who is smart, has devilishly good looks, and knows all sorts of funny phrases."
var filtered = self.rawString.replacingOccurrences(of: "of", with: "")
我现在想做的就是扩展它以替换字符串中的所有介词。
我当时想做的是创建大量已知介词,例如
let prepositions = ["in","through","after","under","beneath","before"......]
然后用空格将字符串分开
var WordList : [String] = filtered.components(separatedBy: " ")
然后在单词表中循环查找介词匹配项并将其删除。创建列表很丑陋,可能对我的代码没有效率。
从字符串中检测和删除介词的最佳方法是什么?
用途NaturalLanguage
:
import NaturalLanguage
let text = "The ripe taste of cheese improves with age."
let tagger = NLTagger(tagSchemes: [.lexicalClass])
tagger.string = text
let options: NLTagger.Options = [.omitPunctuation, .omitWhitespace]
var newSentence = [String]()
tagger.enumerateTags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .lexicalClass, options: options) { tag, tokenRange in
guard let tag = tag, tag != .preposition else { return true }
newSentence.append("\(text[tokenRange])")
return true
}
print("Input: \(text)")
print("Output: \(newSentence.joined(separator: " "))")
打印:
Input: The ripe taste of cheese improves with age.
Output: The ripe taste cheese improves age
注意两个介词的,并与被删除。我的方法还删除了标点符号。您可以使用该.omitPunctuation
选项进行调整。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句