检查字符串是否包含列表项

O P 发表于 Dev

我有以下脚本检查字符串是否包含列表项：

word = ['one',
        'two',
        'three']
string = 'my favorite number is two'
if any(word_item in string.split() for word_item in word):
    print 'string contains a word from the word list: %s' % (word_item)

这可行，但是我正在尝试打印字符串包含的列表项。我究竟做错了什么？

布伦丹·朗（Brendan Long）

问题在于您使用的是if语句而不是for语句，因此您print只能（最多）运行一次（如果至少一个单词匹配），并且此时any已遍历整个循环。

这是您想要做的最简单的方法：

words = ['one',
         'two',
         'three']
string = 'my favorite number is two'
for word in words:
    if word in string.split():
        print('string contains a word from the word list: %s' % (word))

如果您出于某种原因希望此功能正常运行，可以执行以下操作：

for word in filter(string.split().__contains__, words):
    print('string contains a word from the word list: %s' % (word))

由于即使此问题与性能无关，也有人必须回答与性能相关的答案，所以将字符串拆分一次并根据要检查的单词数将其转换为set可能会更有效。也有用。

关于注释中的问题，如果要使用多个单词的“单词”，则有两个简单的选择：添加空格，然后在完整字符串中搜索单词，或带有单词边界的正则表达式。

最简单的方法是在要搜索的文本之前和之后添加一个空格字符，然后搜索' ' + word + ' '：

phrases = ['one',
           'two',
           'two words']
text = "this has two words in it"

for phrase in phrases:
    if " %s " % phrase in text:
        print("text '%s' contains phrase '%s'" % (text, phrase))

对于正则表达式，只需使用\b边界一词：

import re

for phrase in phrases:
    if re.search(r"\b%s\b" % re.escape(phrase), text):
        print("text '%s' contains phrase '%s'" % (text, phrase))

哪一个是“ nicer”很难说，但是正则表达式的效率可能大大降低（如果这对您很重要）。

如果您不在乎单词边界，则可以执行以下操作：

phrases = ['one',
           'two',
           'two words']
text = "the word 'tone' will be matched, but so will 'two words'"

for phrase in phrases:
    if phrase in text:
        print("text '%s' contains phrase '%s'" % (text, phrase))

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。