what i am trying to find is the correct regular expression
import re
line = "The Boeing AH-64 Apache is an American four-blade,"
print(re.findall('(A.+)\s', line))
this is want i want
['AH-64', 'Apache' , 'American']
and this is what i'm getting
['AH-64 Apache is an American']
You may use a word boundary (\b
) before A
and then match one or more non-whitespace chars after it (\S+
):
import re
line = "The Boeing AH-64 Apache is an American four-blade,"
print(re.findall(r'\bA\S+', line))
NOTE: to match A
as a whole word, replace +
(1 or more occurrences) with *
(0 or more occurrences): r'\bA\S*'
. I assume you want to match longer sequences though.
Or, since \S
matches all symbols and punctuation, you may precise your regex a bit and use
print(re.findall(r'\bA[\w-]+', line))
where [\w-]+
matches 1 or more letter, digits, _
and -
symbols.
See the Python demo showing ['AH-64', 'Apache', 'American']
output.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments