I think I have an infinite loop? I made a dictionary with search terms as keys and one index where those keys were found in my_string. I'd like to create a search_dict with a list of ALL matches in my_string listed as indices for each key as a search term.
My search_dict isn't populated except for one item which has millions of items.
my_string='Shall I compare thee to a summer\'s day?'
#string_dict has only a single index as a value where its key was found in my_string
string_dict={'a': 36, ' ': 34, 'e': 30, '': 39, 'h': 17, 'm': 29, 'l': 4, 'o': 22, 'e ': 19, 's': 33, 'r': 31, 't': 21, ' t': 20, 'e t': 19}
#I'd like search_dict to have all indices for key matches in my_string
search_dict=dict()
for key in string_dict:
search_dict[key]=list()
for item in search_dict:
start=0
end=len(my_string)
found=my_string.find(item,start,end)
while start<end:
if found>=0:
search_dict[key].append(found)
start=found+len(item)
found=my_string.find(item,start,end)
else:
break
print search_dict
I've also tried the changes below. Still not sure why if my_string.find comes up -1 (not found) the loop isn't breaking for the next search key iteration.
else:
break
#with
if found<0:
break
I'm thinking that if you're looking for substrings and not characters I think regex would work best.
>>> import re
>>> my_string='Shall I compare thee to a summer\'s day?'
>>> search_items = ['a', ' ', 'e', 'h', 'm', 'l', 'o', 'e ', 's', 'r', 't', ' t', 'e t']
>>> results_dict = {}
>>> for search_item in search_items:
... results_dict[search_item] = [m.start() for m in re.finditer(search_item, my_string)]
...
>>> for elem in results_dict:
... print("%s: %s" % (elem, results_dict[elem]))
...
a: [2, 12, 24, 36]
: [5, 7, 15, 20, 23, 25, 34]
e: [14, 18, 19, 30]
h: [1, 17]
m: [10, 28, 29]
l: [3, 4]
o: [9, 22]
e : [14, 19]
s: [26, 33]
r: [13, 31]
t: [16, 21]
t: [15, 20]
e t: [14, 19]
While it's not specified in your question the value in the results is the starting position of the substring.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments