Find common sub string from the list of strings

thiruvenkadam

How can I take out only a prefix of a string from the list of strings? The caveat is that I do not know the prefix before hand. Only through this function, I will know the prefix.

(eg):
string_list = ["test11", "test12", "test13"]
# Prefix is test1
string_list = ["test-a", "test-b", "test-c"]
# Prefix is test-
string_list = ["test1", "test1a", "test12"]
# Prefix is test1
string_list = ["testa-1", "testb-1", "testc-1"]
# Prefix is test

In case there is nothing common in all the strings of the list, it should be an empty string.

Mike Müller

Solution

This function works:

def find_prefix(string_list):
    prefix = []
    for chars in zip(*string_list):
        if len(set(chars)) == 1:
            prefix.append(chars[0])
        else:
            break
    return ''.join(prefix)

Tests

string_lists = [["test11", "test12", "test13"],
                ["test-a", "test-b", "test-c"],
                ["test1", "test1a", "test12"],
                ["testa-1", "testb-1", "testc-1"]]


for string_list in string_lists:
    print(string_list)
    print(find_prefix(string_list))

Output:

['test11', 'test12', 'test13']
test1
['test-a', 'test-b', 'test-c']
test-
['test1', 'test1a', 'test12']
test1
['testa-1', 'testb-1', 'testc-1']
test

Speed

It is always fun to time things:

string_list = ["test11", "test12", "test13"]

%timeit get_large_subset(string_list)
100000 loops, best of 3: 14.3 µs per loop

%timeit find_prefix(string_list)
100000 loops, best of 3: 6.19 µs per loop

long_string_list = ['test{}'.format(x) for x in range(int(1e4))]

%timeit get_large_subset(long_string_list)
100 loops, best of 3: 7.44 ms per loop

%timeit find_prefix(long_string_list)
100 loops, best of 3: 2.38 ms per loop

very_long_string_list = ['test{}'.format(x) for x in range(int(1e6))]

%timeit get_large_subset(very_long_string_list)
1 loops, best of 3: 761 ms per loop

%timeit find_prefix(very_long_string_list)
1 loops, best of 3: 354 ms per loop

Conclusion: Using sets in this way is fast.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

find if there is a common string between 2 list of strings using linq

From Dev

Extract sub Strings From a String

From Dev

Find all sub strings separated by dot in a string?

From Dev

How to find and remove multiple sub strings in a string

From Dev

Find common items in list of lists of strings

From Dev

Find common items in list of lists of strings

From Dev

algorithm for making optimise string from sub strings

From Dev

Extracting multiple sub-strings from a string

From Dev

Extract Sub-strings from String with FILTERXML

From Dev

find common DateTime from list of DateTime, if no common then find most common

From Dev

Find string from a list

From Dev

Find if a String in a list of Strings is in another list of Strings in Esper

From Dev

One liner: Find if string is in list of strings

From Dev

Find a string in a list of strings in c#

From Dev

Find string in a list of strings with 1 character difference

From Dev

Comparing a string with a list of strings to find anagrams in Python

From Dev

How to find most frequent string in List of strings

From Dev

Splitting string into sub strings

From Dev

Find strings from a list starting with strings in another list

From Dev

How to find all files containing various strings from a long list of string combinations?

From Dev

How to find all files containing various strings from a long list of string combinations?

From Dev

how to fetch multiple sub strings from a long string

From Dev

Algorithm to find same substring from a list of strings

From Dev

Find intersection of words from two list of strings

From Dev

Algorithm to find same substring from a list of strings

From Dev

Find if any string element in list is contained in list of strings

From Dev

Find if any string element in list is contained in list of strings

From Dev

Find all strings that are in between two sub strings

From Dev

Create DropDownListFor using strings from a List<string>

Related Related

  1. 1

    find if there is a common string between 2 list of strings using linq

  2. 2

    Extract sub Strings From a String

  3. 3

    Find all sub strings separated by dot in a string?

  4. 4

    How to find and remove multiple sub strings in a string

  5. 5

    Find common items in list of lists of strings

  6. 6

    Find common items in list of lists of strings

  7. 7

    algorithm for making optimise string from sub strings

  8. 8

    Extracting multiple sub-strings from a string

  9. 9

    Extract Sub-strings from String with FILTERXML

  10. 10

    find common DateTime from list of DateTime, if no common then find most common

  11. 11

    Find string from a list

  12. 12

    Find if a String in a list of Strings is in another list of Strings in Esper

  13. 13

    One liner: Find if string is in list of strings

  14. 14

    Find a string in a list of strings in c#

  15. 15

    Find string in a list of strings with 1 character difference

  16. 16

    Comparing a string with a list of strings to find anagrams in Python

  17. 17

    How to find most frequent string in List of strings

  18. 18

    Splitting string into sub strings

  19. 19

    Find strings from a list starting with strings in another list

  20. 20

    How to find all files containing various strings from a long list of string combinations?

  21. 21

    How to find all files containing various strings from a long list of string combinations?

  22. 22

    how to fetch multiple sub strings from a long string

  23. 23

    Algorithm to find same substring from a list of strings

  24. 24

    Find intersection of words from two list of strings

  25. 25

    Algorithm to find same substring from a list of strings

  26. 26

    Find if any string element in list is contained in list of strings

  27. 27

    Find if any string element in list is contained in list of strings

  28. 28

    Find all strings that are in between two sub strings

  29. 29

    Create DropDownListFor using strings from a List<string>

HotTag

Archive