How can I find most frequent combinations of numbers in a list?

KutaBeach

Imagine you have a list of numbers (or letters), such as

1177783777297461145777267337774652113777236237118777

I want to find the most frequent combinations of numbers in this list:

for 1-digit-long combinations - it is the most frequent number in this list

for 2-digit-long combinations - probably '11'

for 3-digits-long combinations - probably '777' etc

Is there some special algorythm for such a tasks?

UPDATE Well, by myself I have coded the following (Java). Looks like execution time is proportional to data size multiplied by pattern size:

public static void main(String[] args)
{
    int DATA_SIZE = 10000;
    int[] data = new int[DATA_SIZE];
    for (int i = 0; i < DATA_SIZE; i++)
    {
        data[i] = (int) (10 * Math.random()) % 10;
        System.out.print(data[i]);
    }

    int[] pattern1 = new int[]{1, 2, 3};
    int[] pattern2 = new int[]{7, 7, 7};
    int[] pattern3 = new int[]{7, 7};

    System.out.println();
    System.out.println(match(data, pattern1));
    System.out.println(match(data, pattern2));
    System.out.println(match(data, pattern3));
}

static int match(int[] data, int[] pattern)
{
    int matches = 0;
    int i = 0;
    while (i < data.length)
    {
        matches = isEqual(data, i, pattern) ? matches + 1 : matches;
        i++;
    }
    return matches;
}

static boolean isEqual(int[] a, int startIndex, int[] a2)
{
    if (a == a2)
    {
        return true;
    }
    if (a == null || a2 == null)
    {
        return false;
    }

    for (int i = 0; i < a2.length; i++)
    {
        if (a[startIndex + i] != a2[i])
        {
            return false;
        }
    }

    return true;
}
Paul Hankin

To find the largest number of repeats of a sequence of length at least k in a string of length n, you can build a suffix tree (http://en.wikipedia.org/wiki/Suffix_tree) in linear time, then find the node that has the most children that describes sequences of length k (or more) from the root.

Overall, this is linear time in the length of the input string.

For small k, you're better off with a naive algorithm:

from collections import Counter

def input(s, k):
    c = Counter()
    for i in xrange(len(s) - k):
        c[s[i : i + k]] += 1
    return c.most_common(1)[0][0]


for k in xrange(1, 4):
    print input('1177783777297461145777267337774652113777236237118777', k)

本文收集自互联网,转载请注明来源。

如有侵权,请联系[email protected] 删除。

编辑于
0

我来说两句

0条评论
登录后参与评论

相关文章

来自分类Dev

How to find the most frequent words before and after a given word in a given text in python?

来自分类Dev

How can I efficiently find the accuracy of a classifier

来自分类Dev

How can I pass floating point numbers as template parameters?

来自分类Dev

Column - Most frequent letter in a group of 4 rows

来自分类Dev

How to read numbers in text file as numbers in to list?

来自分类Dev

How can I find who pushed a tag(s) to BitBucket

来自分类Dev

How can I recursively find the *directories* containing a file of a certain type?

来自分类Dev

How can I use find() on a map inside a vector of maps?

来自分类Dev

How can I find executables installed by NuGet packages?

来自分类Dev

How can I find the string value of a property given it's value

来自分类Dev

How can I find my browser web log file?

来自分类Dev

How can I get a list of PHP errors occurred in a page?

来自分类Dev

how can i prevent nested list's click event

来自分类Dev

How can I get list of users of a specific domain?

来自分类Dev

How can I use sed to find the lines that start with "X", then find and replace inside this line using Perl?

来自分类Dev

How can I find out how much physical space a record (BLOB) in a sqlite database uses?

来自分类Dev

How can I find out what version is passed to setup in setup.py?

来自分类Dev

How can I find out if an data- attribute is set to an empty value?

来自分类Dev

How can I find all text within a word document that matches a pattern and apply formatting on it?

来自分类Dev

How do I enable line numbers in Kakoune?

来自分类Dev

How can I retrieve data from array list properly for the first action

来自分类Dev

How can I get the list of the currently installed certificate authority on my ubuntu?

来自分类Dev

Why can't I split this python list?

来自分类Dev

Can I sort a list of objects by 2 keys?

来自分类Dev

r language: how to find rows with match in list?

来自分类Dev

Can i use nested linearlayouts instead of list view, for a big list?

来自分类Dev

How to create combinations of values in Java?

来自分类Dev

How can I cancel an ngEvent?

来自分类Dev

How can I return a function?

Related 相关文章

  1. 1

    How to find the most frequent words before and after a given word in a given text in python?

  2. 2

    How can I efficiently find the accuracy of a classifier

  3. 3

    How can I pass floating point numbers as template parameters?

  4. 4

    Column - Most frequent letter in a group of 4 rows

  5. 5

    How to read numbers in text file as numbers in to list?

  6. 6

    How can I find who pushed a tag(s) to BitBucket

  7. 7

    How can I recursively find the *directories* containing a file of a certain type?

  8. 8

    How can I use find() on a map inside a vector of maps?

  9. 9

    How can I find executables installed by NuGet packages?

  10. 10

    How can I find the string value of a property given it's value

  11. 11

    How can I find my browser web log file?

  12. 12

    How can I get a list of PHP errors occurred in a page?

  13. 13

    how can i prevent nested list's click event

  14. 14

    How can I get list of users of a specific domain?

  15. 15

    How can I use sed to find the lines that start with "X", then find and replace inside this line using Perl?

  16. 16

    How can I find out how much physical space a record (BLOB) in a sqlite database uses?

  17. 17

    How can I find out what version is passed to setup in setup.py?

  18. 18

    How can I find out if an data- attribute is set to an empty value?

  19. 19

    How can I find all text within a word document that matches a pattern and apply formatting on it?

  20. 20

    How do I enable line numbers in Kakoune?

  21. 21

    How can I retrieve data from array list properly for the first action

  22. 22

    How can I get the list of the currently installed certificate authority on my ubuntu?

  23. 23

    Why can't I split this python list?

  24. 24

    Can I sort a list of objects by 2 keys?

  25. 25

    r language: how to find rows with match in list?

  26. 26

    Can i use nested linearlayouts instead of list view, for a big list?

  27. 27

    How to create combinations of values in Java?

  28. 28

    How can I cancel an ngEvent?

  29. 29

    How can I return a function?

热门标签

归档