存储重复的数组元素

弗拉德·安德烈(Vlad Andrei)

我拼命地试图解决以下问题:在一系列句子/新闻标题中,我试图找到非常相似的句子(共有3个或4个单词),然后将它们放入一个新的数组中。因此,对于此原始数组/列表:

'Title1: Hackers expose trove of snagged Snapchat images',
'Title2: New Jersey officials order symptom-less NBC News crew into Ebola quarantine',
'Title3: Family says goodbye at funeral for 16-year-old',
'Title4: New Jersey officials talk about Ebola quarantine',
'Title5: New Far Cry 4 Trailer Welcomes You to Kyrat Lowlands',
'Title6: Hackers expose Snapchat images'

结果应为:

Array
(
    [0] => Title1: Hackers expose trove of snagged Snapchat images
    [1] => Array
        (
            [duplicate] => Title6: Hackers expose Snapchat images
        )

    [2] => Title2: New Jersey officials order symptom-less NBC News crew into Ebola quarantine
    [3] => Array
        (
            [duplicate] => Title4: New Jersey officials talk about Ebola quarantine
        )
    [4] => Title3: Family says goodbye at funeral for 16-year-old
    [5] => Title5: New Far Cry 4 Trailer Welcomes You to Kyrat Lowlands
)

这是我的代码:

    $titles = array(
    'Title1: Hackers expose trove of snagged Snapchat images',
    'Title2: New Jersey officials order symptom-less NBC News crew into Ebola quarantine',
    'Title3: Family says goodbye at funeral for 16-year-old',
    'Title4: New Jersey officials talk about Ebola quarantine',
    'Title5: New Far Cry 4 Trailer Welcomes You to Kyrat Lowlands',
    'Title6: Hackers expose Snapchat images'
    );
$z = 1;
foreach ($titles as $feed)
{
    $feed_A = explode(' ', $feed);
    for ($i=$z; $i<count($titles); $i++)
    {
        $feed_B = explode(' ', $titles[$i]);
        $intersect_A_B = array_intersect($feed_A, $feed_B);
        if(count($intersect_A_B)>3)
        {
            $titluri[] = $feed;
            $titluri[]['duplicate'] = $titles[$i]; 
        }
        else 
        {
            $titluri[] = $feed;
        }
    }
    $z++;
}

它输出此结果(笨拙,但稍微符合所需的结果):

Array
(
    [0] => Title1: Hackers expose trove of snagged Snapchat images
    [1] => Title1: Hackers expose trove of snagged Snapchat images
    [2] => Title1: Hackers expose trove of snagged Snapchat images
    [3] => Title1: Hackers expose trove of snagged Snapchat images
    [4] => Title1: Hackers expose trove of snagged Snapchat images
    [5] => Array
        (
            [duplicate] => Title6: Hackers expose Snapchat images
        )

    [6] => Title2: New Jersey officials order symptom-less NBC News crew into Ebola quarantine
    [7] => Title2: New Jersey officials order symptom-less NBC News crew into Ebola quarantine
    [8] => Array
        (
            [duplicate] => Title4: New Jersey officials talk about Ebola quarantine
        )

    [9] => Title2: New Jersey officials order symptom-less NBC News crew into Ebola quarantine
    [10] => Title2: New Jersey officials order symptom-less NBC News crew into Ebola quarantine
    [11] => Title3: Family says goodbye at funeral for 16-year-old
    [12] => Title3: Family says goodbye at funeral for 16-year-old
    [13] => Title3: Family says goodbye at funeral for 16-year-old
    [14] => Title4: New Jersey officials talk about Ebola quarantine
    [15] => Title4: New Jersey officials talk about Ebola quarantine
    [16] => Title5: New Far Cry 4 Trailer Welcomes You to Kyrat Lowlands
)

我们欢迎所有的建议!

奥尔本·波默雷特(Alban Pommeret)

这是我的解决方案,受@DomWeldon启发,没有重复项:

 <?php
$titles = array(
    'Title1: Hackers expose trove of snagged Snapchat images',
    'Title2: New Jersey officials order symptom-less NBC News crew into Ebola quarantine',
    'Title3: Family says goodbye at funeral for 16-year-old',
    'Title4: New Jersey officials talk about Ebola quarantine',
    'Title5: New Far Cry 4 Trailer Welcomes You to Kyrat Lowlands',
    'Title6: Hackers expose Snapchat images'
);
$titluri    =   array(); // unless it's declared elsewhere
$duplicateTitles = array();
// loop through each line of the array
foreach ($titles as $key => $originalFeed)
{
    if(!in_array($key, $duplicateTitles)){
        $titluri[] = $originalFeed; // all feeds are listed in the new array
        $feed_A = explode(' ', $originalFeed);
        foreach ($titles as $newKey => $comparisonFeed)
        {
            // iterate through the array again and see if they intersect
            if ($key != $newKey) { // but don't compare same line against eachother!
                $feed_B = explode(' ', $comparisonFeed);
                $intersect_A_B = array_intersect($feed_A, $feed_B);
                // do they share three words?
                if(count($intersect_A_B)>3)
                {
                    // yes, add a diplicate entry
                    $titluri[]['duplicate'] = $comparisonFeed;
                    $duplicateTitles[] = $newKey;
                }
            }
        }
    }
}

本文收集自互联网,转载请注明来源。

如有侵权,请联系[email protected] 删除。

编辑于
0

我来说两句

0条评论
登录后参与评论

相关文章

来自分类Dev

存储重复的数组元素

来自分类Dev

处理重复的数组元素

来自分类Dev

删除重复的数组元素

来自分类Dev

删除重复的数组元素

来自分类Dev

数组中的元素重复

来自分类Dev

从Int数组中删除重复(重复)的元素

来自分类Dev

Ruby中重复的数组元素

来自分类Dev

数组推送,重复元素

来自分类Dev

返回数组中重复的元素

来自分类Dev

重复随机数组元素

来自分类Dev

从数组列表中删除重复的元素

来自分类Dev

数组中的重复元素(SML)

来自分类Dev

如何计算重复的数组元素?(javascript)

来自分类Dev

搜索重复元素数组

来自分类Dev

重复numpy数组的每个元素5次

来自分类Dev

该代码如何消除重复的数组元素?

来自分类Dev

JS-数组中的重复元素

来自分类Dev

查找多少个重复的数组元素

来自分类Dev

如何分配ID以标识重复的数组元素?

来自分类Dev

重复该程序以再次搜索数组元素

来自分类Dev

Dataweave:创建重复元素的数组

来自分类Dev

用于重复数组元素的Rust宏

来自分类Dev

从数组列表中删除重复的元素

来自分类Dev

删除数组中的重复元素

来自分类Dev

在列表/数组中查找元素重复的次数

来自分类Dev

删除数组中的重复元素

来自分类Dev

在Perl中从数组中删除重复的元素

来自分类Dev

查找多少个重复的数组元素

来自分类Dev

从数组中返回元素,但重复除外