使用Python和pymongo的多线程

philmckendry 发表于 Dev

芬肯德里

您好，我希望制作一个程序来对推文进行正向和负向分类，以分类关于已经保存在mongodb中并且一旦被分类的公司的推文，然后根据结果更新整数。

我已经编写了代码，使之成为可能，但是我想对程序进行多线程处理，但是我在python中没有任何经验，并且一直试图遵循教程，但是运气不好，因为程序只是在不经过任何程序的情况下启动和退出代码。

如果有人可以帮助我，将不胜感激。该程序和预期的多线程代码如下。

from textblob.classifiers import NaiveBayesClassifier
import pymongo
import datetime
from threading import Thread

train = [
('I love this sandwich.', 'pos'),
('This is an amazing place!', 'pos'),
('I feel very good about these beers.', 'pos'),
('This is my best work.', 'pos'),
("What an awesome view", 'pos'),
('I do not like this restaurant', 'neg'),
('I am tired of this stuff.', 'neg'),
("I can't deal with this", 'neg'),
('He is my sworn enemy!', 'neg'),
('My boss is horrible.', 'neg'),
(':)', 'pos'),
(':(', 'neg'),
('gr8', 'pos'),
('gr8t', 'pos'),
('lol', 'pos'),
('bff', 'neg'),
]

test = [
'The beer was good.',
'I do not enjoy my job',
"I ain't feeling dandy today.",
"I feel amazing!",
'Gary is a friend of mine.',
"I can't believe I'm doing this.",
]

filterKeywords = ['IBM', 'Microsoft', 'Facebook', 'Yahoo', 'Apple',   'Google', 'Amazon', 'EBay', 'Diageo',
              'General Motors', 'General Electric', 'Telefonica', 'Rolls Royce', 'Walmart', 'HSBC', 'BP',
              'Investec', 'WWE', 'Time Warner', 'Santander Group']

# Create pos/neg counter variables for each company using dicts
vars = {}
for word in filterKeywords:
vars[word + "SentimentOverall"] = 0


# Initialising the classifier
cl = NaiveBayesClassifier(train)


class TrainingClassification():
    def __init__(self):
        #creating the mongodb connection
        try:
            conn = pymongo.MongoClient('localhost', 27017)
            print "Connected successfully!!!"
            global db
            db = conn.TwitterDB
        except pymongo.errors.ConnectionFailure, e:
            print "Could not connect to MongoDB: %s" % e

        thread1 = Thread(target=self.apple_thread, args=())
        thread1.start()
        thread1.join()
        print "thread finished...exiting"

    def apple_thread(self):
        appleSentimentText = []
        for record in db.Apple.find():
            if record.get('created_at'):
                created_at = record.get('created_at')
                dt = datetime.strptime(created_at, '%a %b %d %H:%M:%S +0000 %Y')
                if record.get('text') and dt > datetime.today():
                    appleSentimentText.append(record.get("text"))
        for targetText in appleSentimentText:
            classificationApple = cl.classify(targetText)
            if classificationApple == "pos":
                vars["AppleSentimentOverall"] = vars["AppleSentimentOverall"] + 1
            elif classificationApple == "neg":
                vars["AppleSentimentOverall"] = vars["AppleSentimentOverall"] - 1

德夫沙克

您的代码的主要问题在这里：

thread1.start()
thread1.join()

当您在线程上调用join时，它的作用是使当前正在运行的线程（在您的情况下为主线程）等待直到线程（此处为thread1）完成。因此，您可以看到您的代码实际上不会更快。它只是启动一个线程并等待它。实际上，由于线程创建，它会稍微慢一些。

这是进行多线程处理的正确方法：

thread1.start()
thread2.start()
thread1.join()
thread2.join()

在此代码中，线程1和2都将并行运行。

重要提示：请注意，在Python中，这是“模拟”并行化。因为Python的内核不是线程安全的（主要是因为它执行垃圾回收的方式），所以它使用GIL（全局解释器锁），因此进程中的所有线程只能在1个内核上运行。如果您热衷于使用真正的并行化（例如，如果您的2个线程是CPU范围而不是I / O范围），那么请看一下多处理模块。

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。

编辑于2021-02-21

我来说两句

0条评论

登录后参与评论

上一篇：如何使用toBuffer与[node] graphicsmagick创建新图像

来自分类Dev

Related 相关文章

文章

使用Python和pymongo的多线程

使用Python和pymongo的多线程

Python和多线程+函数

Python Feedparser和多线程

使用Python和C API进行多线程

Python使用IterTools和多线程在Excel中找出密码

python中使用并行线程的多线程

mysql和python中的多线程

Python多线程（while和apscheduler）

Python和Shell中的多线程

Python - 父级和多线程问题

使用ThreadPool和CountdownEvent的多线程

在C ++中使用cURL和多线程

使用多线程处理和保存IMage

具有QThread和线程模块的Python多线程

使用ThreadPool在Python中进行多线程

Python-在多线程中使用nonce

使用python的多线程读写文件

使用Python多线程时的属性错误

Python多线程但使用对象实例

使用Opencv Python多线程录制视频

Python，使用Queue进行多线程

使用多线程构建python dict

使用JavaFX Tasks正确执行多线程和线程池

SpriteKit和多线程

JDBC和多线程

autofac和多线程

多线程和排队

多线程和MessageDlgPos

多线程和监控

ContentProvider和多线程