我正在创建一个像应用程序这样的搜寻器,它将在网页中找到图像。在这里,生产者产生链接,而消费者连接到该链接以查找图像,但是由于消费者产生了大量的链接,消费者花费了大量时间。因此,我将消费者置于执行者服务中,但是我发现消费者所花费的时间没有减少。请帮我。下面是我的代码。
@Service
@Qualifier("crawlerService")
public class CrawlerService {
@Autowired
@Qualifier("loggerService")
LoggerService loggerService;
@Autowired
@Qualifier("imageTypeExtensionCombo")
ImageTypeExtensionCombo imageTypeExtensionCombo;
public List<String> startCrawler(List<String> links, List<String> images, URL url, String protocol, String protocolHost) throws Exception{
LinkQueue queue = new LinkQueue(links);
LinkProducer producer = new LinkProducer(links, url, protocol, protocolHost, queue, loggerService);
LinkConsumer consumer = new LinkConsumer(links, images, url, protocol, protocolHost, loggerService, queue);
ExecutorService executorService = Executors.newFixedThreadPool(4);
executorService.submit(consumer);
producer.start();
//consumer.start();
Thread.currentThread().join();
executorService.shutdown();
return images;
}
}
LinkProducer类
public class LinkProducer extends Thread {
private List<String> anchorList;
private URL url;
private String protocol;
private String protocolHost;
private UrlValidator urlValidator = new UrlValidator();
private LinkQueue queue;
private LoggerService loggerService;
private int MAX_QUEUE_SIZE = 2;
private int counter = 0;
private boolean stopThread = false;
private String HTML_TYPE = "HTML";
private String HTML_CONTENT_TYPE = "text/html";
private String IMAGE_TYPE = "IMAGE";
private String NON_HTML_NON_IMAGE_TYPE = "OTHERS";
public LinkProducer(List<String> anchorList, URL url, String protocol,String protocolHost, LinkQueue queue, LoggerService loggerService) {
super(protocolHost.replace(protocol, "").replaceAll("/", ""));
this.anchorList = anchorList;
this.url = url;
this.protocol = protocol;
this.protocolHost = protocolHost;
this.queue = queue;
this.loggerService = loggerService;
}
public void run() {
int i = 0;
while(true) {
List<String> anchors = null;
loggerService.log("Producer Thread : " + (++i));
try {
anchors = produce();
} catch (Exception ex) {
loggerService.log("Exception occured in producer thread : "+ ex.getMessage());
ex.printStackTrace();
if(stopThread){
break;
}
}
if(stopThread){
break;
}
if(anchors != null && anchors.size() > 0){
Iterator<String> iter = anchors.iterator();
while(iter.hasNext()){
synchronized (queue) {
queue.enQueue(iter.next());
}
}
}
}
}
}
LinkConsumer类
public class LinkConsumer extends Thread {
private List<String> anchorList;
private List<String> imageList;
private URL url;
private String protocol;
private String protocolHost;
private LinkQueue queue;
private LoggerService loggerService;
private UrlValidator urlValidator = new UrlValidator();
private String HTML_TYPE = "HTML";
private String HTML_CONTENT_TYPE = "text/html";
private String IMAGE_TYPE = "IMAGE";
private String NON_HTML_NON_IMAGE_TYPE = "OTHERS";
public LinkConsumer(List<String> anchorList, List<String> imageList, URL url, String protocol,String protocolHost, LoggerService loggerService, LinkQueue queue) {
super(protocolHost.replace(protocol, "").replaceAll("/", ""));
this.anchorList = anchorList;
this.imageList = imageList;
this.url = url;
this.protocol = protocol;
this.protocolHost = protocolHost;
this.queue = queue;
this.loggerService = loggerService;
}
public void run() {
int i = 0;
while (!queue.isEmpty()) {
List<String> images = null;
loggerService.log("Consumer Thread : " + (++i));
try {
images = consume();
} catch (Exception ex) {
loggerService.log("Exception occured in consumer thread : "+ ex.getMessage());
ex.printStackTrace();
}
if (images != null && images.size() > 0) {
Iterator<String> iter = images.iterator();
while (iter.hasNext()) {
imageList.add(iter.next());
}
}
}
}
}
谢谢
您仅创建和提交一个LinkConsumer
,因此您只有一个工作人员。
为了实现真正的并行性能,您将需要创建和提交更多内容LinkConsumer
。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句