Scrapy의 오류 및 기타 카운트를 정수 변수로 어떻게 인쇄 할 수 있습니까?

debugcn 에 게시 Dev

무라트 데 미르

스크래피 로그 출력을 인쇄하여 내 db에 쓰려고합니다.

거미

class NigdeBotSpider(scrapy.Spider):

    name = 'nigdehal' #Bot name for scrapy
    allowed_domains = ['www.halfiyatlari.net'] #Main Domain
    start_urls = ['https://www.halfiyatlari.net/nigde-hal-fiyatlari'] #The url that we have scrape in same website

    def parse(self, response):
        location = "nigdehal"
        url = response.url
        urls_ = url.split("/")
        url_tag = Urls.objects.get(url_tag=urls_[3])
        spiderobj = Spiders.objects.get(spider_name=location)

        date = str(response.xpath('//div[@class="col-lg-12"]/span/text()').extract()) #Parse the date seperately
        last_date = dateFixer(date)
        day = int(last_date[0])
        month = int(last_date[1])
        year = int(last_date[2])

        market_date = Markets.objects.get(spider=spiderobj)

        if day==market_date.market_day and month==market_date.market_month and year==market_date.market_year: 
            raise CloseSpider('Dates are same!')
        
        i = 2 #Table name and header passed
        
        path_lenght = len(response.xpath('//*[@class="table-responsive"]//tr'))
        
        for products in response.xpath('//*[@class="table-responsive"]//tr'): #it will parse product by product with for loop
        
            if path_lenght==1:
                break
            
            product_name = wordFixer(products.xpath('//tr['+str(i)+']/td[1]/text()').get())
            high_price = priceFixer(products.xpath('//tr['+str(i)+']/td[3]/text()').get())
            low_price = priceFixer(products.xpath('//tr['+str(i)+']/td[2]/text()').get())   

            if not product_name: continue
            try:
                market = Markets.objects.get(market_name=location)
            except:
                market = Markets(market_name=location, spider=spiderobj)
                market.save()
                
            new_data = Products(
            product_name=wordFixer(product_name),
            price=priceFixer(str(low_price)),
            low_price=priceFixer(str(low_price)),
            high_price=priceFixer(str(high_price)),
            product_URL=url,
            market = market,
            day = day,
            month = month,
            year = year,
            )
            
            market.market_day = day
            market.market_month = month
            market.market_year = year
            market.save()
            new_data.save()
            i+=1
            path_lenght-=1
        url_tag.url_status = response.status
        url_tag.save()

크롤러 시작

def startSpider(spider_name,spider_class):
    name = 'first_bot.first_bot.spiders.'+spider_name
    now = datetime.datetime.today()
    now_time = now.strftime("%d-%m-%y")
    i = importlib.import_module(name)
    class_ = getattr(i, spider_class)
    configure_logging(install_root_handler=False)

    logging.basicConfig(
        filename='scrapy-log-'+now_time+'.txt',
        format='%(levelname)s: %(message)s',
        level=logging.INFO
    )
    try:
        runner = CrawlerRunner(get_project_settings())
        runner.crawl(class_)
    except OSError as e:
        print("Failed with:" + e.strerror) 
        print("Error code:" + e)

내 spider또는 내 start스크립트 아래에 무언가를 추가 하여 내 오류 수와 크롤링 수를 다음 로그로 인쇄하고 싶습니다.

'downloader/request_bytes': 498,
 'downloader/request_count': 2,
 'downloader/request_method_count/GET': 2,
 'downloader/response_bytes': 8728,
 'downloader/response_count': 2,
 'downloader/response_status_count/200': 1,
 'downloader/response_status_count/301': 1,
 'elapsed_time_seconds': 12.462258,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2020, 9, 29, 10, 34, 20, 855907),
 'log_count/DEBUG': 2,
 'log_count/ERROR': 1,
 'log_count/INFO': 10,

Scrapy 2.3.0과 python3이 있습니다. 어떤 라이브러리와 로거를 사용해야하는지 모르겠습니다. 당신의 제안은 무엇입니까?

모에 인 카멜리

실행 시간 동안 얼마나 많은 오류가 발생하는지 알아 보려면 error_count통계에서 오류 변수로 설정하여 모든 발생에 대해 증가 시키므로 블록 에서 parse()메소드가 필요 try except하고 블록 실행을 제외하고는 항상 증가 :

def parse(self, response)
    # set error_count as stat with initial value of 0
    self.crawler.stats.set_value('error_count', 0)
    try:
        # try block action
    except:
        # except block action
        stats.inc_value('error_count')  # increase value every time get an error

로그는 다음과 같습니다.

'error_count': n,
'downloader/request_bytes': 498,
'downloader/request_count': 2,
'downloader/request_method_count/GET': 2,
...

이 기사는 인터넷에서 수집됩니다. 재 인쇄 할 때 출처를 알려주십시오.

침해가 발생한 경우 연락 주시기 바랍니다[email protected] 삭제

에서 수정2021-04-5

몇 마디 만하겠습니다

0리뷰

로그인참여 후 검토

Related 관련 기사

기사

Scrapy의 오류 및 기타 카운트를 정수 변수로 어떻게 인쇄 할 수 있습니까?

Scrapy의 오류 및 기타 카운트를 정수 변수로 어떻게 인쇄 할 수 있습니까?

이러한 <div> 요소의 정렬 오류 및 호버 오류를 어떻게 수정할 수 있습니까?

Android의 위치, 어댑터 및 ListView에서이 오류를 어떻게 수정할 수 있습니까?

R에서 ecdf 함수의 런타임 오류를 어떻게 수정할 수 있습니까?

R에서 ecdf 함수의 런타임 오류를 어떻게 수정할 수 있습니까?

이 런타임 오류의 원인은 무엇이며 어떻게 수정할 수 있습니까?

bool 변수의 주소를 어떻게 인쇄 할 수 있습니까?

타이머를 어떻게 카운트 다운 할 수 있습니까?

특정 요소를 어떻게 인쇄 할 수 있습니까?

bash 변수를 gsub () 및 match ()에 대한 정규식 인수로 어떻게 사용할 수 있습니까?

1064 오류를 어떻게 수정할 수 있습니까?

이 오류를 어떻게 수정할 수 있습니까?

LNK 2022 오류를 어떻게 수정할 수 있습니까?

Julia 오류를 어떻게 수정할 수 있습니까?

LNK 2022 오류를 어떻게 수정할 수 있습니까?

플로트를 인쇄 할 때 어떻게 더 정확하게 소수점을 인쇄 할 수 있습니까?

Espresso에서 TextinputLayout의 오류를 어떻게 확인할 수 있습니까?

URL 매개 변수 및 오류 메시지와 함께 인증 미들웨어의 redirectTo 메소드를 어떻게 사용할 수 있습니까?

ASN.1 표기법 파일의 다음 오류를 어떻게 수정할 수 있습니까?

mysql에서 카운트를 얻기 위해 어떻게 하나의 쿼리로 작성할 수 있습니까?

옥타브 스크립트의 플로팅 및 인쇄 부분을 어떻게 디버깅 할 수 있습니까?

Targetpath의 오류를 어떻게 해결할 수 있습니까?

HotChocolate의 스키마를 GraphQL SDL로 어떻게 인쇄 할 수 있습니까?

변수`c`의 값을 변경할 때 기계 정밀도의 오류를 어떻게 테스트 할 수 있습니까?

가지고있는 비디오 카드의 종류를 어떻게 확인할 수 있습니까?

MySQL에서 카운트를 어떻게 조인 할 수 있습니까?

및 정수를 문자열로 어떻게 변환 할 수 있습니까?

apt-get 업데이트 및 설치 오류를 어떻게 수정할 수 있습니까?

Python : 바이트를 어떻게 인쇄 할 수 있습니까?

파이썬에서 매트릭스의 특정 인덱스에 '/'를 어떻게 인쇄 할 수 있습니까?