[求助]python爬虫环境初建 遇到问题AttributeError: 'X509' object has no attribute 'get_extensi
代码很简单 豆瓣250第一页先找序号这个标签from scrapy import Request
from scrapy.spiders import Spider
class m250Spider(Spider):
name = 'm250'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36',
}
def start_requests(self):
url = 'http://movie.'
yield Request(url, headers=self.headers)
def parse(self, response):
movie_list= movies.xpath('.//div[@class="pic"]/em/text()').extract()
print movie_list
执行报错
2017-03-29 14:57:09 [scrapy] INFO: Scrapy 1.2.1 started (bot: firstspider)
2017-03-29 14:57:09 [scrapy] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'firstspider.spiders', 'SPIDER_MODULES': ['firstspider.spiders'], 'ROBOTSTXT_OBEY': True, 'BOT_NAME': 'firstspider'}
2017-03-29 14:57:09 [scrapy] INFO: Enabled extensions:
['scrapy.extensions.logstats.LogStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.corestats.CoreStats']
2017-03-29 14:57:09 [scrapy] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.chunked.ChunkedTransferMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2017-03-29 14:57:09 [scrapy] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2017-03-29 14:57:09 [scrapy] INFO: Enabled item pipelines:
[]
2017-03-29 14:57:09 [scrapy] INFO: Spider opened
2017-03-29 14:57:09 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2017-03-29 14:57:09 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2017-03-29 14:57:09 [scrapy] DEBUG: Crawled (403) <GET http://movie. (referer: None)
2017-03-29 14:57:09 [scrapy] DEBUG: Redirecting (301) to <GET https://movie. from <GET http://movie.
Error during info_callback
Traceback (most recent call last):
File "D:\Python27\lib\site-packages\twisted\internet\tcp.py", line 208, in doRead
return self._dataReceived(data)
File "D:\Python27\lib\site-packages\twisted\internet\tcp.py", line 214, in _dataReceived
rval = self.protocol.dataReceived(data)
File "D:\Python27\lib\site-packages\twisted\protocols\tls.py", line 415, in dataReceived
self._checkHandshakeStatus()
File "D:\Python27\lib\site-packages\twisted\protocols\tls.py", line 335, in _checkHandshakeStatus
self._tlsConnection.do_handshake()
--- <exception caught here> ---
File "D:\Python27\lib\site-packages\twisted\internet\_sslverify.py", line 1148, in infoCallback
return wrapped(connection, where, ret)
File "D:\Python27\lib\site-packages\scrapy\core\downloader\tls.py", line 52, in _identityVerifyingInfoCallback
verifyHostname(connection, self._hostnameASCII)
File "D:\Python27\lib\site-packages\service_identity\pyopenssl.py", line 44, in verify_hostname
cert_patterns=extract_ids(connection.get_peer_certificate()),
File "D:\Python27\lib\site-packages\service_identity\pyopenssl.py", line 66, in extract_ids
for i in range(cert.get_extension_count()):
exceptions.AttributeError: 'X509' object has no attribute 'get_extension_count'
2017-03-29 14:57:09 [twisted] CRITICAL: Error during info_callback
Traceback (most recent call last):
File "D:\Python27\lib\site-packages\twisted\internet\tcp.py", line 208, in doRead
return self._dataReceived(data)
File "D:\Python27\lib\site-packages\twisted\internet\tcp.py", line 214, in _dataReceived
rval = self.protocol.dataReceived(data)
File "D:\Python27\lib\site-packages\twisted\protocols\tls.py", line 415, in dataReceived
self._checkHandshakeStatus()
File "D:\Python27\lib\site-packages\twisted\protocols\tls.py", line 335, in _checkHandshakeStatus
self._tlsConnection.do_handshake()
--- <exception caught here> ---
File "D:\Python27\lib\site-packages\twisted\internet\_sslverify.py", line 1148, in infoCallback
return wrapped(connection, where, ret)
File "D:\Python27\lib\site-packages\scrapy\core\downloader\tls.py", line 52, in _identityVerifyingInfoCallback
verifyHostname(connection, self._hostnameASCII)
File "D:\Python27\lib\site-packages\service_identity\pyopenssl.py", line 44, in verify_hostname
cert_patterns=extract_ids(connection.get_peer_certificate()),
File "D:\Python27\lib\site-packages\service_identity\pyopenssl.py", line 66, in extract_ids
for i in range(cert.get_extension_count()):
exceptions.AttributeError: 'X509' object has no attribute 'get_extension_count'
2017-03-29 14:57:09 [scrapy] DEBUG: Retrying <GET https://movie. (failed 1 times): [<twisted.python.failure.Failure exceptions.AttributeError: 'X509' object has no attribute 'get_extension_count'>]
Error during info_callback
Traceback (most recent call last):
File "D:\Python27\lib\site-packages\twisted\internet\tcp.py", line 208, in doRead
return self._dataReceived(data)
File "D:\Python27\lib\site-packages\twisted\internet\tcp.py", line 214, in _dataReceived
rval = self.protocol.dataReceived(data)
File "D:\Python27\lib\site-packages\twisted\protocols\tls.py", line 415, in dataReceived
self._checkHandshakeStatus()
File "D:\Python27\lib\site-packages\twisted\protocols\tls.py", line 335, in _checkHandshakeStatus
self._tlsConnection.do_handshake()
--- <exception caught here> ---
File "D:\Python27\lib\site-packages\twisted\internet\_sslverify.py", line 1148, in infoCallback
return wrapped(connection, where, ret)
File "D:\Python27\lib\site-packages\scrapy\core\downloader\tls.py", line 52, in _identityVerifyingInfoCallback
verifyHostname(connection, self._hostnameASCII)
File "D:\Python27\lib\site-packages\service_identity\pyopenssl.py", line 44, in verify_hostname
cert_patterns=extract_ids(connection.get_peer_certificate()),
File "D:\Python27\lib\site-packages\service_identity\pyopenssl.py", line 66, in extract_ids
for i in range(cert.get_extension_count()):
exceptions.AttributeError: 'X509' object has no attribute 'get_extension_count'
2017-03-29 14:57:10 [twisted] CRITICAL: Error during info_callback
Traceback (most recent call last):
File "D:\Python27\lib\site-packages\twisted\internet\tcp.py", line 208, in doRead
return self._dataReceived(data)
File "D:\Python27\lib\site-packages\twisted\internet\tcp.py", line 214, in _dataReceived
rval = self.protocol.dataReceived(data)
File "D:\Python27\lib\site-packages\twisted\protocols\tls.py", line 415, in dataReceived
self._checkHandshakeStatus()
File "D:\Python27\lib\site-packages\twisted\protocols\tls.py", line 335, in _checkHandshakeStatus
self._tlsConnection.do_handshake()
--- <exception caught here> ---
File "D:\Python27\lib\site-packages\twisted\internet\_sslverify.py", line 1148, in infoCallback
return wrapped(connection, where, ret)
File "D:\Python27\lib\site-packages\scrapy\core\downloader\tls.py", line 52, in _identityVerifyingInfoCallback
verifyHostname(connection, self._hostnameASCII)
File "D:\Python27\lib\site-packages\service_identity\pyopenssl.py", line 44, in verify_hostname
cert_patterns=extract_ids(connection.get_peer_certificate()),
File "D:\Python27\lib\site-packages\service_identity\pyopenssl.py", line 66, in extract_ids
for i in range(cert.get_extension_count()):
exceptions.AttributeError: 'X509' object has no attribute 'get_extension_count'
2017-03-29 14:57:10 [scrapy] DEBUG: Retrying <GET https://movie. (failed 2 times): [<twisted.python.failure.Failure exceptions.AttributeError: 'X509' object has no attribute 'get_extension_count'>]
Error during info_callback
Traceback (most recent call last):
File "D:\Python27\lib\site-packages\twisted\internet\tcp.py", line 208, in doRead
return self._dataReceived(data)
File "D:\Python27\lib\site-packages\twisted\internet\tcp.py", line 214, in _dataReceived
rval = self.protocol.dataReceived(data)
File "D:\Python27\lib\site-packages\twisted\protocols\tls.py", line 415, in dataReceived
self._checkHandshakeStatus()
File "D:\Python27\lib\site-packages\twisted\protocols\tls.py", line 335, in _checkHandshakeStatus
self._tlsConnection.do_handshake()
--- <exception caught here> ---
File "D:\Python27\lib\site-packages\twisted\internet\_sslverify.py", line 1148, in infoCallback
return wrapped(connection, where, ret)
File "D:\Python27\lib\site-packages\scrapy\core\downloader\tls.py", line 52, in _identityVerifyingInfoCallback
verifyHostname(connection, self._hostnameASCII)
File "D:\Python27\lib\site-packages\service_identity\pyopenssl.py", line 44, in verify_hostname
cert_patterns=extract_ids(connection.get_peer_certificate()),
File "D:\Python27\lib\site-packages\service_identity\pyopenssl.py", line 66, in extract_ids
for i in range(cert.get_extension_count()):
exceptions.AttributeError: 'X509' object has no attribute 'get_extension_count'
2017-03-29 14:57:10 [twisted] CRITICAL: Error during info_callback
Traceback (most recent call last):
File "D:\Python27\lib\site-packages\twisted\internet\tcp.py", line 208, in doRead
return self._dataReceived(data)
File "D:\Python27\lib\site-packages\twisted\internet\tcp.py", line 214, in _dataReceived
rval = self.protocol.dataReceived(data)
File "D:\Python27\lib\site-packages\twisted\protocols\tls.py", line 415, in dataReceived
self._checkHandshakeStatus()
File "D:\Python27\lib\site-packages\twisted\protocols\tls.py", line 335, in _checkHandshakeStatus
self._tlsConnection.do_handshake()
--- <exception caught here> ---
File "D:\Python27\lib\site-packages\twisted\internet\_sslverify.py", line 1148, in infoCallback
return wrapped(connection, where, ret)
File "D:\Python27\lib\site-packages\scrapy\core\downloader\tls.py", line 52, in _identityVerifyingInfoCallback
verifyHostname(connection, self._hostnameASCII)
File "D:\Python27\lib\site-packages\service_identity\pyopenssl.py", line 44, in verify_hostname
cert_patterns=extract_ids(connection.get_peer_certificate()),
File "D:\Python27\lib\site-packages\service_identity\pyopenssl.py", line 66, in extract_ids
for i in range(cert.get_extension_count()):
exceptions.AttributeError: 'X509' object has no attribute 'get_extension_count'
2017-03-29 14:57:10 [scrapy] DEBUG: Gave up retrying <GET https://movie. (failed 3 times): [<twisted.python.failure.Failure exceptions.AttributeError: 'X509' object has no attribute 'get_extension_count'>]
2017-03-29 14:57:10 [scrapy] ERROR: Error downloading <GET https://movie. [<twisted.python.failure.Failure exceptions.AttributeError: 'X509' object has no attribute 'get_extension_count'>]
2017-03-29 14:57:10 [scrapy] INFO: Closing spider (finished)
2017-03-29 14:57:10 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 3,
'downloader/exception_type_count/twisted.web._newclient.ResponseNeverReceived': 3,
'downloader/request_bytes': 1428,
'downloader/request_count': 5,
'downloader/request_method_count/GET': 5,
'downloader/response_bytes': 611,
'downloader/response_count': 2,
'downloader/response_status_count/301': 1,
'downloader/response_status_count/403': 1,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2017, 3, 29, 6, 57, 10, 515000),
'log_count/CRITICAL': 3,
'log_count/DEBUG': 6,
'log_count/ERROR': 1,
'log_count/INFO': 7,
'response_received_count': 1,
'scheduler/dequeued': 4,
'scheduler/dequeued/memory': 4,
'scheduler/enqueued': 4,
'scheduler/enqueued/memory': 4,
'start_time': datetime.datetime(2017, 3, 29, 6, 57, 9, 375000)}
2017-03-29 14:57:10 [scrapy] INFO: Spider closed (finished)
[color=#000000]感觉主要问题是
File "D:\Python27\lib\site-packages\service_identity\pyopenssl.py", line 66, in extract_ids
for i in range(cert.get_extension_count()):
exceptions.AttributeError: 'X509' object has no attribute 'get_extension_count'
pyopenssl是用pip装的16.2.0版本 不知为什么会报错,求各位大神指点指点,谢啦
[/color]
[此贴子已经被作者于2017-3-29 15:11编辑过]