Warning: Selector got both text and root, root is being ignored
我写了一个scrapy spider,本来工作正常,但突然开始收到这个警告:
/home/user/github-repos/scrapper/scrapper/env/lib/python3.8/site-packages/scrapy/selector/unified.py:83: UserWarning: Selector got both text and root, root is being ignored. super().__init__(text=text, type=st, root=root, **kwargs)
经进一步检查,产生该错误的部分如下所示
__slots__ = ["response"]
selectorlist_cls = SelectorList
def __init__(self, response=None, text=None, type=None, root=None, **kwargs):
if response is not None and text is not None:
raise ValueError(
f"{self.__class__.__name__}.__init__() received "
"both response and text"
)
st = _st(response, type)
if text is not None:
response = _response_from_text(text, st)
if response is not None:
text = response.text
kwargs.setdefault("base_url", response.url)
self.response = response
super().__init__(text=text, type=st, root=root, **kwargs)
警告指出根被忽略了,尽管构造函数需要它。这是Scrapy包中的一个类,所以它可能与代表他们的更新有关。
这是我的代码中唯一与选择器进行交互的部分:
def load_item(self, response: TextResponse, app_id, db_id, urls):
loader = AppLoader(response=response)
loader.add_value("app_id", app_id)
loader.add_value("db_id", db_id)
loader.add_value("url", response.url)
loader.add_css("game_title", "#appHubAppName::text")
loader.add_css("publisher", "#game_highlights .dev_row+ .dev_row a::text")
loader.add_css("developer", "#developers_list a::text")
loader.add_css("publish_date", ".date::text")
loader.add_css("tags", "#glanceCtnResponsiveRight a::text")
loader.add_css(
"review_count", "#review_type_all+ label .user_reviews_count::text"
)
loader.add_css(
"positive_review_count",
"#review_type_positive+ label .user_reviews_count::text",
)
loader.add_css(
"negative_review_count",
"#review_type_negative+ label .user_reviews_count::text",
)
loader.add_value("file_urls", urls)
return loader.load_item()
quotes
例子,当调用css或xpath选择器时,我也会得到这个警告信息,比如:response.css('div.quote')
有人能证实吗?
- Mike42 2023-04-28
在1.8.1版本中,依赖包parsel
(https://github.com/scrapy/parsel/blob/master/parsel/selector.py)有变化。(与1.7.0版相比,(提交3b3ec90)在class Selector
的__init__
中,kwarg root: Optional[Any] = None
被改成了root: Optional[Any] = _NOT_SET
。
scrapy
包中的class Selector
(https://github.com/scrapy/scrapy/blob/master/scrapy/selector/unified.py)默认向parsel
包中的超类提供root=None
。 这导致parsel
包中class Selector
的__init__
中出现了那个警告信息。 我将在那里开一个问题。
https://github.com/scrapy/scrapy/issues/5913关于此问题的问题
这个警告是无害的。
我以前也遇到过这个问题。这是因为我的xpath中的一个元素错过了引号标记。