首页 \ 问答 \ python3 xpath无法到达子节点(AttributeError:'NoneType'对象没有属性'text')(python3 xpath can't reach a child node (AttributeError: 'NoneType' object has no attribute 'text'))

python3 xpath无法到达子节点(AttributeError:'NoneType'对象没有属性'text')(python3 xpath can't reach a child node (AttributeError: 'NoneType' object has no attribute 'text'))

需要帮助解决一些我无法找到的问题

我有一个像这样的xml:

<forecast xmlns="http://weather.yandex.ru/forecast" country_id="8996ba26eb0edf7ea5a055dc16c2ccbd" part="Лен Стокгольм" link="http://pogoda.yandex.ru/stockholm/" part_id="53f767b78d8f180c28d55ebda1d07e0c" lat="59.381981" slug="stockholm" city="Стокгольм" climate="1" country="Швеция" region="10519" lon="17.956846" zoom="12" id="2464" source="Station" exactname="Стокгольм" geoid="10519">
<fact>...</fact>
<yesterday id="435077826">...</yesterday>
<informer>...</informer>
<day date="2016-04-18">
    <sunrise>05:22</sunrise>
    <sunset>20:12</sunset>
    <moon_phase code="growing-moon">14</moon_phase>
    <moonrise>15:53</moonrise>
    <moonset>04:37</moonset>
    <biomet index="3" geomag="2" low_press="1" uv="1">...</biomet>
    <day_part typeid="1" type="morning">...</day_part>
    <day_part typeid="2" type="day">...</day_part>
    <day_part typeid="3" type="evening">...</day_part>
    <day_part typeid="4" type="night">...</day_part>
    <day_part typeid="5" type="day_short">
        <temperature>11</temperature>
    </day_part>
</day>
</forecast>

(可以通过https://export.yandex.ru/weather-ng/forecasts/2464.xml访问整个xml)。 需要获取temperature.text(11),尝试此代码:

import urllib.request
import codecs
import lxml
from xml.etree import ElementTree as ET

def gen_ns(tag):
    if tag.startswith('{'):
        ns, tag = tag.split('}') 
        return ns[1:]
    else:
        return ''
with codecs.open(fname, 'r', encoding = 'utf-8') as t:
        town_tree = ET.parse(t)
        town_root = town_tree.getroot() 
        print (town_root)

        namespaces = {'ns': gen_ns(town_root.tag)}
        print (namespaces)

        for day in town_root.iterfind('ns:day', namespaces):
            date = (day.get('date'))
            print (date)
            day_temp = day.find('.//*[@type="day_short"]/temperature')  
            print (day_temp.text)

得到:

Traceback (most recent call last):
File "weather.py", line 154, in <module>
    print (day_temp.text)
AttributeError: 'NoneType' object has no attribute 'text'

我的xpath有什么问题? 我可以得到('.//*[@type="day_short"]') attr,但是不能得到它的孩子(温度)文本谢谢大家!


need help with some issue I didn't manage to find

I have an xml like this:

<forecast xmlns="http://weather.yandex.ru/forecast" country_id="8996ba26eb0edf7ea5a055dc16c2ccbd" part="Лен Стокгольм" link="http://pogoda.yandex.ru/stockholm/" part_id="53f767b78d8f180c28d55ebda1d07e0c" lat="59.381981" slug="stockholm" city="Стокгольм" climate="1" country="Швеция" region="10519" lon="17.956846" zoom="12" id="2464" source="Station" exactname="Стокгольм" geoid="10519">
<fact>...</fact>
<yesterday id="435077826">...</yesterday>
<informer>...</informer>
<day date="2016-04-18">
    <sunrise>05:22</sunrise>
    <sunset>20:12</sunset>
    <moon_phase code="growing-moon">14</moon_phase>
    <moonrise>15:53</moonrise>
    <moonset>04:37</moonset>
    <biomet index="3" geomag="2" low_press="1" uv="1">...</biomet>
    <day_part typeid="1" type="morning">...</day_part>
    <day_part typeid="2" type="day">...</day_part>
    <day_part typeid="3" type="evening">...</day_part>
    <day_part typeid="4" type="night">...</day_part>
    <day_part typeid="5" type="day_short">
        <temperature>11</temperature>
    </day_part>
</day>
</forecast>

(the entire xml could be reached at https://export.yandex.ru/weather-ng/forecasts/2464.xml). need to get the temperature.text (11), trying this code:

import urllib.request
import codecs
import lxml
from xml.etree import ElementTree as ET

def gen_ns(tag):
    if tag.startswith('{'):
        ns, tag = tag.split('}') 
        return ns[1:]
    else:
        return ''
with codecs.open(fname, 'r', encoding = 'utf-8') as t:
        town_tree = ET.parse(t)
        town_root = town_tree.getroot() 
        print (town_root)

        namespaces = {'ns': gen_ns(town_root.tag)}
        print (namespaces)

        for day in town_root.iterfind('ns:day', namespaces):
            date = (day.get('date'))
            print (date)
            day_temp = day.find('.//*[@type="day_short"]/temperature')  
            print (day_temp.text)

getting:

Traceback (most recent call last):
File "weather.py", line 154, in <module>
    print (day_temp.text)
AttributeError: 'NoneType' object has no attribute 'text'

what's wrong with my xpath? I can get attr of ('.//*[@type="day_short"]'), but can't get its child (temperature) text Thanks everyone!


原文:https://stackoverflow.com/questions/36726768
更新时间:2024-04-13 09:04

最满意答案

  • dict{...}是错的,它应该是dict(...)OrderedDict{...}
  • dictOrderedDict将序列作为参数
  • 你的元组列表中有('e':8,'data[1]','9') 。 应该是('e',8),('data[1]','9')

这会产生一个dict (它相当于你发布的工作dict文字),它总是无序的

payload = dict([('f','1'),('s','2'),('t','3'),('f','4'),('ft','5'),('s','6'),('se','7'),('e', 8),('data[1]','9'),('t','10'),('el','1q'),('data[2]','12'),('data[3]','13'),('data[4]','14'),('htmldata[5]','15')])

这会产生一个元组元组 ,这些元组的 请求不会作为data参数:

payload = (('f','1'),('s','2'),('t','3'),('f','4'),('ft','5'),('s','6'),('se','7'),('e', 8),('data[1]','9'),('t','10'),('el','1q'),('data[2]','12'),('data[3]','13'),('data[4]','14'),('htmldata[5]','15'))

剩下的两个( 有序字典元组列表 )将产生你想要的东西:

from collections import OrderedDict
payload = OrderedDict([('f','1'),('s','2'),('t','3'),('f','4'),('ft','5'),('s','6'),('se','7'),('e',8),('data[1]','9'),('t','10'),('el','1q'),('data[2]','12'),('data[3]','13'),('data[4]','14'),('htmldata[5]','15')])
payload = [('f','1'),('s','2'),('t','3'),('f','4'),('ft','5'),('s','6'),('se','7'),('e', 8),('data[1]','9'),('t','10'),('el','1q'),('data[2]','12'),('data[3]','13'),('data[4]','14'),('htmldata[5]','15')]

  • dict{...} is wrong, it should be dict(...). Same goes for OrderedDict{...}
  • dict and OrderedDict take a sequence as argument
  • you have ('e':8,'data[1]','9') within your list of tuples. Should probably be ('e',8),('data[1]','9').

This produces a dict (it's equivalent to the working dict literal you posted), which will always be unordered:

payload = dict([('f','1'),('s','2'),('t','3'),('f','4'),('ft','5'),('s','6'),('se','7'),('e', 8),('data[1]','9'),('t','10'),('el','1q'),('data[2]','12'),('data[3]','13'),('data[4]','14'),('htmldata[5]','15')])

This produces a tuple of tuples, which requests doesn't take as argument for data:

payload = (('f','1'),('s','2'),('t','3'),('f','4'),('ft','5'),('s','6'),('se','7'),('e', 8),('data[1]','9'),('t','10'),('el','1q'),('data[2]','12'),('data[3]','13'),('data[4]','14'),('htmldata[5]','15'))

The remaining two (ordered dictionary and list of tuples) will produce what you want:

from collections import OrderedDict
payload = OrderedDict([('f','1'),('s','2'),('t','3'),('f','4'),('ft','5'),('s','6'),('se','7'),('e',8),('data[1]','9'),('t','10'),('el','1q'),('data[2]','12'),('data[3]','13'),('data[4]','14'),('htmldata[5]','15')])
payload = [('f','1'),('s','2'),('t','3'),('f','4'),('ft','5'),('s','6'),('se','7'),('e', 8),('data[1]','9'),('t','10'),('el','1q'),('data[2]','12'),('data[3]','13'),('data[4]','14'),('htmldata[5]','15')]

相关问答

更多
  • HTTPBasicAuth()只能使用用户名和密码参数。 没有第三个参数,完整的。 HTTP基本认证为请求添加额外的头部; 此信息与GET或POST参数保持分开。 当您使用此形式的身份验证时,您不需要将任何username和password参数传递给API方法。 要添加书签,请将url参数作为POST或GET数据参数传入: requests.post('https://www.instapaper.com/api/add', data={'url': 'websiteUrl'}, auth=HTTPBas ...
  • 底层urllib3库使用日志logging模块记录所有新的连接和URL,但不logging POST主体。 对于GET请求,这应该是足够的: import logging logging.basicConfig(level=logging.DEBUG) 这给你最详细的日志记录选项; 有关如何配置日志记录级别和目的地的详细信息,请参阅日志HOWTO 。 简短演示: >>> import requests >>> import logging >>> logging.basicConfig(level=lo ...
  • 它会在文档中准确地告诉您发生了什么。 此时,我们向用户显示登录屏幕,然后在确认屏幕上向您的应用授予访问其Instagram数据的权限 ,如果您运行下面的代码片段,您将看到一个登录框在您的浏览器中: import requests import webbrowser from tempfile import NamedTemporaryFile params = {"client_id": client_id, "redirect_uri": redirect_uri, ...
  • 当你说“我的网址看起来像这样: http://example.site.com/customers?fields=id,first_name,last_name : http://example.site.com/customers?fields=id,first_name,last_name ,first_name, http://example.site.com/customers?fields=id,first_name,last_name ”时,我认为你的意思是你的网址在浏览器中的样子。 关于浏览器 ...
  • 使用files参数说明“用于多部分编码上传”。 使用data参数时,多部分http响应不是默认值,因此两者不等效。 请参阅什么是http multipart请求? 具体来说,使用files参数生成一个multipart/form-data POST,而不是application/x-www-form-urlencoded 。 Using the files argument states "for multipart encoding upload". A multi part http response ...
  • 我建议urllib2作为替代方案,而不是使用request 。 这是一个例子。 我试过了,并成功检索了所需的输出。 import urllib2 import urllib params = urllib.urlencode({"cp":"03/10/2017 07:30", "int":1, "pair":1, "candles":50,"timezone":12, "candlestype":0, "prevTzId":12, "inst":10351841}) target = "https:/ ...
  • Flask以Werkzeug为基础。 Werkzeug本身正在使用Request本身正在使用BaseRequest 。 但这不是Requests库。 请注意,有计划使用Requests和Werkzeug创建一个httpcore库 ,但这似乎已停止。 这说两个项目都在踢。 有些人使用请求在他们的应用程序中修改了 Flask Flask is based on Werkzeug. Werkzeug itself is using Request which itself is using BaseRequest ...
  • dict{...}是错的,它应该是dict(...) 。 OrderedDict{...} dict和OrderedDict将序列作为参数 你的元组列表中有('e':8,'data[1]','9') 。 应该是('e',8),('data[1]','9') 。 这会产生一个dict (它相当于你发布的工作dict文字),它总是无序的 : payload = dict([('f','1'),('s','2'),('t','3'),('f','4'),('ft','5'),('s','6'),('se','7' ...
  • 您要查找的完整URL位于您从请求中返回的响应对象中: resp = self.session.request(self.method, full_url, data=self.post_data, timeout=self.api.timeout, auth=auth, proxies=self.api.proxy) ...
  • 看起来自定义受众群体定位API中的所有POST方法都采用编码为JSON的查询参数。 在Sharing Audiences调用中给出的示例使用整数作为adaccounts值,因此在转换为JSON之前,请确保您的整数是整数: import json import requests adaccounts = [1337845464515, 13464645456566] audienceid = ['6018213515', '601816526'] params = {adaccounts: json.du ...

相关文章

更多

最新问答

更多
  • Firebird客户端安装(Firebird client installation)
  • 如何检查一个文件是否已被C中的另一个进程打开?(How to check if a file is already open by another process in C?)
  • 将对象引用存储在控件标签属性确定中(Is storing an object reference in a controls Tag property OK)
  • 谁能介绍《商务谈判》课程的高职高专教材???谢谢!!!
  • 递归图像下载与请求(recursive image download with requests)
  • C ++对齐字符以便在任何实现的输入中整齐地显示(C++ aligning characters to display neatly with any input implemented)
  • 根据字段值插入一行(Insert a row, based on a fields value)
  • 在Ubuntu上使用TCP_REPAIR套接字选项编译代码[关闭](Compiling code with TCP_REPAIR socket option on Ubuntu [closed])
  • 在开发React应用程序时编译/转换代码(Compile/transpile code while developing React app)
  • 重庆软件开发培训 Java培训哪好
  • 将MultiPoint序列化为GeoJSON文件(Serialize MultiPoint to GeoJSON file)
  • 将文本从多个文件,相同的名称复制到bash(linux)中的不同路径(Copy text from multiple files, same names to different path in bash (linux))
  • 将字符串截断为特定数量的字符,忽略HTML(Truncate string to certain amount of characters, ignoring HTML)
  • 如何为崩溃的JNI调用编写JUnit测试?(How can I write a JUnit test for a JNI call that crashes?)
  • 多点触摸两个手指轻拍(Multi-touch two fingers taps)
  • Sqlserver监视时间的变化(Sqlserver watch for time change)
  • Srcset属性 - 最大宽度问题(Srcset attribute - max-width issue)
  • 如何验证数据库中是否存在记录?(How to verify existence of a record in database?)
  • SQL JOIN来自不同表的行具有相同的值(SQL JOIN row from different table with the same values)
  • NSTextField - 使用KVO进行输入验证?(NSTextField - Input validation with KVO?)
  • 如何使用DBPedia从根类别中提取不同级别的子类别?(How to extract sub-categories of different levels from a root category using DBPedia?)
  • 在Javascript中,如何检查数组是否有重复值?(In Javascript, how do I check if an array has duplicate values? [duplicate])
  • 有什么区别:焦点:必需:无效:焦点和:焦点:必需:无效?(What's the difference between :focus:required:invalid:focus and :focus:required:invalid?)
  • 使用OData创建新数据(Creating new data with OData)
  • 获取过滤器从spark数据帧中删除的行的示例(Get examples for rows that are removed by a filter from a spark dataframe)
  • 使用@RequestMapping编码到Spring Controller方法的接口(Coding to an interface on a Spring Controller method with @RequestMapping)
  • 如果Shape在屏幕外,是否执行Graphics2D.draw?(Is Graphics2D.draw performed if the Shape is offscreen?)
  • 与ajax的成功(success with ajax)
  • 如何获取在Javascript中单击的文本?(How can I get the text that was clicked on in Javascript?)
  • 如果没有头文件,如何定义静态成员?(How to define a static member in case there is not header file?)