Python BeautifulSoup:查找元素
简介
- find 是查找并返回符合条件的第1个元素。
- find_all 是查找并返回所有符合条件的元素。
- select_one 是查找并返回符合条件的第1个元素。
- select 是查找并返回所有符合条件的元素。
find、find_all 示例
示例:使用 name 指定标签名
代码:
from bs4 import BeautifulSoup
html_content = '''
<div>测试01</div>
<div>测试02</div>
<h1>测试H1</h1>
'''
soup = BeautifulSoup(html_content, 'html.parser')
print('--------- find_all ---------')
for ele in soup.find_all(name='div'):
print(ele)
print('--------- find ---------')
print(soup.find(name='div'))
执行结果:
--------- find_all ---------
<div>测试01</div>
<div>测试02</div>
--------- find ---------
<div>测试01</div>
示例:直接指定标签名
代码:
from bs4 import BeautifulSoup
html_content = '''
<div>测试01</div>
<div>测试02</div>
<h1>测试H1</h1>
'''
soup = BeautifulSoup(html_content, 'html.parser')
print('--------- find_all ---------')
for ele in soup.find_all('div'):
print(ele)
print('--------- find ---------')
print(soup.find('div'))
执行结果:
--------- find_all ---------
<div>测试01</div>
<div>测试02</div>
--------- find ---------
<div>测试01</div>
示例:指定多个标签
代码:
from bs4 import BeautifulSoup
html_content = '''
<div>测试01</div>
<div>测试02</div>
<h1>测试H1</h1>
'''
soup = BeautifulSoup(html_content, 'html.parser')
print('--------- find_all ---------')
for ele in soup.find_all(['h1', 'h2', 'div']):
print(ele)
print('--------- find ---------')
print(soup.find(['h1', 'h2', 'div']))
执行结果:
--------- find_all ---------
<div>测试01</div>
<div>测试02</div>
<h1>测试H1</h1>
--------- find ---------
<div>测试01</div>
select、select_one 示例
代码:
from bs4 import BeautifulSoup
html_content = '''
<div>测试01</div>
<div>测试02</div>
'''
soup = BeautifulSoup(html_content, 'html.parser')
print('--------- select ---------')
for ele in soup.select('div'):
print(ele)
print('--------- select_one ---------')
print(soup.select_one('div'))
执行结果:
--------- select ---------
<div>测试01</div>
<div>测试02</div>
--------- select_one ---------
<div>测试01</div>