Beautiful Soup | find_all method
Start your free 7-days trial now!
Beautiful Soup's find_all(~)
method returns a list of all the tags or strings that match a particular criteria.
Parameters
1. name
link | string
| optional
The name of the tag to return.
2. attrs
link | string
| optional
The tag attribute to filter for.
3. recursive
link | boolean
| optional
Boolean indicating whether to look through all descendants of the tag. Defaults to recursive=True
.
4. string
link | string
| optional
The string to search for (rather than tag).
5. limit
link | number
| optional
The number of elements to return. Defaults to all matching.
Examples
Consider the following HTML document:
my_html = """<div> <p id="alex">Alex</p> <p class="Bob">Bob</p> <p id="cathy">Cathy</p></div>"""soup = BeautifulSoup(my_html, "html.parser")
Find by Tag Name
To return a list of all the <p>
tags:
soup.find_all("p")
[<p id="alex">Alex</p>, <p class="Bob">Bob</p>, <p id="cathy">Cathy</p>]
Find by Attribute
To find all tags with id="cathy"
:
soup.find_all(id="cathy")
[<p id="cathy">Cathy</p>]
Find by Class
To find all tags with class="Bob"
:
soup.find_all(class_="Bob")
[<p class="Bob">Bob</p>]
Notice how we have to use class_
rather than class
as it is a reserved word in Python.
Recursive
Consider the following HTML:
my_html = """ <div id="people"> <p>Alex</p> <div> <p>Bob</p> <p>Cathy</p> </div> <div>"""
soup = BeautifulSoup(my_html)
To recursively look for <p>
tags under the <div id="people">
:
soup.find(id="people").find_all("p")
[<p>Alex</p>, <p>Bob</p>, <p>Cathy</p>]
To only look for <p>
tags directly under the <div id="people">
tag:
soup.find(id="people").find_all("p", recursive=False)
[<p>Alex</p>]
Note that only the <p>
tag that is a child of the <div id="people">
tag is returned.
Find by String
Reminder, here is the HTML we are working with:
my_html = """<div> <p id="alex">Alex</p> <p class="Bob">Bob</p> <p id="cathy">Cathy</p></div>"""soup = BeautifulSoup(my_html, "html.parser")
To find all the strings "Alex"
and "Cathy"
:
soup.find_all(string=["Alex", "Cathy"])
['Alex', 'Cathy']
Limit
To limit the number of returned results to 2:
soup.find_all("p", limit=2)
[<p id="alex">Alex</p>, <p class="Bob">Bob</p>]
Note how we only return the first two <p>
tags.