python-regexHow to match HTML tags with regex in Python?
Matching HTML tags with regex in Python can be done using the re
module. The re.findall()
function can be used to find all occurrences of a pattern in a string. For example, the following code will find all HTML tags in a string:
import re
html_string = "<p>This is a paragraph</p><h1>This is a heading</h1>"
tags = re.findall(r"<[^>]*>", html_string)
print(tags)
Output example
['<p>', '</p>', '<h1>', '</h1>']
The code works by using the re.findall()
function to search for all occurrences of a pattern in a string. The pattern used is r"<[^>]*>"
, which matches any HTML tag. The [^>]
part of the pattern means that any character that is not a >
character can be matched.
The output of the code is a list of all the HTML tags found in the string.
Code explanation
re.findall()
: This function searches for all occurrences of a pattern in a string.r"<[^>]*>"
: This is the pattern used to match HTML tags. The[^>]
part of the pattern means that any character that is not a>
character can be matched.
Helpful links
More of Python Regex
- How to replace all using Python regex?
- How to match a YYYY-MM-DD date with Python Regex?
- How to match a URL path using Python regex?
- How to ignore case in Python regex?
- How to split using Python regex?
- How to remove numbers from a string using Python regex?
- How to replace using Python regex?
- How to match a plus sign in Python regex?
- How to get all matches from a regex in Python?
- How to get a group from a regex in Python?
See more codes...