python-regexHow to get the href attribute value from a regex in Python?
The href attribute value from a regex can be obtained in Python using the re.findall()
function. This function takes a regular expression pattern and a string as arguments and returns a list of all non-overlapping matches in the string.
Example code
import re
html_string = '<a href="https://www.example.com">Example</a>'
href_value = re.findall(r'href="(.*?)"', html_string)
print(href_value)
Output example
['https://www.example.com']
Code explanation
import re
: This imports there
module which provides regular expression matching operations.html_string = '<a href="https://www.example.com">Example</a>'
: This is the HTML string from which the href attribute value needs to be extracted.href_value = re.findall(r'href="(.*?)"', html_string)
: This uses there.findall()
function to extract the href attribute value from the HTML string. The regular expression patternr'href="(.*?)"'
matches the href attribute value and the.*?
part captures the value.print(href_value)
: This prints the href attribute value.
Helpful links
More of Python Regex
- How to replace all using Python regex?
- How to count matches with Python regex?
- How to match a year with Python Regex?
- How to perform a zero length match with Python Regex?
- How to match a YYYY-MM-DD date with Python Regex?
- How to use word boundaries in Python Regex?
- How to match whitespace in Python regex?
- How to match a hex number with regex in Python?
- How to match a UUID using Python regex?
- How to match a URL path using Python regex?
See more codes...