How To Use Beautifulsoup To Get The Same Result Obtained By Regex?

June 16, 2024 Post a Comment

I'm trying to extract all the values (which are links) of attribute data-src-mp3 in the content1 generated from the url. The link is contained in ] for tag in soup.select('.cB.cB-def.dictionary.biling [data-src-mp3]')]

mp3s = list(map(lambda tag: tag.attrs['data-src-mp3'],
                soup.select('.cB.cB-def.dictionary.biling [data-src-mp3]')))

[data-src-mp3] selects only elements that have the data-src-mp3 attribute (with any value).

With a small change to have 'data-src-mp3' in a single place:

mp3_tag = 'data-src-mp3'
mp3s = list(map(lambda tag: tag.attrs[mp3_tag],
                soup.select('.cB.cB-def.dictionary.biling [{}]'.format(mp3_tag))))

This solution might look more intimidating at first, but is much better than relying on the wrong tool (such as regex when parsing HTML).

Getting Started with Python

How To Use Beautifulsoup To Get The Same Result Obtained By Regex?

Post a Comment for "How To Use Beautifulsoup To Get The Same Result Obtained By Regex?"