How To Use Beautifulsoup To Get The Same Result Obtained By Regex?
I'm trying to extract all the values (which are links) of attribute data-src-mp3 in the content1 generated from the url. The link is contained in ] for tag in soup.select('.cB.cB-def.dictionary.biling [data-src-mp3]')]
or
mp3s = list(map(lambda tag: tag.attrs['data-src-mp3'],
soup.select('.cB.cB-def.dictionary.biling [data-src-mp3]')))
[data-src-mp3]
selects only elements that have the data-src-mp3
attribute (with any value).
With a small change to have 'data-src-mp3'
in a single place:
mp3_tag = 'data-src-mp3'
mp3s = list(map(lambda tag: tag.attrs[mp3_tag],
soup.select('.cB.cB-def.dictionary.biling [{}]'.format(mp3_tag))))
This solution might look more intimidating at first, but is much better than relying on the wrong tool (such as regex when parsing HTML).
Post a Comment for "How To Use Beautifulsoup To Get The Same Result Obtained By Regex?"