Why Can't The Underscore Be Matched By '\w'?
I know that _ cannot be matched by \W while any other punctuation can. As the docs state: \w is a set of alphanumeric characters and the underscore. At the same time: I have alway
Solution 1:
Lots of Python's regular expression syntax in the module re comes from Perl, which was influenced by sed and awk. The \w comes from there and has a long history.
In the original regex module (which was deprecated in Python 1.5), \w did not include _, as is evident from Python 1.4 documentation:
\wMatches any alphanumeric character; this is equivalent to the set
[a-zA-Z0-9].
P.S. While it is not very convenient can match all non-\w + _ with a character class [\W_].
Post a Comment for "Why Can't The Underscore Be Matched By '\w'?"