Skip to content Skip to sidebar Skip to footer

How To Write A Regex To Match Multiple File Path

I have want to match files located in multiple directories: The file path could be locally - C:/users/path/image.png or on a system - //home/user/web/image.png For the first case,

Solution 1:

What you're trying to get from the match is not clear - maybe you just want the full string?

((?:(?:[cC]:)|//home)[^\.]+\.[A-Za-z]{3})

A dot (.) will match (close to) everything. If you want to compare and contrast against the string ., you should escape it with \..

Test runs:

>>> print re.match("((?:(?:[cC]:)|//home)[^\.]+\.[A-Za-z]{3})", "//home/user/web/image.png").groups()
('//home/user/web/image.png',)

>>> print re.match("((?:(?:[cC]:)|//home)[^\.]+\.[A-Za-z]{3})", "C:/users/path/image.png").groups()
('C:/users/path/image.png',)

And one for the usual Windows path syntax:

>>> print re.match("((?:(?:[cC]:)|//home)[^\.]+\.[A-Za-z]{3})", "C:\users\path\image.png").groups()
('C:\\users\\path\\image.png',)

If there's a need to support .jpeg, increase the max allowed occurrences for the extensions from {3} to {3,4}.

Solution 2:

Try

((c|C|//home)[^.]+[.][A-Za-z]{3})

Regular expression visualization

Debuggex Demo

If you want to use findall(), all the matches will be presented in a list of tuples. The tuples contain the groups in the regex, and that's the crux of the regex above - the whole expression has to be a group itself to show up in the return value of findall(). See the following code

smth = "//home/user/web/image.png C:/users/path/image.png c:/web/image.png"
ip = re.findall("((c|C|//home)[^.]+[.][A-Za-z]{3})",smth)
print ip
[('//home/user/web/image.png', '//home'), ('C:/users/path/image.png', 'C'), ('c:/web/image.png', 'c')]

Post a Comment for "How To Write A Regex To Match Multiple File Path"