----- isHidden: false menupriority: 1 kind: article created_at: 2010-02-16T10:33:21+02:00 title: Pragmatic Regular Expression Exclude (2) tags: - regexp - regular expression ----- In my [previous post](previouspost) I had given some trick to match all except something. On the same idea, the trick to match the smallest possible string. Say you want to match the string between 'a' and 'b', for example, you want to match:
a.....a......b..b..a....a....b...Here are two common errors and a solution:
/a.*b/
a.....a......b..b..a....a....b...
The first error is to use the *evil* `.*`. Because you will match from the first to the last.
/a.*?b/ a.....a......b..b..a....a....b...The next natural way, is to change the *greediness*. But it is not enough as you will match from the first `a` to the first `b`. Then a simple constatation is that our matching string shouldn't contain any `a` nor `b`. Which lead to the last elegant solution.
/a[^ab]*b/ a.....a......b..b..a....a....b...Until now, that was, easy. Now, just pass at the case you need to match not between `a` and `b`, but between strings. For example:
...
[anything not containing ]
([^<]|<[^l]|])*
...
([^<]|<[^l]|])*(|<|
# transform a simple randomly choosen character
# to an unique ID
# (you should verify the identifier is REALLY unique)
# beware the unique ID must not contain the
# choosen character
s/X/_was_x_/g
s/Y/_was_y_/g
# transform the long string in this simple character
s//X/g
s/<\/li>/Y/g
# use the first method
s/X([^X]*)Y//g
# retransform choosen letter by string
s/X/ /g
s/Y/<\/li>/g
# retransform the choosen character back
s/_was_x_/X/g
s/_was_y_/Y/g