quick search:
 

Arbitrary Regular Expressions

Submitted by: dylanr
Last Edited: 2006-03-22

Category: Python(External Method)

Average rating is: 4.5 out of 5 (2 ratings)

Description:
There may well be a good reason you can't use regexes in Python Scripts,
but it's hugely inconvenient.

Taking as given that you should be concerned about uncontrolled use of
regexes, it doesn't make much sense to globally permit it. Rather, by
using a thin set of external methods, it's easy to get your hands on
regex goodness while maintaining control with the standard zope security tools.

Save the following as regexes.py in your INSTANCE_HOME/Extensions folder.
You'll need file system access for this.


Source (Text):
import re

def rgx_sub(pattern, sub_text, source_text):
    return re.sub(pattern,sub_text,source_text)

def rgx_search(pattern, source_text):
    matches = re.search(pattern,source_text)
    if matches:
        return matches.group()
    else:
        return []

Explanation:
Once you've got that installed, add some external methods (in the ZMI):

---
id: regex_search
module name: regexes
function name: rgx_search
---
id: regex_sub
module name: regexes
function name: rgx_sub
---

Using these from a Python Script is simple:

result = context.regex_search(pattern, input_string)
result = context.regex_sub(pattern, repl_text, input_string)

Bear in mind that in both cases, input_string is not changed, rather
your results are *returned*.

Because you're passing your patterns directly to re, all the standard
metacharacters, parens, etc. will work.

The regex_search method returns the group() of the resulting match
object if you wish to iterate over results.

--IMPORTANT--
Per the following comment, group() may not be what you are expecting.
Either ensure that your code is expecting these results or change the
external method to use groups().

Once everything's working, don't forget to set permissions on these
external methods in accordance with whatever security policy you want
enforced.

Enjoy!


Comments:

SPELLING MISTAKE by tra5is - 2004-10-14
OMG. You have no idea how frustrated I got with this thing.

The line: "return matches.group()"

must read: "return matches.groups()"

This will allow the grouping mechanism provided in regular expressions available to the calling script.
 
Re: SPELLING MISTAKE by dylanr - 2004-10-14
I'm certainly sorry you had trouble using this recipe.

For what it's worth, a regular expression match object has *both* a 
group() and groups() method.  I intended the former, but it's valid 
to prefer the latter. Either way, I didn't continue the recipe beyond 
the point where you get *something* back because getting any results 
*at all* is the part that seems to cause the most difficulty.  

Anyone who intends to use this recipe would be well advised to check
out the docs on regular expression match objects and determine if what 
this recipe gives them is, in fact, what they actually want.  

Some links of potential interest:
http://docs.python.org/lib/match-objects.html
http://www.amk.ca/python/howto/regex/

HTH & sorry for any confusion.  I've edited the recipe to flag the 
potential for confusion.


Great article by actionslacks - 2006-03-22
This saved me a lot of time.

Good to outline the difference between group and groups but getting secure access to regular expressions is key.