Tech Off Thread

11 posts

Forum Read Only

This forum has been made read only by the site admins. No new threads or comments can be added.

ASP 3 Regexp - poor performance

Back to Forum: Tech Off
  • User profile image
    Maurits

    I've noticed that ASP's Regexp object seems to have very poor performance when large strings are pushed through it.  I fried up a test case:

    ASP code:

    n = Request("n")
    t = String(n, "a")
    Set r = New RegExp
    r.Global = True
    r.IgnoreCase = True
    r.Pattern = ".*?b"
    r.Test(t)
    


    Compare to the equivalent perl code:
    'a' x shift =~ /.*?b/
    

    where "shift" pulls the value from the first argument to the script.

    I ran these for different values of n and measured the execution times... which were quite dramatically different (execution time in seconds)

    n Perl ASP
    1000 0.015 0
    2000 0.015 0
    5000 0.015 1
    10000 0.016 6
    20000 0.016 24
    50000 0.016 149
    100000 0.015 654

    It seems the ASP execution time rises exponentially in n...

    Wonder if it's related to this Ladybug bug...

  • User profile image
    sbc

    Have you tried it in JScript? Perhaps the performance is better (regular expressions have been around a while in JScript, plus there is a 'compile' method, which the VBScript version lacks).

    re = /.*?b/ig;
    re.test(mystringvar);
    

  • User profile image
    Maurits

    Interesting idea.  I'll also try it in Perlscript.  W3bbo suggested I use a COM regexp object too.

  • User profile image
    W3bbo

    Maurits wrote:
    W3bbo suggested I use a COM regexp object too.


    I did a bit of research, and I found that it was false memory. To my knowledge, there is not a COM RegEx object built-into Windows.

  • User profile image
    Mike Dimmick

    The VBScript RegExp object is implemented as a COM object in VBScript.dll. You can instantiate and use it from VB6 code. Add a reference to the 'Microsoft VBScript Regular Expressions 5.5' type library - there's also a 1.0 type library on my system.

    Regular expressions are, IIRC, a native capability of JScript/JavaScript/ECMAScript. I'm not sure if they're implemented in terms of the VBScript implementation.

    Since the RegExp object first appeared in VBScript 5.0, and therefore shipped along with IE 5.0, it's a safe assumption that any machine running Windows 2000 or higher has this object.

  • User profile image
    Maurits

    Quick experimentation reveals that native regexp handling in JScript and Perlscript are both lightning-fast for expressions and data of this kind.  Only VBScript is slow.

  • User profile image
    W3bbo

    Maurits wrote:
    Quick experimentation reveals that native regexp handling in JScript and Perlscript are both lightning-fast for expressions and data of this kind.  Only VBScript is slow.


    Wait!

    There's a workaround.

    Remember how ASP3 allows multiple languages per page?

    Like so:

    <script runat="server" language="Vbscript">
    
    </script>
    
    <script runat="server" language="JScript">
    
    </script>

  • User profile image
    Maurits

    Yup, that's the plan Smiley

  • User profile image
    Maurits

    Weirder and weirder.  I tried all nine combinations of page language and <script>-block language, and these are the results:

    page \ script

    VBScript

    JScript

    PerlScript

    VBScript

    slow

    slow*

    fast^

    JScript

    fast**

    fast

    fast

    PerlScript

    fast^^

    fast

    fast


    The source for these tests is available in this .zip file but basically it looks like

    <%@Language="(one of the rows)">
    <script runat=server language="(one of the columns)">
    #//' some "test" function that takes a string and runs against a hardcoded regex
    </script>
    #//' make a string and call the test function against it

    * My theory is that the JScript <script> block is auto-converted to VBScript and that is why this test is slow.

    ** My theory is that the VBScript <script> block is auto-converted to JScript and that is why this test is fast.

    ^, ^^ I'm at a bit of a loss to explain these.  I would have thought at least one would be slow.

    The other combinations avoid the slow VBScript issue by not using VBScript.

    EDIT: In the interest of completeness - PerlScript refers to ActiveState Perl 5.8.6.811
    EDIT2: Since I'm stuck with the first row, I guess I'm also stuck with PerlScript for the <script> language.

  • User profile image
    sbc

    Have you tried using a Windows Scripting Component / Scriptlet? These are XML files containing scripts that are referenced like COM objects in you web application.

    <%
    
    ' if registered
    
    Set oComponent = CreateObject("MyScriptlet")
    
    ' if not registered (i.e. on shared hosting service)
    
    Set oComponent = GetObject("script:http://myserver/MyScriptlet.wsc")
    
    oComponent.DoStuff(mystring,mypattern)
    
    %>


    Building Windows Scripting Components


  • User profile image
    Maurits

    In the end I just changed the code to use a series of simpler s///'s rather than one monster s///.  Just to be safe I used PerlScript instead of VBScript.

    Even with over 80KB of data being matched, the simpler s///'s are orders of magnitude faster than the monster s/// - which actually consumed ALL the available resources on the server for a particular real-world chunk of data. (EDIT: Even in PerlScript.)

    In case anyone's curious, the idea was to take a large chunk of HTML and break it up into page-size bits.

Conversation locked

This conversation has been locked by the site admins. No new comments can be made.