Last modified: 2009-03-26 04:48:54 UTC
The way the code currently works its way through the rule set has an unnecessary major inefficiency. For each rule, AbuseFilter::checkAllFilters calls AbuseFilter::checkConditions with the $conds and $vars. $vars is of course the spiffy variable class preloaded with a combination of already fetched and buildable data. checkConditions then creates a new AbuseFilterParserClass and loads it with $vars. The problem here is that a new parser is being created for every rule evaluation. As currently implemented that means the subsequent rules aren't getting the benefit of the work done by prior rules. In particular, the funcCache is being cleared and the variable holder reset to its initial state. What you want is for a single parser instance to be used for the processing of the entire rule set, that way the evaluation of functions and loading of variables is cached the whole way through, and not just within a single rule.
(In reply to comment #0) > The problem here is that a new parser is being created for every rule > evaluation. As currently implemented that means the subsequent rules aren't > getting the benefit of the work done by prior rules. In particular, the > funcCache is being cleared and the variable holder reset to its initial state. The function cache is a static variable, and therefore not cleared. The variable holder is not being reset to its original state. It's an object, so it's passed by reference, so you'd need specific code to reset it to its original state.
(In reply to comment #1) > The function cache is a static variable, and therefore not cleared. By declaring it static you create a globally accessible variable AbuseFilterParser::$funcCache, that has one value, but every instance of the class has it's own instantiation of $funcCache that is separate from the global variable and separate from each other. So strictly speaking you didn't clear it, rather by calling new you created a new one that is blank. Test this yourself. $funcCache is not retained across rules. > The variable holder is not being reset to its original state. It's an object, > so it's passed by reference, so you'd need specific code to reset it to its > original state. You have such code. More precisely, each time AbuseFilterParser is created it in turn creates a new AbuseFilterVariableHolder and then calls addHolder to copy the parameters passed to it in setVars. Those parameters consist of an array of AFPData and AFPComputedVariable objects. During the course of operation you replace the AFPComputedVariable objects which AFPData objects as you fetch the values. The problem here is that because you copied the array into a local holder, and didn't simply attach the original VariableHolder by reference, that means that the data array in the original VariableHolder is unaware of the actions occuring in the local copy. Specifically, it doesn't know that you replaced ComputedVariables with Data. So each time you create a new parser you end up passing it the original VariableHolder that still contains the unprocessed ComputedVariables. The end result is that both the $funcCache and the ComputedVariables are being lost upon the creation of a new AbuseFilterParser.
(In reply to comment #2) > (In reply to comment #1) > > The function cache is a static variable, and therefore not cleared. > > By declaring it static you create a globally accessible variable > AbuseFilterParser::$funcCache, that has one value, but every instance of the > class has it's own instantiation of $funcCache that is separate from the global > variable and separate from each other. > > So strictly speaking you didn't clear it, rather by calling new you created a > new one that is blank. > > Test this yourself. $funcCache is not retained across rules. I did: > $afp = new AbuseFilterParser(); > $afp->evaluateExpression( 'ccnorm("FooBarBaz")' ) > print_r( AbuseFilterParser::$funcCache ); Array ( [8437e92172f797748a5f891b6952c110] => AFPData Object ( [type] => string [data] => F00BARBAZ ) ) > $afp = new AbuseFilterParser(); Seems to work for me. > > The variable holder is not being reset to its original state. It's an object, > > so it's passed by reference, so you'd need specific code to reset it to its > > original state. > > You have such code. More precisely, each time AbuseFilterParser is created it > in turn creates a new AbuseFilterVariableHolder and then calls addHolder to > copy the parameters passed to it in setVars. Those parameters consist of an > array of AFPData and AFPComputedVariable objects. During the course of > operation you replace the AFPComputedVariable objects which AFPData objects as > you fetch the values. > > The problem here is that because you copied the array into a local holder, and > didn't simply attach the original VariableHolder by reference, that means that > the data array in the original VariableHolder is unaware of the actions > occuring in the local copy. Specifically, it doesn't know that you replaced > ComputedVariables with Data. > > So each time you create a new parser you end up passing it the original > VariableHolder that still contains the unprocessed ComputedVariables. You're right here, and I've made the Parser a lazy-initialised static variable in r48855.
(In reply to comment #3) > You're right here, and I've made the Parser a lazy-initialised static variable > in r48855. > Doesn't that break batch testing? You want the variables to carry over when working with a single edit, but you need them to be different when each check corresponds to a different edit.
(In reply to comment #4) > (In reply to comment #3) > > > You're right here, and I've made the Parser a lazy-initialised static variable > > in r48855. > > > > Doesn't that break batch testing? You want the variables to carry over when > working with a single edit, but you need them to be different when each check > corresponds to a different edit. > Should be sorted out in r48856. Note that that commit included a bunch of unrelated global filters work, which I reverted in r48857.
(In reply to comment #3) > I did: > Seems to work for me. Okay, works for me too. $afp->funcCache and self::$funcCache are distinct variables, and somewhere in performance testing I got them crossed. new does clear $afp->funcCache but not the global self::$funcCache which is what you are correctly using.