2014-01-22 11:45 AM
Not really sure if I am just impatient or something but it seems that lookup_and_add is really slow when you add it to a report. I understand that it will be running additional queries against the data but it just seems to be taking too long.
An example is looking up windows failed logins for the past hour, our report picks up 47 unique users. I want to lookup and add both the event computer and source IP.
When I run the report with no lookup_and_add I get the report back in 3.95 seconds. When I do add the lookup_and_add I get it back in 222 seconds. Is that really how long it would take or is something misconfigured? Some reports are taking so long that the server errors out and I get no results back.
2014-01-24 08:12 PM
It works by running sub-queries for each of the values returned from the initial query, so the run time will increase linearly as the initial query returns more results. The more values you return, the more followup queries it runs... AND the subqueries are executed serially, not in parallel. So the first query runs and finishes, second query runs and finishes, third query runs and finishes... lather, rinse, repeat.
You can watch or look through /var/log/messages to watch this behavior occurring.
So if your initial query returns 47 values, one instance of lookup_and_add on the rule is going to perform 47 subqueries to lookup whatever value you're trying to add based on the returned values from the initial query. If each query takes 3 seconds, that's over two minutes. Now multiply by two since you're doing two lookup_and_add's on 47 values. (The timing not an exact match, but in the real world each individual subquery isn't going to take exactly 3 seconds.)
This also highlights the importance of ensuring that the field you're lookup_and_add'ing off of is set to IndexValues, and why it's a good idea to use the "limit" setting when you're building and testing a new rule.
2014-01-25 12:16 AM
how many values are you asking to add to the initial query? I rarely use more than 4-5 in the subquery. also, make sure you are querying indexed values.
2014-01-27 10:08 AM
I do not believe it has a limit on the lookup_and_add, I will have to add them and see what they do. Also, is it best practice to run the report off the log concentrator instead of the broker, considering the packet won't have any of that data.
2014-01-27 12:25 PM
In your case, it may be best to query the concentrator directly. The query format for the lookup and add would be:
select: your_primary_key
where: your conditions statement
then: lookup_and_add('secondar_lookup','your_primary_key',4);
And you can add multiple lookup_and_adds.