2014-01-07 07:49 AM
I am attempting to use custom feeds but they just seem really slow to search on. Due to the nature of our company we use almost every 10.1-255.0.0 cidr range. The feed is designed just to segment out each range so I can search on it. But when I attempt to do a search on the feed or just view the meta of feed.name, it takes more than ten minutes to load an hour. Anyone else seeing this?
2014-03-17 07:40 PM
Thanks guys.
So if I understand correctly:
IndexValues means index the blue color values , which make query look for the green color values faster?
2014-03-17 07:52 PM
To use your example – IndexValues means index the blue colour values which makes queries for the blue colour values faster.
2014-03-18 10:14 AM
That means when we click the green number, the concentrator ask decoder to load the actual data to display?
Where the green number came from then?
Thanks.
2014-03-18 10:19 AM
That is simply the number of times that particular meta value has been seen. They are distinct sessions.
2014-03-18 10:24 AM
So that means it's in the index?
2014-03-18 10:40 AM
not exactly. When we talk about indexing, it is meta only. The sessions in green are tagged with meta, but those sessions are tracked on the decoders by sessionID numbers.
The decoder captures traffic.
It creates the session in memory.
It creates meta tags according to its parsers for each session, and links its session ID to those meta values
It dumps those meta values locally to await pickup by the concentrator.
The decoder then writes the session to disk for safekeeping until it is needed later.
Concentrator logs into decoder and asks if it has new meta.
Concentrator pulls meta values with linked session IDs to store in tables on the concentrator
If the table key is indexKeys, it stores those values in the key without indexing the values.
If the table key is IndexValues, it creates a searchable index for each of the values within that key up to its ValueMax setting.
If the ValueMax setting for a key becomes full, the concentrator drops the least used 10% of values out of its index to make room for more values.
Then wash, rinse and repeat.
2014-03-18 10:49 AM
The way I see it, fielder can comment if I am wrong.
You are writing a rule to look for an action from an IP source.
Scenario one action is using indexkeys and ip.src is using indexvalues
ip.src = '1.1.1.1' && action = 'get'
because action is only indexkeys it is going to go "Okay, let me get the ip.src and any log/packet that has action attached to it" It does not care that you want it to equal 'get'
Scenario two action is indexvalues and ip.src is indexvalues
ip.src = '1.1.1.1' && action = 'get'
because the action is now index values it looks for get and the ip.src and only pulls those values.
Using index keys can be very help if you really just want it to be useful information for an investigation but you don't really care about querying on that key.
So maybe you want to label your 1.1.1.0/24 network as Critical Web Servers
You might not want to query on that ever but during an investigation if you see that you are dealing with a critical web server it can be helpful but you are not wasting your index for querying purposes.
2014-03-18 11:54 AM
So Sean brings up a good point- querying for values some of which are indexed and others are not.
In Sean's scenario, he is querying IP address first, which is indexed, followed by an action that is not.
ip.src=1.1.1.1 && action = get. Both values get queried, but the IP goes first, followed by the action. Since IP is indexed, the first portion of the query returns quickly, and the action of get takes a little longer, but the initial drill into the IP narrows down the search parameters quite a bit.
Now let's look at the obverse:
Action=get && ip.src=1.1.1.1
The first query will be run which will take a very long time since it is not indexed. Once you get the first part of the result, the action, then the IP gets queried.
So whenever you craft your drills and queries, be sure to always use indexed items up front and only use non-indexed values at the very last. If it becomes a common query you need to run often, you need to look at indexing the key's values to get results faster.
2014-03-18 12:11 PM
thanks guys for all the helpful reply. Now i understand what does indexkeys and indexvalues means.
We had an issue on the filename meta, it take long time load, below is the benchmark.
Last 24 hours data – click Show more
SA UI: 2 minutes and 25 seconds
Nw Investigator: 2 minutes and 30 seconds.
I tried change from indexkeys to indexvalues, then reset the index, but the performance(display in SA UI) is similar.
Wondering is there anything we can do to make it fast?
I understand the filename is volatile, only indexkeys will be recommended, but why so slow?
Thanks.