2014-01-07 07:49 AM
I am attempting to use custom feeds but they just seem really slow to search on. Due to the nature of our company we use almost every 10.1-255.0.0 cidr range. The feed is designed just to segment out each range so I can search on it. But when I attempt to do a search on the feed or just view the meta of feed.name, it takes more than ten minutes to load an hour. Anyone else seeing this?
2014-01-07 09:45 PM
What you have described sounds like an index problem during investigation rather than the feeds themslevs.
Check the index settings for the meta key that you are writing the data to. Any changes to index settings, and any custom meta keys you have created, should be in the index-concentrator-custom.xml on the concentrator. Make sure it is set to level="IndexValues" and give it an appropriate ValueMax setting. Here is an example of what the entry should look like.
<key description="Threat Source" format="Text" level="IndexValues" valueMax="250000" name="threat.source"/>
2014-01-07 08:32 AM
i think it depends how much data you viewing. check the query finished time.
2014-01-07 09:43 PM
Check the index level of the feed.name key. It's probably set to IndexKeys, which means accessing or searching it for specific values on that key will be slow. If this is something you will be using heavily or doing frequently, you might consider changing the indexing level of this key to IndexValues instead..
2014-01-07 09:45 PM
What you have described sounds like an index problem during investigation rather than the feeds themslevs.
Check the index settings for the meta key that you are writing the data to. Any changes to index settings, and any custom meta keys you have created, should be in the index-concentrator-custom.xml on the concentrator. Make sure it is set to level="IndexValues" and give it an appropriate ValueMax setting. Here is an example of what the entry should look like.
<key description="Threat Source" format="Text" level="IndexValues" valueMax="250000" name="threat.source"/>
2014-01-08 11:09 AM
Something I also use are custom meta keys for my own network ranges. So, I create a key called netname.src and netname.dst and then slice up the csv with cidr notation and some meta. I try to keep the meta value small to save space on the database.
Ex.
192.168.1.1/24,internal
10.1.1.1/24,dmz
172.16.0.1/24,vpn
Then, I create some app rules around this to figure out direction to either include or exclude. Furthermore, you could create app rules to fire when certain segments are observed communicating with each other.
It does sound like an indexing problem, but I think there may be some other ways around getting the data you want to get at.
Chris
2014-02-04 03:07 PM
Do you know what the difference is between indexValues and indexKeys? Just curious, by switching it to indexValues it did work.
2014-02-04 04:04 PM
One of the RSA guys may be able to provide a more technically accurate answer, but in general IndexValues actually indexes the unique values in each meta key/table. This means you can search for specific values and leverage the benefit of the index. At a really abstracted level, an index contains pointers to where the different unique values are, so when you search for one it can find it in the index which is generally much smaller than the whole table.
IndexKeys doesn't index the individual values, but only whether a value exists for that key. So you can do "exists" comparisons efficiently (like "alias.host exists"), but hunting for specific values still requires iterating over every record in the table and comparing against each one, which is very slow.
Indexing each individual value is more resource intensive to create and maintain as the database engine has more work to do at consumption time because it has to update the index ... but the indexing allows you to search through data faster after the fact. So it's a trade-off... if you try to set IndexValues for everything, you may have performance problems depending on how much data you're consuming.
2014-03-17 12:50 PM
still confusing:
for example:
Action Event | (5 values) |
get (6,070) - put (252) - set attributes (42) - get attributes (42) - head (4)
if set to indexvalues, which values it indexing?
if set to indexkeys, which keys it indexing?
thanks.
2014-03-17 12:56 PM
I actually got a response from RSA support on this. The article is for 9.8 and lower so it might be different for SA but I think it helps understand them more.
https://knowledge.rsasecurity.com/scolcms/knowledge.aspx#a59809
The main difference between IndexValues and IndexKeys is as follows:
Hence there is space saving in the index and some increased performance for initial drills if you set a key to IndexKeys over using IndexValues.
Where you will see the most difference is in Investigator where the keys with IndexKeys will always come up in a closed state regardless if there are values or not.
2014-03-17 04:06 PM
Patriot, if set to indexvalues, for action, everything in blue is indexed. This means that queries that specify something in the action key will return very quickly. For instance action=put
If that meta key was only indexkey, any queries for matches will take a while to return, but the EXISTENCE of the key is what will return quickly. For instance, service=80 && action EXISTS
So why the distinction?
There are tons of meta keys that just don't need to be queried by an analyst. For instance, with the GEOIP integration, most analysts don't care about latitude and longitude, or city, but those values are still created. In most instances, the SQL key is not indexed, but if I had SA near to my SQL servers, I'd want to turn indexing on to capture and alert against certain SQL commands and queries.
The reason behind these settings is that indexing is rarely turned on for everything. The index file takes up quite a bit of room on the disk, and in fact, many concentrators have fast reading solid state drives that only host the index, and if the index grows beyond that SSD's capacity, it will ruin the performance gain of having the SSD in the first place.
Of course, technology improves and one day this distinction might not be necessary. Everything will eventually be able to be indexed.
But for now, make sure your indexing supports your use cases. Need to analyze malware and hunt for malicious known filenames? Gotta index filenames. Need to be on the lookout for known malicious referral strings? Gotta index referral.