2015-07-11 10:58 AM
Here is a little quick guide to help users identify parsing issues within their SA instance.Due to the static text parsing SA leverages within their parsers, you may want to do some scheduled reporting/querying to identify events that aren't identified properly, as new/rarely seen events that come through may not be properly identified.
Detecting unidentified events within you SA Instance.
Description : This will identify events that either A) Your decoders do not have the parser enabled or B) SA doesn't support this log
Query :device.type=unknown
Detecting properly identified events, but not tagging exists
Description : This will identify events that are caught by a parser, but no tagging/normalization exists, as either the log is incorrectly identified by the wrong parser (header parser issues), and/or there is no message parser created for this parser
Query : device.type exists && msg.id !exists
Tip1 :
You may also want to ensure you do run reports from a period of time, and validate that you parsers are cleaned up on your decoders (removing unused parsers) as they will use more processing overhead, and may incorrectly identify logs with the wrong parser.
Tip2 :
You will also want to keep an eye on device types that are outliers (identified parsers within SA that are generating an extremely high/low amount of traffic detected by a parser (device.type)), chances are that you will want to dig into these logs and ensure that the proper parser is being used. Create daily/weekly reports and/or leverage rules within real-time charts on dashboards to keep on eye on your log sources.
Tip3:
Rhlinux is usually a catch-all, and identifies traffic from a bunch of logs that do not have a message parser created for this (see example 2 query above to catch this). I have seen some cisco parsers also catch a bunch of logs that arent properly identified either.
2015-08-15 11:18 AM
Also, just found that users should also keep an eye on the meta tag : "parse.error", this will show issues that the parser presents in normalizing the event.
2015-08-17 07:54 AM
I run two reports a few minutes apart each week.
The first one takes the 100 devices with the most unknown messages in the last 24 hours and uses them to create a list. The second uses this list to show which device types are being reported against each of these IPs. This gives us a good way to track the worst devices.
I also have a script that runs the SDK timeline query for all logs, and logs where device.type = unknown, and I can use this to generate a 'parsing success' figure which we target being above 99%. Something I would like to see is an option for unparsed messages to be stored in a single meta key so that they can be retained by the archiver.
Good tip on the second query though - will have to investigate that further.
Another tip from me is that we have a couple of in house parsers. For example, for our load balancers they generate custom messages that aren't recognized by the standard parser, so I have an additional parser to sort those. I also have some generic (e.g., firewallxx, unixxx) parsers which clean up other unparsed messages in the logs.
Most of the time there's little value in these messages, but parsing them out at least means we can see the remaining unparsed messages.
2017-08-03 04:31 AM
We have a couple of in-house parsers for different IPS. But we keep getting parse.error in some of the fields. Can you point out some of the mistakes by which this issue is caused?
Regards,
Karan
2017-08-03 02:24 PM
take a close look at the wording of the parse.error message.
convert fail message: <-- the system is trying to stuff a string of characters (risk unknown...) into a reputation.num metakey. My guess is that in the tablemap file has a format defined as a number (uint or some other non text key) and the parser is trying to put a string value in there which is failing (system is working as it should).
What should be done is to update that reputation.num metakey in the table-map-custom.xml file and add a failurekey to it so that when the insert for a non number happens the system tries to insert into the key listed in failurekey (fallback kind of process).
you can check the default file table-map.xml for examples to see how that works and looks. I would also check to make sure if you copied your keys to the custom table map file a while ago, the failure lines may not have been added and simply updating with a recent version of the same line may contain your solution.
<mapping envisionName="daddr_v6" nwName="ipv6.dst" flags="None" format="IPv6" envisionDisplayName="DestinationAddressv6|ClientAddressv6" failureKey="host.dst" nullTokens="(null)|-"/>
2017-08-03 02:26 PM
You can also make nice dashboards to show a summary over time for these items.