2016-09-27 11:02 AM
Background:
We have several packet decoders that receive traffic from Gigamon. That system has aggregators and taps that feed into it and provide us the raw packet traffic. Our IDS sensors receive the same feeds.
Problem:
We have reports, fairly consistently (several per week) that an IDS sensor will fire but when SA is used to try to identify the session related to that event, we're unable to find it.
Anyone in the RSA Link universe have tips for tackling this problem? Management wants 100% certainty that our packet decoders have all the traffic that IDS has, but if our feeds are the same the only problem I can see is with SA, but I've investigated the following for each incident.
Checked decoder performance stats (assembler pages, dropped packets, capture rate, pool pages)
Decoder system stats (CPU, memory, disk I/O, network bandwidth)
Reviewed local logs (app and system) for any irregularities
Checked app rules
Checked parsers
Checked taps are correct
Checked Gigamon feeds are correct
I just have run out of ideas. No clue how to increase the reliability of our decoder processing.
Thanks
2016-09-27 11:27 AM
Just a few questions for things to think about:
Is anything special about these IDS alerts that you are not getting the packets for?
Does the IDS system see exactly the same packets as the netwitness packet decoders?
Can you generate some sort of heart beat packets and check that you receive these all the time in both the IDS system and the Netwitness Packet Decoder?
2016-09-27 11:32 AM
I usually find this is an issue with timezones.
Is the IDS logging in its loca; timezone or UTC(GMT)
NetWitness captures UTC, but if you have your profile preferences set to "display" in a different timezone like: CST6CDT (GMT-05:00), you may be querying the wrong time frame.
Also what components of the IDS message are you using int the NetWitness Query?
2016-09-27 11:51 AM
The problem with the IDS is that it doesn't always store the packets related to the alert. And when it does, since the IDS is packet based (i.e. run packet through signatures and flag when match found) even when we do get the PCAP from the IDS, if you look at it in Wireshark for instance it isn't always a full TCP handshake.
In the past I usually get the pcap from IDS if it has it, then import into Wireshark to verify if that parses content the way I'd expect, then inject that pcap into our UAT environment and see what SA parses.
I'll gather more information about the IDS alert and look at the packets for this instance. I don't have access to the IDS console either so I just get whatever is logged from the IDS to compare metadata to.
2016-09-27 11:58 AM
Our IDS logs local, SA we have in UTC.
Usually when I investigate these issues, SA generates the alert from the IDS log, and thus feeds that alert via SAIM through UCF over to Archer for the creation of the incident.
Thus I can pivot from the link in Archer directly to the incident created in SAIM. I look at that incident, the raw alert and log details. The raw alert will show me the UTC time it fired so I can use that as my base when running queries against our packet concentrators/brokers. I run the queries against both brokers and the concentrator that should have the data just to rule out potential places for failure.
2016-12-27 04:24 PM
Update:
Thank you David Waugh and John Snider for your assistance earlier.
David, I like the idea of heartbeat packets, that is cool, could setup ESA alerts to trigger but not be ingested by our SAIM system just to make sure we're always getting end to end streams from segments we expect, however setting this up in practice may be a little interesting.
I haven't had as many issues with this now as I use to. Often I see issues with time skew as mentioned and also it is difficult to always have confidence that we have a 1:1 mapping of IDS datafeeds to the datafeeds we have in NetWitness. Our SDAN infrastructure could use better monitoring and dashboards for visibility. I'm looking at ways to extract via API the Gigamon feed names with my decoders and those for IDS so I can at least do a diff and know if things suddenly change.
I'm also looking at improving an RSA provided Python script that extracts packet performance data so in the future we can run it adhoc to see if our assembler is getting backed up or if there is a large amount of dropped packets for w/e reason.
If I come up with anything cool related to this stuff I'll reply back.
2016-12-28 02:27 PM
Kevin,
I agree with John, the only time I have seen this issue is when the time of the various devices getting the same data feed were logging in different timezone. As David mentioned, I have also in the past run an event at regular intervals (1 per min) that would get detected by the IDS and compare the number of packets received by NetWitness to see if both devices were getting the same number of packets. This way you can check to see if the Gigamon is sending the same traffic to all capture devices.
2016-12-28 02:46 PM
Also remember that the device itself may be filtering. A common example is an ASA that has tcp builds and tear downs filtered. Since many appliances have their own filtering mechanisms it is important to fundamentally decide if you want to filter on the appliance or in NetWitness but not both due to the confusion that can be created.