2015-06-11 08:55 AM
I am trying to write a message definition for a log message that starts with the following string:
"The Windows Filtering Platform has permitted a connection. Application Information:"
When I try to match it against
"<event_description>. Application Information:" it will not match, but with the definition
"<event_description> Application Information:" it does match. Does a dot character (".") get some special treatment in enVision/SA parsers? I was under the impression that it does not, but clearly something is wrong here. I would not like to include the dots in the stored value if that can be avoided.
I am using ESI to test for matches.
2015-06-12 02:24 AM
Figured it out. The dot is not a special character, but the parsing failed because of the multiple whitespace between the dot character and the start of the word "Application". Even though I exported the log from SA and used that exact export as a starting point, it still did not match the amount of whitespace.
What I did instead was add a dummy variable into the whitespace area:
"<event_description>.<space>Application Information:"
2015-06-17 01:35 PM
Thanks the update - glad its working now.
2015-06-18 11:36 AM
I revisited the issue and noticed that there are still issues with parsing Windows logs.
This is the original log message where single spaces have been replaced with <space> and single tabs have been replaced with <tab> for clarity. Whitespace in tokens, such as "Service Information" have not been replaced; only whitespace between tokens.
A Kerberos authentication ticket (TGT) was requested.<space><space><space><space>Account Information:<space><space><tab>Account Name:<tab><tab>bob.user<space><space><tab>Supplied Realm Name:<tab>ACMEINC<space><space><tab>User ID:<tab><tab><tab>S-1-5-21-0000000-0000000-0000000-2965<space><space><space><space>Service Information:<space><space><tab>Service Name:<tab><tab>krbtgt<space><space><tab>Service ID:<tab><tab>S-1-5-21-000000000-000000000-000000000-502<space><space><space><space>Network Information:<space><space><tab>Client Address:<tab><tab>::ffff:192.168.1.10<space><space><tab>Client Port:<tab><tab>50895<space><space><space><space>Additional Information:<space><space><tab>Ticket Options:<tab><tab>0x40810010<space><space><tab>Result Code:<tab><tab>0x0<space><space><tab>Ticket Encryption Type:<tab>0x17<space><space><tab>Pre-Authentication Type:<tab>2<space><space><space><space>Certificate Information:<space><space><tab>Certificate Issuer Name:<tab><tab><space><space><tab>Certificate Serial Number:<tab><space><space><tab>Certificate Thumbprint:<tab><tab><space><space><space><space>Certificate information is only provided if a certificate was used for pre-authentication.<space><space><space><space>Pre-authentication types ticket options encryption types and result codes are defined in RFC 4120.,107340008
In the original log message, which is originally collected via nxlog (which is pushing the logs out via Syslog in Snare format) and then exported from Security Analytics, there are always four space characters (<space>) where there originally was a line change in the original windows log message.
Below we have my first try where I got ESI to say that it is able to parse the message with this parser. The logic in this is that:
1. Space characters between tokens are not stored into variables, they are left as is,
2. One or more tabs preceeding a variable are left as is,
3. One or more tabs preceeding a static text ("0x0" in this example") are stored in a temporary variable
<event_description>.<space><space><space><space>Account Information:<space><space><space>Account Name:<tab><tab><username><space><space><space>Supplied Realm Name:<tab><ddomain><space><space><space>User ID:<tab><tab><tab><sid><space><space><space><space>Service Information:<space><space><space>Service Name:<tab><tab><service><space><space><space>Service ID:<tab><tab><service_id><space><space><space><space>Network Information:<space><space><space>Client Address:<tab><tab>{::ffff:<saddr>|<saddr>}<space><space><space>Client Port:<tab><tab><sport><space><space><space><space>Additional Information:<space><space><space>Ticket Options:<tab><tab><ticket_options><space><space><space>Result Code:<space>0x0<space><space><space>Ticket Encryption Type:<tab><encryption_type><space><space><space>Pre-Authentication Type:<tab><preauth_type><space><space><space><space>Certificate Information:<space><space><space>Certificate Issuer Name:<tab><tab><cert_issuer><space><space><space>Certificate Serial Number:<tab><cert_sn><space><space><space>Certificate Thumbprint:<tab><tab><cert_thumbprint><space><space><space><space><fld0>
As stated, ESI claims that is is happy to parse events with this signature. But unfortunately, Security Analytics is not.
Below is the parser for Event ID 4768 Audit Success from the default enVision XML parser. It behaves less logically to my eyes as the original log message and my own signature, as for example the four space characters separating the line or sections changes in the log (e.g. before the beginning of "Service Information" or "Certificate Information". In the enVision parser these four space characters for example are sometimes preceeded by a single space character as in "<sid><space>Service Information:" but sometimes by two space characters as in "<fld93><space><space>Certificate Information". Also, tabs in the original log convert strangely as the original log file has three tabs before the User ID variable and the Pre-Authentication Type variable is only preceeded by a single tab in the original log. However, both these variable are preceeded by only a single space character in the enVision signature below.
<event_description><space><space><space>Account Information:<space><space><space>Account Name:<space><space><username><space><space><space><space>Supplied Realm Name:<space><ddomain><space>User ID:<space><sid><space>Service Information:<space><space><space>Service Name: <service><space>Service ID:<space><fld60><space><space>Network Information:<space><space><space>Client Address:<space><saddr><space>Client Port:<space><sport><space><space>Additional Information:<space><space><space>Ticket Options:<space><fld92><space>Result Code:<space>0x0<space>Ticket Encryption Type:<space><encryption_type><space>Pre-Authentication Type:<space><fld93><space><space>Certificate Information:<space><space><space>Certificate Issuer Name:<space><space><cert_issuer><space><space>Certificate Serial Number: <fld94><space><space>Certificate Thumbprint:<space><fld95><space>Certificate information <fld50>
Does anyone know if there is some logic as to how whitespace is handled in the parsers? I used the logic used in the second signature above to multiple Event IDs and some of them get parsed by Security Analytics and some do not, even though ESI claims it can parse them all equally well. And is Security Analytics giving me the log export exactly in the format as the payload arrive or is it making some changes to the original format, thus giving me this headache with parsing the logs?
Luckily this problem of mine is only related to Windows logs because of the use of spaces and tabs. However, in my opinion quite many of the default Windows signatures are not good enough or in some cases contain large chunks of the message just ignored by the use of temporary variables with arbitrary limits.
I apologize for the hard to read syntax above but I figured that this way the signatures and the raw log will be quite easy to convert to their original form. Please notice that the above signatures contain both <space> and <space>, where <space> is just marking a single space character and <space> is basically a temporary variable that has not match in table-map*.xml.