2013-09-03 10:02 AM
This thread is a follow on to my previous question found:Mail From (email.src) Custom Parser
Firstly a special thanks to Fielder who provided the base code for the parser I am sharing. However unfortunately for me the parser Fielder shared didnt work in its initial state, nor did it do the bit I wanted, which was to give me the ability to identify whether the email was inbound or outbound.
I have been struggling for quite sometime with the fact I have not been able to simply identify all the emails that went from an address while ignoring the emails that went to the same address. This parser finally addresses that problem, and much more.
the two major differences from the parser that Fielder posted is that it now identifies the domain only of both inbound and outbound traffic. So you can see how many emails went to or from a domain without having to do a custom drill for any particular domain.
To get it working you should be able to save the contents of this parser into a email.parser file and upload it straight to a decoder or you can edit the flex.parser file that is already on the decoders. . You will also need to add the following to the index files for each of the decoder,concentrator and broker
!!!! ###########
I strongly suggest you test the index and parser using Investigator prior to deploying it on your decoder et all. During its development I killed my concentrator at least twice due to simply syntax errors.
!!! ##########
<key description="From Email Address" name="email.fromaddy" format="Text" level="IndexNone" valueMax="2500000"/>
<key description="To Email Address" name="email.toaddy" format="Text" level="IndexNone" valueMax="2500000"/>
<key description="Email IP Address" name="email.ip" format="Text" level="IndexNone" valueMax="2500000"/>
<key description="Email Domain" name="email.domain" format="Text" level="IndexNone" valueMax="2500000"/>
<key description="Email Mailer Client" name="email.mailer" format="Text" level="IndexNone" valueMax="2500000"/>
<key description="Email Content" name="email.content" format="Text" level="IndexNone" valueMax="2500000"/>
<key description="Email Encoding" name="email.encoding" format="Text" level="IndexNone" valueMax="2500000"/>
<key description="Email Description" name="email.desc" format="Text" level="IndexNone" valueMax="2500000"/>
<key description="Email Disposition" name="email.disp" format="Text" level="IndexNone" valueMax="2500000"/>
<key description="Email Message ID" name="email.messageid" format="Text" level="IndexNone" valueMax="2500000"/>
Note: for Concentrator and Broker the "IndexNone" should be set to "IndexValues"
Prior to deployment do a find and replace to add your own domain name instead of the "mycompany.com" put your own domain name in full your partial i.e. rsa.com or rsa.co.uk, or just rsa (there should be two instances to update)
The possibilities this parser offer are untold, but a simple informer rule / Investigator drill examples could be:
soc.misc !=FromInternal && filetype = 'zip','windows executable'
This would give all inbound emails with zip files or windows executable.
soc.misc =FromInternal && soc.misc !=ToInternal
This would give you all email that went external to your company.... (minor flaw in this drill is that if the email also cc'd an internal address you could not capture it here)
-- The parser --
<?xml version="1.0" encoding="utf-8"?>
<parsers xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="parsers.xsd">
<parser name="WP_Email" desc="version 2.0">
<!--
****
KEYS NEEDED FOR PARSER TO WORK
****
<key description="Miscellaneous" format="Text" level="IndexValues" name="soc.misc" valueMax="250000" defaultAction="Open"/>
<key description="From Email Address" name="email.fromaddy" format="Text" level="IndexNone" valueMax="2500000"/>
<key description="To Email Address" name="email.toaddy" format="Text" level="IndexNone" valueMax="2500000"/>
<key description="Email IP Address" name="email.ip" format="Text" level="IndexNone" valueMax="2500000"/>
<key description="Email Domain" name="email.domain" format="Text" level="IndexNone" valueMax="2500000"/>
<key description="Email Mailer Client" name="email.mailer" format="Text" level="IndexNone" valueMax="2500000"/>
<key description="Email Content" name="email.content" format="Text" level="IndexNone" valueMax="2500000"/>
<key description="Email Encoding" name="email.encoding" format="Text" level="IndexNone" valueMax="2500000"/>
<key description="Email Description" name="email.desc" format="Text" level="IndexNone" valueMax="2500000"/>
<key description="Email Disposition" name="email.disp" format="Text" level="IndexNone" valueMax="2500000"/>
<key description="Email Message ID" name="email.messageid" format="Text" level="IndexNone" valueMax="2500000"/>
Note: for Concentrator and Broker the "IndexNone" should be set to "IndexValues"
-->
<declaration>
<!--
****
META DECLARATIONS
****
-->
<meta name="fromaddy" key="email.fromaddy" format="Text" />
<meta name="toaddy" key="email.toaddy" format="Text" />
<meta name="misc" key="soc.misc" format="Text" />
<meta name="emaildomain" key="email.domain" format="Text" />
<meta name="mailer" key="email.mailer" format="Text" />
<meta name="content" key="email.content" format="Text" />
<meta name="desc" key="email.desc" format="Text" />
<meta name="disp" key="email.disp" format="Text" />
<!--
****
STRINGS
****
-->
<string name="myString_From" />
<string name="myString_Temp" />
<string name="myString_Temp2" />
<string name="myString_Empty" scope="constant"/>
<!--
****
NUMBERS
****
-->
<number name="myNum_offset"/>
<number name="myNum_offset2"/>
<number name="myNum_offset3"/>
<number name="myNum_FoundEHLO" scope="stream"/>
<!--
****
TOKENS
****
-->
<token name="myToken_EHLO" value="EHLO" options="linestart" />
<token name="myToken_EHLO" value="HELO" options="linestart" />
<token name="myToken_FROM" value="MAIL FROM:<" options="linestart"/>
<token name="myToken_TO" value="RCPT TO:<" options="linestart"/>
<token name="myToken_Return" value="Return-Path:" options="linestart"/>
<token name="myToken_Mailer" value="X-mailer: " options="linestart" />
<token name="myToken_Mailer" value="X-Mailer: " options="linestart" />
<token name="myToken_ContentType" value="Content-Type:" options="linestart" />
<token name="myToken_DescType" value="Content-Description:" options="linestart" />
<token name="myToken_DispType" value="Content-Disposition:" options="linestart" />
</declaration>
<!--
****
FILTER MATCH
****
-->
<match name="myToken_EHLO">
<assign name="myNum_FoundEHLO" value="1"/>
</match>
<!--
****
REMAINING MATCHES
****
-->
<match name="myToken_FROM">
<if name="myNum_FoundEHLO" notequal="1">
<end/>
</if>
<find name="myNum_offset" value=">" length="255">
<read name="myString_From" length="$myNum_offset">
<if name="myString_From" notequal="$myString_Empty">
<register name="fromaddy" value="$myString_From" />
<!--
###########################
Find your corporate email domain and register direction
This part of the parser checks for the existance of your company domin, you need to replace "mycompany.com" below with either the keyword or full domiain name.
i.e. google.com or google.co.uk, if your email server hosts multiple domains you will need to shorten the search to something common. in this case just "google"
###########################
-->
<find in="$myString_From" value="mycompany.com" name="myNum_offset2" >
<register name="misc" value="FromInternal" />
</find>
<!-- This section extracts just the part that follows the @ sign. very handy if you want to see all emails going to a damain but dont care who they were too or from-->
<move value="$myNum_offset" direction="reverse">
<find in="$myString_From" value="@" name="myNum_offset2" >
<increment name="myNum_offset2" value="1" />
<move value="$myNum_offset2" />
<decrement name="myNum_offset" value="$myNum_offset2" />
<read name="myString_Temp2" length="$myNum_offset" >
<register name="emaildomain" value="$myString_Temp2" />
</read>
</find>
</move>
<assign name="myString_Temp" value="$myString_Empty"/>
<assign name="myString_Temp2" value="$myString_Empty"/>
</if>
</read>
</find>
</match>
<match name="myToken_TO">
<if name="myNum_FoundEHLO" notequal="1">
<end/>
</if>
<find name="myNum_offset" value=">" length="255">
<read name="myString_Temp" length="$myNum_offset">
<if name="myString_Temp" notequal="$myString_Empty">
<register name="toaddy" value="$myString_Temp" />
<!--
###########################
Find your corporate email domain and register direction
This part of the parser checks for the existance of your company domin, you need to replace "mycompany.com" below with either the keyword or full domiain name.
i.e. google.com or google.co.uk, if your email server hosts multiple domains you will need to shorten the search to something common. in this case just "google"
###########################
-->
<find in="$myString_Temp" value="mycompany.com" name="myNum_offset2" >
<register name="misc" value="ToInternal" />
</find>
<!-- This section extracts just the part that follows the @ sign. very handy if you want to see all emails going to a damain but dont care who they were too or from-->
<move value="$myNum_offset" direction="reverse">
<find in="$myString_Temp" value="@" name="myNum_offset2" >
<increment name="myNum_offset2" value="1" />
<move value="$myNum_offset2" />
<decrement name="myNum_offset" value="$myNum_offset2" />
<read name="myString_Temp2" length="$myNum_offset" >
<register name="emaildomain" value="$myString_Temp2" />
</read>
</find>
</move>
<assign name="myString_Temp" value="$myString_Empty"/>
</if>
</read>
</find>
</match>
<!--
extracts the mail client used if proceeded by "X-mailer:"
-->
<match name="myToken_Mailer">
<if name="myNum_FoundEHLO" notequal="1">
<end/>
</if>
<find name="myNum_offset" value="
" length="64">
<read name="myString_Temp" length="$myNum_offset">
<if name="myString_Temp" notequal="$myString_Empty">
<register name="mailer" value="$myString_Temp" />
<assign name="myString_Temp" value="$myString_Empty"/>
</if>
</read>
</find>
</match>
<!--
extracts the content type used if proceeded by "Content-Type:"
-->
<match name="myToken_ContentType">
<if name="myNum_FoundEHLO" notequal="1">
<end/>
</if>
<find name="myNum_offset" value="
" length="64">
<read name="myString_Temp" length="$myNum_offset">
<if name="myString_Temp" notequal="$myString_Empty">
<register name="content" value="$myString_Temp" />
<assign name="myString_Temp" value="$myString_Empty"/>
</if>
</read>
</find>
</match>
<!--
extracts the email content description used if proceeded by "Content-Description:"
note: in most cases this will be a file name
-->
<match name="myToken_DescType">
<if name="myNum_FoundEHLO" notequal="1">
<end/>
</if>
<find name="myNum_offset" value="
" length="64">
<read name="myString_Temp" length="$myNum_offset">
<if name="myString_Temp" notequal="$myString_Empty">
<register name="desc" value="$myString_Temp" />
<assign name="myString_Temp" value="$myString_Empty"/>
</if>
</read>
</find>
</match>
<!--
extracts the content type used if proceeded by "Content-Type:"
-->
<match name="myToken_DispType">
<if name="myNum_FoundEHLO" notequal="1">
<end/>
</if>
<find name="myNum_offset" value="
" length="64">
<read name="myString_Temp" length="$myNum_offset">
<if name="myString_Temp" notequal="$myString_Empty">
<register name="disp" value="$myString_Temp" />
<assign name="myString_Temp" value="$myString_Empty"/>
</if>
</read>
</find>
</match>
</parser>
</parsers>
This parser has already given me a massive improvement on the visability of email to and from the company I work at... I hope it does the same for you.
2013-09-03 11:17 AM
Thanks for sharing ! It is a good solution to get information on devices that write a single event on multi-lignes log file.