Microsoft has been converting customers to O365 for a while, as a result more and more traffic is being routed from on-premise out to Microsoft clouds potentially putting it into visibility of NetWitness. Being able to group that traffic into a bucket for potential whitelisting or at the very least identification could be a useful ability.
Microsoft used to provide an XML file for all the required IPv4, IPv6 and URLs that were required for accessing their O365 services. This is being deprecated in October of 2018 in favor of API access.
This page gives a great explainer on the data in the API and how to interact with it as well as a python and Powershell script to grab data for use in firewalls etc.
The powershell script is where I started so that a script could be run on client workstation to determine if there was any updates and then apply the relevant data to the NW environment. Eventually, hopefully this gets into the generic whitelisting process that is being developed so that it is programatically delivered to NW environments.
The script provided by Microsoft was modified to create 3 output files for use in NetWitness
the IP feeds are in a format that can be used as feeds in NetWitness, the github link with the code provides the xml for them to map to the same keys as the lua parser so there is alignment between the three.
the o365urlOut.txt is used in a lua parser to map against the alias.host key. The reason the lua parser was used is as a result of a limitation of the feeds engine which prevents wildcard matching. The matches in feeds need to be exact, and some of the hosts provided by the feeds are *.domain.com. The Lua parser attempts to match direct exact match first then falls back to subdomain matches to see if there are any hits there.
The Lua parser has the updated host list as of the published version, as Microsoft updates their API the list needs to be changed. Thats where the PS1 script comes in. That can be run from client workstation, the output txt file then opened up if there are changes and the text copied to the decoder > config > files tab and replace the text in the parser to include any changes published. The decoder probably needs to have the parsers reloaded which can be done from REST or explore menu to reload the content into the decoder. You can also push the updated parser to all your other Log and Packet decoders to keep them up to date as well.
The output of all the content is data in the filter metakey
My knowledge of powershell is pretty close to 0 at the beginning of this exercise, now it's closer to 0.5.
To Do Items you can help with:
Ideally i would like the script to output the serviceArea of each URL or IP network so that you can tell which service from O365 the content belongs to to give you more granular data on what part of the suite is being used.
If you know how to modify the script to do this, more than happy to update the script to include those. Ideally 3-4 levels of filter would be perfect.
would be sufficient granularity i think
Changes you might make:
The key to read from is alias.host, if you have logs that write values into domain.dst or host.dst that you want considered and you are on NW11 you can change the key to be host.all to include all of those at once in the filtering (just make sure that key is in your index-decoder-custom.xml)
Benefits of using this:
Ability to reduce the noise on the network for known or trusted communications to Microsoft that could be treated as lower priority. Especially when investigating traffic outbound and you can remove known O365 traffic (powershell from endpoint to internet != microsoft)
As any FYI, so far all the test data that I have lists the outbound traffic as heading to org.dst='Microsoft Hosting', i'm sure on wider scale of data that isn't true, but so far the whitelist lines up 100% with that org.dst.