2016-03-20 07:48 AM
Hello,
I have some usefull data in user.dst and OS metakeys in SA from one event souce (this data I receive from 3rd feed). I wish feed this data (user.dst and OS) some events where user.dst exist. How I can do it? How I can extract this data like CSV from SA and the feed some events like new feed?
2016-03-24 12:22 PM
Here is version 2
This will generate up to 5 different feeds (configurable) so multiple values for a metakey can be captured. I also dont output any lines that have a blank result.
You can monitor progress of the script either my uncommenting all the #echo lines or taking at the files in /tmp/*result.txt
If the command to get the result times out with a 408 response, then the script will keep retrying until it gets an answer.
I might be a bit quiet for the next two weeks or so as I am off on my Easter break. Post any feedback and I will pick it up on my return.
#!/bin/bash
#
# SCRIPT: MakeFeed.sh
# AUTHOR: David Waugh RSA Technical Support
# DATE: 24 March 2016
# REV: 2
#
# PLATFORM: Centos 6
#
# PURPOSE:
# LICENSE: GNU Public License v2 (http://gnu.org/licenses/)
# Copyright (C) 2016
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
# or FITNESS FOR A PARTICULAR PURPOSE.
#
# See the GNU General Public License for more details.
#
# Changes:
# 1.2 adds &expirey=600 to prevent queries giving a 408 TimeOut error Message
# adds the ability for up to 5 different values of a metakey. We will create a seperate feed for each set of values
# For example output myfeed1.csv will contain the first set of values. myfeed2.csv will contain the second set of values as so on.
#
###################################################################################
##### PARAMETERS THAT CAN BE CHANGED
MAXRESULTS=200
USER=admin
PASSWORD=netwitness
BROKER=192.168.123.249:50103
OUTPUTFEEDFILE=/var/www/html/myfeed
OUTPUTFEEDFILEEXT=.csv
MAXVALUESPERKEY=5
# The Callback key we will use in our feed
LINKFIELD="ip.src"
#The Meta Keys that we want in our feed
METAKEYS="os.name os.platform os.sname os.type os.ver brand model bot.info clientinfo.type clientinfo.name clientinfo.sname clientinfo.ver clientinfo.plat"
###### END PARAMETERS THAT CAN BE CHANGED
rawurlencode() {
local string="${1}"
local strlen=${#string}
local encoded=""
local pos c o
for (( pos=0 ; pos<strlen ; pos++ )); do
c=${string:$pos:1}
case "$c" in
[-_=.:/\&\?~a-zA-Z0-9] ) o="${c}" ;;
* ) printf -v o '%%%02x' "'$c"
esac
encoded+="${o}"
done
echo "${encoded}" # You can either set a return variable (FASTER)
REPLY="${encoded}" #+or echo the result (EASIER)... or both... :p
}
selectdistinct() {
# This function takes a string of words "the lazy brown fox"
# It outputs "disinct(the),distinct(lazy),distinct(brown),distinct(fox)"
OUTPUT=""
for word in ${1}
do
OUTPUT+=$(echo "distinct($word),")
done
#Remove the last comma
SELECTDISTINCT=$(echo "${OUTPUT%?}")
}
blankresult() {
#This function takes a string of words "the lazy brown fox"
# It output "^^^" basically a ^ for each space
OUTPUT=""
for word in ${1}
do
OUTPUT+=$(echo "\^")
done
BLANKRESULT=$OUTPUT
}
CurrentTime=$(date -u +"%Y-%b-%d %H:%M:%S")
#echo "Current Time is: " $CurrentTime
PreviousTime=$(date -u -d '1 hour ago' +"%Y-%b-%d %H:%M:%S")
#echo "1 Hour Ago was: " $PreviousTime
EXISTS=$(echo "$METAKEYS" |sed 's/ / exists ||/g')
SELECT=$(echo "$METAKEYS" |sed 's/ /,/g')
selectdistinct "$METAKEYS"
EXISTS="$EXISTS exists"
#echo "EXISTS: " $EXISTS
#echo "SELECT: " $SELECT
#echo "SELECTDISTINCT:" $SELECTDISTINCT
blankresult "$METAKEYS"
#echo "BLANKRESULT: " $BLANKRESULT
# First Get All Link Fields where our Meta Keys Exist
#URL="http://$BROKER/sdk?msg=query&expirey=600&size=$MAXRESULTS&query=select distinct($LINKFIELD) where $EXISTS&time='$PreviousTime'-'$CurrentTime'&force-content-type=text/plain"
#echo "URL: " $URL
URL=$( rawurlencode "$URL")
#echo "Encoded URL: " $URL
curl -s --user "$USER:$PASSWORD" "$URL" |grep value |cut -d " " -f 9 |cut -d "=" -f 2 >/tmp/linkfields.txt
#curl -s --user "$USER:$PASSWORD" "$URL" >/tmp/linkfields.output
# Print out the header for our output feed files
COUNTER=1
while [ $COUNTER -le $MAXVALUESPERKEY ]; do
echo "#"$LINKFIELD,$SELECT > $OUTPUTFEEDFILE$COUNTER$OUTPUTFEEDFILEEXT
let COUNTER=COUNTER+1
done
while read link; do
# We know that our linkfield has values for out METAKEYS so lets get the values over the last hour
URL="http://$BROKER/sdk?msg=query&expirey=600&size=$MAXRESULTS&query=select $SELECTDISTINCT where $LINKFIELD=$link&time='$PreviousTime'-'$CurrentTime'&force-content-type=text/plain"
URL=$( rawurlencode "$URL")
# Our Result will contain ^M Line endings so we replace these with newline characters
(curl -s --user "$USER:$PASSWORD" "$URL" >/tmp/$link.result.txt)
# We shouldnt, but sometimes we are seeing 408 Request Timeout so resubmit query if we do
while grep -Fxq "408 Request Timeout: Request Timeout" /tmp/$link.result.txt;
do
#echo "408 Request Timeout: Request Timeout FOUND: " $link " Will retry "
(curl -s --user "$USER:$PASSWORD" "$URL" >/tmp/$link.result.txt)
done
dos2unix -q -o /tmp/$link.result.txt
COUNTER=1
while [ $COUNTER -le $MAXVALUESPERKEY ]; do
OUTPUT=$link
OUTPUT+="^"
for meta in $METAKEYS
do
#cat /tmp/$link.result.txt |grep value |awk -v var=$meta -v count=$COUNTER '$0 ~ var {i++}i==count{print; exit}' >/tmp/$link.$meta.$COUNTER.output.txt
OUTPUT+=$(cat /tmp/$link.result.txt |grep value |awk -v var=$meta -v count=$COUNTER '$0 ~ var {i++}i==count{print; exit}' |cut -d "=" -f 6 |sed 's/ type//' |xargs )
OUTPUT+="^"
done
echo $OUTPUT |grep -v "$BLANKRESULT" >> $OUTPUTFEEDFILE$COUNTER$OUTPUTFEEDFILEEXT
let COUNTER=COUNTER+1
done
rm -rf /tmp/$link.result.txt
done </tmp/linkfields.txt
2016-03-24 12:27 PM
I've found a problem where the metakeys are getting jumbled up. Taking a look....
In feed1.csv you may get Chrome and Version 11
In feed2.csv you may get Internet Explorer and 49.0
This isnt really a problem, but make sure that you use all the feeds generated so that the IP address will be associated with Chrome Version 49 and Internet Explorer Version 11.
This is because my results output looks as follows
more /tmp/192.168.202.1.result.txt
[id1=23129796050 id2=23129796049
id1=22880915678 id2=23096608016 count=5264 format=65 value=GNU/Linux type=os.name flags=6 group=0
id1=20351595283 id2=23129793200 count=816154 format=65 value=Windows type=os.name flags=6 group=0
id1=20351595286 id2=23129793201 count=821271 format=65 value=x64 type=os.platform flags=6 group=0
id1=22665076930 id2=23114088008 count=15 format=65 value=x86 type=os.platform flags=6 group=0
id1=22880915680 id2=23096608018 count=5264 format=65 value=LIN type=os.sname flags=6 group=0
id1=20351595284 id2=23129793202 count=816154 format=65 value=WIN type=os.sname flags=6 group=0
id1=22872698884 id2=23129793203 count=25402 format=65 value=7 type=os.ver flags=6 group=0
id1=20351595285 id2=23129793196 count=795860 format=65 value=8.1 type=os.ver flags=6 group=0
id1=22665076929 id2=23114088007 count=15 format=65 value=95 type=os.ver flags=6 group=0
id1=23070028924 id2=23096608020 count=5255 format=65 value=Generic Bot type=bot.info flags=6 group=0
id1=20351595279 id2=23129793204 count=821262 format=65 value=browser type=clientinfo.type flags=6 group=0
id1=20572506069 id2=22934192278 count=23 format=65 value=library type=clientinfo.type flags=6 group=0
id1=20351595280 id2=23129793198 count=800252 format=65 value=Chrome type=clientinfo.name flags=6 group=0
id1=20890236649 id2=23102677369 count=665 format=65 value=Firefox type=clientinfo.name flags=6 group=0
id1=20355831800 id2=23129793205 count=20345 format=65 value=Internet Explorer type=clientinfo.name flags=6 group=0
id1=22880915682 id2=22909726904 count=9 format=65 value=Java type=clientinfo.name flags=6 group=0
id1=20572506070 id2=22934192279 count=14 format=65 value=Python urllib type=clientinfo.name flags=6 group=0
id1=20351595281 id2=23129793199 count=800252 format=65 value=CH type=clientinfo.sname flags=6 group=0
id1=20890236650 id2=23102677370 count=665 format=65 value=FF type=clientinfo.sname flags=6 group=0
id1=20355831801 id2=23129793206 count=20345 format=65 value=IE type=clientinfo.sname flags=6 group=0
id1=23070028928 id2=23096608024 count=5255 format=65 value=1.7.0 type=clientinfo.ver flags=6 group=0
id1=22880915683 id2=22909726905 count=9 format=65 value=1.8.0 type=clientinfo.ver flags=6 group=0
id1=20355831802 id2=23129793207 count=21146 format=65 value=11.0 type=clientinfo.ver flags=6 group=0
id1=20572506071 id2=22934192280 count=14 format=65 value=2.1 type=clientinfo.ver flags=6 group=0
id1=20890236651 id2=22606314181 count=583 format=65 value=43.0 type=clientinfo.ver flags=6 group=0
id1=22837443520 id2=23102677371 count=82 format=65 value=45.0 type=clientinfo.ver flags=6 group=0
id1=20355463257 id2=23129791638 count=179529 format=65 value=48.0.2564.116 type=clientinfo.ver flags=6 group=0
id1=20901869436 id2=23129791579 count=495248 format=65 value=49.0.2623.87 type=clientinfo.ver flags=6 group=0
devices="192.168.123.240:56005=on 192.168.123.6:56005=on"
offlineDeviceCount=0
onlineDeviceCount=2
deviceElapsedTime="192.168.123.240:56005=00:00:08 192.168.123.6:56005=00:00:13"
]
2016-03-24 01:15 PM
Nope this isnt producing the correct information. I'm going to have a think...
2016-03-24 01:25 PM
Looks like there is something wrong in my select distinct query.
2016-03-24 02:26 PM
David,
Are you sure that this way (through script) is a better way to create this feed? Maybe take a look from another side? Maybe create report and save result as CSV and then via Other options put CSV on the URL or network share?
When we receive a correct feed how we can upload this feed and not duplicate the data in metakeys from microsoftiis? We will use a Transient flag?
2016-03-24 04:08 PM
2016-03-25 05:44 AM
Hello David,
I have no result. Something wrong. Could you try it with only one metakey like OS?
2016-03-25 08:03 AM
2016-03-25 10:42 AM
nice, simple and a lot more user friendly than mine!
2016-03-25 03:33 PM
David,
Do you know how prevent duplicate data on events where this metakey with same data exist? Maybe I can feed date with some clauses?