Query Aggregates 

This section explains the supported aggregate functions.

Supported Aggregate Functions

The following table lists the supported Aggregate Functions.

Aggregate Function Description Input data types Output data types
count Returns the count of meta values, which includes duplicate values as well. Numeric Numeric
countdistinct Returns the total number of distinct or unique values. Numeric Numeric
distinct Returns all the unique values. Any Any
first Returns the first occurrence of the meta value. Any Same as input
last Returns the last occurrence of the meta value. Any Same as input
sum Returns a sum of all non-NULL values of metaKey in a group. Numeric Numeric
avg (Average) Returns the average value of all non-NULL values of the metaKey within a group. Numeric Numeric
min (Minimum) Returns the minimum for all values of metaKey in each group. This value is based on order by field. Any Any
max (Maximum) Returns the maximum for all values of metaKey in each group. The maximum value is the value that is returned by order by field. Any Any
length Returns the length of the values of metakey. This is called a "scalar function" in SQL. Any Numeric

Examples of Queries and Results per Function

Count

This function returns the number of values for a specified meta key, that exclude null values but include duplicate ones. .

Example

The following figure shows a sample query for count function used for the destination IP and the respective source IP.

netwitness_110_count_function.png

The following figure shows the result for the above query.

netwitness_110_count_ip.png

Here, for each unique ip.src (source IP), the page returns the total number or count of ip.dst (destination IP) values, which include the duplicate values as well.

Note: If your NetWitness is currently on 10.5 or newer version and any of the NetWitness Core devices are on 10.3 or 10.4 versions, then some of the aggregate functions may display unexpected errors. However, aggregate functions such as sum() and count() are supported in 10.4 version.

Countdistinct

The countdistinct function returns the count of unique or distinct values for the metakey. In other words, countdistinct function can be used to retrieve a number of distinct values for the specified metakey.

The following figure shows a sample query where the countdistinct function is used along with IP source (ip.src) and data size(size).

Example

netwitness_110_countdistinct_func.png

The following figure shows the result for the above query.
netwitness_110_countdist_result.png

Here, the page displays the data size along with the total number or count of distinct filenames from the respective IP source. Unlike the count function, the countdistinct excludes the duplicate values from the result.

Distinct

This function returns all the unique or distinct values of the metakey.

Example

The following figure shows a sample query for distinct function used to retrieve e-mails, between various source and destination IP (ip.dst).

netwitness_110_distnct_func_query.png

The following figure shows the result for the above query.

netwitness_110_distinct_function.png

Here, the page displays the list of unique e-mails that were exchanged between the respective IP source and destination.

First

This function is used to retrieve the first value from an ordered sequence of values for a specified metakey.

Example

The following figure shows a sample query for first function used to retrieve the first destination city name.

netwitness_110_firstfunc_city.png

The following figure shows the result for the above query.

netwitness_110_first_city.png

Here, the page displays the the first destination city for the corresponding source and destination IP. You can use the first function to isolate a particular value from a search result.

Last

This function is used to retrieve the last value from an ordered sequence of values for a specified metakey.

Example

The following figure shows a sample query for last function used to retrieve the most recent user name.

netwitness_110_lastfunc_meta.png

The following figure shows the result for the above query.

netwitness_110_last_fullname.png

Here, the page displays the list of most recent or last usernames in full, that were exchanged between the source and destination IP.

Sum

This function returns the total of the non-NULL values of the metaKey within a group.

Example

The following figure shows the query for the Sum function used for packets.

netwitness_110_sum_packs.png

The following figure shows the result of the above query.

netwitness_110_sum_packets.png

Here the page displays the total or sum of the packets along with the size of the data for the respective destination country.

Avg

The average function returns the average of non-NULL values of the meta within a group.

Example

The following figure shows a sample query for average data size transmitted between a source and destination IP.

netwitness_110_avg_thresh.png

The following figure shows the result for the above query.

netwitness_110_avgmeta_val.png

Here, the page displays the average size of data exchanged between source and destination IP:

Max and Min

Max and Min functions provide the maximum and minimum for given values of a meta respectively.

The following figure shows a sample query for max and min functions for various data sizes, for source IP and destination country.

Example

netwitness_110_maxmin_meta.png

The following figure shows the result for the above query.

netwitness_110_max_min_res.png

Here, the page displays the max(size) and min(size) columns, along with the list of source IP and destination country. The max(size) column lists the maximum data sizes exchanged while the min(size) column lists the minimum data sizes that were exchanged.

Filter aggregate meta results with Max_threshold

You can further filter the results of any function by using the threshold rule action.

Example
Following is a sample query for max_threshold used along with the Max function in the Then field is:
max_threshold(5000,max(size))

The following figure shows the Build Rule screen for the above query.

netwitness_110_max_thresh_qry.png

Here the max_threshold is applied for data size with an upper limit of 5000. The following figure shows the result.

netwitness_110_max_thresh.png

Here, the result page displays the max(size) column, that lists the data sizes lesser than 5000 as this is the maximum threshold in the query, along with the corresponding IP source and the respective directory.

Filter aggregate meta results withMin_threshold

Similarly, min_threshold is used to filter the results for any function. A similar scenario as max_threshold is considered to explain this.

Example
Query for min_threshold used along with the Max function in the Then field is:
min_threshold(5000,max(size))

The following figure shows the Build Rule screen for the above query.

netwitness_110_min_thresh_qry.png

Here the min_threshold is applied for data size with a lower limit of 5000. The following figure shows the result.

netwitness_110_min_thresh.png

Here, the result page displays the max(size) column, that lists the data sizes greater than 5000 as this is the minimum threshold in the query, along with the corresponding IP source and the respective directory.

Note: Max_threshold and Min_threshold rule actions are common across all the functions, and can be used along with the other queries in the Then field to retrieve the respective output.

Length

This function returns the length of a meta value. In other words, Length function returns the number of bytes used to store the actual value.
For instance, for the value "Analytics" it returns the length as 9. Similarly, for an IPv4 ip.src, it returns 4 (representing 4 bytes).

Example

The following figure shows a sample query for the length function used for usernames.

netwitness_110_len_func.png

The following figure shows the result for the above query.

netwitness_110_len_meta.png

Here, the page displays the length of the usernames associated with the user account and their respective source IP.

Additional Information

When you query for aggregates (E.g. sum(size)) with Group By on a meta which has multiple values in a session, then the session with multiple values is accounted for aggregate calculation for each value of that meta.

Example

When you query for the Count aggregate function with Group By on Alias.host and if the column has multiple values in a session, then the session is counted for each occurrence, including the duplicate values.

Consider the following table.

SessionID Alias.host Ip.src Size

1

host-a, host-b, host-a

a

10

2

host-b, host-c, host-a, host-c

c

20

3

host-b, host-c, host-d

b

30

4

host-c, host-a

a

40

In the above table, alias.host for host-a and host-c has duplicate values listed for a single session. Let us consider the following query:

Select : alias.host, count(ip.src), sum(size)
Group By : alias.host

Here, host-a and host-c are present in 3 sessions and they are duplicated for two different sessions. However, the output is as shown below.

Alias.host count(Ip.src) Sum (size)

host-a

4

80

host-b

3

60

host-c

4

110

host-d

1

30

Output table shows that the count for host-a and host-c is 4. This is because for each alias.host value, the entire session is considered. Similarly to calculate sum (size), the same sessions are considered for each alias.host value.

In the report output if the number of rows has reached NWDB maximum aggregate rows defined in RE configuration, then a message Max Aggregate Row Limit Reached is displayed to indicate that there is more information to be displayed. The default limit is 1000, and you can change this value as per your requirement, in the Reporting Engine Configuration page .

110_NWDBagglimit.png