This section provides information about possible issues when using NetWitness UEBA.

Task Failure Issues in Airflow

Problem

The userId_output_entities task fails when the username contains a backslash.

Cause

When events with usernames containing a backslash character is passed through UEBA, then the userId_output_entities task fails.

Solution

To resolve these issue contact the customer success to obtain the relevant files and execute the following steps:

  • Stop airflow-scheduler service.
  • Remove all MongoDB documents in the "aggr", "accm" and "input" collections that contains context.userId with hashtag. These documents can be located using the FindCollecionsContainsBackslash.js script.
  • Replace the /var/netwitness/presidio/flume/conf/adapter/transformers/authentication.json file with the updated authentication .json.
  • Restart the airflow-scheduler service.
  • Validate that the next run of the userId_output_entities task is completed successfully.

 

Problem

The AUTHENTICATION_userId_build_feature_historical_data task fails when the username contains a hashtag.

Cause

When events with usernames containing a hashtag character is passed through UEBA, then the AUTHENTICATION_userId_build_feature_historical_data task fails.

Solution

To resolve these issue contact the customer success to obtain the relevant files and execute the following steps:

  • Stop airflow-scheduler service.
  • Remove all MongoDB documents in the "aggr", "accm" and "input" collections that contains context.userId with hashtag. These documents can be located using the FindCollecionsContainsHashtagContextUserId.js script.
  • Replace the /var/netwitness/presidio/flume/conf/adapter/transformers/authentication.json file with the updated authentication .json.
  • Restart the airflow-scheduler service.
  • Validate that the next run of AUTHENTICATION_userId_build_feature_historical_data task is completed successfully.

 

Problem

The task output_forwarding_task fails in Airflow UI for userId_hourly_ueba_flow DAG due to Elasticsearch 'too many clauses' exception.

Cause

The output_forwarding_task task in the userId_hourly_ueba_flow DAG fails in the Airflow UI. The failure is caused by an Elasticsearch exception with the following message: "caused_by":{"type":"too_many_clauses","reason":"maxClauseCount is set to 1024"}. The too_many_clauses error occurs when the number of clauses in an Elasticsearch query exceeds the maximum limit set by the system. In this case, the maximum number of clauses was set to 1024. The output_forwarding_task exceeded this limit, which caused the failure.

Solution

To increase the max clause count value, execute the following steps:

  1. SSH to UEBA server.

  2. Open the /etc/elasticsearch/elasticsearch.yml file.

  3. Update the max clause count parameter value:

    indices.query.bool.max_clause_count: 1500

  4. Restart the elasticsearch service using the following command:

    systemctl restart elasticsearch

    Note: After restarting, the task may fail and will be automatically retried.

 

Problem

The TLS model is taking too long to complete tasks.

Cause

If the TLS_raw_model_task and TLS_aggr_model tasks take more than 20 hours to process the data.

Solution

You need to enable the configuration in application.properties file to improve the processing time by executing the following steps.

  1. SSH to UEBA server.

  2. Navigate to the following path:

    /etc/netwitness/presidio/configserver/configurations/application.properties

  3. Add or update the following configuration:

    presidio.ade.enable.short.time.range=true

Note: This is only applicable to the 12.3 version.

 

Problem

Out of Memory exception in Model DAG.

Cause

When you encounter an out of memory exception in any model DAG, it may be because of I/O operations with the database in a single thread. It could also be due to insufficient memory allocated for the respective operator of the task.

Solution

To prevent out of memory exceptions on any of the model DAG, add or modify the configuration in the application.properties file. Follow these steps to make the required changes:

  1. SSH to UEBA server.

  2. Navigate to the following path:

    /etc/netwitness/presidio/configserver/configurations/application.properties

  3. Do one of the following based on your requirement:

    • Add the parameter presidio.model.store.size=5000.

    • Change the size of parameter presidio.model.store.size to 5000.

Note: This is only applicable to the 12.3.1 version.

 

Problem

Invalid username and password are displayed in Adapter dags after upgrading UEBA to the 12.3.1 version.

Cause

After upgrading UEBA to version 12.3.1, UEBA is unable to retrieve data from the Broker because the Broker password contains special characters such as (@/! ), which prevent it from being parsed.

Solution

Complete the following steps to reset the password of the Broker service.

Note: NetWitness recommends you use only alpha-numeric characters and no special characters for the Broker service.

  1. Log in to the NetWitness Platform.

  2. Go to the AdminIcon.png (Admin) > Services page.

  3. In the Services list, select the Broker service and click actions_button.png > View > Explore.

  4. On the Explore page of Broker service, in the left panel, click users > accounts > uebauser > config and set the new password in the Password (password) field.

  5. Go to the Broker > Security > select the uebauser, enter the new password in the Password and Confirm Password fields.

  6. Restart the Broker service.

  7. Run the ueba-server-config script with a new username and password:

    sh /opt/rsa/saTools/bin/ueba-server-config -u uebauser -p <password> -h <Admin Server/Head node IP> -t <Start time> -s '<Schemas>' -o broker -e -v

    For example, sh /opt/rsa/saTools/bin/ueba-server-config -u uebauser -p Netwitness12 -h 10.0.0.0 -t 2023-11-20T00:00:00Z -s 'AUTHENTICATION FILE ACTIVE_DIRECTORY PROCESS REGISTRY TLS' -o broker -e -v

 

Problem

Task failure in root DAG due to Airflow issue.

Cause

In the root DAG, one of the tasks failed unexpectedly due to an existing issue with the airflow system.

task_root.png

Solution

Whenever this issue occurs, you must examine the logs related to the specific failed task and verify whether the following log entry is present: dagrun.py:465} INFO - (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "task_instance_pkey". In such situations, you can safely ignore this issue as it does not require further action.

This issue does not impact the functionality of UEBA.

MongoDB I/O Operations Slowness Issue

 

Problem

Increased execution time for DAGs with Mongo I/O Operations.

Cause

Some of the DAGs in the system experienced increased execution time due to slow MongoDB Input/Output (I/O) operations.

Solution

To increase the MongoDB cache size in the MongoDB config file, execute the following steps:

  1. SSH to UEBA server.

  2. Open the /etc/mongod.conf file.

  3. Update the internalQueryMaxBlockingSortMemoryUsageBytes value to 1GB (1053554432 bytes).

    internalQueryMaxBlockingSortMemoryUsageBytes: 1053554432

  4. Restart the Mongod service using the following command:

    systemctl restart mongod

    Note: After restarting, the task may fail and will be automatically retried.

 

User Interface Inaccessible

Problem

The User Interface is not accessible.

Cause You have more than one NetWitness UEBA service existing in your NetWitness deployment and you can only have NetWitness UEBA service in your deployment.
Solution

Complete the following steps to remove the extra NetWitness UEBA service.

    1. SSH to NW Server and run the following commands to query the list of installed NetWitness UEBA services.
      # orchestration-cli-client --list-services|grep presidio-airflow
      ... Service: ID=7e682892-b913-4dee-ac84-ca2438e522bf, NAME=presidio-airflow, HOST=xxx.xxx.xxx.xxx:null, TLS=true
      ... Service: ID=3ba35fbe-7220-4e26-a2ad-9e14ab5e9e15, NAME=presidio-airflow, HOST=xxx.xxx.xxx.xxx:null, TLS=true
    2. From the list of services, determine which instance of the presidio-airflow service should be removed (by looking at the host addresses).

    3. Run the following command to remove the extra service from Orchestration (use the matching service ID from the list of services):
      # orchestration-cli-client --remove-service --id <ID-for-presidio-airflow-form-previous-output>

Note: Run the following command to update NW Server to restore NGINX:
# orchestration-cli-client --update-admin-node

  1. Log in to NetWitness, go to AdminIcon_25x22.png(Admin) > Hosts, and remove the extra NetWitness UEBA host.

Get UEBA Configuration Parameters

Issue

How to get UEBA configuration parameters?

Explanation

In order to get the UEBA configuration main parameters, run the curl http://localhost:8888/application-default.properties command from the UEBA machine.
ParaConf_1453x584.png
The main parameters which will be returned are the following:

  • uiIntegration.brokerId: The Service ID of the NW data source (Broker / Concentrator)
  • dataPipeline.schemas: List of schemas processed by the UEBA
  • dataPipeline.startTime: The date the UEBA started consuming data from the NW data source
  • outputForwarding.enableForwarding: The UEBA Forwarder status
Resolution

See the resolution for these statistics in the Troubleshooting UEBA Configurations section.

 

Check UEBA Progress Status using Airflow:

Issue

How to check UEBA progress status using Airflow?

 

Note: To access the Airflow UI, you must use the deploy_admin credentials.

  1. Navigate to- https://<UEBA-host-name>/admin. Enter the admin username and the deploy-admin password. The following image is of the Airflow home page that shows the system is working as expected.
    UEBAArFlw_531x269.png
  2. Make sure that no red or yellow circles appear in the main page:

    • red circle indicates that a task has failed.
    • yellow circle indicates that a task has failed and is “awaiting” for a retry.
    • If a “failed” or “up-for-retry” task appears, investigate what is the root cause of the problem.
  3. Make sure the system continues to run.

  4. Tap the Browse button and select Task Instance.

  5. Add the following filters: State = running and Pool = spring_boot_jar_pool.
    The Task Instance page is displayed.
    UEBAStatus_403x117.png

The Execution Date column shows the current time window for each running task. Make sure the execution date is greater than the UEBA start-date and that new tasks have an updated date are added to the table.

Resolution
 

 

Check if data is received on the UEBA by Kibana:

Issue How to check if data is received on the UEBA by Kibana:
Explanation

Note: To access the Kibana UI, you must use the deploy_admin credentials.

Navigate to- https://<UEBA-host-name>/kibana. Enter the admin username and deploy-admin password: To check that the data is flowing to the UEBA go to the Adapter Dashboard: Tap the Dashboard tab in the left menu Tap Adapter Dashboard at the right menu Select the relevant time range at the top bar The charts on this dashboard will present you the data that already fetched by the UEBA.
adapter_dashboard_411x180.png

Scaling Limitation Issue

When installed on a Virtual Machine, you can determine the number of network events to be processed by referring to the latest version of the Learning Period Per Scale topic.

Note: If the scaling limits are exceeded, NetWitness recommends provisioning the UEBA on a physical appliance.

Issue

How to determine the scale of network events currently available, to know if it exceeds the UEBA limitation.

Solution

To know the network data limit, perform the following :

  • Run the query on the Broker or Concentrator that connects to UEBA using NetWitness UI:

service=443 && direction='outbound' && analysis.service!='quic' && ip.src exists && ip.dst exists && tcp.srcport!=443

Calculate the total number of events for the selected days (including weekdays with standard workload). To determine the number of network events to be processed on a virtual machine for your environment, always refer to the latest version of the Learning Period for Scale topic.

 

Issue

Can UEBA for Packets be used if UEBA's supported scale is exceeded?

Solution

You must create or choose a Broker that is connected to a subset of Concentrators that does not exceed the supported limit.

To know the network data limit, perform the following :

  • Run the query on the Concentrator that connects to UEBA using NetWitness UI:

service=443 && direction='outbound' && analysis.service!='quic' && ip.src exists && ip.dst exists && tcp.srcport!=443

Calculate the total number of events for the selected days (including weekdays with standard workload). If the average is above 20 million per day then it indicates that UEBA’s supported scale is exceeded.

Note: The Broker must query all the available and needed data needed such as logs, endpoint and network (packets). UEBA packets models are based on the whole environment. Hence, make sure that the data parsed from the subset of Concentrators is consistent.

UEBA Policy Issue

Issue After you create a rule under UEBA policy, duplicate values are displayed in the Statistics drop-down.
Solution

To remove the duplicate values, perform the following:

  1. Log in to MongoDB using following command:mongo admin -u deploy_admin -p {Enter the password}
  2. Run the following command on MongoDB:
    use sms;
    db.getCollection('sms_statdefinition').find({componentId :"presidioairflow"})
    db.getCollection('sms_statdefinition').deleteMany({componentId :"presidioairflow"})

Troubleshoot Using Kibana

Issue

After you deploy NetWitness UEBA, the connection between the NetWitness and NetWitness UEBA is successful but there are very few or no events in the Users > OVERVIEW tab.

  1. Log in to Kibana.
  2. Go to Table of Content > Dashboards > Adapter Dashboard.
  3. Adjust the Time Range on the top-right corner of the page and review the following:
    • If the new events are flowing.
    • In the Saved Events Per Schema graph, see the number of successful events per schema per hour.
    • In the Total Events vs. Success Events graph, see the total number of events and number of successful events. The number of successful events should be more every hour.

    For example, in an environment with 1000 users or more, there should be thousands of authentication and file access events and more than 10 Active Directory events. If there are very few events, there is likely an issue with Windows auditing.

Solution

You must identify the missing events and reconfigure the Windows auditing.

  1. Go to INVESTIGATE > Navigate.
  2. Filter by devide.type= device.type “winevent_snare” or “winevent_nic”.
  1. Review the events using reference.id meta key to identify the missing events.
  2. Reconfigure the Windows auditing. For more information, see NetWitness UEBA Windows Audit Policy topic.

 

Issue

The historical load is complete and the events are coming from Adapter dashboard but no alerts are displayed in the Users > OVERVIEW tab.
Solution
  1. Go to Kibana > Table of content > Scoring and model cache.
  2. Adjust the Time Range from the top-right corner of the page, and see if the events are scored.

 

Issue

The historical load is complete but no alerts are displayed in the Investigate > Users tab.
Solution
  1. Go to Kibana > Dashboard > Overview.

  2. Adjust the Time Range from the top-right corner of the page, and see how many users are analyzed and if any anomalies are found.

Troubleshoot Using Airflow

Issue After you start running the UEBA it is not possible to remove a data source during the run process else the process stops.
Solution

You must either continue the process till it completes or remove the required data source from UEBA and rerun the process.

 

Issue After you deploy UEBA and if there are no events displayed in the Kibana > Table of content > Adapter dashboard and Airflow has already processed the hours but there are no events. This is due to some communication issue.
Solution

You must check the logs and resolve the issue.

  1. Log in to Airflow.
  2. Go to Admin > REST API Plugin.
  3. In the Failed Tasks Logs, click execute.
    A zip file is downloaded.
  4. Unzip the file and open the log file to view and resolve the error.
  5. In the DAGs > reset_presidio, click Trigger Dag.
    This deletes all the data and compute all the alert from the beginning.

Note: During initial installation, if the hours are processed successfully but there are no events, you must click reset_presidio after fixing the data in the Broker. Do not reset if there are alerts.