Appendix C. Troubleshooting Version Installations and Upgrades

This section describes the error messages displayed in the Hosts view when it encounters problems updating host versions and installing services on hosts in the Hosts view. If you cannot resolve an update or installation issue using the following troubleshooting solutions, contact Customer Support.

Troubleshooting instructions for the following errors that may occur during the upgrade are described in this section.

Troubleshooting instructions are also provided for errors for the following hosts and services that may occur during or after an upgrade.

Problem Unable to boot the appliance after upgrading
Wokaround
  1. Manually modify the GRUB boot line to FIPS=0 to get it to boot.

  2. From here, disable FIPS using the following command:

    manage-stig-controls --disable-control-groups 3 --host-all

  3. Verify the line FIPS=1 is removed from /boot/grub2/grub.cfg

    • If not, run the following command:

      grub2-mkconfig -o /boot/grub2/grub.cfg

  4. Reboot.

  5. Run the following command to enable FIPS:

    manage-stig-controls --enable-control-groups 3 --host-all

  6. Reboot again.

deploy_admin User Password Has Expired Error

Error Message

netwitness_credential-expired.png

Cause The deploy_admin user password has expired.
Solution

Reset your deploy_admin password password.

  1. On the NW Server host only, run the following command.
    nw-manage --update-deploy-admin-pw
    Please enter the new deploy_admin account password: <new-deploy-admin-password>
    Please confirm the new deploy_admin account password: <new-deploy-admin-password>
  2. Review the output of the nw-manage --update-deploy-admin-pw command to verify the deploy_admin password was successfully updated on all hosts. If an NW host is down or fails for any reason as displayed by the output of the nw-manage --update-deploy-admin-pw command, run nw-manage --sync-deploy-admin-pw --host-key <host-identifier> to synchronize the password between the NW Server and the host that failed once the communication failure is resolved.
  3. On the host that failed installation or orchestration, run the nwsetup-tui command and use the new deploy_admin password in response to the Deployment Password prompt.

Downloading Error

Error Message

netwitness_download_error.png

Problem When you select an update version and click Update >Update Host, the download starts but fails to complete.
Cause Version download files can be large and take a long time to download. If there are communication issues during the download it will fail.
Solution
  1. Try to update again.
  2. If it fails again with the same error, try to update using the offline methods as described in "Offline Method from Hosts View" or "Offline Method Using Command Line Interface" in the Upgrade Guide for NetWitness Platform. Go to the NetWitness All Versions Documents page and find NetWitness Platform guides to troubleshoot issues.

  3. If you are still not able to update, contact Customer Support.

Error Message

If you are upgrading from NetWitness Platform 11.x.x.x to 11.6.x.x or later, offline UI upgrade fails with the Download error message.

Solution
  1. In the Command Line Interface (CLI):

    1. SSH to NW Server.

    2. Run the following command:
      upgrade-cli-client --upgrade --host-key <ID, IP address, hostname or display name of host> --version <version number>
      For example:
      upgrade-cli-client --upgrade --host-key <ID, IP address, hostname or display name of host> --version 11.6.0.0
  2. After the NW Server is successfully updated, log in to the NW Server user interface and go to netwitness_adminicon_25x22.png (Admin) > Hosts, where you are prompted to reboot the host.
  3. Click Reboot Host from the toolbar.

    You can upgrade all the other hosts directly from the user interface:

    1. Click Begin Update from the Update Available dialog.
      After the host is upgraded, it prompts you to reboot the host.
    2. Click Reboot Host from the toolbar.

Error Deploying Version <version-number> Missing Update Packages

Error Message

netwitness_offline-ui-update-errordeployingversion.png

Problem

Error deploying version <version-number> is displayed in the Initialize Update Package for NetWitness Platform dialog after you click on Initialize Update if the update package is corrupted.

Solution
  1. Click Close to close the dialog.

  2. Remove the version folder from staging folder.

  3. Make sure that the salt-master service is running.

  4. Recopy the update package zip file to the staging folder.
  5. In the Hosts view toolbar, select Check for Updates again.
    netwitness_chk4upds.png

  6. Click Initialize Update.
  7. Click Update > Update Hosts from the toolbar.
  8. Click Begin Update from the Update Available dialog.
    After the host is updated, it prompts you to reboot the host.
  9. Click Reboot from the toolbar.

Upgrade Failed Error

Error Message

While updating/installing a device to version 11.2 or above, the following error can occur and be found in /var/log/netwitness/config-management/chef-solo.log:

netwitness_image_1.png

Cause

The reason can be because the target host is unable to communicate to the Admin Server on port 53 as it is attempting to use the dnsmasq service on the Admin Server to resolve, in this case, 889e5752-6ae3-4286-a944-c182 33f4ccbc. This is the salt minion id of the admin server. You can see this by running "cat /etc/salt/minion" on the Admin Server to compare. Example output:

netwitness_image_2.png

Solution If possible, configure any firewalls between the target host and the Admin Server host to be able to communicate on port 53. If this is not possible, the workaround is to include the minion id in the /etc/host file on the component hosts and starting in the 11.4 release, modify the chef recipe not to overwrite this workaround.
Workaround Refer to Install/Upgrade fails in NetWitness Platform because Resolv::ResolvError: no address for a particular host KB Article.

Error Message

Received an error in the error log similar to the following when trying to update to version 11.6 or later:
netwitness_error_log.png

Cause Custom builds/rpms installed for certain components installed on hosts, such as in the case of installing Hotfixes.
Solution

To resolve the issue, follow the below steps.

  1. SSH to Admin Server.
  2. Locate the component descriptor file by running the following command.
    cd /etc/netwitness/component-descriptor/
  3. Open the component descriptor file by running the following command.
    vi nw-component-descriptor. json
  4. Search for “packages” section for the component you have custom build/rpm. For example, below shown is the package details for “concentrator” host that has custom build/rpm.
    “concentrator”: {
    “cookbook_name”: “rsa-concentrator”,
    “service_names”: [“rsa-nw-concentrator”],
    “family”: “launch”,
    “default_port”: xxxx, “description”: “Concentrator”,
    packages”:[{ “name”: “rsa-nw-concentrator”,
    “version” : “11.6.0.0-2003001075220.5.cecf24b.e.17.centos”
    },
  5. Delete the complete version details including (,) character in the packages section. For example, it should look like as shown below after you delete the version details.
    “packages”: [{
    “name”: “rsa-nw-concentrator”
    },

Note: You must delete the version details for all the host that has custom builds/rpms in the component descriptor of the admin server.

  1. Run the upgrade process again.

External Repo Update Error

Error Message

Received an error similar to the following error when trying to update to a new version from the :
.Repository 'nw-rsa-base': Error parsing config: Error parsing "baseurl = 'https://nw-node-zero/nwrpmrepo /<version-number>/RSA'": URL must be http, ftp, file or https not ""

Cause There is an error the path you specified.
Solution

Make sure that:

  • the URL does exist on the NW Server host.
  • you used the correct path and remove any spaces from it.

Host Update Failed Error

Error Message


netwitness_hstupdfailed.png

Problem When you select an update version and click Update > Update Host, the download process is successful, but the update process fails.
Solution
  1. Try to apply the version update to the host again.
    Often this is all you need to do.
  2. If you still cannot apply the new version update:
    Monitor the following logs on NW Server as it progresses (for example, run the tail -f command from the command line):
    /var/netwitness/uax/logs/sa.log
    /var/log/netwitness/orchestration-server/orchestration-server.log
    /var/log/netwitness/deployment-upgrade/chef-solo.log
    /var/log/netwitness/config-management/chef-solo.log

    /var/lib/netwitness/config-management/cache/chef-stacktrace.out
    The error appears in one or more of these logs.
  3. If you still cannot apply the update, gather the logs from step 2 and contact Customer Support.

Missing Update Packages Error

Error Message

Initialize Update for Version xx.x.x.x
Missing the following update package(s)

Download Packages from NetWitness Link

Problem Missing the following update package(s) is displayed in the Initialize Update Package for NetWitness Platform dialog when you are updating a host from the Hosts view offline and there are packages missing in the staging folder.
Solution
  1. Click Download Packages from NetWitness Community in the Initialize Update Package for NetWitness Platform dialog.
    The NetWitness Community page that contains the update files for the selected version is displayed.

  2. Select the missing packages from the staging folder.
    The Initialize Update Package for NetWitness Platform dialog is displayed telling you that it is ready to initialize the update packages.

OpenSSL 1.1.x

Error Message

The following example illustrates an ssh error that can occur when the ssh client is run from a host with OpenSSL 1.1.x installed:
$ ssh root@10.1.2.3
ssh_dispatch_run_fatal: Connection to 10.1.2.3 port 22: message authentication code incorrect

Problem

Advanced users who want to ssh to a NetWitness Platform host from a client that is using OpenSSL 1.1.x encounter this error because of incompatibility between CENTOS 7.x and OpenSSL 1.1.x. For example:

$ rpm -q openssl
openssl-1.1.1-8.el8.x86_64

Solution

Specify the compatible cipher list on the command line. For example:

$ ssh -oCiphers=aes128-ctr,aes192-ctr,aes256-ctr root@10.1.2.3

I've read & consent to terms in IS user agreement.

root@10.1.2.3's password:

Last login: Mon Oct 21 19:03:23 2019

Patch Update to Non-NW Server Error

Error Message

The /var/log/netwitness/orchestration-server/orchestration-server.log has an error similar to the following error:
API|Failure /rsa/orchestration/task/update-config-management [counter=10 reason=IllegalArgumentException::Version '11.x.x.n' is not supported

Problem After you update the NW Server host to a version, you must update all non-NW Server hosts to the same version. For example, if you update the NW Server from 11.4.0.0 to 11.6.0.0 or later, the only update path for the non-NW Server hosts is the same version (that is, 11.6.0.0). If you try to update any non-NW Server host to a different version (for example, from 11.4.0.0 to an 11.4.x.x) you will get this error.
Solution

You have two options:

  • Update the non-NW Server host to 11.6.0.0 or later, or
  • Do not update the non-NW Server host (keep it at its current version)

Reboot Host After Update from Command Line Error

Error Message

You receive a message in the User Interface to reboot the host after you update and reboot the host offline.
netwitness_asoc-50839.png

Cause You cannot use CLI to reboot the host. You must use the User Interface.
Solution

Reboot the host in the Host View in the User Interface.

Reporting Engine Restarts After Upgrade

Problem

In some cases, after you upgrade to 11.6 or later from versions of 11.x, such as 11.4, the Reporting Engine service attempts to restart continuously without success.

Cause

The database files for live charts, alert status, or report status may not be loaded successfully as the files may be corrupted.

Solution

To resolve the issue, do the following:

  1. Check which database files are corrupted:

    Navigate to the file located at /var/netwitness/reserver/rsa/soc/reporting-engine/logs/reporting-engine.log and check the following blocks:

    • If the live charts db file is corrupted, the following logs are displayed:

      netwitness_livecharts_650x195.png

    • If the alert status db file is corrupted, the following logs are displayed:

      netwitness_alertstatus_636x238.png

    • If the report status db file is corrupted, the following logs are displayed:

      org.h2.jdbc.JdbcSQLException: File corrupted while reading record: null. Possible solution: use the recovery tool [90030-196]

  2. To resolve the live charts database file corruption, perform the following steps:

    1. Stop the Reporting Engine service.

    2. Move the livechart.mv.db file from /var/netwitness/reserver/rsa/soc/reporting-engine/livecharts folder to a temporary location.

    3. Restart the Reporting Engine service.

      Note: Some live charts data may be lost on performing the above steps.

    To resolve the alert status or report status database file corruption, perform the following steps:

    1. Stop the Reporting Engine service.
    2. Replace the corrupted db file with the latest alertstatusmanager.mv.db or reportstatusmanager.mv.db file from /var/netwitness/reserver/rsa/soc/reporting-engine/archives folder.
    3. Restart the Reporting Engine service.

    For more information, see the Knowledge Base article Reporting Engine restarts After upgrade to NetWitness Platform 11.4.

Problem After you upgrade to version 11.6 or later, the Reporting Engine service does not restart.
Cause The Reporting Engine service may not start due to any of the following reasons.
- workspace.xml not updated.
- Time is not converted properly in livechart h2 database.
- JCR (Jackrabbit repository) is corrupted with primary key violation.
Solution

To resolve the issue, run the Reporting Engine Migration Recovery tool (rsa-nw-re-migration-recovery.sh) on the Admin Server where the Reporting Engine service is installed.

Note: You can find the Reporting Engine Migration Recovery tool in the below location.
/opt/rsa/soc/reporting-engine-<version number>-<Tag>/nwtools
For example:
/opt/rsa/soc/reporting-engine-11.6.0.0-<Tag>/nwtools

1. SSH to Admin Server.

2. Untar the RE (Reporting Engine) tool, run the following command.
tar -xvf rsa-nw-re-recovery-tool-bundle.tar

3. (Optional) If you want to untar the RE tool file in some other directory, you can create a directory and untar the RE tool. Run the following commands.

mkdir <NAME OF THE DIRECTORY>
tar -xvf rsa-nw-re-recovery-tool-bundle.tar --directory <PATH OF THE DIRECTORY>

4. Run the script, run the following command.
./<PATH OF THE DIRECTORY>/rsa-nw-re-recovery-tool.sh

For more information, see the Knowledge Base article Reporting Engine Migration Recovery Tool.

Log Collector Service (nwlogcollector)

Log Collector installation logs posted to /var/log/install/nwlogcollector_install.log on the host running the nwlogcollector service.

Error Message <timestamp>.NwLogCollector_PostInstall: Lockbox Status : Failed to open lockbox: The lockbox stable value threshold was not met because the system fingerprint has changed. To reset the system fingerprint, open the lockbox using the passphrase.
Cause The Log Collector Lockbox failed to open after the update.
Solution Log in to NetWitness and reset the system fingerprint by resetting the stable system value password for the Lockbox as described in the "Reset the Stable System Value" topic under "Configure Lockbox Security Settings" topic in the Log Collection Configuration Guide.

Error Message <timestamp> NwLogCollector_PostInstall: Lockbox Status : Not Found
Cause The Log Collector Lockbox is not configured after the update.
Solution If you use a Log Collector Lockbox, log in to NetWitness and configure the Lockbox as described in the "Configure Lockbox Security Settings" topic in the Log Collection Configuration Guide.

Error Message <timestamp>: NwLogCollector_PostInstall: Lockbox Status : Lockbox maintenance required: The lockbox stable value threshold requires resetting. To reset the system fingerprint, select Reset Stable System Value on the settings page of the Log Collector.
Cause You need to reset the stable value threshold field for the Log Collector Lockbox.
Solution Log in to NetWitness and reset the stable system value password for the Lockbox as described in "Reset the Stable System Value" topic under "Configure Lockbox Security Settings" topic in the Log Collection Configuration Guide.

Error Message

Decoder tries to start capture events but fails.
netwitness_image_3.png

Solution

To resolve the issue, do the following steps,

  1. SSH to the Decoder host.
  2. Run the following commands.
    yum reinstall pfring*
    systemctl restart nwdecoder

NW Server

These logs are posted to /var/netwitness/uax/logs/sa.log on the NW Server Host.

Problem

After upgrade, you notice that Audit logs are not getting forwarded to the configured Global Audit Setup;

or,

The following message seen in the sa.log.
Syslog Configuration migration failed. Restart jetty service to fix this issue

Cause NW Server Global Audit setup migration failed to migrate from 11.4.x.x or 11.5.x.x. to 11.6.0.0 or later.
Solution
  1. SSH to the NW Server.
  2. Submit the following command.
    orchestration-cli-client --update-admin-node

Orchestration

The orchestration server logs are posted to /var/log/netwitness/orchestration-server/orchestration-server.log on the NW Server Host.

Problem
  1. Tried to upgrade a non-NW Server host and it failed.
  2. Retried the upgrade for this host and it failed again.

You will see the following message in the orchestration-server.log.
"'file' _virtual_ returned False: cannot import name HASHES""

Cause Salt minion may have been upgraded and never restarted on failed non-NW Server host
Solution
  1. SSH to the non-NW Server host that failed to upgrade.
  2. Submit the following commands.
    systemctl unmask salt-minion
    systemctl restart salt-minion
  3. Retry the upgrade of the non-NW Server host.

Reporting Engine Service

Reporting Engine Update logs are posted to to/var/log/re_install.log file on the host running the Reporting Engine service.

Error Message <timestamp> : Available free space in /var/netwitness/re-server/rsa/soc/reporting-engine [ ><existing-GB ] is less than the required space [ <required-GB> ]
Cause Update of the Reporting Engine failed because you do not have enough disk space.
Solution Free up the disk space to accommodate the required space shown in the log message. See the "Add Additional Space for Large Reports" topic in the Reporting Engine Configuration Guide for instructions on how to free up disk space.

Event Stream Analysis

Problem After upgrading to version 11.6 or later, the ESA correlation server does not aggregate events from the configured data sources.
Error Message Invalid username or password at com.rsa.netwitness.streams.base.RecordSourceSubscription.run(RecordSourceSubscription.java:173)
Solution

To resolve the issue, do the following steps.
In the NetWitness user interface,

  1. Go to netwitness_configureicon_24x21.png (Configure) > ESA Rules.
    ESA Rules panel is displayed with Rules tab open.
  2. In the Rules tab options panel, under Deployments, select a deployment.
  3. In the Data Sources section, select the data source and click netwitness_icon_edit.png in the toolbar.
  4. In the Edit Service dialog, type the password for that data source.
  5. Click the Test Connection button to make sure that it can communicate with the ESA service and then click OK.

Note: Do the above procedure for all the configured data sources.

  1. After you finish making changes to the deployment, click Deploy Now to redeploy the ESA rule deployment.

Legacy Windows Log Collector

Problem
  • Legacy Windows Log Collector appears as inactive post upgrade of SA to 12.0.0.0 version and Legacy Windows Log Collector to 11.6.x or 11.7.x versions.

  • Legacy Windows Log Collector appears as inactive when the stack is upgraded to 12.0.0.0.

Cause Certificate update in the SA node.
Solution

Refer Legacy Windows Log Collector section in the Post Update Tasks.

ESA Troubleshooting Information

ESA Rules are Not Creating Alerts

If you are not seeing any alerts, check the status of the ESA rule deployments.

  1. Go to netwitness_configureicon_24x21.png (Configure) > ESA Rules > Services tab.
    The Services view is displayed, which shows the status of your ESA services and deployments.
  2. In the options panel on the left, select an ESA service.
  3. For each service listed, look at the deployment tabs in the panel on the right. Each tab represents a separate ESA rule deployment.
  4. For each ESA rule deployment:
    1. In the Engine Stats section, look at the Events Offered and the Offered Rate. They confirm that the data is being aggregated and analyzed properly. If you see 0 for Events Offered, nothing is coming in for the deploymIent.
    2. In the Rule Stats section, look at the Rules Enabled and Rules Disabled. If there are any disabled rules, look in the Deployed Rule Stats section below to view the details of the disabled rules. Disabled rules show a white circle. Enabled rules show a green circle.

    netwitness_esa_verifydeplstatus.png

  5. If you notice any disabled rules that should be enabled:
    1. Go to netwitness_configureicon_24x21.png (Configure) > ESA Rules > Rules tab and redeploy the ESA rule deployments that contain disabled rules.
    2. Go back to the Services tab and check to see if the rules are still disabled. If the rules are still disabled, check the ESA Correlation service log files, which are located at /var/log/netwitness/correlation-server/correlation-server.log.

Note: To avoid unnecessary processing overhead, the Ignore Case option has been removed from the ESA Rule Builder - Build a Statement dialog for meta keys that do not contain text data values. During the upgrade to 11.4 or later, NetWitness Platform does not modify existing rules for the Ignore Case option. If an existing Rule Builder rule has the Ignore Case option selected for a meta key that no longer has the option available, an error occurs if you try to edit the statement and try to save it again without clearing the checkbox.

Endpoint, UEBA, and Live Content Rules are Not Working

To support Endpoint and UEBA content as well as changes to ESA rules from Live, a data change from single-value (string) to multi-value (string array) is required for several meta keys within the ESA Correlation service. In NetWitness Platform 11.4 or later, ESA automatically adjusts the operator in the rule statement when there is a change from string to string array, but you still may need to make manual adjustments to adjust for the string array changes.

To change the string type meta keys to string array type meta keys manually in 11.4 or later, see “Configure Meta Keys as Arrays in ESA Correlation Rule Values” in the ESA Configuration Guide.

To use the latest Endpoint, UEBA, and Live content rules, the following default multi-valued meta keys are required on the ESA Correlation service in NetWitness Platform version 11.4 or later:

action , alert , alert.id , alias.host , alias.ip , alias.ipv6 , analysis.file , analysis.service , analysis.session , boc , browserprint , cert.thumbprint , checksum , checksum.all , checksum.dst , checksum.src , client.all , content , context , context.all , context.dst , context.src , dir.path , dir.path.dst , dir.path.src , directory , directory.all , directory.dst , directory.src , email , email.dst , email.src , eoc , feed.category , feed.desc , feed.name , file.cat , file.cat.dst , file.cat.src , filename.dst , filename.src , filter , function , host.all , host.dst , host.orig , host.src , host.state , inv.category , inv.context , ioc , ip.orig , ipv6.orig , netname , OS , param , param.dst , param.src , registry.key , registry.value , risk , risk.info , risk.suspicious , risk.warning , threat.category , threat.desc , threat.source , user.agent , username

The following default single-valued meta keys are also required on the ESA Correlation service in NetWitness Platform 11.4 or later:

accesses , context.target , file.attributes , logon.type.desc , packets

To update your meta keys, see "Update the Multi-Valued and Single-Valued Parameter Meta Keys for the latest Endpoint, UEBA, and RSA Live Content Rules" in the ESA Configuration Guide.

If you used any meta keys in the ESA rule notification templates from the Required String Array or String Meta Keys list, update the templates with the meta key changes. See "Configure Global Notification Templates" in the System Configuration Guide. Go to the NetWitness All Versions Documents page and find NetWitness Platform guides to troubleshoot issues.

Note: Advanced EPL rules may get disabled and are not automatically updated so they must be fixed manually.

For additional troubleshooting information, see “Troubleshoot ESA” in the Alerting with ESA Correlation Rules User Guide for NetWitness Platform. Go to the NetWitness All Versions Documents page and find NetWitness Platform guides to troubleshoot issues.

Example ESA Correlation Server Warning Message for Missing Meta Keys

If you see a warning message in the ESA Correlation server error logs that means there is a difference between the default-multi-valued parameter and multi-valued parameter meta key values, the new Endpoint, UEBA, and Live content rules will not work. Completing the "Update the Multi-Valued and Single-Valued Parameter Meta Keys for the latest Endpoint, UEBA, and RSA Live Content Rules" procedure in the ESA Configuration Guide should fix the issue.

Multi-Valued Warning Message Example

2019-08-23 08:55:07,602 [ deployment-0] WARN Stream|[alert, alert_id, browserprint, cert_thumbprint, checksum, checksum_all, checksum_dst, checksum_src, client_all, content, context, context_all, context_dst, context_src, dir_path, dir_path_dst, dir_path_src, directory, directory_all, directory_dst, directory_src, email_dst, email_src, feed_category, feed_desc, feed_name, file_cat, file_cat_dst, file_cat_src, filename_dst, filename_src, filter, function, host_all, host_dst, host_orig, host_src, host_state, ip_orig, ipv6_orig, OS, param, param_dst, param_src, registry_key, registry_value, risk, risk_info, risk_suspicious, risk_warning, threat_category, threat_desc, threat_source, user_agent] are still MISSING from multi-valued

Single Value Warning Message Example

2019-08-23 08:55:07,602 [ deployment-0] WARN Stream|[accesses, context_target, file_attributes, logon_type_desc, packets] are still MISSING from single-valued