Configure the Data Sources

You must configure NWDB, Warehouse, and Respond to generate Reports, Charts, and Alerts. Optionally, you can also configure Archiver, Collection, and Workbench data sources.

IMPORTANT: If you change the admin password on a NetWitness service that is used as a Reporting Engine data source, you must remove and then re-add the service as a data source.

Note: To execute Reports and Charts on an Analyst UI, make sure the admin adds the data sources to each Reporting Engine instance from the admin node using the relevant procedure described in this topic.

Configure a NWDB Data Source

To add a NWDB data source:

  1. Go to netwitness_adminicon_25x22.png (Admin) > Services.
  2. In the Services, select Reporting Engine service.
  3. Click netwitness_ic-actns.png > View > Config

    The Services Config View of Reporting Engine is displayed.

  4. On the Sources tab, click netwitness_add.pngnetwitness_arrow.png > Available Services.

    The Available Services dialog is displayed.

    netwitness_106_available_sources.png

  5. Select a NWDB service you want to add and click OK.
  6. In the Service Information for Broker dialog, enter the service information for the service and click OK. In this example, we are adding a Broker service.

    netwitness_106_serv_info_303x208.png

  7. The service is displayed in the Sources tab when it is successfully added.

Note: The services with the Trust Model enabled must be added individually. You are prompted to provide a username and password for the selected service.

Configure a Warehouse Data Source

You can add the warehouse data source to Reporting Engine, so that you can extract the data from the required services, store them in MapR or Horton works and generate Reports and Alerts. The procedure to configure Warehouse as a data source differs. To extract data from a Warehouse data source, you must configure it using the following procedure.

Note: Warehouse Connector is still supported, as it is reporting against a warehouse in NetWitness 11.x.

Prerequisite

Make sure you:

  • Add a Warehouse Data Source to Reporting Engine.
  • Set Warehouse Data Source as the Default Source.
  • HIVE server is in running state on all the Warehouse nodes. Use the following command to check the status of the HIVE server:

    status hive2 (MapR deployments)
    service hive-server2 status (Horton Works deployments)

  • Warehouse Connector is configured to write data to the warehouse deployments.
  • If Kerberos authentication is enabled for HiveServer2, make sure that the keytab file is copied to the /var/netwitness/re-server/rsa/soc/reporting-engine/conf/ directory in the Reporting Engine Host.

    Note: The rsasoc user should have read permissions for the keytab file. For more information, see Configure Data Source Permissions.

    Also, make sure that you update the keytab file location in the Kerberos Keytab File parameter in the Reporting Engine Service Config View. For more information, see Reporting Engine General Tab.

To add Warehouse data source for MapR:

  1. Go to netwitness_adminicon_25x22.png (Admin) > Services.
  2. In the Services list, select the Reporting Engine service.
  3. Click netwitness_settings.png > View > Config.

    The Service Config view is displayed with the General tab open.

  4. Click the Sources tab.
  5. In the Sources tab, click netwitness_ic-adddrop.png and select New Service.

    The New Service dialog is displayed.

    netwitness_wc_device_re_382x414.png

  6. In the Source Type drop-down menu, select WAREHOUSE.
  7. In the Warehouse Source drop-down menu, select the warehouse data source.
  8. In the Name field, enter the host name of the Warehouse data source.
  9. In the HDFS Path field, enter the HDFS root path to which the Warehouse Connector writes the data.

    For example:

    If /saw is the local mount point for HDFS that you have configured while mounting NFS on the device. And if you have installed the Warehouse Connector service to write to SAW. For more information, see "Mount the Warehouse on the Warehouse Connector" topic in the Warehouse (MapR) Configuration Guide.

    If you have created a directory named Ionsaw01 under /saw and provided the corresponding Local Mount Path as /saw/Ionsaw01, then the corresponding HDFS root path would be /Ionsaw01.​

    The /saw mount point implies to /as the root path for HDFS. The Warehouse Connector writes the data /Ionsaw01 in HDFS. If there is no data available in this path, the following error is displayed:

    “No data available. Check HDFS path”

    Make sure that /lonsaw01/rsasoc/v1/sessions/meta contains avro files of the meta data before performing test connection.

  10. Select the Advanced checkbox to use the advanced settings, and fill in the Database URL with the complete JDBC URL to connect to the HiveServer2.

    For example:

    If kerberos is enabled in HIVE then the JDBC url will be:

    jdbc:hive2://<host>:<port>/<db>;principal=<Kerberos serverprincipal>

    If SSL is enabled in HIVE then the JDBC url will be:

    jdbc:hive2://<host>:<port>/<db>;ssl=true;sslTrustStore=<trust_store_path>;trustStorePassword​=<trust_store_password>

    For more information on HIVE server clients, see https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients.

  11. If not using the advanced settings, enter the values for the Host and Port.

    • In the Host field, enter the IP address of the host on which HiveServer2 is hosted.

      Note: You can use the virtual IP address of MapR only if HiveServer2 is running on all the nodes in the cluster.

    • In the Port field, enter the HiveServer2 port of the Warehouse data source. By default, the port number is 10000.
  12. In the Username and Password field, enter the JDBC credentials used to access HiveServer2.

    Note: You can also use LDAP mode of authentication using Active Directory. For instructions to enable LDAP authentication mode, see Enable LDAP Authentication.

  13. Enable Kerberos authentication: see Enable Kerberos Authentication.
  14. ​If you want set the added Warehouse data source as default source for the Reporting Engine, select the added Warehouse data source and click netwitness_setdefault.png.

To add Warehouse data source for Horton Works (HDP):

Note: Make sure you download the hive-jdbc-1.2.1-with-full-dependencies.jar. This jar contains the driver file of HIVE 1.2.1 which connects to Reporting Engine for Hive 1.2.1 Hiveserver2.

  1. SSH to the NetWitness server.
  2. In the /opt/rsa/soc/reporting-engine/plugins/ folder, take a backup of the following jar:

    hive-jdbc-0.12.0-with-full-dependencies.jar or hive-jdbc-1.0.0-mapr-1508-standalone.jar

  3. Remove the following jar:

    hive-jdbc-0.12.0-with-full-dependencies.jar or hive-jdbc-1.0.0-mapr-1508-standalone.jar

  4. In the /opt/rsa/soc/reporting-engine/plugins folder, copy the following jar using WinSCP:

    hive-jdbc-1.2.1-with-full-dependencies.jar

  5. Restart the Reporting Engine service.
  6. Log in to NetWitness UI.
  7. Select the Reporting Engine service and select netwitness_ic-actns.png > View > Explore.
  8. In the hiveConfig, set EnableSmallSplitBasedSchemaLiteralCreation parameter to true.

Enable Jobs

Note: Warehouse Analytics is not supported in NetWitness 11.0 or later releases.

  1. Select the Enable Jobs checkbox.

    netwitness_pivotal_warehouse.png

    Note: Do not select Pivotal in the HDFS field as it is not supported.

  2. Enter the following details:

    1. Select the type of HDFS from the HDFS Type drop-down menu.

      • If you select the Horton Works HDFS type, enter the following information.

        Field Description
        MapReduce Framework For HDFS type Horton Works, select MapReduce Framework as Yarn.

        HDFS Username

        Enter the username that Reporting Engine should claim when connecting to Horton Works. For standard horton works DCA clusters, this would be ‘gpadmin’.
        HDFS Name Enter the URL to access HDFS. For example, hdfs://hdm1.gphd.local:8020.

        HBase Zookeeper Quorom

        Enter the list of host names separated by a comma on which the ZooKeeper servers are running.
        HBase Zookeeper Port Enter the port number for the ZooKeeper servers. The default port is 2181.

        Input Path Prefix

        Enter the output path of the Warehouse Connector (/sftp/rsasoc/v1/sessions/data/<year>/<month>/<date>/<hour>) until the year directory.

        For example, /sftp/rsasoc/v1/sessions/data/.

        Output Path Prefix Enter the location where the data science job results are stored in HDFS.​
        ETL-Output Directory Enter the location of the ETL output directory

        Yarn Host Name

        Enter the Hadoop yarn resource-manager host name in the DCA cluster.

        For example, hdm3.gphd.local.​

        Job History Server

        Enter the Hadoop job-history-server address in the DCA cluster.

        For example, hdm3.gphd.local:10020.

        Yarn Staging Directory

        Enter the staging directory for YARN in the DCA cluster.

        For example, /user.​

        Socks Proxy

        If you use the standard DCA cluster, most hadoop services will be running in a local private network and will not reachable from Reporting Engine. So, you must run a socks proxy in the DCA cluster and allow access from outside the cluster.

        For example, mdw.netwitness.local:1080.

      • If you select the MapR HDFS type, enter the following information:

        Field Description
        MapReduce Framework

        For HDFS type MapR, select MapReduce framework as Classic

        Client Host Name

        The public IP address of any one of the MapR warehouse hosts can be entered.

        Client Host User Enter a UNIX username in the given host that has access to execute map-reduce jobs on the cluster. The default value is 'mapr'​.
        Client Host Password To setup password-less authentication, copy the public key of the “rsasoc” user from /home/rsasoc/.ssh/id_rsa.pub to the “authorized_keys” file of the warehouse host located in /home/mapr/.ssh/authorized_keys, with the assumption that “mapr” is the remote UNIX user.
        Client Host Work Dir

        Enter a path that the given UNIX user (for example, “mapr” ) has write access to.

        Note: The work directory is used by Reporting Engine to remotely copy the Warehouse Analytics jar files and start the jobs from the given host name. You must not use “/tmp” to avoid filling up of the system temporary space. The given work directory will be remotely managed by Reporting Engine.

        HDFS Name Enter the URL to access HDFS. For example, to access a specific cluster, maprfs:/mapr/<cluster-name>​.
        HBase Zookeeper Port Enter the port number for the ZooKeeper servers. The default port is 5181.

        Input Path Prefix

        Enter the output path (/rsasoc/v1/sessions/data/<year>/<month>/<date>/<hour>) until the year directory.

        For example, /rsasoc/v1/sessions/data/.

        Input Filename enter the file name filter for avro files. For example, sessions-warehouseconnector​.
        Output Path Prefix Enter the location where the data science job results are stored in HDFS.​
        ETL-Output Directory Enter the location of the ETL output directory

Enable Kerberos Authentication

  1. Select Kerberos Authentication checkbox, if the Warehouse is Kerberos enabled HIVE server.

    netwitness_wc_device_re_kerberos_option.png

  2. Fill in the fields as follows:

    Field Description

    Server Principal

    Enter the Principal used by the HIVE server to authenticate with the Kerberos Key Distribution Center (KDC) Server.​

    User Principal Enter the Principal that HIVE JDBC client uses to authenticate with the KDC server for connecting the HIVE server. For example, gpadmin@EXAMPLE.COM.

    Kerberos Keytab File

    View the Kerberos keytab file location configured in the HIVE Configuration panel on the Reporting Engine General Tab.

    Note: Reporting Engine supports only the data sources configured with the same Kerberos credentials, like, User Principal and key tab file.

  3. Click Test Connection to test the connection with the values entered.
  4. Click Save.

    The added Warehouse data source is displayed in the Reporting Engine Sources tab.

  5. Click netwitness_add.png netwitness_arrow.png > ​Available Services.

    The Available Services dialog box is displayed.

    netwitness_106_available_sources.png

  6. In the Available Services dialog box, select the service that you want to add as data source to the Reporting Engine and click OK.

    NetWitness adds this as a data source available to reports and alerts against this Reporting Engine.

    122_available_dev_RE_1122.png

    Note: This step is relevant only for an Untrusted model.

Set a Data Source as the Default Source

To set a data source to be the default source when you create reports and alerts:

  1. Go to netwitness_adminicon_25x22.png (Admin) > Services.
  2. In the Services list, select a Reporting Engine service.
  3. Select netwitness_ic-actns.png > View> Config.

    The Services Config View of Reporting Engine is displayed.

  4. Select the Sources tab.

    The Services Config View is displayed with the Reporting Engine Sources tab open.

  5. Select the source that you want to be the default source (for example, Broker).
  6. Click the netwitness_setdefault.png checkbox.

    NetWitness defaults to this data source when you create reports and alerts against this Reporting Engine.