2018-06-08 03:43 AM
Hello we have been using Netwitness for several years and the web GUI seems to be getting slower over time.
During slow periods the disk i/o on the partition
VolGroup01-uax
increases greatly.
There as a previous issue where the notifications table in the database did not get cleaned up and this resulted in a similar issue. However I have confirmed that this is not the case
Our Netwitness Database is 2.7 GB
Here is a count of the database by rows:
sql> select table_name, count_rows(table_name) AS rc from INFORMATION_SCHEMA.TABLES where table_schema = 'PUBLIC' order by rc desc;
TABLE_NAME | RC
METAGROUPLANGUAGES | 52743
DEVICEMETALANGUAGES | 7271
USERPREDICATE | 4470
PREDICATE | 4334
QRTZ_JOB_DETAILS | 3072
USER_PREFERENCES | 2992
METAGROUPS_METAGROUPLANGUAGES | 2975
DASHLET_OPTIONS | 2323
DASHLET | 559
METALANGUAGES | 524
CUSTOMCOLUMNGROUPFIELD | 442
NOTIFICATIONS | 281
DASHBOARD | 109
METAGROUPS | 99
ODBCDSNTEMPLATEENTRY | 86
USERS | 66
USERCONNECTIONATTRIBUTE | 64
SHARED_DASHBOARD | 55
DEVICEINFO_PROPERTIES | 55
INVESTIGATIONPROFILE | 46
CUSTOMCOLUMNGROUP | 37
QRTZ_SIMPROP_TRIGGERS | 34
QRTZ_TRIGGERS | 34
DEVICEINFO | 32
GROUPDESCRIPTOR_IDENTIFIERS | 30
ROLECONNECTIONATTRIBUTE | 21
PUPPETPROVISION | 19
ODBCDSNTEMPLATE | 18
ROLE | 18
GROUPDESCRIPTOR | 14
ENDPOINTDESCRIPTOR | 12
RULESNAPSHOT | 11
SCHEMA_VERSION | 10
LICENSEDDEVICEINFO | 9
APPLIANCEDESCRIPTOR | 6
USERS_ROLE | 6
FAVORITE_DASHBOARD | 3
CEP_STATEMENT_TYPE | 2
AUDITLOGCONFIGURATION | 2
QRTZ_FIRED_TRIGGERS | 1
EVENTSOURCEMONITOR | 1
YUMSTATS | 1
QRTZ_LOCKS | 1
EVENTSOURCEMONITORDECOMMISSION | 1
DEVICEEVENTLISTVIEWS | 0
QRTZ_PAUSED_TRIGGER_GRPS | 0
CONFIGSETTINGSHISTORY | 0
X509CRLDBENTRY | 0
TEMPLATE | 0
MALWARESCAN | 0
DEVICEEVENTLISTVIEWS_EVENTVIEWFIELDS | 0
ARCHIVERMONITOR | 0
ALERT_CONSUMER | 0
PREFERENCE | 0
EVENTVIEWFIELDS | 0
CUSTOMFEEDHISTORY | 0
CEP_MODULE | 0
CEP_STATEMENT | 0
QRTZ_BLOB_TRIGGERS | 0
QRTZ_CALENDARS | 0
QRTZ_CRON_TRIGGERS | 0
EVENTSOURCEMONITORENTRY | 0
QRTZ_SCHEDULER_STATE | 0
ROLEMAPPING | 0
WAREHOUSECONNECTORMONITOR | 0
PERSISTENT_LOGINS | 0
ALERT_CONSUMER_TYPE | 0
QRTZ_SIMPLE_TRIGGERS | 0
CEP_MODULE_STATEMENT | 0
CEP_STATEMENT_ALERTER | 0
(70 rows, 41 ms)
Here is a count of the database by size on disk:
TABLE_NAME | PUBLIC.COUNT_ROWS(TABLE_NAME) | DSU
USER_PREFERENCES | 2992 | 2269184
METAGROUPLANGUAGES | 52743 | 1728512
QRTZ_JOB_DETAILS | 3072 | 1468416
PREDICATE | 4334 | 1095680
DEVICEMETALANGUAGES | 7271 | 448512
USERPREDICATE | 4470 | 374784
DASHLET_OPTIONS | 2323 | 202752
METAGROUPS_METAGROUPLANGUAGES | 2975 | 102400
DASHLET | 559 | 90112
USERS | 66 | 71680
DEVICEINFO_PROPERTIES | 55 | 55296
METALANGUAGES | 524 | 32768
QRTZ_TRIGGERS | 34 | 32768
NOTIFICATIONS | 281 | 30720
CUSTOMCOLUMNGROUPFIELD | 442 | 26624
QRTZ_SIMPROP_TRIGGERS | 34 | 24576
METAGROUPS | 99 | 12288
INVESTIGATIONPROFILE | 46 | 10240
DASHBOARD | 109 | 10240
USERCONNECTIONATTRIBUTE | 64 | 10240
PUPPETPROVISION | 19 | 6144
ODBCDSNTEMPLATEENTRY | 86 | 4096
ENDPOINTDESCRIPTOR | 12 | 4096
ROLECONNECTIONATTRIBUTE | 21 | 2048
DEVICEINFO | 32 | 2048
DEVICEEVENTLISTVIEWS | 0 | 2048
SCHEMA_VERSION | 10 | 2048
QRTZ_PAUSED_TRIGGER_GRPS | 0 | 2048
CONFIGSETTINGSHISTORY | 0 | 2048
X509CRLDBENTRY | 0 | 2048
CUSTOMCOLUMNGROUP | 37 | 2048
GROUPDESCRIPTOR | 14 | 2048
TEMPLATE | 0 | 2048
QRTZ_FIRED_TRIGGERS | 1 | 2048
EVENTSOURCEMONITOR | 1 | 2048
ODBCDSNTEMPLATE | 18 | 2048
CEP_STATEMENT_TYPE | 2 | 2048
MALWARESCAN | 0 | 2048
DEVICEEVENTLISTVIEWS_EVENTVIEWFIELDS | 0 | 2048
YUMSTATS | 1 | 2048
ARCHIVERMONITOR | 0 | 2048
ALERT_CONSUMER | 0 | 2048
PREFERENCE | 0 | 2048
EVENTVIEWFIELDS | 0 | 2048
LICENSEDDEVICEINFO | 9 | 2048
APPLIANCEDESCRIPTOR | 6 | 2048
CUSTOMFEEDHISTORY | 0 | 2048
CEP_MODULE | 0 | 2048
CEP_STATEMENT | 0 | 2048
QRTZ_BLOB_TRIGGERS | 0 | 2048
AUDITLOGCONFIGURATION | 2 | 2048
ROLE | 18 | 2048
GROUPDESCRIPTOR_IDENTIFIERS | 30 | 2048
QRTZ_CALENDARS | 0 | 2048
QRTZ_CRON_TRIGGERS | 0 | 2048
SHARED_DASHBOARD | 55 | 2048
EVENTSOURCEMONITORENTRY | 0 | 2048
QRTZ_SCHEDULER_STATE | 0 | 2048
ROLEMAPPING | 0 | 2048
QRTZ_LOCKS | 1 | 2048
WAREHOUSECONNECTORMONITOR | 0 | 2048
FAVORITE_DASHBOARD | 3 | 2048
PERSISTENT_LOGINS | 0 | 2048
USERS_ROLE | 6 | 2048
ALERT_CONSUMER_TYPE | 0 | 2048
QRTZ_SIMPLE_TRIGGERS | 0 | 2048
CEP_MODULE_STATEMENT | 0 | 2048
RULESNAPSHOT | 11 | 2048
EVENTSOURCEMONITORDECOMMISSION | 1 | 2048
CEP_STATEMENT_ALERTER | 0 | 2048
(70 rows, 35 ms)
These are the files that are being accessed on the partition:
[root@SASERVER pcaps]# lsof |grep 29036 |grep uax
java 29036 root 300w REG 253,2 520068 134217878 /var/lib/netwitness/uax/logs/sa.log
java 29036 root 301uW REG 253,2 0 402654258 /var/lib/netwitness/uax/logs/sa.log_index/write.lock
java 29036 root 303r REG 253,2 323351 134264516 /var/lib/netwitness/uax/logs/sa.log.1_index/_0.cfs
java 29036 root 304r REG 253,2 141684 134264512 /var/lib/netwitness/uax/logs/sa.log.1_index/_0.cfx
java 29036 root 305r REG 253,2 5167 28811077 /var/lib/netwitness/uax/logs/sa.log.2_index/_0.cfs
java 29036 root 306r REG 253,2 1089 28811062 /var/lib/netwitness/uax/logs/sa.log.2_index/_0.cfx
java 29036 root 307r REG 253,2 1329357 305735111 /var/lib/netwitness/uax/logs/sa.log.3_index/_0.cfs
java 29036 root 308r REG 253,2 554552 305735105 /var/lib/netwitness/uax/logs/sa.log.3_index/_0.cfx
java 29036 root 309r REG 253,2 1223573 134737678 /var/lib/netwitness/uax/logs/sa.log.4_index/_0.cfs
java 29036 root 310r REG 253,2 511521 134706594 /var/lib/netwitness/uax/logs/sa.log.4_index/_0.cfx
java 29036 root 311r REG 253,2 971321 134706596 /var/lib/netwitness/uax/logs/sa.log.5_index/_0.cfs
java 29036 root 312r REG 253,2 448468 134257275 /var/lib/netwitness/uax/logs/sa.log.5_index/_0.cfx
java 29036 root 313r REG 253,2 294490 403048505 /var/lib/netwitness/uax/logs/sa.log.6_index/_2.cfs
java 29036 root 314r REG 253,2 720110 28807424 /var/lib/netwitness/uax/logs/sa.log.7_index/_0.cfs
java 29036 root 315r REG 253,2 303222 28807417 /var/lib/netwitness/uax/logs/sa.log.7_index/_0.cfx
java 29036 root 316r REG 253,2 1140972 28807419 /var/lib/netwitness/uax/logs/sa.log.8_index/_0.cfs
java 29036 root 317r REG 253,2 473187 28807413 /var/lib/netwitness/uax/logs/sa.log.8_index/_0.cfx
java 29036 root 318r REG 253,2 1134817 402654094 /var/lib/netwitness/uax/logs/sa.log.9_index/_0.cfs
java 29036 root 319r REG 253,2 468888 402653374 /var/lib/netwitness/uax/logs/sa.log.9_index/_0.cfx
java 29036 root 320u REG 253,2 77940 402701508 /var/lib/netwitness/uax/logs/sa.log_index/_0.cfs
java 29036 root 322u REG 253,2 0 1060 /var/lib/netwitness/uax/cache/server/com.netwitness.platform.server.Sessions.data
java 29036 root 325w REG 253,2 1663547 269825560 /var/lib/netwitness/uax/db/platform.trace.db
java 29036 root 326u REG 253,2 2855802880 268649541 /var/lib/netwitness/uax/db/platform.h2.db
java 29036 root 328r REG 253,2 56572035 268649509 /var/lib/netwitness/uax/db/GeoCity.dat
java 29036 root 329w REG 253,2 10485877 49903 /var/lib/netwitness/uax/logs/audit/audit.log.2 (deleted)
java 29036 root 334r REG 253,2 20890471 268649510 /var/lib/netwitness/uax/db/GeoOrg.dat
java 29036 root 335r REG 253,2 4996004 268649511 /var/lib/netwitness/uax/db/GeoDomain.dat
java 29036 root 339w REG 253,2 4092206 44812 /var/lib/netwitness/uax/logs/audit/audit.log
java 29036 root 938w REG 253,2 0 402654264 /var/lib/netwitness/uax/logs/sa.log_index/_1.fdt
java 29036 root 939u REG 253,2 0 402654265 /var/lib/netwitness/uax/logs/sa.log_index/_1.fdx
root 29036 46.6 0.9 3904692 659900 ? Ssl 01:40 339:59 /usr/bin/java -Djava.awt.headless=true -Dcom.rsa.netwitness.carlos.LOG_ENABLE_SYSOUT=true -Dcom.netwitness.platform.DB_DEFRAG_ALWAYS=true -Xms6G -Xmx8G -XX:MaxMetaspaceSize=256m -Djdk.tls.ephemeralDHKeySize=2048 -Djavax.net.ssl.keyStore=/opt/rsa/carlos/keystore -XX:+OptimizeStringConcat -XX:+UseLargePages -XX:+UseG1GC -Djetty.state=/opt/rsa/jetty9/jetty.state -Djetty.home=/opt/rsa/jetty9 -Djava.io.tmpdir=/tmp -jar /opt/rsa/jetty9/start.jar etc/jetty-logging.xml etc/jetty-started.xml
The jetty log does not contain any obvious clues to what is happening.
2018-06-07 13:48:29,767 [qtp1157740463-27653] ERROR com.rsa.smc.sa.admin.service.entitlement.OOTBEntitlementService - Error while reading from DB : No data found
2018-06-07 13:48:30,573 [Context Availability Service Executor 388934942] WARN com.rsa.netwitness.carlos.transport.spi.AbstractMessageChannel - New message channel for 'class com.rsa.asoc.context.ContextServiceProtocol$ServiceMessage' is a non-CARLOS protocol. Caching is unavailable
2018-06-07 13:48:32,459 [Context Availability Service Executor 388934942] WARN com.rsa.netwitness.carlos.transport.spi.AbstractMessageChannel - New message channel for 'class com.rsa.asoc.context.ContextServiceProtocol$ServiceMessage' is a non-CARLOS protocol. Caching is unavailable
2018-06-07 13:48:33,443 [Context Availability Service Executor 388934942] WARN com.rsa.netwitness.carlos.transport.spi.AbstractMessageChannel - New message channel for 'class com.rsa.asoc.context.ContextServiceProtocol$ServiceMessage' is a non-CARLOS protocol. Caching is unavailable
2018-06-07 13:50:15,781 [qtp1157740463-27832] ERROR com.rsa.smc.sa.admin.service.entitlement.OOTBEntitlementService - Error while reading from DB : No data found
I have set defrag always to true in Jetty Defaults
more /etc/default/jetty
# file: '/etc/default/jetty' must be present when jettyuax is installed
export LD_LIBRARY_PATH=/usr/bin/lic
JETTY_HOME=/opt/rsa/jetty9
DB_DEFRAG_ALWAYS=true
JAVA_OPTIONS="-Djava.awt.headless=true -Dcom.rsa.netwitness.carlos.LOG_ENABLE_SYSOUT=t
rue -Dcom.netwitness.platform.DB_DEFRAG_ALWAYS=${DB_DEFRAG_ALWAYS} -Xms6G -Xmx8G -XX:M
axMetaspaceSize=256m -Djdk.tls.ephemeralDHKeySize=2048 -Djavax.net.ssl.keyStore=/opt/r
sa/carlos/keystore"
JAVA_OPTIONS="${JAVA_OPTIONS} -XX:+OptimizeStringConcat -XX:+UseLargePages -XX:+UseG1G
C"
2018-06-12 07:34 AM
After recreating my recurring feeds and running shutdown compact on the database, the size of the database has decreased from 2.7GB to 150MB.
2018-06-08 03:58 AM
iostat shows lots of writing to the uax disk:
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
VolGroup01-uax 0.00 0.00 0.00 372.00 0.00 2.66 14.66 1.06 2.86 0.80 29.90
2018-06-08 04:31 AM
Output of iotop -a
Total DISK READ: 0.00 B/s | Total DISK WRITE: 2.43 M/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
22607 be/4 root 0.00 B 233.10 M 0.00 % 2.63 % java -Djava.awt.he~c/jetty-started.xml
22606 be/4 root 0.00 B 123.79 M 0.00 % 2.07 % java -Djava.awt.he~c/jetty-started.xml
22608 be/4 root 0.00 B 170.14 M 0.00 % 1.91 % java -Djava.awt.he~c/jetty-started.xml
2018-06-08 05:39 AM
Further output of iotop -a shows that collectd is also using a lot of disk i/o
Total DISK READ: 0.00 B/s | Total DISK WRITE: 2.46 M/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
7444 be/4 root 0.00 B 0.00 B 0.00 % 2.81 % [flush-253:2]
22606 be/4 root 0.00 B 775.97 M 0.00 % 1.47 % java -Djava.awt.headless=true -Dcom.rsa.ne~etc/jetty-logging.xml etc/jetty-started.xml
22608 be/4 root 0.00 B 775.46 M 0.00 % 1.35 % java -Djava.awt.headless=true -Dcom.rsa.ne~etc/jetty-logging.xml etc/jetty-started.xml
22603 be/4 root 0.00 B 649.62 M 0.00 % 1.08 % java -Djava.awt.headless=true -Dcom.rsa.ne~etc/jetty-logging.xml etc/jetty-started.xml
22609 be/4 root 0.00 B 437.57 M 0.00 % 1.00 % java -Djava.awt.headless=true -Dcom.rsa.ne~etc/jetty-logging.xml etc/jetty-started.xml
22604 be/4 root 0.00 B 785.63 M 0.00 % 0.89 % java -Djava.awt.headless=true -Dcom.rsa.ne~etc/jetty-logging.xml etc/jetty-started.xml
22610 be/4 root 0.00 B 413.71 M 0.00 % 0.88 % java -Djava.awt.headless=true -Dcom.rsa.ne~etc/jetty-logging.xml etc/jetty-started.xml
22605 be/4 root 0.00 B 423.88 M 0.00 % 0.66 % java -Djava.awt.headless=true -Dcom.rsa.ne~etc/jetty-logging.xml etc/jetty-started.xml
22607 be/4 root 0.00 B 418.98 M 0.00 % 0.55 % java -Djava.awt.headless=true -Dcom.rsa.ne~etc/jetty-logging.xml etc/jetty-started.xml
1256 be/3 root 0.00 B 1228.00 K 0.00 % 0.53 % [jbd2/dm-10-8]
8506 be/4 tokumx 4.00 K 43.93 M 0.00 % 0.21 % mongod --quiet -f /etc/tokumx.conf run
1296 be/4 root 0.00 B 0.00 B 0.00 % 0.20 % [xfsbufd/dm-2]
1250 be/3 root 0.00 B 1604.00 K 0.00 % 0.13 % [jbd2/dm-7-8]
1252 be/3 root 0.00 B 2.14 M 0.00 % 0.11 % [jbd2/dm-8-8]
15370 be/4 root 2.23 M 1375.83 M 0.00 % 0.10 % collectd -C /etc/collectd.conf
Collectd also writes to this partition.
2018-06-08 05:56 AM
I ran through these instructions to limit the amount of messages that collectd generated
2018-06-08 07:21 AM
Running the strings command against the database file it seems to be full of:
{"started":1519387570633,"finished":1519387570759,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1519387870662,"finished":1519387870737,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1519387870662,"finished":1519387870737,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1519388170638,"finished":1519388170711,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1519388170638,"finished":1519388170711,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1519388470640,"finished":1519388470882,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1519388470640,"finished":1519388470882,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1519388770627,"finished":1519388770701,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1519388770627,"finished":1519388770701,"status":"COMPLETE","message"
2018-06-08 10:19 AM
It looks like every time a feed runs, then its job history is getting appended to the JOB_DATA column in the qrtz_job_details database table.
I ran the following and it produced...
select UTF8TOSTRING(job_data) from qrtz_job_details where job_name='e7ce309f-a123-43a6-a337-375f0cede469'
<lots of text>
....
finished":1528455853488,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528456152866,"finished":1528456153523,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528456152866,"finished":1528456153523,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528456452867,"finished":1528456453508,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528456452867,"finished":1528456453508,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528456752865,"finished":1528456755150,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528456752865,"finished":1528456755150,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528457052864,"finished":1528457053496,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528457052864,"finished":1528457053496,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528457352862,"finished":1528457353501,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528457352862,"finished":1528457353501,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528457652865,"finished":1528457653499,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528457652865,"finished":1528457653499,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528457952871,"finished":1528457953543,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528457952871,"finished":1528457953543,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528458252868,"finished":1528458253521,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528458252868,"finished":1528458253521,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528458552868,"finished":1528458553555,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528458552868,"finished":1528458553555,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528458852905,"finished":1528458853643,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528458852905,"finished":1528458853643,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528459152868,"finished":1528459153517,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528459152868,"finished":1528459153517,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528459452873,"finished":1528459453535,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528459452873,"finished":1528459453535,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528459752871,"finished":1528459753490,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528459752871,"finished":1528459753490,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528460052865,"finished":1528460053508,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528460052865,"finished":1528460053508,"status":"COMPLETE","message":"Error uploading file to service"},{"started":1528460352868,"finished":1528460354894,"status":"COMPLETE","message":"Error uploading file to service"}_subComponenttfeedst_statusCOMPLETEt:152separatort,turlthttp://localhost/feeds/RDNS.csvtuploading file to service"}]t
2018-06-11 07:48 AM
Apparently this is due to be fixed in 10.6.6
Is anyone able to confirm that for existing recurring feeds, the job history will be reset and I won't have to recreate all my recurring feeds again?
2018-06-12 07:34 AM
After recreating my recurring feeds and running shutdown compact on the database, the size of the database has decreased from 2.7GB to 150MB.
2018-06-18 04:05 AM
Another tip to shrink the database:
1) service puppet stop
2) stop jettysrv
3) cd /var/lib/netwitness/uax/db
4) java -cp ./h2-1.3.174.jar org.h2.tools.Shell -url jdbc:h2:file:platform
This will then put you into the SQL Shell
SCRIPT TO 'mydb.sql';
5) This will empty the database into a file mydb.sql
6) mv platform.h2.db platform.h2.db.backup
7) At this point there is no platform.h2.db file in the /var/lib/netwitness/uax/db directory
😎 start jettysrv - An original platform.h2.db file will be recreated
9) stop jettysrv
10) java -cp ./h2-1.3.174.jar org.h2.tools.Shell -url jdbc:h2:file:platform
This will then put you into the SQL Shell
11) Run
DROP ALL OBJECTS; RUNSCRIPT FROM 'mydb.sql';
exit;
12) service puppet start
The contents of the original database will be restored and should be much smaller. In my case on our test system it went from 2.7GB to 3MB!!!