Article Number
000039758
Applies To
RSA Product Set: RSA NetWitness Platform
RSA Product/Service Type: Archiver
RSA Version/Condition: 11.x
Platform: CentOS
O/S Version: 7
Issue
Archiver service fails to start the aggregation and crashes frequently.
'
Archiver Aggregation Stopped' and '
Archiver Service in Bad State' alarms are triggered in
HEALTH & WELLNESS.
/var/log/messages shows the following errors.
- Service fails to load due to at least one file being on the wrong tier.
Jun 17 03:30:50 Archiver NwArchiver[15808]: [Database] [failure] There are 1 files that were found to be on the wrong tier. Please shutdown the service and correct this problem before continuing.
Jun 17 03:30:50 Archiver NwArchiver[15808]: [Engine] [failure] Module archiver failed to load: There are 1 files that were found to be on the wrong tier. Please shutdown the service and correct this problem before continuing.
Jun 17 03:30:50 Archiver NwArchiver[15808]: [Engine] [failure] Module archiver failed to load: Diagnostic information: Throw in function void nw::ObjectStoreDatabase<FactoryT>::validateAllTablesNl() [with FactoryT = nw::PacketDatabaseFactory]Dynamic exception type: boost::exception_detail::clone_impl<nw::ObjectStoreError>std::exception::what: There are 1 files that were found to be on the wrong tier. Please shutdown the service and correct this problem before continuing.[boost::errinfo_at_line_*] = 553
- Slightly earlier logs may provide the name of the database files on the wrong tier.
Jun 17 03:30:49 Archiver NwArchiver[15808]: [packet] [warning] Database packet is missing objects from 25200749537 to 25304073505. The gap exists between object store "/var/netwitness/archiver/database0/alldata/packetdb/packet-000000219.nwpdb" and "/var/netwitness/archiver/database0/alldata/packetdb/packet-000001520.nwpdb".
Jun 17 03:30:49 Archiver NwArchiver[15808]: [packet] [failure] Database packet has misaligned tiers. Files /var/netwitness/archiver/database0/alldata/packetdb/packet-000001520.nwpdb and/var/netwitness/archiver/nas/alldata/packetdb/packet-000001521.nwpdb are out of order.
Jun 17 03:30:49 Archiver NwArchiver[15808]: [packet] [warning] Database packet has overlapping object numbers. /var/netwitness/archiver/database0/alldata/packetdb/packet-000001520.nwpdb has objects from 25304073506 to 25415013936. /var/netwitness/archiver/nas/alldata/packetdb/packet-000001521.nwpdb has objects from 21538519862 to 21644114852.
Jun 17 03:30:49 Archiver NwArchiver[15808]: [packet] [warning] Ignoring file /var/netwitness/archiver/database0/alldata/packetdb/packet-000001520.nwpdb
- Service crashes and restarts.
Jun 17 03:36:28 Archiver systemd: nwarchiver.service: main process exited, code=killed, status=11/SEGV
Jun 17 03:36:28 Archiver systemd: Unit nwarchiver.service entered failed state.
Jun 17 03:36:28 Archiver systemd: nwarchiver.service failed.
Jun 17 03:36:28 Archiver NwAppliance[31258]: [ServiceConnectionNode::messageHandler] [failure] localhost:56008: short read
Jun 17 03:36:28 Archiver systemd: nwarchiver.service holdoff time over, scheduling restart.
Jun 17 03:36:28 Archiver systemd: Stopped Netwitness Archiver.
Jun 17 03:36:28 Archiver systemd: Started Netwitness Archiver.
Cause
As the logs indicate, the issue occurs when database files are on the wrong tier.
For example, if the archiver service is configured to have the following two storage tiers, the oldest files from the hot tier roll over to the warm tier according to the policy to result in something like below.
- Hot tier: /var/netwitness/archiver/database0
/var/netwitness/archiver/database0/alldata/packetdb
packet-000002000.nwpdb
packet-000002001.nwpdb
packet-000002002.nwpdb
- Warm tire: /var/netwitness/archiver/nas
/var/netwitness/archiver/nas/alldata/packetdb
...
packet-000001519.nwpdb
packet-000001520.nwpdb
packet-000001521.nwpdb
packet-000001522.nwpdb
...
packet-000002000.nwpdb
However, if some of the files that are meant to exist on the warm tier are on the hot tier like the below example, the service will fail to start and return the error shown above. This may occur when these files were manually moved across the tiers or the rollover occurred while these files were not accessible due to database misconfiguration or unavailability.
- Hot tier: /var/netwitness/archiver/database0
/var/netwitness/archiver/database0/alldata/packetdb
packet-000001520.nwpdb
packet-000002000.nwpdb
packet-000002001.nwpdb
packet-000002002.nwpdb
- Warm tire: /var/netwitness/archiver/nas
/var/netwitness/archiver/nas/alldata/packetdb
...
packet-000001519.nwpdb
packet-000001521.nwpdb
packet-000001522.nwpdb
...
packet-000002000.nwpdb
Resolution
To resolve the issue, please perform the following.
- Stop the archiver service.
systemctl stop nwarchiver
- Move the problematic files to the correct tier.
cd /var/netwitness/archiver/database0/alldata/packetdb
mv packet-000001520.* /var/netwitness/archiver/nas/alldata/packetdb
- Start the archiver service.
systemctl start nwarchiver
Please check all available paths for the hot tier(e.g. database1 and database2) if exist, and also other database types(meta, session, and index) to ensure all of out of sync files are moved to the correct tier or even deleted if no longer required.