Article Number
000036554
Applies To
RSA Product Set: NetWitness Logs & Network
RSA Product/Service Type: Core Appliance
RSA Version/Condition: 10.6.x, 11.x
Issue
The RabbitMQ service is no longer starting. When you try to start the service, it shows the error:
[root@appliance ~]# service rabbitmq-server start
Starting rabbitmq-server: FAILED - check /var/log/rabbitmq/startup_{log, _err}
The
/var/log/rabbitmq/startup_log file shows the following error:
[root@appliance ~]# cat /var/log/rabbitmq/startup_log
ERROR: epmd error for host localhost: nxdomain (non-existing domain)
or
[root@appliance ~]# cat /var/log/rabbitmq/startup_log
ERROR: epmd error for host 529e5432-5c74-4521-8dad-1cc6a0735902: nxdomain (non-existing domain)
Cause
When the RabbitMQ service starts, one of the first things it does is trying to resolve the hostname specified in the
/etc/rabbitmq/rabbitmq-env.conf file.
In Netwitness, the hostname is by default:
sa@localhost for 10.6.x versions
rabbit@<nodeid> for 11.x versions
For example for 10.6:
[root@appliance ~]# cat /etc/rabbitmq/rabbitmq-env.conf NODENAME=sa@localhost <----- ENABLED_PLUGINS_FILE=/etc/rabbitmq/rsa_enabled_pluginsTo resolve the hostname, RabbitMQ first tries to use the
/etc/hosts file and if it fails, then tries to use the
/etc/resolv.conf. In this case, "localhost" (or the <nodeid> for 11.x versions) should always be located in the
/etc/hosts file.
If you are getting the
nxdomain (non-existing domain) error, then that means that "localhost" (or the nodeid for 11.x versions)
cannot be resolved into an IP address using the /etc/hosts file. That could be caused by a typo in the /etc/hosts file or maybe because the file doesn't have the correct permissions so the rabbitmq service is not able to read the contents. The correct permissions should be: [root@appliance ~]# ls -lh /etc/hosts
-rw-r--r--. 1 root root 331 Feb 23 2016 /etc/hosts
Resolution
Open the /etc/hosts with the vi editor and make sure that there are no strange characters, that all the IP Addresses and hostnames are correct, and especially that the line starting with 127.0.0.1 includes "localhost" or the <nodeid> :
10.6.x: 127.0.0.1 LDecoder localhost localhost.localdomain localhost4 localhost4.localdomain4
11.x.x: 127.0.0.1 LDecoder localhost localhost.localdomain localhost4 localhost4.localdomain4 529e5432-5c74-4521-8dad-1cc6a0735902
( in 11.x you can get the node id by running the command: cat /etc/salt/minion | grep id )
If the permissions are not correct, change the permissions with the command: chmod 644 /etc/hosts
After fixing the typo or changing the permissions, check if there is any process still open by rabbitmq with the command:
ps aux | grep rabbit
You should only see a line with the "ps aux | grep rabbit" command that you just ran. If you can see any other processes related to rabbitmq, kill them with the command: kill <PID>
Then run the puppet agent to automatically fix any other discrepancies an to restart the rabbitmq service:
puppet agent -t