NetWitness Community

JosephWest1 · ‎2016-04-02

The network connection is established, and firewall turned off for testing. When I enable the Netwitness agent, the Security Analytics Master / Puppet Master states in /var/log/messages:

puppet-master: Compiled catalog for <key> in envorionment production 0.27 seconds

python: Adding {'node': '<certificate name> , classes " {[base': ''} to ENC database

python: Adding <certificate name> user to /rsa/system

python: signing puppet Cert

python: Pinging host <certificate name> with a 40 second timeout

python: Error with mco ping. Please check configuration.

When I run $mco ping <IP Address> I get a return.

When I run $mco ping <cert name> it returns the other certificate names in the inventory.txt, but not the one I'm trying to add.

The instructions from this site to establish a new puppet certificate has been followed from this site. I also am having an issue with RabbitMQ connecting to the puppet master.

Thank you for your time!

AdamRasnick1 · ‎2016-04-05

Can you run 'puppet agent -t' on your head unit and screen shot it or paste it in here somehow?

DavidPoirier · ‎2016-04-07

Hi Joseph,

Sounds like you can MCO ping other devices.

1. make sure that the /etc/puppet/csr_attributes.yaml file has the correct IP and hostname of the device you are trying to add.

2. Second, make sure that the time is correct, the time should be the same or close, but not ahead of the SA Server.

3. The device knows how to reach SA head because the device knows the Puppet master in the /etc/hosts, but does the SA head know the end device?

Thank you

David

KEVINDIENST · ‎2016-04-07

I concur with David, when I see these issues it is usually related to ntp problems with time not being synced. In addition as long as you can mco ping other systems, mcollective isn't dead (which I've run into before) so RabbitMQ isn't overloaded or anything strange like that.

That .yaml csr attributes file is critical as well, need name and IP to match up with contents there or the cert that Puppet generates will be in conflict with what Puppet Master sees.

JosephWest1 · ‎2016-04-07

Adam,

I have ran "puppet agent -t" quite a bit. I first thought it was a cert issue, so I cleared the old cert, and attempted to get a new one. That's where this problem lead, and that's how I received the original message from the /var/log/messages on the puppet-master.

JosephWest1 · ‎2016-04-07

The NTP server is good on the puppet master, and the node that is trying to be added.

Update: mcollective was not installed on the node that was trying to be added. I installed it, and now mcollective is working. The only thing that isn't working is rabbitMQ. I receive the error:

$service rabbitmq-server start

Status of node sa@localhost ...

Error.... unable to connect to node sa@localhost: nodedown

DIAGNOSTICS

==========

attemtped to contact [sa@localhost]

sa@localhost:

* unable to connect to epmd (port 4369) on localhost: nxdomain (non-existing domain)

Current node details:

- node name: 'rabbitmqctl-22331@shkm'

- home dir: /var/lib/rabbitmq

- cookie hash: 2*************** (An actual hash)

I have verified that the port can work, and tested it with netcat and netstat.

The hostname has been configured in /etc/hosts, and the configuration in /etc/rabbitmq and /var/lib/rabbitmq looks exactly like it does in the clusters that are working. I'm convinced that this may be coming down to improper erlang configuration, even though the erlang looks identical to the working cluster nodes.

Thank you all again for your time and effort!

KEVINDIENST · ‎2016-04-07

And you've verified that time is accurate within a few seconds between the appliance and puppet-master?

Run the date command and check that /etc/ntp.conf shows the appropriate server for each system:

# Use public servers from the pool.ntp.org project.

# Please consider joining the pool (http://www.pool.ntp.org/join.html).

#server 0.centos.pool.ntp.org

#server 1.centos.pool.ntp.org

#server 2.centos.pool.ntp.org

server.mydomain.com

KEVINDIENST · ‎2016-04-07

Hmmm,

I'd do a ps -ef | grep erlang on SA head unit, kill any erlang processes (kill -9 <PID>) and then do service rabbitmq-server restart

Check the status of the node after words and maybe tail out the RabbitMQ log files

tail -f /var/log/rabbitmq/sa@localhost.log

JosephWest1 · ‎2016-04-07

Kevin,

Thank you for your input!

This is on a WAN, so the SA and the clustered nodes are on two different NTP servers. (I will call them working and broke to help differentiate)

SA is on one NTP server

Broken-node and Working-node are on the same ntp server. The time difference is about 40 seconds, but working-node still has no issues maintaining it's certificates, mco, or rabbit messaging.

I spent about 20 hours last week going through all .conf files last week, and ensuring broken-node and working-node were the same. Today, I found one difference:

One thing to note, /var/log/rabbitmq/startup_{log, err} and sa@localhost is empty.

$ps -ef |grep erlang

Working Node:

/usr/lib64/erlang/erts-5.10.4/bin/epmd -daemon

/usr/lib64/erlang/erts-5.10.4/bin/beam.smp -W -w -K true -A30 -P 1048576 -- -root /usr/lib64/erlang -progname erl -- -home /var/lib/rabbitmq -- -pa /usr/lib/rabbitmq/lib/rabbitmq_server-3.3.4/sbin../ebin -noshell -noinput -s rabbit boot -sname sa@localhost -boot start_sasl -config /etc/rabbitmq/rabbitmq -kernel inet_default_connect_options [{nodelay,true}] -sasl errorlog_type error -sasl sals_error_logger false -rabbit error_logger {file, "/var/log/rabbitmq/sa@localhost.log"} -rabbit enabled_plugins_file "/etc/rabbitmq/rsa_enabled_plugins" -rabbit plugins_dir "/usr/lib/rabbitmq/lib/rabbitmq_server-3.3.4/sbin../plugins -rabbit plugins_expand_dir "/var/lib/rabbitmq/mnesia/sa@localhost-plugins-expand" -os_mon_start_cpu_sup false -os_mon start_disksup false -os_mon start_memsup false -mnesia dir "/var/lib/rabbitmq/mnesia/sa@localhost" -kernel inet_dist_listen_min 25672 -kernel inet_dist_listen_max 25672

Broken Node

/usr/lib64/erlang/erts-5.10.4/bin/epmd -daemon

/usr/lib64/erlang/erts-5.10.4/bin/beam.smp -W -w -K true -A30 -P 1048576 -- -root /usr/lib64/erlang -progname erl -- -home /var/lib/rabbitmq -- -pa /usr/lib/rabbitmq/lib/rabbitmq_server-3.3.4/sbin../ebin -noshell -noinput -hidden -boot start_clean -sasl errlog_type error -mnesia dir "/var/lib/rabbitmq/mnesia/sa@localhost" -s rabbit_control_main -nodename sa@localhost -extra wait /var/run/rabbitmq/pid

DavidPoirier · ‎2016-04-07

Is the time forward or behind? Forward is not correct.