Classification: Application
Category: FireMon Data Collector
Context: GlusterFS
Severity: Error
Summary
One of the nodes in the GlusterFS trusted storage pool is not connected. It may be offline or otherwise unreachable from this machine.
Description
This error occurs when one of the nodes in the GlusterFS trusted storage pool is not able to communicate with the other nodes. The node is not participating in filesystem replication.
Impact
While the node is disconnected, it will not receive replication updates from other servers, nor will it be able to send replication updates to other servers in the cluster.
If this error is reported on the primary database server about a standby server, it indicates that the standby server is not up-to-date with respect to the shared filesystem data. If the machine is disconnected for an extended period of time, it may take a long time to replicate the changes it has missed. If a failure event were to occur on the primary database server while the standby is disconnected, promoting that standby will result in permanent loss of the data that was not replicated.
If this error is reported on a standby server about the primary server, it may indicate that the primary server has failed. If this is the case, a switchover may need to be initiated. It could also indicate that this standby machine has lost network communication with the primary server. If a failure event were to occur on the primary database server while the standby is disconnected, promoting that standby will result in permanent loss of the data that was not replicated.
Cause
There are a number of issues that could cause a member of the GlusterFS trusted storage pool to become disconnected. To accurately identify the exact cause, additional information needs to be gathered from the disconnected machine.
To investigate the cause of the problem, connect to the FMOS command line on the machine in question. Use the FMOS Health Evaluation system to inspect the current state of the machine:
fmos health -dv
This may provide better insight into the issue. Use the links provided with the health results to find additional information about specific problems.
Log files may also provide valuable information for troubleshooting the issue. Use the fmos logview command to view the system log and search for messages from glusterd.
Resolution
To resolve this error, the standby server will need to be reconnected to the GlusterFS trusted storage pool. Follow the instructions in the Cause section above to identify the specific issue to resolve.
Comments
0 comments
Article is closed for comments.