Some commands for managing Solaris Cluster 3.3u1 global configuration/status.
Command description | Examples | |
Command: cluster
Subcommand: list-cmds (list Cluster commands) |
# cluster list-cmds claccess cldevice ... |
|
cluster list | # cluster list suncluster |
|
cluster show | # cluster show ...show below info ... === Cluster === Cluster Name: suncluster === Host Access Control === === Cluster Nodes === === Transport Cables === === Transport Switches === === Quorum Devices === === Device Groups === === Registered Resource Types === === Resource Groups and Resources === === DID Device Instances === |
|
cluster status | # cluster status === Cluster Nodes === Node Name Status --------- ------ unixlab-3 Online unixlab-2 Online === Cluster Transport Paths === Endpoint1 Endpoint2 Status --------- --------- ------ unixlab-3:e1000g2 unixlab-2:e1000g2 Path online unixlab-3:e1000g3 unixlab-2:e1000g3 Path online === Cluster Quorum === --- Quorum Votes Summary from (latest node reconfiguration) --- Needed Present Possible ------ ------- -------- 2 3 3 --- Quorum Votes by Node (current status) --- Node Name Present Possible Status --------- ------- -------- ------ unixlab-3 1 1 Online unixlab-2 1 1 Online --- Quorum Votes by Device (current status) --- Device Name Present Possible Status ----------- ------- -------- ------ d4 1 1 Online === Cluster DID Devices === Device Instance Node Status --------------- ---- ------ /dev/did/rdsk/d1 unixlab-3 Ok /dev/did/rdsk/d2 unixlab-3 Ok /dev/did/rdsk/d3 unixlab-3 Ok /dev/did/rdsk/d4 unixlab-2 Ok unixlab-3 Ok /dev/did/rdsk/d6 unixlab-2 Ok /dev/did/rdsk/d7 unixlab-2 Ok /dev/did/rdsk/d8 unixlab-2 Ok | |
cluster check
(Check and report whether the cluster is configured correctly) |
# cluster check -v initializing... initializing xml output... loading auxiliary data... starting check run... ... finished check run finishing xml output... Maximum severity of all violations: Moderate Reports in: /var/cluster/logs/cluster_check/2013-01-03.15:39:56/ cleaning up... |
Some commands for managing Solaris Cluster nodes.
Command description | Examples |
clnode list | # clnode list unixlab-3 unixlab-2 |
clnode show | # clnode show === Cluster Nodes === Node Name: unixlab-3 Node ID: 1 Enabled: yes privatehostname: clusternode1-priv reboot_on_path_failure: disabled globalzoneshares: 1 defaultpsetmin: 1 quorum_vote: 1 quorum_defaultvote: 1 quorum_resv_key: 0x50E49D1600000001 Transport Adapter List: e1000g3, e1000g2 |
clnode status | # clnode status -v === Cluster Nodes === --- Node Status --- Node Name Status --------- ------ unixlab-3 Online unixlab-2 Online --- Node IPMP Group Status --- Node Name Group Name Status Adapter Status --------- ---------- ------ ------- ------ unixlab-3 sc_ipmp0 Online e1000g0 Online unixlab-2 sc_ipmp0 Online e1000g0 Online |
clnode show-rev
Show Oracle Solaris Cluster software release and packages |
# clnode show-rev -v Oracle Solaris Cluster 3.3u1 for Solaris 10 sparc SUNWscu: 3.3.0,REV=2010.07.26.13.19, 145333-08 800222-01 SUNWsccomu: 3.3.0,REV=2010.07.26.13.19, 145333-08 800222-01 SUNWsczr: 3.3.0,REV=2010.07.26.13.19, 800222-01 ... |
Some commands for managing Solaris Cluster devices.
Command description | Examples |
cldevice list -v | # cldevice list -v DID Device Full Device Path ---------- ---------------- d1 unixlab-3:/dev/rdsk/c0t0d0 d2 unixlab-3:/dev/rdsk/c0t1d0 d3 unixlab-3:/dev/rdsk/c0t2d0 d4 unixlab-3:/dev/rdsk/c4t60003BACCC75000050E2094800021C74d0 d4 unixlab-2:/dev/rdsk/c4t60003BACCC75000050E2094800021C74d0 |
cldevice show | # cldevice show === DID Device Instances === DID Device Name: /dev/did/rdsk/d1 Full Device Path: unixlab-3:/dev/rdsk/c0t0d0 Replication: none default_fencing: global DID Device Name: /dev/did/rdsk/d4 Full Device Path: unixlab-2:/dev/rdsk/c4t60003BACCC75000050E2094800021C74d0 Full Device Path: unixlab-3:/dev/rdsk/c4t60003BACCC75000050E2094800021C74d0 Replication: none default_fencing: global |
remove internal disk, check device consistency | # cldevice status ... /dev/did/rdsk/d7 unixlab-2 Ok /dev/did/rdsk/d8 unixlab-2 Fail # cldevice check cldevice: (C594300) Could not stat "/dev/rdsk/c0t2d0s2" - No such file or directory. Warning: Path not loaded - "/dev/rdsk/c0t2d0s2". |
Clear all DID references to stale devices
Refresh the device configuration Populate the global-devices namespace |
# cldevice clear Updating shared devices on node 1 Updating shared devices on node 2 # cldevice refresh # cldevice populate # cldevice status ... /dev/did/rdsk/d7 unixlab-2 Ok (there is no d8) |
Some commands for managing Solaris Cluster quorum.
Command description | Examples |
clquorum list | # clquorum list -v Quorum Type ------ ---- d4 shared_disk unixlab-3 node unixlab-2 node |
clquorum show | # clquorum show Node Name: unixlab-2 Node ID: 2 Quorum Vote Count: 1 Reservation Key: 0x50E49D1600000002 Quorum Device Name: d4 Enabled: yes Votes: 1 Global Name: /dev/did/rdsk/d4s2 Type: shared_disk Access Mode: scsi2 Hosts (enabled): unixlab-3, unixlab-2 |
clquorum status | # clquorum status --- Quorum Votes Summary from (latest node reconfiguration) --- Needed Present Possible ------ ------- -------- 2 3 3 --- Quorum Votes by Node (current status) --- Node Name Present Possible Status --------- ------- -------- ------ unixlab-3 1 1 Online unixlab-2 1 1 Online --- Quorum Votes by Device (current status) --- Device Name Present Possible Status ----------- ------- -------- ------ d4 1 1 Online |
Some commands for managing Solaris Cluster interconnect.
Command description | Examples |
clinterconnect show | # clinterconnect show === Transport Cables === Transport Cable: unixlab-3:e1000g3,switch1@1 Endpoint1: unixlab-3:e1000g3 Endpoint2: switch1@1 State: Enabled Transport Cable: unixlab-3:e1000g2,switch2@1 Endpoint1: unixlab-3:e1000g2 Endpoint2: switch2@1 State: Enabled === Transport Switches === Transport Switch: switch1 State: Enabled Type: switch Port Names: 1 2 Port State(1): Enabled Port State(2): Enabled Transport Switch: switch2 State: Enabled Type: switch Port Names: 1 2 Port State(1): Enabled Port State(2): Enabled --- Transport Adapters for unixlab-3 --- Transport Adapter: e1000g3 State: Enabled Transport Type: dlpi device_name: e1000g device_instance: 3 lazy_free: 1 dlpi_heartbeat_timeout: 10000 dlpi_heartbeat_quantum: 1000 nw_bandwidth: 80 bandwidth: 70 ip_address: 172.16.0.129 netmask: 255.255.255.128 Port Names: 0 Port State(0): Enabled |
clinterconnect status | # clinterconnect status === Cluster Transport Paths === Endpoint1 Endpoint2 Status --------- --------- ------ unixlab-3:e1000g2 unixlab-2:e1000g2 Path online unixlab-3:e1000g3 unixlab-2:e1000g3 Path online |
Shutting down and booting a Global Cluster
Command description | Examples |
Shutting down a Global Cluster.
Run shutdown command on one node (from serial console). Both nodes will shutdown. Don't power off any nodes untill all nodes are at "ok" prompt (SPARC). |
{unixlab-3}/# cluster shutdown -g10 -y unixlab-3 cl_runtime: NOTICE: CMM: Cluster members: unixlab-3 unixlab-2. unixlab-3 cl_runtime: NOTICE: CMM: node reconfiguration #11 completed. RGEVACUATE: Calling clzc halt -n unixlab-3 + RGEVACUATE: Calling clnode evacuate RGEVACUATE: disabling failfasts svc.startd: The system is coming down. Please wait. svc.startd: 131 system services are now being stopped. unixlab-3 nrpe[651]: Cannot remove pidfile '/var/run/nrpe.pid' - check your privileges. unixlab-3 Cluster.Transport.Privipd: fatal: received signal 15 unixlab-3 Cluster.Transport.Cznetd: fatal: received signal 15 unixlab-3 cl_eventlogd[1264]: Going down on signal 15. unixlab-3 syslogd: going down on signal 15 svc.startd: The system is down. syncing file systems... done WARNING: CMM: Node being shut down. Program terminated {11} ok The serial console messages from second node: |
Booting a Global Cluster.
Run 'boot' command from 'ok' on each node in the cluster. |
The serial console messages from one node: |
Shutting down and booting a Single Node in a Cluster
Command description | Examples |
Switch all resource groups, resources
and devices groups from node being shut down to other active nodes in cluster. Command: clnode evacuate And shut down the node. |
{unixlab-2}/# clnode evacuate unixlab-2 {unixlab-2}/# shutdown -g0 -y -i0 Shutdown started. 09:56:16 AM PST Changing to init state 0 - please wait Broadcast Message from root (console) on unixlab-2 F THE SYSTEM unixlab-2 IS BEING SHUT DOWN NOW ! ! ! Log off now or risk your files being damaged {unixlab-2}/# RGEVACUATE: Calling clzc halt -n unixlab-2 + RGEVACUATE: Calling clnode evacuate RGEVACUATE: disabling failfasts svc.startd: The system is coming down. Please wait. svc.startd: 129 system services are now being stopped. Cluster.Transport.Privipd: fatal: received signal 15 Cluster.Transport.Cznetd: fatal: received signal 15 cl_eventlogd[988]: Going down on signal 15. syslogd: going down on signal 15 umount: /global/.devices/node@2 busy umount: /global/.devices/node@1 busy svc.startd: The system is down. syncing file systems... done WARNING: CMM: Node being shut down. Program terminated Check cluster status on remaining running node: {unixlab-3}/# cluster status -t node,quorum === Cluster Nodes === --- Node Status --- Node Name Status --------- ------ unixlab-3 Online unixlab-2 Offline === Cluster Quorum === --- Quorum Votes Summary from (latest node reconfiguration) --- Needed Present Possible ------ ------- -------- 2 2 3 --- Quorum Votes by Node (current status) --- Node Name Present Possible Status --------- ------- -------- ------ unixlab-3 1 1 Online unixlab-2 0 1 Offline --- Quorum Votes by Device (current status) --- Device Name Present Possible Status ----------- ------- -------- ------ d4 1 1 Online |
Booting a Single Node | {5} ok boot SC Alert: Host System has Reset ... WARNING: /scsi_vhci/ssd@g60003baccc75000050e2094800021c74 (ssd0): reservation conflict Booting in cluster mode NOTICE: CMM: Node unixlab-3 (nodeid = 1) with votecount = 1 added. NOTICE: CMM: Node unixlab-2 (nodeid = 2) with votecount = 1 added. WARNING: CMM: Open failed for quorum device /dev/did/rdsk/d4s2 with error 1. NOTICE: clcomm: Adapter e1000g2 constructed cl_runtime: NOTICE: clcomm: Adapter e1000g3 constructed cl_runtime: NOTICE: CMM: Node unixlab-2: attempting to join cluster. cl_runtime: NOTICE: clcomm: Path unixlab-2:e1000g3 - unixlab-3:e1000g3 online cl_runtime: NOTICE: CMM: Node unixlab-3 (nodeid: 1, incarnation #: 1357320249) has become reachable. cl_runtime: WARNING: CMM: Open failed for quorum device /dev/did/rdsk/d4s2 with error 1. cl_runtime: NOTICE: CMM: Cluster has reached quorum. cl_runtime: NOTICE: CMM: Node unixlab-3 (nodeid = 1) is up; new incarnation number = 1357320249. cl_runtime: NOTICE: CMM: Node unixlab-2 (nodeid = 2) is up; new incarnation number = 1357322854. cl_runtime: NOTICE: CMM: Cluster members: unixlab-3 unixlab-2. cl_runtime: NOTICE: CMM: node reconfiguration #6 completed. cl_runtime: NOTICE: CMM: Node unixlab-2: joined cluster. cl_runtime: NOTICE: clcomm: Path unixlab-2:e1000g2 - unixlab-3:e1000g2 online obtaining access to all attached disks |
Files location
Files | Examples |
Cluster commands are logged in file /var/cluster/logs/commandlog
Commands that show config/state of cluster are not logged here. |
{unixlab-3}/var/cluster/logs# tail commandlog 01/03/2013 16:20:59 unixlab-3 16802 root END 0 01/03/2013 16:21:03 unixlab-3 16796 root END 0 01/03/2013 17:11:07 unixlab-3 17014 root START - clquorum "disable" 01/03/2013 17:11:08 unixlab-3 17014 root END 3 01/03/2013 17:11:16 unixlab-3 17015 root START - clquorum "disable" "unixlab-2" 01/03/2013 17:11:16 unixlab-3 17015 root END 38 01/03/2013 17:11:23 unixlab-3 17016 root START - clquorum "disable" "d4" 01/03/2013 17:11:23 unixlab-3 17016 root END 9 01/03/2013 17:11:44 unixlab-3 17018 root START - clquorum "disable" 01/03/2013 17:11:44 unixlab-3 17018 root END 3 |