Working with Solaris Cluster 3.3u1

Some commands for managing Solaris Cluster 3.3u1 global configuration/status.

Command description	Examples
Command: cluster Subcommand: list-cmds (list Cluster commands)	# cluster list-cmds claccess cldevice ...
cluster list	# cluster list suncluster
cluster show	# cluster show ...show below info ... === Cluster === Cluster Name: suncluster === Host Access Control === === Cluster Nodes === === Transport Cables === === Transport Switches === === Quorum Devices === === Device Groups === === Registered Resource Types === === Resource Groups and Resources === === DID Device Instances ===
cluster status	# cluster status === Cluster Nodes === Node Name Status --------- ------ unixlab-3 Online unixlab-2 Online === Cluster Transport Paths === Endpoint1 Endpoint2 Status --------- --------- ------ unixlab-3:e1000g2 unixlab-2:e1000g2 Path online unixlab-3:e1000g3 unixlab-2:e1000g3 Path online === Cluster Quorum === --- Quorum Votes Summary from (latest node reconfiguration) --- Needed Present Possible ------ ------- -------- 2 3 3 --- Quorum Votes by Node (current status) --- Node Name Present Possible Status --------- ------- -------- ------ unixlab-3 1 1 Online unixlab-2 1 1 Online --- Quorum Votes by Device (current status) --- Device Name Present Possible Status ----------- ------- -------- ------ d4 1 1 Online === Cluster DID Devices === Device Instance Node Status --------------- ---- ------ /dev/did/rdsk/d1 unixlab-3 Ok /dev/did/rdsk/d2 unixlab-3 Ok /dev/did/rdsk/d3 unixlab-3 Ok /dev/did/rdsk/d4 unixlab-2 Ok unixlab-3 Ok /dev/did/rdsk/d6 unixlab-2 Ok /dev/did/rdsk/d7 unixlab-2 Ok /dev/did/rdsk/d8 unixlab-2 Ok
cluster check (Check and report whether the cluster is configured correctly)	# cluster check -v initializing... initializing xml output... loading auxiliary data... starting check run... ... finished check run finishing xml output... Maximum severity of all violations: Moderate Reports in: /var/cluster/logs/cluster_check/2013-01-03.15:39:56/ cleaning up...

Some commands for managing Solaris Cluster nodes.

Command description	Examples
clnode list	# clnode list unixlab-3 unixlab-2
clnode show	# clnode show === Cluster Nodes === Node Name: unixlab-3 Node ID: 1 Enabled: yes privatehostname: clusternode1-priv reboot_on_path_failure: disabled globalzoneshares: 1 defaultpsetmin: 1 quorum_vote: 1 quorum_defaultvote: 1 quorum_resv_key: 0x50E49D1600000001 Transport Adapter List: e1000g3, e1000g2
clnode status	# clnode status -v === Cluster Nodes === --- Node Status --- Node Name Status --------- ------ unixlab-3 Online unixlab-2 Online --- Node IPMP Group Status --- Node Name Group Name Status Adapter Status --------- ---------- ------ ------- ------ unixlab-3 sc_ipmp0 Online e1000g0 Online unixlab-2 sc_ipmp0 Online e1000g0 Online
clnode show-rev Show Oracle Solaris Cluster software release and packages	# clnode show-rev -v Oracle Solaris Cluster 3.3u1 for Solaris 10 sparc SUNWscu: 3.3.0,REV=2010.07.26.13.19, 145333-08 800222-01 SUNWsccomu: 3.3.0,REV=2010.07.26.13.19, 145333-08 800222-01 SUNWsczr: 3.3.0,REV=2010.07.26.13.19, 800222-01 ...

Some commands for managing Solaris Cluster devices.

Command description	Examples
cldevice list -v	# cldevice list -v DID Device Full Device Path ---------- ---------------- d1 unixlab-3:/dev/rdsk/c0t0d0 d2 unixlab-3:/dev/rdsk/c0t1d0 d3 unixlab-3:/dev/rdsk/c0t2d0 d4 unixlab-3:/dev/rdsk/c4t60003BACCC75000050E2094800021C74d0 d4 unixlab-2:/dev/rdsk/c4t60003BACCC75000050E2094800021C74d0
cldevice show	# cldevice show === DID Device Instances === DID Device Name: /dev/did/rdsk/d1 Full Device Path: unixlab-3:/dev/rdsk/c0t0d0 Replication: none default_fencing: global DID Device Name: /dev/did/rdsk/d4 Full Device Path: unixlab-2:/dev/rdsk/c4t60003BACCC75000050E2094800021C74d0 Full Device Path: unixlab-3:/dev/rdsk/c4t60003BACCC75000050E2094800021C74d0 Replication: none default_fencing: global
remove internal disk, check device consistency	# cldevice status ... /dev/did/rdsk/d7 unixlab-2 Ok /dev/did/rdsk/d8 unixlab-2 Fail # cldevice check cldevice: (C594300) Could not stat "/dev/rdsk/c0t2d0s2" - No such file or directory. Warning: Path not loaded - "/dev/rdsk/c0t2d0s2".
Clear all DID references to stale devices Refresh the device configuration Populate the global-devices namespace	# cldevice clear Updating shared devices on node 1 Updating shared devices on node 2 # cldevice refresh # cldevice populate # cldevice status ... /dev/did/rdsk/d7 unixlab-2 Ok (there is no d8)

Some commands for managing Solaris Cluster quorum.

Command description	Examples
clquorum list	# clquorum list -v Quorum Type ------ ---- d4 shared_disk unixlab-3 node unixlab-2 node
clquorum show	# clquorum show Node Name: unixlab-2 Node ID: 2 Quorum Vote Count: 1 Reservation Key: 0x50E49D1600000002 Quorum Device Name: d4 Enabled: yes Votes: 1 Global Name: /dev/did/rdsk/d4s2 Type: shared_disk Access Mode: scsi2 Hosts (enabled): unixlab-3, unixlab-2
clquorum status	# clquorum status --- Quorum Votes Summary from (latest node reconfiguration) --- Needed Present Possible ------ ------- -------- 2 3 3 --- Quorum Votes by Node (current status) --- Node Name Present Possible Status --------- ------- -------- ------ unixlab-3 1 1 Online unixlab-2 1 1 Online --- Quorum Votes by Device (current status) --- Device Name Present Possible Status ----------- ------- -------- ------ d4 1 1 Online

Some commands for managing Solaris Cluster interconnect.

Command description

Examples

clinterconnect show

#  clinterconnect show
=== Transport Cables ===
Transport Cable:                                unixlab-3:e1000g3,switch1@1
  Endpoint1:                                       unixlab-3:e1000g3
  Endpoint2:                                       switch1@1
  State:                                           Enabled
Transport Cable:                                unixlab-3:e1000g2,switch2@1
  Endpoint1:                                       unixlab-3:e1000g2
  Endpoint2:                                       switch2@1
  State:                                           Enabled
=== Transport Switches ===
Transport Switch:                               switch1
  State:                                           Enabled
  Type:                                            switch
  Port Names:                                      1 2
  Port State(1):                                   Enabled
  Port State(2):                                   Enabled
Transport Switch:                               switch2
  State:                                           Enabled
  Type:                                            switch
  Port Names:                                      1 2
  Port State(1):                                   Enabled
  Port State(2):                                   Enabled
--- Transport Adapters for unixlab-3 ---
Transport Adapter:                              e1000g3
  State:                                           Enabled
  Transport Type:                                  dlpi
  device_name:                                     e1000g
  device_instance:                                 3
  lazy_free:                                       1
  dlpi_heartbeat_timeout:                          10000
  dlpi_heartbeat_quantum:                          1000
  nw_bandwidth:                                    80
  bandwidth:                                       70
  ip_address:                                      172.16.0.129
  netmask:                                         255.255.255.128
  Port Names:                                      0
  Port State(0):                                   Enabled

clinterconnect status

# clinterconnect status
=== Cluster Transport Paths ===
Endpoint1               Endpoint2               Status
---------               ---------               ------
unixlab-3:e1000g2       unixlab-2:e1000g2       Path online
unixlab-3:e1000g3       unixlab-2:e1000g3       Path online

Shutting down and booting a Global Cluster

Command description

Examples

Shutting down a Global Cluster.
Run shutdown command on one node (from serial console).
Both nodes will shutdown.
Don't power off any nodes untill all nodes are at "ok" prompt (SPARC).

{unixlab-3}/# cluster shutdown -g10 -y
unixlab-3 cl_runtime: NOTICE: CMM: Cluster members: unixlab-3 unixlab-2.
unixlab-3 cl_runtime: NOTICE: CMM: node reconfiguration #11 completed.
RGEVACUATE: Calling clzc halt -n unixlab-3 +
RGEVACUATE: Calling clnode evacuate
RGEVACUATE: disabling failfasts
svc.startd: The system is coming down.  Please wait.
svc.startd: 131 system services are now being stopped.
unixlab-3 nrpe[651]: Cannot remove pidfile '/var/run/nrpe.pid' - check your privileges.
unixlab-3 Cluster.Transport.Privipd: fatal: received signal 15
unixlab-3 Cluster.Transport.Cznetd: fatal: received signal 15
unixlab-3 cl_eventlogd[1264]: Going down on signal 15.
unixlab-3 syslogd: going down on signal 15
svc.startd: The system is down.
syncing file systems... done
WARNING: CMM: Node being shut down.
Program terminated
{11} ok

The serial console messages from second node: 


{unixlab-2}/# cl_runtime: NOTICE: CMM: Cluster members: unixlab-3 unixlab-2.
cl_runtime: NOTICE: CMM: node reconfiguration #11 completed.
RGEVACUATE: Calling clzc halt -n unixlab-2 +
RGEVACUATE: Calling clnode evacuate
RGEVACUATE: disabling failfasts
svc.startd: The system is coming down.  Please wait.
svc.startd: 130 system services are now being stopped.
cl_eventlogd[1248]: Going down on signal 15.
Cluster.Transport.Privipd: fatal: received signal 15
Cluster.Transport.Cznetd: fatal: received signal 15
syslogd: going down on signal 15
svc.startd: The system is down.
syncing file systems... done
WARNING: CMM: Node being shut down.
NOTICE: CMM: Node unixlab-3 (nodeid = 1) is dead.
Program terminated
{13} ok

Booting a Global Cluster.
Run 'boot' command from 'ok' on each node in the cluster.

The serial console messages from one node:


{13} ok boot
SC Alert: Host System has Reset
Sun Fire T200, No Keyboard
Copyright 2008 Sun Microsystems, Inc.  All rights reserved.
OpenBoot 4.30.0, 16256 MB memory available, Serial #74104598.
Ethernet address 0:14:4f:6a:bf:16, Host ID: 846abf16.
Boot device: disk:a  File and args:
SunOS Release 5.10 Version Generic_147440-01 64-bit
Copyright (c) 1983, 2011, Oracle and/or its affiliates. All rights reserved.
Hostname: unixlab-2
Booting in cluster mode
NOTICE: CMM: Node unixlab-3 (nodeid = 1) with votecount = 1 added.
NOTICE: CMM: Node unixlab-2 (nodeid = 2) with votecount = 1 added.
cl_runtime: WARNING: CMM: Open failed for quorum device /dev/did/rdsk/d4s2 with error 1.
cl_runtime: NOTICE: clcomm: Adapter e1000g2 constructed
cl_runtime: NOTICE: clcomm: Adapter e1000g3 constructed
cl_runtime: NOTICE: CMM: Node unixlab-2: attempting to join cluster.
cl_runtime: NOTICE: clcomm: Path unixlab-2:e1000g2 - unixlab-3:e1000g2 online
cl_runtime: NOTICE: CMM: Node unixlab-3 (nodeid: 1, incarnation #: 1357320249) has become reachable.
cl_runtime: NOTICE: clcomm: Path unixlab-2:e1000g3 - unixlab-3:e1000g3 online
cl_runtime: NOTICE: CMM: Cluster has reached quorum.
cl_runtime: NOTICE: CMM: Node unixlab-3 (nodeid = 1) is up; new incarnation number = 1357320249.
cl_runtime: NOTICE: CMM: Node unixlab-2 (nodeid = 2) is up; new incarnation number = 1357320329.
cl_runtime: NOTICE: CMM: Cluster members: unixlab-3 unixlab-2.
cl_runtime: NOTICE: CMM: node reconfiguration #3 completed.
cl_runtime: NOTICE: CMM: Node unixlab-2: joined cluster.
obtaining access to all attached disks

Verify status of nodes in cluster:

{unixlab-2}/# cluster status -t node
=== Cluster Nodes ===
--- Node Status ---
Node Name                                       Status
---------                                       ------
unixlab-3                                       Online
unixlab-2                                       Online

Shutting down and booting a Single Node in a Cluster

Command description

Examples

Switch all resource groups, resources
and devices groups from node being shut down
to other active nodes in cluster.
Command: clnode evacuate

And shut down the node.

{unixlab-2}/# clnode evacuate unixlab-2

{unixlab-2}/# shutdown -g0 -y -i0
Shutdown started.    09:56:16 AM PST
Changing to init state 0 - please wait
Broadcast Message from root (console) on unixlab-2 F
THE SYSTEM unixlab-2 IS BEING SHUT DOWN NOW ! ! !
Log off now or risk your files being damaged

{unixlab-2}/# RGEVACUATE: Calling clzc halt -n unixlab-2 +
RGEVACUATE: Calling clnode evacuate
RGEVACUATE: disabling failfasts
svc.startd: The system is coming down.  Please wait.
svc.startd: 129 system services are now being stopped.
Cluster.Transport.Privipd: fatal: received signal 15
Cluster.Transport.Cznetd: fatal: received signal 15
cl_eventlogd[988]: Going down on signal 15.
syslogd: going down on signal 15
umount: /global/.devices/node@2 busy
umount: /global/.devices/node@1 busy
svc.startd: The system is down.
syncing file systems... done
WARNING: CMM: Node being shut down.
Program terminated

Check cluster status on remaining running node:

{unixlab-3}/# cluster status -t node,quorum
=== Cluster Nodes ===
--- Node Status ---
Node Name                                       Status
---------                                       ------
unixlab-3                                       Online
unixlab-2                                       Offline
=== Cluster Quorum ===
--- Quorum Votes Summary from (latest node reconfiguration) ---
            Needed   Present   Possible
            ------   -------   --------
            2        2         3
--- Quorum Votes by Node (current status) ---
Node Name       Present       Possible       Status
---------       -------       --------       ------
unixlab-3       1             1              Online
unixlab-2       0             1              Offline
--- Quorum Votes by Device (current status) ---
Device Name       Present      Possible      Status
-----------       -------      --------      ------
d4                1            1             Online

Booting a Single Node

{5} ok boot
SC Alert: Host System has Reset
...
WARNING: /scsi_vhci/ssd@g60003baccc75000050e2094800021c74 (ssd0):
        reservation conflict
Booting in cluster mode
NOTICE: CMM: Node unixlab-3 (nodeid = 1) with votecount = 1 added.
NOTICE: CMM: Node unixlab-2 (nodeid = 2) with votecount = 1 added.
WARNING: CMM: Open failed for quorum device /dev/did/rdsk/d4s2 with error 1.
NOTICE: clcomm: Adapter e1000g2 constructed
cl_runtime: NOTICE: clcomm: Adapter e1000g3 constructed
cl_runtime: NOTICE: CMM: Node unixlab-2: attempting to join cluster.
cl_runtime: NOTICE: clcomm: Path unixlab-2:e1000g3 - unixlab-3:e1000g3 online
cl_runtime: NOTICE: CMM: Node unixlab-3 (nodeid: 1, incarnation #: 1357320249) has become reachable.
cl_runtime: WARNING: CMM: Open failed for quorum device /dev/did/rdsk/d4s2 with error 1.
cl_runtime: NOTICE: CMM: Cluster has reached quorum.
cl_runtime: NOTICE: CMM: Node unixlab-3 (nodeid = 1) is up; new incarnation number = 1357320249.
cl_runtime: NOTICE: CMM: Node unixlab-2 (nodeid = 2) is up; new incarnation number = 1357322854.
cl_runtime: NOTICE: CMM: Cluster members: unixlab-3 unixlab-2.
cl_runtime: NOTICE: CMM: node reconfiguration #6 completed.
cl_runtime: NOTICE: CMM: Node unixlab-2: joined cluster.
cl_runtime: NOTICE: clcomm: Path unixlab-2:e1000g2 - unixlab-3:e1000g2 online
obtaining access to all attached disks

Files location

Files

Examples

Cluster commands are logged in file /var/cluster/logs/commandlog
Commands that show config/state of cluster are not logged here.

{unixlab-3}/var/cluster/logs# tail commandlog
01/03/2013 16:20:59 unixlab-3 16802 root END 0
01/03/2013 16:21:03 unixlab-3 16796 root END 0
01/03/2013 17:11:07 unixlab-3 17014 root START - clquorum "disable"
01/03/2013 17:11:08 unixlab-3 17014 root END 3
01/03/2013 17:11:16 unixlab-3 17015 root START - clquorum "disable" "unixlab-2"
01/03/2013 17:11:16 unixlab-3 17015 root END 38
01/03/2013 17:11:23 unixlab-3 17016 root START - clquorum "disable" "d4"
01/03/2013 17:11:23 unixlab-3 17016 root END 9
01/03/2013 17:11:44 unixlab-3 17018 root START - clquorum "disable"
01/03/2013 17:11:44 unixlab-3 17018 root END 3

Back to the main page