Back to the main page

Working with Solaris Cluster 3.3u1

Some commands for managing Solaris Cluster 3.3u1 global configuration/status.

Command descriptionExamples
Command: cluster
Subcommand: list-cmds (list Cluster commands)
# cluster list-cmds
claccess
cldevice
...
cluster list
# cluster list
suncluster
cluster show
# cluster show
...show below info ...
  === Cluster ===
  Cluster Name:  suncluster
  === Host Access Control ===
  === Cluster Nodes ===
  === Transport Cables ===
  === Transport Switches ===
  === Quorum Devices ===
  === Device Groups ===
  === Registered Resource Types ===
  === Resource Groups and Resources ===
  === DID Device Instances ===
cluster status
# cluster status
=== Cluster Nodes ===
Node Name                                       Status
---------                                       ------
unixlab-3                                       Online
unixlab-2                                       Online
=== Cluster Transport Paths ===
Endpoint1               Endpoint2               Status
---------               ---------               ------
unixlab-3:e1000g2       unixlab-2:e1000g2       Path online
unixlab-3:e1000g3       unixlab-2:e1000g3       Path online
=== Cluster Quorum ===
--- Quorum Votes Summary from (latest node reconfiguration) ---
            Needed   Present   Possible
            ------   -------   --------
            2        3         3
--- Quorum Votes by Node (current status) ---
Node Name       Present       Possible       Status
---------       -------       --------       ------
unixlab-3       1             1              Online
unixlab-2       1             1              Online
--- Quorum Votes by Device (current status) ---
Device Name       Present      Possible      Status
-----------       -------      --------      ------
d4                1            1             Online
=== Cluster DID Devices ===
Device Instance             Node                Status
---------------             ----                ------
/dev/did/rdsk/d1            unixlab-3           Ok
/dev/did/rdsk/d2            unixlab-3           Ok
/dev/did/rdsk/d3            unixlab-3           Ok
/dev/did/rdsk/d4            unixlab-2           Ok
                            unixlab-3           Ok
/dev/did/rdsk/d6            unixlab-2           Ok
/dev/did/rdsk/d7            unixlab-2           Ok
/dev/did/rdsk/d8            unixlab-2           Ok
cluster check
(Check and report whether the cluster is configured correctly)
# cluster check -v 
initializing...
  initializing xml output...
  loading auxiliary data...
  starting check run...
...
  finished check run
  finishing xml output...
  Maximum severity of all violations: Moderate
  Reports in: /var/cluster/logs/cluster_check/2013-01-03.15:39:56/
  cleaning up...

Some commands for managing Solaris Cluster nodes.

Command descriptionExamples
clnode list
# clnode list
unixlab-3
unixlab-2
clnode show
# clnode show
=== Cluster Nodes ===
Node Name:                unixlab-3
  Node ID:                1
  Enabled:                yes
  privatehostname:        clusternode1-priv
  reboot_on_path_failure: disabled
  globalzoneshares:       1
  defaultpsetmin:         1
  quorum_vote:            1
  quorum_defaultvote:     1
  quorum_resv_key:        0x50E49D1600000001
  Transport Adapter List: e1000g3, e1000g2
clnode status
# clnode status -v
=== Cluster Nodes ===
--- Node Status ---
Node Name                                             Status
---------                                             ------
unixlab-3                                             Online
unixlab-2                                             Online
--- Node IPMP Group Status ---
Node Name     Group Name     Status     Adapter        Status
---------     ----------     ------     -------        ------
unixlab-3     sc_ipmp0       Online     e1000g0        Online
unixlab-2     sc_ipmp0       Online     e1000g0        Online
clnode show-rev
Show Oracle Solaris Cluster software release and packages
# clnode show-rev -v 
Oracle Solaris Cluster 3.3u1 for Solaris 10 sparc
SUNWscu:       3.3.0,REV=2010.07.26.13.19, 145333-08 800222-01
SUNWsccomu:    3.3.0,REV=2010.07.26.13.19, 145333-08 800222-01
SUNWsczr:      3.3.0,REV=2010.07.26.13.19, 800222-01
...

Some commands for managing Solaris Cluster devices.

Command descriptionExamples
cldevice list -v
#  cldevice list -v
DID Device  Full Device Path
----------  ----------------
d1          unixlab-3:/dev/rdsk/c0t0d0
d2          unixlab-3:/dev/rdsk/c0t1d0
d3          unixlab-3:/dev/rdsk/c0t2d0
d4          unixlab-3:/dev/rdsk/c4t60003BACCC75000050E2094800021C74d0
d4          unixlab-2:/dev/rdsk/c4t60003BACCC75000050E2094800021C74d0
cldevice show
# cldevice show
=== DID Device Instances ===
DID Device Name:    /dev/did/rdsk/d1
  Full Device Path: unixlab-3:/dev/rdsk/c0t0d0
  Replication:      none
  default_fencing:  global

DID Device Name:    /dev/did/rdsk/d4
  Full Device Path: unixlab-2:/dev/rdsk/c4t60003BACCC75000050E2094800021C74d0
  Full Device Path: unixlab-3:/dev/rdsk/c4t60003BACCC75000050E2094800021C74d0
  Replication:      none
  default_fencing:  global
remove internal disk, check device consistency
# cldevice status
...
/dev/did/rdsk/d7  unixlab-2   Ok 
/dev/did/rdsk/d8  unixlab-2   Fail
# cldevice check
cldevice:  (C594300) Could not stat "/dev/rdsk/c0t2d0s2" - No such file or directory.
Warning: Path not loaded - "/dev/rdsk/c0t2d0s2".
Clear all DID references to stale devices
Refresh the device configuration
Populate the global-devices namespace
# cldevice clear
Updating shared devices on node 1
Updating shared devices on node 2
# cldevice refresh
# cldevice populate
# cldevice status
...
/dev/did/rdsk/d7  unixlab-2   Ok
(there is no d8)

Some commands for managing Solaris Cluster quorum.

Command descriptionExamples
clquorum list
#  clquorum list -v
Quorum              Type
------              ----
d4                  shared_disk
unixlab-3           node
unixlab-2           node
clquorum show
# clquorum show
Node Name:            unixlab-2
  Node ID:            2
  Quorum Vote Count:  1
  Reservation Key:    0x50E49D1600000002

Quorum Device Name:   d4
  Enabled:            yes
  Votes:              1
  Global Name:        /dev/did/rdsk/d4s2
  Type:               shared_disk
  Access Mode:        scsi2
  Hosts (enabled):    unixlab-3, unixlab-2
clquorum status
# clquorum status
--- Quorum Votes Summary from (latest node reconfiguration) ---
            Needed   Present   Possible
            ------   -------   --------
            2        3         3

--- Quorum Votes by Node (current status) ---
Node Name       Present       Possible       Status
---------       -------       --------       ------
unixlab-3       1             1              Online
unixlab-2       1             1              Online

--- Quorum Votes by Device (current status) ---
Device Name       Present      Possible      Status
-----------       -------      --------      ------
d4                1            1             Online

Some commands for managing Solaris Cluster interconnect.

Command descriptionExamples
clinterconnect show
#  clinterconnect show
=== Transport Cables ===
Transport Cable:                                unixlab-3:e1000g3,switch1@1
  Endpoint1:                                       unixlab-3:e1000g3
  Endpoint2:                                       switch1@1
  State:                                           Enabled
Transport Cable:                                unixlab-3:e1000g2,switch2@1
  Endpoint1:                                       unixlab-3:e1000g2
  Endpoint2:                                       switch2@1
  State:                                           Enabled
=== Transport Switches ===
Transport Switch:                               switch1
  State:                                           Enabled
  Type:                                            switch
  Port Names:                                      1 2
  Port State(1):                                   Enabled
  Port State(2):                                   Enabled
Transport Switch:                               switch2
  State:                                           Enabled
  Type:                                            switch
  Port Names:                                      1 2
  Port State(1):                                   Enabled
  Port State(2):                                   Enabled
--- Transport Adapters for unixlab-3 ---
Transport Adapter:                              e1000g3
  State:                                           Enabled
  Transport Type:                                  dlpi
  device_name:                                     e1000g
  device_instance:                                 3
  lazy_free:                                       1
  dlpi_heartbeat_timeout:                          10000
  dlpi_heartbeat_quantum:                          1000
  nw_bandwidth:                                    80
  bandwidth:                                       70
  ip_address:                                      172.16.0.129
  netmask:                                         255.255.255.128
  Port Names:                                      0
  Port State(0):                                   Enabled
clinterconnect status
# clinterconnect status
=== Cluster Transport Paths ===
Endpoint1               Endpoint2               Status
---------               ---------               ------
unixlab-3:e1000g2       unixlab-2:e1000g2       Path online
unixlab-3:e1000g3       unixlab-2:e1000g3       Path online

Shutting down and booting a Global Cluster

Command descriptionExamples
Shutting down a Global Cluster.
Run shutdown command on one node (from serial console).
Both nodes will shutdown.
Don't power off any nodes untill all nodes are at "ok" prompt (SPARC).
{unixlab-3}/# cluster shutdown -g10 -y
unixlab-3 cl_runtime: NOTICE: CMM: Cluster members: unixlab-3 unixlab-2.
unixlab-3 cl_runtime: NOTICE: CMM: node reconfiguration #11 completed.
RGEVACUATE: Calling clzc halt -n unixlab-3 +
RGEVACUATE: Calling clnode evacuate
RGEVACUATE: disabling failfasts
svc.startd: The system is coming down.  Please wait.
svc.startd: 131 system services are now being stopped.
unixlab-3 nrpe[651]: Cannot remove pidfile '/var/run/nrpe.pid' - check your privileges.
unixlab-3 Cluster.Transport.Privipd: fatal: received signal 15
unixlab-3 Cluster.Transport.Cznetd: fatal: received signal 15
unixlab-3 cl_eventlogd[1264]: Going down on signal 15.
unixlab-3 syslogd: going down on signal 15
svc.startd: The system is down.
syncing file systems... done
WARNING: CMM: Node being shut down.
Program terminated
{11} ok

The serial console messages from second node: 

{unixlab-2}/# cl_runtime: NOTICE: CMM: Cluster members: unixlab-3 unixlab-2. cl_runtime: NOTICE: CMM: node reconfiguration #11 completed. RGEVACUATE: Calling clzc halt -n unixlab-2 + RGEVACUATE: Calling clnode evacuate RGEVACUATE: disabling failfasts svc.startd: The system is coming down. Please wait. svc.startd: 130 system services are now being stopped. cl_eventlogd[1248]: Going down on signal 15. Cluster.Transport.Privipd: fatal: received signal 15 Cluster.Transport.Cznetd: fatal: received signal 15 syslogd: going down on signal 15 svc.startd: The system is down. syncing file systems... done WARNING: CMM: Node being shut down. NOTICE: CMM: Node unixlab-3 (nodeid = 1) is dead. Program terminated {13} ok
Booting a Global Cluster.
Run 'boot' command from 'ok' on each node in the cluster.
The serial console messages from one node:

{13} ok boot SC Alert: Host System has Reset Sun Fire T200, No Keyboard Copyright 2008 Sun Microsystems, Inc. All rights reserved. OpenBoot 4.30.0, 16256 MB memory available, Serial #74104598. Ethernet address 0:14:4f:6a:bf:16, Host ID: 846abf16. Boot device: disk:a File and args: SunOS Release 5.10 Version Generic_147440-01 64-bit Copyright (c) 1983, 2011, Oracle and/or its affiliates. All rights reserved. Hostname: unixlab-2 Booting in cluster mode NOTICE: CMM: Node unixlab-3 (nodeid = 1) with votecount = 1 added. NOTICE: CMM: Node unixlab-2 (nodeid = 2) with votecount = 1 added. cl_runtime: WARNING: CMM: Open failed for quorum device /dev/did/rdsk/d4s2 with error 1. cl_runtime: NOTICE: clcomm: Adapter e1000g2 constructed cl_runtime: NOTICE: clcomm: Adapter e1000g3 constructed cl_runtime: NOTICE: CMM: Node unixlab-2: attempting to join cluster. cl_runtime: NOTICE: clcomm: Path unixlab-2:e1000g2 - unixlab-3:e1000g2 online cl_runtime: NOTICE: CMM: Node unixlab-3 (nodeid: 1, incarnation #: 1357320249) has become reachable. cl_runtime: NOTICE: clcomm: Path unixlab-2:e1000g3 - unixlab-3:e1000g3 online cl_runtime: NOTICE: CMM: Cluster has reached quorum. cl_runtime: NOTICE: CMM: Node unixlab-3 (nodeid = 1) is up; new incarnation number = 1357320249. cl_runtime: NOTICE: CMM: Node unixlab-2 (nodeid = 2) is up; new incarnation number = 1357320329. cl_runtime: NOTICE: CMM: Cluster members: unixlab-3 unixlab-2. cl_runtime: NOTICE: CMM: node reconfiguration #3 completed. cl_runtime: NOTICE: CMM: Node unixlab-2: joined cluster. obtaining access to all attached disks Verify status of nodes in cluster: {unixlab-2}/# cluster status -t node === Cluster Nodes === --- Node Status --- Node Name Status --------- ------ unixlab-3 Online unixlab-2 Online

Shutting down and booting a Single Node in a Cluster

Command descriptionExamples
Switch all resource groups, resources
and devices groups from node being shut down
to other active nodes in cluster.
Command: clnode evacuate


And shut down the node.
{unixlab-2}/# clnode evacuate unixlab-2

{unixlab-2}/# shutdown -g0 -y -i0
Shutdown started.    09:56:16 AM PST
Changing to init state 0 - please wait
Broadcast Message from root (console) on unixlab-2 F
THE SYSTEM unixlab-2 IS BEING SHUT DOWN NOW ! ! !
Log off now or risk your files being damaged

{unixlab-2}/# RGEVACUATE: Calling clzc halt -n unixlab-2 +
RGEVACUATE: Calling clnode evacuate
RGEVACUATE: disabling failfasts
svc.startd: The system is coming down.  Please wait.
svc.startd: 129 system services are now being stopped.
Cluster.Transport.Privipd: fatal: received signal 15
Cluster.Transport.Cznetd: fatal: received signal 15
cl_eventlogd[988]: Going down on signal 15.
syslogd: going down on signal 15
umount: /global/.devices/node@2 busy
umount: /global/.devices/node@1 busy
svc.startd: The system is down.
syncing file systems... done
WARNING: CMM: Node being shut down.
Program terminated

Check cluster status on remaining running node:

{unixlab-3}/# cluster status -t node,quorum
=== Cluster Nodes ===
--- Node Status ---
Node Name                                       Status
---------                                       ------
unixlab-3                                       Online
unixlab-2                                       Offline
=== Cluster Quorum ===
--- Quorum Votes Summary from (latest node reconfiguration) ---
            Needed   Present   Possible
            ------   -------   --------
            2        2         3
--- Quorum Votes by Node (current status) ---
Node Name       Present       Possible       Status
---------       -------       --------       ------
unixlab-3       1             1              Online
unixlab-2       0             1              Offline
--- Quorum Votes by Device (current status) ---
Device Name       Present      Possible      Status
-----------       -------      --------      ------
d4                1            1             Online
Booting a Single Node
{5} ok boot
SC Alert: Host System has Reset
...
WARNING: /scsi_vhci/ssd@g60003baccc75000050e2094800021c74 (ssd0):
        reservation conflict
Booting in cluster mode
NOTICE: CMM: Node unixlab-3 (nodeid = 1) with votecount = 1 added.
NOTICE: CMM: Node unixlab-2 (nodeid = 2) with votecount = 1 added.
WARNING: CMM: Open failed for quorum device /dev/did/rdsk/d4s2 with error 1.
NOTICE: clcomm: Adapter e1000g2 constructed
cl_runtime: NOTICE: clcomm: Adapter e1000g3 constructed
cl_runtime: NOTICE: CMM: Node unixlab-2: attempting to join cluster.
cl_runtime: NOTICE: clcomm: Path unixlab-2:e1000g3 - unixlab-3:e1000g3 online
cl_runtime: NOTICE: CMM: Node unixlab-3 (nodeid: 1, incarnation #: 1357320249) has become reachable.
cl_runtime: WARNING: CMM: Open failed for quorum device /dev/did/rdsk/d4s2 with error 1.
cl_runtime: NOTICE: CMM: Cluster has reached quorum.
cl_runtime: NOTICE: CMM: Node unixlab-3 (nodeid = 1) is up; new incarnation number = 1357320249.
cl_runtime: NOTICE: CMM: Node unixlab-2 (nodeid = 2) is up; new incarnation number = 1357322854.
cl_runtime: NOTICE: CMM: Cluster members: unixlab-3 unixlab-2.
cl_runtime: NOTICE: CMM: node reconfiguration #6 completed.
cl_runtime: NOTICE: CMM: Node unixlab-2: joined cluster.
cl_runtime: NOTICE: clcomm: Path unixlab-2:e1000g2 - unixlab-3:e1000g2 online
obtaining access to all attached disks

Files location

FilesExamples
Cluster commands are logged in file /var/cluster/logs/commandlog
Commands that show config/state of cluster are not logged here.
{unixlab-3}/var/cluster/logs# tail commandlog
01/03/2013 16:20:59 unixlab-3 16802 root END 0
01/03/2013 16:21:03 unixlab-3 16796 root END 0
01/03/2013 17:11:07 unixlab-3 17014 root START - clquorum "disable"
01/03/2013 17:11:08 unixlab-3 17014 root END 3
01/03/2013 17:11:16 unixlab-3 17015 root START - clquorum "disable" "unixlab-2"
01/03/2013 17:11:16 unixlab-3 17015 root END 38
01/03/2013 17:11:23 unixlab-3 17016 root START - clquorum "disable" "d4"
01/03/2013 17:11:23 unixlab-3 17016 root END 9
01/03/2013 17:11:44 unixlab-3 17018 root START - clquorum "disable"
01/03/2013 17:11:44 unixlab-3 17018 root END 3

Back to the main page