A cluster is more nodes that work together to provide high-availability for applications. Compare two pictures, one is the general example from Oracle/Sun documentation and the second one is my actual setup for this excersise.
Let's now do real installation of Solaris Cluster 3.3u1. Both SunFire T2000 servers run Solaris 10 update10.
The StorEdge 6120 has configured 50G slice/LUN and allows access from both servers.
The FC switch has no zoning configured, for this excersise.
If you need info how to configure FC switch zoning see SAN zoning .
Both T2000 have slice 4 of root disk mounted to UFS /globaldevices file system.
The scinstall command later renames /globaldevices to, for example, /global/.devices/node@1 where 1 is number of host when it becomes global-cluster member.
Let's compare 2 hosts in the table and see coresponding actions before/during/after installation:
Node | unixlab-2 | unixlab-3 |
/etc/hosts file | add unixlab-3 here | add unixlab-2 here |
from /etc/vfstab file | /dev/dsk/c0t0d0s1 - - swap - no - /dev/dsk/c0t0d0s0 /dev/rdsk/c0t0d0s0 / ufs 1 no - /dev/dsk/c0t0d0s3 /dev/rdsk/c0t0d0s3 /var ufs 1 no - /dev/dsk/c0t0d0s5 /dev/rdsk/c0t0d0s5 /.0 ufs 2 yes - /dev/dsk/c0t0d0s6 /dev/rdsk/c0t0d0s6 /backup ufs 2 yes - /dev/dsk/c0t0d0s4 /dev/rdsk/c0t0d0s4 /globaldevices ufs 2 yes - |
same here |
from format command | 3. c4t60003BACCC75000050E2094800021C74d0 <SUN-T4-0302-50.00GB> /scsi_vhci/ssd@g60003baccc75000050e2094800021c74 |
same here |
Each node has to SSH to other without password | Create ssh key with no passphrase: ssh-keygen -t dsa -b 1024 Copy public key root_id_dsa.pub to unixlab-3 and place in file /.ssh/authorized_keys Add line "IdentityFile /.ssh/root_id_dsa" in file unixlab-2:/etc/ssh/ssh_config |
repeat same here and copy public key to unixlab-2 |
Get Solaris Cluster s/w | Download, unzip solaris-cluster-3_3u1-ga-sparc.zip and go to directory Solaris_sparc | repeat same here |
Installation | {unixlab-2}/tmp/Solaris_sparc# ./installer Unable to access a usable display on the remote system. Continue in command-line mode?(Y/N) Y Java Accessibility Bridge for GNOME loaded. Installation Type ----------------- Do you want to install the full set of Oracle Solaris Cluster Products and Services? (Yes/No) [Yes] {"<" goes back, "!" exits} No Choose Software Components - Main Menu ------------------------------- Note: "* *" indicates that the selection is disabled [ ] 1. Oracle Solaris Cluster Geographic Edition 3.3u1 [ ] 2. Quorum Server [ ] 3. High Availability Session Store 4.4.3 [ ] 4. Oracle Solaris Cluster 3.3u1 [ ] 5. Java DB 10.2.2.1 [ ] 6. Oracle Solaris Cluster Agents 3.3u1 Enter a comma separated list of products to install, or press R to refresh the list [] {"<" goes back, "!" exits}: 4,6 Choose Software Components - Confirm Choices -------------------------------------------- Based on product dependencies for your selections, the installer will install: [X] 4. Oracle Solaris Cluster 3.3u1 * * Java DB 10.2.2.1 [X] 6. Oracle Solaris Cluster Agents 3.3u1 Component Selection - Selected Product "Oracle Solaris Cluster 3.3u1" --------------------------------------------------------------------- ** * Oracle Solaris Cluster Core *[X] 2. Oracle Solaris Cluster Manager Component Selection - Selected Product "Java DB 10.2.2.1" --------------------------------------------------------- ** * Java DB Client ** * Java DB Server Component Selection - Selected Product "Oracle Solaris Cluster Agents 3.3u1" ---------------------------------------------------------------------------- *[X] 1. Oracle Solaris Cluster HA for Java System Application Server *[X] 2. Oracle Solaris Cluster HA for Java System Message Queue *[X] 3. Oracle Solaris Cluster HA for Java System Directory Server *[X] 4. Oracle Solaris Cluster HA for Java System Messaging Server *[X] 5. Oracle Solaris Cluster HA for Application Server EE (HADB) *[X] 6. Oracle Solaris Cluster HA/Scalable for Java System Web Server *[X] 7. Oracle Solaris Cluster HA for Instant Messaging *[X] 8. Oracle Solaris Cluster HA for Java System Calendar Server *[X] 9. Oracle Solaris Cluster HA for Apache Tomcat *[X] 10. Oracle Solaris Cluster HA for Apache *[X] 11. Oracle Solaris Cluster HA for DHCP *[X] 12. Oracle Solaris Cluster HA for DNS *[X] 13. Oracle Solaris Cluster HA for MySQL *[X] 14. Oracle Solaris Cluster HA for Sun N1 Service Provisioning System *[X] 15. Oracle Solaris Cluster HA for NFS *[X] 16. Oracle Solaris Cluster HA for Oracle *[X] 17. Oracle Solaris Cluster HA for Agfa IMPAX *[X] 18. Oracle Solaris Cluster HA for Samba Enter a comma separated list of components to install (or A to install all ) [A] {"<" goes back, "!" exits} 10,15,18 *[X] 10. Oracle Solaris Cluster HA for Apache *[X] 15. Oracle Solaris Cluster HA for NFS *[X] 18. Oracle Solaris Cluster HA for Samba Checking System Status Available disk space... : Checking .... OK Memory installed... : Checking .... OK Swap space installed... : Checking .... OK Operating system patches... : Checking .... OK Operating system resources... : Checking .... OK System ready for installation Screen for selecting Type of Configuration 1. Configure Now - Selectively override defaults or express through 2. Configure Later - Manually configure following installation Select Type of Configuration [1] {"<" goes back, "!" exits} 2 Ready to Install ---------------- The following components will be installed. Product: Oracle Solaris Cluster Uninstall Location: /var/sadm/prod/SUNWentsyssc33u1 Space Required: 513.57 MB --------------------------------------------------- Java DB Java DB Server Java DB Client Oracle Solaris Cluster 3.3u1 Oracle Solaris Cluster Core Oracle Solaris Cluster Manager Oracle Solaris Cluster Agents 3.3u1 Oracle Solaris Cluster HA for Apache Oracle Solaris Cluster HA for NFS Oracle Solaris Cluster HA for Samba 1. Install 2. Start Over 3. Exit Installation What would you like to do [1] Oracle Solaris Cluster |-1%--------------25%-----------------50%-----------------75%--------------100%| Installation Complete |
same installation here |
Adjust ${PATH}/${MANPATH} |
The tcsh shell: setenv PATH /usr/cluster/bin:${PATH} setenv MANPATH /usr/cluster/man:${MANPATH} |
same here |
Configure cluster on all nodes with scinstall (as we selected to do it later). Do from SERIAL CONSOLE, reboot needed. | Configuration performed only on unixlab-2. unixlab-3 reboots first, once unixlab-3 is up, unixlab-2 reboots. {unixlab-2}/# scinstall *** Main Menu *** Please select from one of the following (*) options: * 1) Create a new cluster or add a cluster node 2) Configure a cluster to be JumpStarted from this install server 3) Manage a dual-partition upgrade 4) Upgrade this cluster node * 5) Print release information for this cluster node Option: 1 *** New Cluster and Cluster Node Menu *** Please select from any one of the following options: 1) Create a new cluster 2) Create just the first node of a new cluster on this machine 3) Add this machine as a node in an existing cluster Option: 1 *** Create a New Cluster *** This option creates and configures a new cluster. If the "remote configuration" option is unselected from the Oracle Solaris Cluster installer when you install the Oracle Solaris Cluster framework on any of the new nodes, then you must configure either the remote shell (see rsh(1)) or the secure shell (see ssh(1)) before you select this option. If rsh or ssh is used, you must enable root access to all of the new member nodes from this node. Press Control-d at any time to return to the Main Menu. Do you want to continue (yes/no) [yes]? yes ### Typical or Custom Mode This tool supports two modes of operation, Typical mode and Custom. For most clusters, you can use Typical mode. However, you might need to select the Custom mode option if not all of the Typical defaults can be applied to your cluster. Please select from one of the following options: 1) Typical 2) Custom Option [1]: 1 ### Cluster Name Each cluster has a name assigned to it. The name can be made up of any characters other than whitespace. Each cluster name should be unique within the namespace of your enterprise. What is the name of the cluster you want to establish? suncluster ### Cluster Nodes This Oracle Solaris Cluster release supports a total of up to 16 nodes. Please list the names of the other nodes planned for the initial cluster configuration. List one node name per line. When finished, type Control-D: Node name (Control-D to finish): unixlab-2 Node name (Control-D to finish): unixlab-3 This is the complete list of nodes: unixlab-2 unixlab-3 Is it correct (yes/no) [yes]? Attempting to contact "unixlab-3" ... done Searching for a remote configuration method ... done ### Cluster Transport Adapters and Cables You must identify the cluster transport adapters which attach this node to the private cluster interconnect. Select the first cluster transport adapter: 1) e1000g1 2) e1000g2 3) e1000g3 4) Other Option: 3 Will this be a dedicated cluster transport adapter (yes/no) [yes]? Searching for any unexpected network traffic on "e1000g3" ... done Verification completed. No traffic was detected over a 10 second sample period. Select the second cluster transport adapter: 1) e1000g1 2) e1000g2 3) e1000g3 4) Other Option: 2 Will this be a dedicated cluster transport adapter (yes/no) [yes]? Searching for any unexpected network traffic on "e1000g2" ... done Verification completed. No traffic was detected over a 10 second sample period. Plumbing network address 172.16.0.0 on adapter e1000g3 >> NOT DUPLICATE ... done Plumbing network address 172.16.0.0 on adapter e1000g2 >> NOT DUPLICATE ... done ### Quorum Configuration Every two-node cluster requires at least one quorum device. By default, scinstall selects and configures a shared disk quorum device for you. This screen allows you to disable the automatic selection and configuration of a quorum device. You have chosen to turn on the global fencing. If your shared storage devices do not support SCSI, such as Serial Advanced Technology Attachment (SATA) disks, or if your shared disks do not support SCSI-2, you must disable this feature. If you disable automatic quorum device selection now, or if you intend to use a quorum device that is not a shared disk, you must instead use clsetup(1M) to manually configure quorum once both nodes have joined the cluster for the first time. Do you want to disable automatic quorum device selection (yes/no) [no]? Is it okay to create the new cluster (yes/no) [yes]? During the cluster creation process, cluster check is run on each of the new cluster nodes. If cluster check detects problems, you can either interrupt the process or check the log files after the cluster has been established. Interrupt cluster creation for cluster check errors (yes/no) [no]? Cluster Creation Testing for "/globaldevices" on "unixlab-2" ... done Testing for "/globaldevices" on "unixlab-3" ... done Starting discovery of the cluster transport configuration. The following connections were discovered: unixlab-2:e1000g3 switch1 unixlab-3:e1000g3 unixlab-2:e1000g2 switch2 unixlab-3:e1000g2 Completed discovery of the cluster transport configuration. Started cluster check on "unixlab-2". Started cluster check on "unixlab-3". cluster check failed for "unixlab-2". cluster check failed for "unixlab-3". The cluster check command failed on both of the nodes. Refer to the log file for details. The name of the log file is /var/cluster/logs/install/scinstall.log.15266. Configuring "unixlab-3" ... done Rebooting "unixlab-3" ... done Configuring "unixlab-2" ... done Rebooting "unixlab-2" ... SERIAL CONSOLE MESSAGES: Booting in cluster mode CMM: Node unixlab-3 (nodeid = 1) with votecount = 1 added. CMM: Node unixlab-2 (nodeid = 2) with votecount = 0 added. clcomm: Adapter e1000g2 constructed clcomm: Adapter e1000g3 constructed CMM: Node unixlab-2: attempting to join cluster. clcomm: Path unixlab-2:e1000g2 - unixlab-3:e1000g2 online CMM: Node unixlab-3 (nodeid: 1, incarnation #: 1357159875) has become reachable. CMM: Cluster has reached quorum. CMM: Node unixlab-3 (nodeid = 1) is up; new incarnation number = 1357159875. CMM: Node unixlab-2 (nodeid = 2) is up; new incarnation number = 1357160079. CMM: Cluster members: unixlab-3 unixlab-2. CMM: node reconfiguration #3 completed. CMM: Node unixlab-2: joined cluster. clcomm: Path unixlab-2:e1000g3 - unixlab-3:e1000g3 online DID subpath "/dev/rdsk/c4t60003BACCC75000050E2094800021C74d0s2" created for instance "4". did instance 6 created. did subpath unixlab-2:/dev/rdsk/c0t0d0 created for instance 6. did instance 7 created. did subpath unixlab-2:/dev/rdsk/c0t1d0 created for instance 7. did instance 8 created. did subpath unixlab-2:/dev/rdsk/c0t2d0 created for instance 8. Configuring DID devices obtaining access to all attached disks Configuring the /dev/global directory (global devices) SCPOSTCONFIG: Configuring Oracle Solaris Cluster quorum... SCPOSTCONFIG: clquorum: (C192716) I/O error. SCPOSTCONFIG: Will add the following quorum devices: SCPOSTCONFIG: /dev/did/rdsk/d4s2 SCPOSTCONFIG: scquorumconfig: Quorum autoconfig failed SCPOSTCONFIG: The quorum configuration task encountered a problem on node unixlab-2, manual configuration by using clsetup(1CL) might be necessary |
No need to run configuration here, it's done from unixlab-2. Unixlab-3 reboots here, watch serial console messages. |
/etc/vfstab | #/dev/dsk/c0t0d0s4 /dev/rdsk/c0t0d0s4 /globaldevices ufs 2 yes - /dev/did/dsk/d6s4 /dev/did/rdsk/d6s4 /global/.devices/node@2 ufs 2 no global NOTE: node@2 = because became cluster node number 2 |
#/dev/dsk/c0t0d0s4 /dev/rdsk/c0t0d0s4 /globaldevices ufs 2 yes - /dev/did/dsk/d1s4 /dev/did/rdsk/d1s4 /global/.devices/node@1 ufs 2 no global NOTE: node@1 = because became cluster node number 1 |
Global devices, local filesystems | {unixlab-2}# df -h | grep global /dev/did/dsk/d1s4 3.9G 6.6M 3.9G 1% /global/.devices/node@1 /dev/did/dsk/d6s4 3.9G 6.6M 3.9G 1% /global/.devices/node@2 |
Same here: {unixlab-3}# df -h | grep global /dev/did/dsk/d1s4 3.9G 6.6M 3.9G 1% /global/.devices/node@1 /dev/did/dsk/d6s4 3.9G 6.6M 3.9G 1% /global/.devices/node@2 |
Network interfaces | e1000g0: flags=9000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,NOFAILOVER> mtu 1500 index 2 inet 192.168.28.215 netmask ffffff00 broadcast 192.168.28.255 groupname sc_ipmp0 ether 0:14:4f:6a:bf:16 e1000g2: flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4> mtu 1500 index 3 inet 172.16.1.2 netmask ffffff80 broadcast 172.16.1.127 ether 0:14:4f:6a:bf:18 e1000g3: flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4> mtu 1500 index 4 inet 172.16.0.130 netmask ffffff80 broadcast 172.16.0.255 ether 0:14:4f:6a:bf:19 clprivnet0: flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4> mtu 1500 index 5 inet 172.16.4.2 netmask fffffe00 broadcast 172.16.5.255 ether 0:0:0:0:0:2 |
e1000g0: flags=9000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,NOFAILOVER> mtu 1500 index 2 inet 192.168.28.216 netmask ffffff00 broadcast 192.168.28.255 groupname sc_ipmp0 ether 0:14:4f:82:5:46 e1000g2: flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4> mtu 1500 index 6 inet 172.16.1.1 netmask ffffff80 broadcast 172.16.1.127 ether 0:14:4f:82:5:48 e1000g3: flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4> mtu 1500 index 4 inet 172.16.0.129 netmask ffffff80 broadcast 172.16.0.255 ether 0:14:4f:82:5:49 clprivnet0: flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4> mtu 1500 index 5 inet 172.16.4.1 netmask fffffe00 broadcast 172.16.5.255 ether 0:0:0:0:0:1 |
Manually add LUN as quorum device (command clsetup, since "Quorum autoconfig failed" | {unixlab-2}# clsetup ### Initial Cluster Setup This program has detected that the cluster "installmode" attribute is still enabled. As such, certain initial cluster setup steps will be performed at this time. This includes adding any necessary quorum devices, then resetting both the quorum vote counts and the "installmode" property. Please do not proceed if any additional nodes have yet to join the cluster. Is it okay to continue (yes/no) [yes]? Do you want to add any quorum devices (yes/no) [yes]? Following are supported Quorum Devices types in Oracle Solaris Cluster. Please refer to Oracle Solaris Cluster documentation for detailed information on these supported quorum device topologies. What is the type of device you want to use? 1) Directly attached shared disk 2) Network Attached Storage (NAS) from Network Appliance 3) Quorum Server Option: 1 ### Add a Shared Disk Quorum Device If you are using a dual-ported disk, by default, Oracle Solaris Cluster uses SCSI-2. If you are using disks that are connected to more than two nodes, or if you manually override the protocol from SCSI-2 to SCSI-3, by default, Oracle Solaris Cluster uses SCSI-3. If you turn off SCSI fencing for disks, Oracle Solaris Cluster uses software quorum, which is Oracle Solaris Cluster software that emulates a form of SCSI Persistent Group Reservations (PGR). Warning: If you are using disks that do not support SCSI, such as Serial Advanced Technology Attachment (SATA) disks, turn off SCSI fencing. Is it okay to continue (yes/no) [yes]? Which global device do you want to use (dN)? d4 Is it okay to proceed with the update (yes/no) [yes]? /usr/cluster/bin/clquorum add d4 Command completed successfully. Do you want to add another quorum device (yes/no) [yes]? no Once the "installmode" property has been reset, this program will skip "Initial Cluster Setup" each time it is run again in the future. However, quorum devices can always be added to the cluster using the regular menu options. Resetting this property fully activates quorum settings and is necessary for the normal and safe operation of the cluster. Is it okay to reset "installmode" (yes/no) [yes]? /usr/cluster/bin/clquorum reset /usr/cluster/bin/claccess deny-all Cluster initialization is complete. Serial console messages: CMM: Cluster members: unixlab-3 unixlab-2. CMM: node reconfiguration #4 completed. CMM: Votecount changed from 0 to 1 for node unixlab-2. CMM: Cluster members: unixlab-3 unixlab-2. CMM: Quorum device 1 (/dev/did/rdsk/d4s2) added; votecount = 1 |
nothing to be done here |
Quick check of quorum devices | {unixlab-2}/# cldevice list -v DID Device Full Device Path ---------- ---------------- d1 unixlab-3:/dev/rdsk/c0t0d0 d2 unixlab-3:/dev/rdsk/c0t1d0 d3 unixlab-3:/dev/rdsk/c0t2d0 d4 unixlab-3:/dev/rdsk/c4t60003BACCC75000050E2094800021C74d0 d4 unixlab-2:/dev/rdsk/c4t60003BACCC75000050E2094800021C74d0 d6 unixlab-2:/dev/rdsk/c0t0d0 d7 unixlab-2:/dev/rdsk/c0t1d0 d8 unixlab-2:/dev/rdsk/c0t2d0 {unixlab-2}/# clquorum list -v Quorum Type ------ ---- d4 shared_disk unixlab-3 node unixlab-2 node {unixlab-2}/# clquorum show === Cluster Nodes === Node Name: unixlab-3 Node ID: 1 Quorum Vote Count: 1 Reservation Key: 0x50E49D1600000001 Node Name: unixlab-2 Node ID: 2 Quorum Vote Count: 1 Reservation Key: 0x50E49D1600000002 === Quorum Devices === Quorum Device Name: d4 Enabled: yes Votes: 1 Global Name: /dev/did/rdsk/d4s2 Type: shared_disk Access Mode: scsi2 Hosts (enabled): unixlab-3, unixlab-2 |
same here |