Back to the main page

Solaris Cluster HA for NFS

The document talks about Solaris Cluster installation, here I'll continue with setting up Cluster HA for NFS.

Solaris Cluster Agent for NFS was already installed on both nodes during Cluster installation.
{unixlab-2}/# pkginfo -l SUNWscnfs
   PKGINST:  SUNWscnfs
      NAME:  Oracle Solaris Cluster NFS Server Component
  CATEGORY:  application
      ARCH:  sparc
   VERSION:  3.3.0,REV=2010.07.26.12.56
   BASEDIR:  /opt
    VENDOR:  Oracle Corporation
      DESC:  Oracle Solaris Cluster nfs server data service
    PSTAMP:  04/05/2011.14:42:22
  INSTDATE:  Jan 02 2013 10:54
   HOTLINE:  Please contact your local service provider
    STATUS:  completely installed
     FILES:       38 installed pathnames
                   6 shared pathnames
                  14 directories
                  16 executables
                1644 blocks used (approx)
In the cluster, the NFS server is managed by cluster software, so nfs/server service is basically disabled on nodes.
Also Cluster NFS data service set 2 properties for next 3 services as:
{unixlab-2}/# foreach i (nfs/server nfs/status nlockmgr)
foreach? echo -------- $i -------
foreach? svccfg -s $i listprop | egrep "application/auto_enable|startd/duration"
foreach? end
-------- nfs/server -------
application/auto_enable   boolean  false
startd/duration           astring  transient
-------- nfs/status -------
application/auto_enable   boolean  false
startd/duration           astring  transient
-------- nlockmgr -------
application/auto_enable   boolean  false
startd/duration           astring  transient
The hostname/IP (unixlab-1/192.168.28.202) will be used by NFS clients to access NFS service, so have the line 192.168.28.202 unixlab-1 in /etc/hosts file on both nodes. Have this also registered in the DNS.

The NFS data will be on the device d4 (on StorEdge 6120).
{unixlab-3}/# cldevice list -v
DID Device          Full Device Path
----------          ----------------
d1                  unixlab-3:/dev/rdsk/c0t0d0
d2                  unixlab-3:/dev/rdsk/c0t1d0
d3                  unixlab-3:/dev/rdsk/c0t2d0
d4                  unixlab-2:/dev/rdsk/c4t60003BACCC75000050E2094800021C74d0
d4                  unixlab-3:/dev/rdsk/c4t60003BACCC75000050E2094800021C74d0
d6                  unixlab-2:/dev/rdsk/c0t0d0
d7                  unixlab-2:/dev/rdsk/c0t1d0
Let's put the action in the table so we compare both nodes.
Actionunixlab-2unixlab-3
Cleate slice0 as 50G
on shared disk (StorEdge 6120)
partition# p
Part      Tag Flag Cylinders Size            Blocks
0 unassigned  wm   0 - 51197 50.00GB    (51198/0/0) 104853504
1 unassigned  wu   0         0         (0/0/0)             0
2     backup  wu   0 - 51197 50.00GB    (51198/0/0) 104853504
3 unassigned  wm   0         0         (0/0/0)             0
4 unassigned  wm   0         0         (0/0/0)             0
5 unassigned  wm   0         0         (0/0/0)             0
6 unassigned  wm   0         0         (0/0/0)             0
7 unassigned  wm   0         0         (0/0/0)             0

partition# l
[0] SMI Label
[1] EFI Label
Specify Label type[0]:
Ready to label disk, continue? y
do nothing here
Create new filesystem
{unixlab-2}/# newfs -v /dev/did/rdsk/d4s0
newfs: /dev/did/rdsk/d4s0 last mounted as /ha-nfs
newfs: construct a new file system /dev/did/rdsk/d4s0: (y/n)? y
pfexec mkfs -F ufs /dev/did/rdsk/d4s0 104853504 128 -1 8192 ... 
/dev/did/rdsk/d4s0:     104853504 sectors in 17066 cylinders of 48 tracks, 128 sectors
        51198.0MB in 1067 cyl groups (16 c/g, 48.00MB/g, 5824 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
 32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,
Initializing cylinder groups:
.....................
super-block backups for last 10 cylinder groups at:
 103911584, 104010016, 104108448, 104206880, 104305312, 104403744, 104502176,
do nothing here
Create directory /ha-nfs

update /etc/vfstab ...

and mount filesystem
...with line
/dev/did/dsk/d4s0 /dev/did/rdsk/d4s0 /ha-nfs ufs 2 no global,logging
... and mount /ha-nfs
add same line to /etc/vfstab, it should be already mounted since of option "global"
In /ha-nfs, create two sub-directories : admin and data.
The admin will be used by Cluster NFS recource to maintain administrative info.
The data will store NFS data for clients.
{unixlab-2}/# df -h /ha-nfs
Filesystem             size   used  avail capacity  Mounted on
/dev/did/dsk/d4s0       49G    50M    49G     1%    /ha-nfs
{unixlab-3}/# df -h /ha-nfs
Filesystem        size used avail cap Mounted on
/dev/did/dsk/d4s0 49G  50M  49G   1%  /ha-nfs
Do it on only ONE node:
Create resource group to contain 
NFS resources.
Reminder: Pathprefix directory on global filesystem is needed for NFS recource to maintain admin info. 

{unixlab-2}/# clresourcegroup create -n unixlab-2,unixlab-3 -p Pathprefix=/ha-nfs/admin resgroup-nfs {unixlab-2}/# clresourcegroup list -v Resource Group Mode Overall status -------------- ---- -------------- resgroup-nfs Failover not_online {unixlab-2}/# clresourcegroup show === Resource Groups and Resources === Resource Group: resgroup-nfs RG_description: NULL RG_mode: Failover RG_state: Unmanaged Failback: False Nodelist: unixlab-2 unixlab-3 {unixlab-2}/# clresourcegroup status === Cluster Resource Groups === Group Name Node Name Suspended Status ---------- --------- --------- ------ resgroup-nfs unixlab-2 No Unmanaged unixlab-3 No Unmanaged
do nothing here
Create Logical Hostname resource
{unixlab-2}/# clreslogicalhostname create -v -g resgroup-nfs -h unixlab-1 res-logicalhostname
Resource "res-logicalhostname" created.
resource res-logicalhostname marked as enabled

{unixlab-2}/# clreslogicalhostname list -v
Resource Name         Resource Type            Resource Group
-------------         -------------            --------------
res-logicalhostname   SUNW.LogicalHostname:4   resgroup-nfs

{unixlab-2}/# clreslogicalhostname show
=== Resources ===
Resource:                                       res-logicalhostname
  Type:                                            SUNW.LogicalHostname:4
  Type_version:                                    4
  Group:                                           resgroup-nfs
  R_description:
  Resource_project_name:                           default
  Enabled{unixlab-2}:                              True
  Enabled{unixlab-3}:                              True
  Monitored{unixlab-2}:                            True
  Monitored{unixlab-3}:                            True

{unixlab-2}/# clreslogicalhostname status
=== Cluster Resources ===
Resource Name           Node Name    State      Status Message
-------------           ---------    -----      --------------
res-logicalhostname     unixlab-2    Offline    Offline
                        unixlab-3    Offline    Offline
do nothing here
Create directory /ha-nfs/admin/SUNW.nfs
and file dfstab.res-nfs
The file /ha-nfs/admin/SUNW.nfs/dfstab.res-nfs has line

share -F nfs -o rw -d "NFS Cluster on unixlab-1" /ha-nfs/data

Also run share command: 

{unixlab-2}/# share -F nfs -o rw -d "NFS Cluster on unixlab-1" /ha-nfs/data

{unixlab-2}/# share
-               /ha-nfs/data   rw   "NFS Cluster on unixlab-1"

As reminder:
Local SMF shows nfs/server service is disabled:

{unixlab-2}/# svcs nfs/server
STATE          STIME    FMRI
disabled       Jan_04   svc:/network/nfs/server:default

Check exported filesystem:
{unixlab-2}/# showmount -e unixlab-2
export list for unixlab-2:
/ha-nfs/data (everyone)
do nothing here
Verify registered resource types
{unixlab-2}/# clresourcetype list -v
Resource Type            Node List
-------------            ---------
SUNW.LogicalHostname:4   All
SUNW.SharedAddress:2     All
SUNW.HAStoragePlus:9     All
SUNW.nfs:3.3             All

If SUNW.nfs is missing, register with command : 
{unixlab-2}/# clresourcetype register SUNW.nfs
{unixlab-2}/# clresourcetype show SUNW.nfs
=== Registered Resource Types ===
Resource Type:         SUNW.nfs:3.3
  RT_description:      HA-NFS for Oracle Solaris Cluster
  RT_version:          3.3
  API_version:         2
  RT_basedir:          /opt/SUNWscnfs/bin
  Single_instance:     False
  Proxy:               False
  Init_nodes:          All potential masters
  Installed_nodes:     All
  Failover:            True
  Pkglist:             SUNWscnfs
  RT_system:           False
  Global_zone:         False
do nothing here
Create NSF resource:
{unixlab-2}/# clresource create -g resgroup-nfs -t SUNW.nfs res-nfs

{unixlab-2}/# clresource list -v
Resource Name         Resource Type            Resource Group
-------------         -------------            --------------
res-nfs               SUNW.nfs:3.3             resgroup-nfs
res-logicalhostname   SUNW.LogicalHostname:4   resgroup-nfs

{unixlab-2}/# clresource status
=== Cluster Resources ===
Resource Name           Node Name    State      Status Message
-------------           ---------    -----      --------------
res-nfs                 unixlab-2    Offline    Offline
                        unixlab-3    Offline    Offline

res-logicalhostname     unixlab-2    Offline    Offline
                        unixlab-3    Offline    Offline

Since resource group is not online, make it online now. 

{unixlab-2}/# clresourcegroup list -v
Resource Group      Mode                Overall status
--------------      ----                --------------
resgroup-nfs        Failover            not_online

{unixlab-2}/# clresourcegroup online -v -M resgroup-nfs
resource group resgroup-nfs state changed from unmanaged state to managed offline state.
Set all the specified resource groups to managed state.
resource group resgroup-nfs rebalanced successfully

{unixlab-2}/# clresourcegroup list -v
Resource Group      Mode                Overall status
--------------      ----                --------------
resgroup-nfs        Failover            online

{unixlab-2}/# clresourcegroup status
=== Cluster Resource Groups ===
Group Name        Node Name      Suspended      Status
----------        ---------      ---------      ------
resgroup-nfs      unixlab-2      No             Online
                  unixlab-3      No             Offline
Check from this node:
{unixlab-3}/# clresourcegroup status
=== Cluster Resource Groups ===
Group Name        Node Name      Suspended      Status
----------        ---------      ---------      ------
resgroup-nfs      unixlab-2      No             Online
                  unixlab-3      No             Offline

{unixlab-3}/# clresource status
=== Cluster Resources ===
Resource Name       Node Name State   Status Message
-------------       --------- -----   --------------
res-nfs             unixlab-2 Online  Online - Service is online.
                    unixlab-3 Offline Offline

res-logicalhostname unixlab-2 Online  Online - LogicalHostname online.
                    unixlab-3 Offline Offline

The NFS client can ping logical hostname and see exported filesystem.
{client}/# showmount -e unixlab-1
export list for unixlab-1:
/ha-nfs/data (everyone)

{client}/# mount -F nfs unixlab-1:/ha-nfs/data /mnt
{client}/# cd /mnt
{client}/mnt# ls
total 13
drwxrwxrwx   5 root     root         512 Jan  7 11:05 .
drwxr-xr-x  29 root     root          37 Dec  9 16:12 ..
drwxrwxrwx   2 root     root         512 Jan  7 11:05 iso
drwxrwxrwx   2 root     root         512 Jan  7 11:05 projects
drwxrwxrwx   2 root     root         512 Jan  7 11:05 software
For example open file /mnt/projects/myproj.txt and write some text
~
this is project with testing solaris cluster nfs ...
~

Actionunixlab-2unixlab-3
Keep the file open and let's now evacuate resource groups
and device groups from unixlab-2 to another node.
{unixlab-2}/# clnode evacuate -v unixlab-2
clnode:  resource group resgroup-nfs stopped successfully
They are now online on another node. 
{unixlab-3}/# clresource status
=== Cluster Resources ===
Resource Name       Node Name State   Status Message
-------------       --------- -----   --------------
res-nfs             unixlab-2 Offline Offline - Completed successfully.
                    unixlab-3 Online  Online - Successfully started NFS service.

res-logicalhostname unixlab-2 Offline Offline - LogicalHostname offline.
                    unixlab-3 Online  Online - LogicalHostname online.

The NFS client continues to edit file, no problem!

Actionunixlab-2unixlab-3
And let's now un-plug power cables on unixlab-3.
Message from unixlab-2 serial console:
cl_runtime: NOTICE: CMM: Node unixlab-3 (nodeid = 1) is dead

{unixlab-2}/# clresource status
=== Cluster Resources ===
Resource Name           Node Name    State      Status Message
-------------           ---------    -----      --------------
res-nfs                 unixlab-2    Online     Online - Successfully started NFS service.
                        unixlab-3    Offline    Offline

res-logicalhostname     unixlab-2    Online     Online - LogicalHostname online.
                        unixlab-3    Offline    Offline

{unixlab-2}/# clresourcegroup status
=== Cluster Resource Groups ===
Group Name        Node Name      Suspended      Status
----------        ---------      ---------      ------
resgroup-nfs      unixlab-2      No             Online
                  unixlab-3      No             Offline

{unixlab-2}/# clquorum status
=== Cluster Quorum ===
--- Quorum Votes Summary from (latest node reconfiguration) ---
            Needed   Present   Possible
            ------   -------   --------
            2        2         3
--- Quorum Votes by Node (current status) ---
Node Name       Present       Possible       Status
---------       -------       --------       ------
unixlab-3       0             1              Offline
unixlab-2       1             1              Online

--- Quorum Votes by Device (current status) ---
Device Name       Present      Possible      Status
-----------       -------      --------      ------
d4                1            1             Online
Un-plug power cables!


No problem, the NFS client still works with NFS mount and file.

Back to the main page