I assume many of you are using ZFS as file system for users' home directories. Since ZFS is cheap, you probably have separate ZFS for each user.
And you probably use ZFS snapshots in order to have some kind of backup history. Actually snapshot is not replacement for backup, I just use this terminology.
Say that for each user, you keep 24 daily snapshots, 7 weekly, one monthly, something like this. When snapshot is taken, it doesn't take any extra space, but with creating and deleting files, snapshot grows.
The common situation is that users have quota on home directories, and quite often they complain that they are reaching the quota. You may realize sometimes that snapshots actually take even 50% and more of home dir quota. So you destroy snapshots and use has his/her space back (yes, you have users' home directories backup somewhere else).
So I use this script to find users with biggest size of home dirs and also to see their biggest snapshots, so if needed, I destroy snapshots manually. I run this once daily, using cron job and email results to myself.
Note: all account are in the NIS.
Feel free to edit script so it can suite your needs.
#!/bin/sh #set -x # List of hogs and their biggest snapshots # Idea: # find couple of biggest user's snapshots and (if needed) remove them manually # ---------------------------------------------------------------------------- # Variables # ----------- AWK=/usr/bin/nawk # zfs data set that contains users zfs ZFSYSTEM=pool.1/home # how many hogs I want to see NUMOFHOGS=20 # how many biggest snapshots I want to see NUMOFSNAPS=5 # I ignore some accounts: # ---------------------------- USERFS=`zfs list -rH -S used -o name -t filesystem ${ZFSYSTEM} | head -${NUMOFHOGS} | egrep -v home$ \ | egrep -v oracle \ | egrep -v db2 \ | egrep -v smp-users6 \ | egrep -v sybase` \ || exit 1 echo ; echo Biggest hogs and their biggest snapshots echo ======================================== for i in ${USERFS} do # find used space of user's filesystem SPACEUSED=`zfs list -H -o used ${i}` # find quota of user's filesystem FSQUOTA=`zfs list -H -o quota ${i}` # get GECO from NIS USERNAME=`echo ${i} | ${AWK} -F/ '{print $3}'` GECO=`/usr/bin/ypmatch ${USERNAME} passwd | ${AWK} -F: '{print $5}'` || exit 2 #echo ; echo ${i} ${USERNAME} ${GECO} echo ; echo ${GECO} echo Used space of filesystem: ${SPACEUSED} echo Quota of filesystem: ${FSQUOTA} echo -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- # get ${NUMOFSNAPS} biggest snapshots for specific user filesystem SNAPS=`zfs list -rH -S used -o name -t snapshot ${i} | head -${NUMOFSNAPS}` || exit 3 # print biggest snapshots and used space for j in ${SNAPS} do SNAPUSED=`zfs list -rH -o used ${j}` echo ${j} ${SNAPUSED} done done exit 0 |
So email that I receive looks like:
Milan Dudic Used space of filesystem: 5.55G Quota of filesystem: 8G -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- pool.1/home/mdudic@znap.weekly.2010-08W31 6.27M pool.1/home/mdudic@znap.weekly.2010-07W30 1.01M pool.1/home/mdudic@znap.daily.2010-08D13 388K pool.1/home/mdudic@znap.daily.2010-08D10 170K pool.1/home/mdudic@znap.daily.2010-08D11 169K Ugljesa Dudic Used space of filesystem: 5.49G Quota of filesystem: 10G -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- pool.1/home/ududic@znap.weekly.2010-08W31 1.04M pool.1/home/ududic@znap.hourly.2010-08-15H04 123K pool.1/home/ududic@znap.hourly.2010-08-15H05 16.5K pool.1/home/ududic@znap.weekly.2010-07W30 0 pool.1/home/ududic@znap.monthly.2010M08 0 |