Friday, December 21, 2007
Replacing a failed drive in a Software RAID mirror set
Like I wrote about last time, I have a failing drive in my triple active RAID mirror set on my firewall box. See also "Failing hard drive in a Software RAID". I'm still trying to decide whether the disk has actually failed, or if it is just having issues.

# /sbin/badblocks -sv /dev/sdc2

Since I have unmounted this RAID slice, I'm going to test with a DESTRUCTIVE write/read verification. (Which is also a good way to wipe the disk.)

# /sbin/badblocks -sv -w -t random /dev/sdc2

Well, after a few runs with that, the disk is no longer making "retry" noises. So I'm going to re-add the slice to the RAID array and see what happens.

# /sbin/mdadm /dev/md1 -a /dev/sdc2

And force mdadm to verify the sync:

# echo check > /sys/block/md1/md/sync_action

It seems to be working. I'm guessing that I finally convinced SMART to re-map the bad sector that was causing problems.

Labels: ,

Wednesday, December 05, 2007
Failed drive slice in a Software RAID after resync
One of the things that I do periodically on my servers is to run a mdadm resync. Because this can put a heavy strain on the disk system, I strongly suggest that you have good backups in place. My home systems run a check about once a month, servers at work run a check early on Tuesday mornings.

The script is very simple, and you can even fire off the command by writing "check" to the sync_action variable of the md process.

#!/bin/sh
# Tells mdadm to verify that the arrays are synchronized.
# This deals with the issue where a seldom-read disk block has gone bad
# by doing a daily/weekly verification of the array.

echo check > /sys/block/md0/md/sync_action
echo check > /sys/block/md1/md/sync_action
echo check > /sys/block/md2/md/sync_action
echo check > /sys/block/md3/md/sync_action
echo check > /sys/block/md4/md/sync_action
echo check > /sys/block/md5/md/sync_action
echo check > /sys/block/md6/md/sync_action


In this particular case, all of my RAID slices verified correctly, except for one of them. In this particular situation I'm running a triple-active RAID1 array. (Instead of using a hot-spare disk, I'm putting live data onto all three disks and using all three actively.)

See also Failing hard drive in a Software RAID

$ cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdc1[2] sdb1[1] sda1[0]
256896 blocks [3/3] [UUU]

md2 : active raid1 sdc3[2] sdb3[1] sda3[0]
12289600 blocks [3/3] [UUU]

md4 : active raid1 sdc5[2] sdb5[1] sda5[0]
33551616 blocks [3/3] [UUU]

md3 : active raid1 sdc6[2] sdb6[1] sda6[0]
1052160 blocks [3/3] [UUU]

md5 : active raid1 sdc7[2] sdb7[1] sda7[0]
64010880 blocks [3/3] [UUU]

md6 : active raid1 sdc8[2] sdb8[1] sda8[0]
267257216 blocks [3/3] [UUU]

md7 : active raid1 sdf1[2] sde1[1] sdd1[0]
488383936 blocks [3/3] [UUU]

md1 : active raid1 sdc2[3](F) sdb2[1] sda2[0]
12289600 blocks [3/2] [UU_]

unused devices: <none>


The md1 array is my / (root) partition. Since the rest of the disk slices appear to be fine, I'm going to proceed with the assumption that it was a minor glitch.

Step 0: Analyze the failure

The first sign of error was the (F) showing up in /proc/mdstat. Apparently I don't have mdadm configured yet in monitor mode so that it e-mails me when it finds an error.

# grep "sdc2" messages
Dec 4 09:11:58 fw1-shimo kernel: raid1: Disk failure on sdc2, disabling device.
Dec 4 09:12:06 fw1-shimo kernel: disk 2, wo:1, o:0, dev:sdc2


The full detail from the mdadm resync:

# grep "Dec 4 09" messages | grep "md:"
Dec 4 09:08:33 fw1-shimo kernel: md: md6: sync done.
Dec 4 09:08:33 fw1-shimo kernel: md: syncing RAID array md1
Dec 4 09:08:33 fw1-shimo kernel: md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
Dec 4 09:08:33 fw1-shimo kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.
Dec 4 09:08:33 fw1-shimo kernel: md: using 128k window, over a total of 12289600 blocks.
Dec 4 09:11:31 fw1-shimo kernel: md: md1: sync done.
#


And finally, evidence from the logs that shows that sdc was having issues:

Dec 4 09:11:34 fw1-shimo kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Dec 4 09:11:34 fw1-shimo kernel: ata3.00: (BMDMA stat 0x60)
Dec 4 09:11:34 fw1-shimo kernel: ata3.00: tag 0 cmd 0x25 Emask 0x9 stat 0x51 err 0x40 (media error)
Dec 4 09:11:34 fw1-shimo kernel: ata3: EH complete
Dec 4 09:11:35 fw1-shimo kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Dec 4 09:11:35 fw1-shimo kernel: ata2.00: (BMDMA stat 0x0)
Dec 4 09:11:35 fw1-shimo kernel: ata2.00: tag 0 cmd 0xc8 Emask 0x9 stat 0x51 err 0x40 (media error)
Dec 4 09:11:35 fw1-shimo kernel: ata2: EH complete
Dec 4 09:11:37 fw1-shimo kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Dec 4 09:11:37 fw1-shimo kernel: ata3.00: (BMDMA stat 0x60)
Dec 4 09:11:37 fw1-shimo kernel: ata3.00: tag 0 cmd 0x25 Emask 0x9 stat 0x51 err 0x40 (media error)
Dec 4 09:11:37 fw1-shimo kernel: ata3: EH complete
Dec 4 09:11:50 fw1-shimo kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Dec 4 09:11:51 fw1-shimo kernel: ata3.00: (BMDMA stat 0x60)
Dec 4 09:11:51 fw1-shimo kernel: ata3.00: tag 0 cmd 0x25 Emask 0x9 stat 0x51 err 0x40 (media error)
Dec 4 09:11:51 fw1-shimo kernel: ata3: EH complete
Dec 4 09:11:51 fw1-shimo kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Dec 4 09:11:52 fw1-shimo kernel: ata3.00: (BMDMA stat 0x60)
Dec 4 09:11:52 fw1-shimo kernel: ata3.00: tag 0 cmd 0x25 Emask 0x9 stat 0x51 err 0x40 (media error)
Dec 4 09:11:52 fw1-shimo kernel: ata3: EH complete
Dec 4 09:11:52 fw1-shimo setroubleshoot: SELinux is preventing /usr/sbin/sendmail.postfix (system_mail_t) "read" to /dev/md1 (proc_mdstat_t). For complete SELinux messages. run sealert -l d5c655f4-6fc3-445b-ab9d-3b21336cb2d0
Dec 4 09:11:52 fw1-shimo kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Dec 4 09:11:53 fw1-shimo kernel: ata3.00: (BMDMA stat 0x60)
Dec 4 09:11:53 fw1-shimo kernel: ata3.00: tag 0 cmd 0x25 Emask 0x9 stat 0x51 err 0x40 (media error)
Dec 4 09:11:53 fw1-shimo kernel: ata3: EH complete
Dec 4 09:11:53 fw1-shimo kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Dec 4 09:11:53 fw1-shimo kernel: ata3.00: (BMDMA stat 0x60)
Dec 4 09:11:54 fw1-shimo kernel: ata3.00: tag 0 cmd 0x25 Emask 0x9 stat 0x51 err 0x40 (media error)
Dec 4 09:11:54 fw1-shimo kernel: sd 2:0:0:0: SCSI error: return code = 0x08000002
Dec 4 09:11:54 fw1-shimo kernel: sdc: Current: sense key: Medium Error
Dec 4 09:11:54 fw1-shimo kernel: Additional sense: Unrecovered read error - auto reallocate failed
Dec 4 09:11:55 fw1-shimo kernel: end_request: I/O error, dev sdc, sector 25091744
Dec 4 09:11:55 fw1-shimo kernel: ata3: EH complete
Dec 4 09:11:55 fw1-shimo kernel: SCSI device sdc: 781422768 512-byte hdwr sectors (400088 MB)
Dec 4 09:11:55 fw1-shimo kernel: sdc: Write Protect is off
Dec 4 09:11:56 fw1-shimo kernel: SCSI device sdc: drive cache: write back
Dec 4 09:11:56 fw1-shimo kernel: SCSI device sdb: 781422768 512-byte hdwr sectors (400088 MB)
Dec 4 09:11:56 fw1-shimo kernel: sdb: Write Protect is off
Dec 4 09:11:57 fw1-shimo kernel: SCSI device sdb: drive cache: write back
Dec 4 09:11:57 fw1-shimo kernel: SCSI device sdc: 781422768 512-byte hdwr sectors (400088 MB)
Dec 4 09:11:57 fw1-shimo kernel: Incorrect number of segments after building list
Dec 4 09:11:57 fw1-shimo kernel: counted 127, received 15
Dec 4 09:11:58 fw1-shimo kernel: req nr_sec 0, cur_nr_sec 8
Dec 4 09:11:58 fw1-shimo kernel: raid1: Disk failure on sdc2, disabling device.
Dec 4 09:11:58 fw1-shimo kernel: Operation continuing on 2 devices
Dec 4 09:11:58 fw1-shimo kernel: blk: request botched
Dec 4 09:11:58 fw1-shimo kernel: Incorrect number of segments after building list
Dec 4 09:11:59 fw1-shimo kernel: counted 112, received 16
Dec 4 09:11:59 fw1-shimo kernel: req nr_sec 0, cur_nr_sec 8
Dec 4 09:11:59 fw1-shimo kernel: blk: request botched
Dec 4 09:11:59 fw1-shimo kernel: sdc: Write Protect is off
Dec 4 09:12:00 fw1-shimo kernel: Incorrect number of segments after building list
Dec 4 09:12:00 fw1-shimo kernel: counted 96, received 16
Dec 4 09:12:00 fw1-shimo kernel: req nr_sec 0, cur_nr_sec 8
Dec 4 09:12:01 fw1-shimo kernel: blk: request botched
Dec 4 09:12:01 fw1-shimo kernel: Incorrect number of segments after building list
Dec 4 09:12:01 fw1-shimo kernel: counted 80, received 16
Dec 4 09:12:01 fw1-shimo kernel: req nr_sec 0, cur_nr_sec 8
Dec 4 09:12:02 fw1-shimo kernel: blk: request botched
Dec 4 09:12:02 fw1-shimo kernel: Incorrect number of segments after building list
Dec 4 09:12:02 fw1-shimo kernel: counted 64, received 16
Dec 4 09:12:02 fw1-shimo kernel: req nr_sec 0, cur_nr_sec 8
Dec 4 09:12:03 fw1-shimo kernel: blk: request botched
Dec 4 09:12:03 fw1-shimo kernel: SCSI device sdc: drive cache: write back
Dec 4 09:12:03 fw1-shimo kernel: Incorrect number of segments after building list
Dec 4 09:12:03 fw1-shimo kernel: counted 48, received 16
Dec 4 09:12:04 fw1-shimo kernel: req nr_sec 0, cur_nr_sec 8
Dec 4 09:12:04 fw1-shimo kernel: blk: request botched
Dec 4 09:12:04 fw1-shimo kernel: Incorrect number of segments after building list
Dec 4 09:12:04 fw1-shimo kernel: counted 32, received 16
Dec 4 09:12:05 fw1-shimo kernel: req nr_sec 0, cur_nr_sec 8
Dec 4 09:12:05 fw1-shimo kernel: blk: request botched
Dec 4 09:12:05 fw1-shimo kernel: ata3.00: WARNING: zero len r/w req
Dec 4 09:12:06 fw1-shimo last message repeated 5 times


Step 1: Drop the failed slice

# /sbin/mdadm /dev/md1 --fail /dev/sdc2
mdadm: set /dev/sdc2 faulty in /dev/md1
# /sbin/mdadm /dev/md1 --remove /dev/sdc2
mdadm: hot removed /dev/sdc2


Step 2: Zero out the failed slice

My thinking here is that by zeroing out the failed slice, I can force the SATA disk to remap any sectors that have gone bad.

# dd if=/dev/zero of=/dev/sdc2
dd: writing to `/dev/sdc2': Input/output error
24577993+0 records in
24577992+0 records out
12583931904 bytes (13 GB) copied, 1916.7 seconds, 6.6 MB/s


Well, that's not a good sign (and the disk was clicking a bit). So I'll run smartctl and check the disk's SMART info (see Monitoring Hard Disks with SMART).

# /usr/sbin/smartctl -i -d ata /dev/sdc
smartctl version 5.36 [x86_64-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model: SAMSUNG HD400LJ
Serial Number: S0H2J1KLA07831
Firmware Version: ZZ100-15
User Capacity: 400,088,457,216 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 7
ATA Standard is: ATA/ATAPI-7 T13 1532D revision 4a
Local Time is: Wed Dec 5 09:43:36 2007 EST

==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for details.

SMART support is: Available - device has SMART capability.
SMART support is: Enabled


However, the "-Hc" output of smartctl says that the disk health is still "PASSED" and not "FAILING". So it's possible that the disk doesn't need to be retired yet.

# /usr/sbin/smartctl -Hc -d ata /dev/sdc
smartctl version 5.36 [x86_64-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x05) Offline data collection activity
was aborted by an interrupting command from host.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 121) The previous self-test completed having
the read element of the test failed.
Total time to complete Offline
data collection: (7640) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 130) minutes.


Personally, since I know the drive makes clicking noises and throws an error during the dd wipe, I'm going to swap it out.

Labels: ,

Wednesday, August 15, 2007
Installing Angband on CentOS 5
Installation of Angband 3.0.9 on RedHat or CentOS5.

1) Grab the latest source release from http://rephial.org/release

# cd /root
# wget http://rephial.org/downloads/3.0/angband-3.0.9-src.tar.gz
# tar xzf angband-3.0.9-src.tar.gz

2) Compile the source code (the following is for running angband from the location where you unpacked the source, see Compiling for other options)

# cd angband-3.0.9
# ./configure
# make
# make install

3a) Errors: Make can't find "ncurses.h" (see also Compiling)

# make
CC main-gcu.c
main-gcu.c:63:22: error: ncurses.h: No such file or directory

Which indicates that you need to install the ncurses library. You can fix that by installing the "ncurses-devel" and re-running "./configure".

# yum install ncurses-devel
# ./configure

4) If you do a system install (making Angband available for all users on the system), make sure you add the users to the "games" group. Otherwise, when your users attempt to run Angband, they will get error messages about not being able to write to various files in the /usr/local/games/lib/angband folders.

# ./configure --with-setgid=games --with-libpath=/usr/local/games/lib/angband --bindir=/usr/local/games
# make
# make install

Labels: , ,

Thursday, July 05, 2007
SELinux is preventing named from write access
It seems like the SELinux profile in CentOS5 may not be correct by default. In my /var/log/messages file, I have thousands of entries per month consisting of:

Jul 4 05:01:04 fw1-hosho setroubleshoot: SELinux is preventing /usr/sbin/named (named_t) "write" access to named (named_conf_t). For complete SELinux messages. run sealert -l 663ea169-d194-4c49-a5bb-a6a4bb707990

Here's the output of the sealert command:

# sealert -l 663ea169-d194-4c49-a5bb-a6a4bb707990
Summary
SELinux is preventing /usr/sbin/named (named_t) "write" access to named
(named_conf_t).

Detailed Description
SELinux denied access requested by /usr/sbin/named. It is not expected that
this access is required by /usr/sbin/named and this access may signal an
intrusion attempt. It is also possible that the specific version or
configuration of the application is causing it to require additional access.
Please file a http://bugzilla.redhat.com/bugzilla/enter_bug.cgi against this
package.

Allowing Access
Sometimes labeling problems can cause SELinux denials. You could try to
restore the default system file context for named, restorecon -v named.
There is currently no automatic way to allow this access. Instead, you can
generate a local policy module to allow this access - see
http://fedora.redhat.com/docs/selinux-faq-fc5/#id2961385 - or you can
disable SELinux protection entirely for the application. Disabling SELinux
protection is not recommended. Please file a
http://bugzilla.redhat.com/bugzilla/enter_bug.cgi against this package.
Changing the "named_disable_trans" boolean to true will disable SELinux
protection this application: "setsebool -P named_disable_trans=1."

The following command will allow this access:
setsebool -P named_disable_trans=1

Additional Information

Source Context system_u:system_r:named_t
Target Context root:object_r:named_conf_t
Target Objects named [ dir ]
Affected RPM Packages bind-9.3.3-8.el5 [application]
Policy RPM selinux-policy-2.4.6-30.el5
Selinux Enabled True
Policy Type targeted
MLS Enabled True
Enforcing Mode Enforcing
Plugin Name plugins.disable_trans
Host Name fw1-shimo.hq.example.org.
Platform Linux fw1-shimo.hq.example.org.
2.6.18-8.1.6.el5 #1 SMP Thu Jun 14 17:29:04 EDT
2007 x86_64 x86_64
Alert Count 70481
Line Numbers

Raw Audit Messages

avc: denied { write } for comm="named" dev=md1 egid=25 euid=25
exe="/usr/sbin/named" exit=-13 fsgid=25 fsuid=25 gid=25 items=0 name="named"
pid=2628 scontext=system_u:system_r:named_t:s0 sgid=25
subj=system_u:system_r:named_t:s0 suid=25 tclass=dir
tcontext=root:object_r:named_conf_t:s0 tty=(none) uid=25


The most helpful web page that I've found so far is the thread "Permissions Issue starting Bind 9.3.1". The gist seems to be that RedHat (and CentOS) are using a chroot bind installation in conjunction with an SELinux policy that expects the bind configuration files to be in a non-chroot setup. But there aren't very clear instructions there on fixing it.

Labels: , , ,

Wednesday, July 04, 2007
Setting up svn+ssh on an alternate point for TortoiseSVN
This builds off a post to the TortoiseSVN user list: Specifying custom port for svn+ssh: a workaround

  1. Right-click on the Pageant icon in the system tray (I'm assuming that you're loading the SSH public key that you use for SVN into Pageant).
  2. Choose "New Session"
  3. Enter the hostname / IP address and SSH port that you'll be connecting to. If you're going to connect as "svn+ssh://thomas@svn.tgharold.com:2222", then this would be "svn.tgharold.com" and "2222".
  4. Go back to the "Session" tab and name the session as "svn.tgharold.com:2222".

Now you will be able to use both TortoiseSVN and the command-line version of SVN to talk to your repository over the alternate SSH port.

Why do this?

This is useful for cases where you want to put a SVN server on a publicly accessible IP address. What you will find is that if you leave SSH running on the default port, you will be inviting attacks on your SSH server. On the other hand, if you put the SSH server on an alternate port, you'll find that it gets attacked a lot less often (1-2 orders of magnitude difference would be likely).

Since mid-Oct of last year (around 8.5 months), we've logged 90,300 attack attempts against our SSH server. Usually they come in batches of attempting to guess accounts that normally exist or by attacking a list of common usernames. Since we don't allow root login, we don't allow password authentication, we only allow public key authentication and our SSH keys are limited to running "svnserve -t", we have yet to see a break-in attempt succeed.

Labels: , , ,

Monday, July 02, 2007
LVM and SELinux
I was a bit perplexed... I had created a LV called /dev/vg/svn, had it mounted, was reading/writing data to it with no issues. But after I rebooted the CentOS5 server, I'm unable to mount the LV.

[root@localhost /]# /usr/sbin/pvscan
PV /dev/md6 VG vg lvm2 [144.78 GB / 59.78 GB free]
Total: 1 [144.78 GB] / in use: 1 [144.78 GB] / in no VG: 0 [0 ]
[root@localhost /]# /usr/sbin/vgscan
Reading all physical volumes. This may take a while...
Found volume group "vg" using metadata type lvm2
[root@localhost /]# /usr/sbin/lvscan
No volume groups found
[root@localhost /]# /usr/sbin/lvdisplay
No volume groups found
[root@localhost /]# /usr/sbin/lvdisplay vg
--- Logical volume ---
LV Name /dev/vg/svn
VG Name vg
LV UUID taYjia-BWWs-IWG3-313k-VoC2-ghik-01mFCg
LV Write Access read/write
LV Status NOT available
LV Size 85.00 GB
Current LE 21760
Segments 1
Allocation inherit
Read ahead sectors 0

[root@localhost /]#


So lvdisplay knows that the LV is there, but only if I tell it to look at the VG named "vg".

...

Turns out that it's an SELinux issue. Because SELinux was blocking access to the /etc/lvm/.cache file, it was causing problems. Fixing it was as simple as:

# cd /etc/lvm
# /sbin/restorecon -v .cache
# /usr/sbin/lvscan
inactive '/dev/vg/svn' [85.00 GB] inherit

Labels: , ,

Sunday, July 01, 2007
CentOS5: Moving /var/log to a separate volume
One thing I like to do is put /var/log on its own volume. That keeps the root volume from overflowing and also gets the log files out of the way. However, in CentOS5 (and probably RHEL5), SELinux is probably going to complain unless we tell it to "fixup" the new filesystem.

  1. Create the filesystem (I use ext3, so # /sbin/mke2fs -j /dev/mdX)
  2. Mount it at a temporary location: # mkdir /mnt/log ; mount /dev/mdX /mnt/log
  3. Copy the contents: # cp -a /var/log/* /mnt/log/
  4. It may be necessary to "fixup" the new volume: # cd /mnt/log ; /sbin/restorecon -R *
  5. Edit the /etc/fstab file to mount the new volume at /var/log
  6. Reboot

AFAIK, that's the extent of what's needed. Looking at the directory listings using "ls -lZ" seems to show the correct SELinux flags on the files between the two different directories.

Labels: , , ,

Friday, June 22, 2007
FSVS ignore patterns (v1.1.5)
Here's a list of the current ignore patterns that I use on my CentOS5 box.

# fsvs ignore dump
./backup
./dev
./home
./lost+found
./media
./mnt
./proc/**
./root/.mozilla/firefox/**/Cache/**
./root/.thumbnails/**
./selinux
./sys
./tmp
./var/cache/**
./var/lock/**
./var/log/**
./var/named/chroot/proc
./var/run/**
./var/spool/**
./var/tmp/**

There are a few commands that I use to keep my sanity:

# fsvs dump ignore | sort > /root/fsvs-ignore.txt
# sort /root/fsvs-ignore.txt | fsvs ignore load

I find that keeping my ignore files in a .txt file under /root makes it easier to work with them. I'm able to edit the text file, load the ignore patterns into FSVS and see whether it does what it should. If it's wrong, I re-edit the text file and load them back into FSVS.

...

After mucking with a new box for a week, here's the set of ignore filters that I'm using on another CentOS5 box. On this particular box, I'm only versioning configuration data (/etc, /var/named).

[root@fw1-shimo /]# fsvs ignore dump | sort
./backup/
./bin/
./dev/
./home/
./lib/
./lib64/
./lost+found
./media/
./mnt/
./proc/
./root/
./sbin/
./selinux/
./srv/
./sys/
./tmp/
./usr/bin/
./usr/include/
./usr/kerberos/
./usr/lib/
./usr/lib64
./usr/libexec/
./usr/local/bin/
./usr/local/include/
./usr/local/lib/
./usr/local/libexec/
./usr/local/share/
./usr/local/src/
./usr/sbin/
./usr/share/
./usr/share/applications/
./usr/share/backgrounds/
./usr/share/dict/
./usr/share/doc/
./usr/share/i18n/
./usr/share/info/
./usr/share/locale/
./usr/share/man/
./usr/share/pixmaps/
./usr/share/X11/
./usr/share/zoneinfo/
./usr/src/
./usr/tmp/
./usr/X11R6/
./var/cache/
./var/lib/
./var/lock/
./var/log/
./var/named/chroot/dev/
./var/named/chroot/proc/
./var/named/chroot/var/run/
./var/run/
./var/spool/
./var/svn/
./var/tmp/
./var/www/
[root@fw1-shimo /]#

Labels: , ,

Thursday, June 21, 2007
Identifying bandwidth abusers in Linux
# /sbin/ip link

This Linux command will display information about your interfaces. When doing network analysis, the primary information that we're interested in is whether the interface is running in promiscuous mode. An adapter that is running in promiscuous mode can capture any packets that pass by on the wire, not just the ones destined for its MAC address. Here's an example of a ethernet adapter that is in promiscuous mode:

3: eth0: <BROADCAST,MULTICAST,PROMISC,UP> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:16:ff:ff:ff:25 brd ff:ff:ff:ff:ff:ff

If you're in a situation where there are multiple hosts on the WAN side and you want to monitor traffic for them, you'll need to use an interface in promiscuous mode. You'll also need to be connected to the same hub as those units, or connected to the same switch where your port is configured in monitoring mode.

# /sbin/ifconfig

Use this command to find out which interface is your WAN link and which interface is your LAN link (in the case of multi-homed systems).

/usr/bin/nload -t 10000 -u H -U H -i 1750 -o 1750 eth0

The nload utility is a console application that will graph the inbound and outbound activity on your network interface. You'll have to download and install this software yourself as it is not included in most distributions. If you have "rpmforge" configured on your CentOS5 installation, this is as simple as "yum install nload". Some key arguments are shown above. "-t 10000" sets the update time to every 10 seconds, "-i" and "-o" set the graph maximum height in kilobits per second (1750 works well for a 1.5Mbit T1).

/usr/sbin/nettop -d 5 -i eth0

This is another console utility that you can add to a Linux firewall. Nettop displays a tree-like listing of all activity on a particular interface, with the packets grouped by protocol and then port/service. This gives you a quick idea of what services are abusing your bandwidth.

Labels: ,

Wednesday, June 20, 2007
Remote GUI administration of CentOS5 using Windows
Over the years, I've become very spoiled by Windows Terminal Services that we use to administer our Windows 2000 and Windows 2003 servers. It's fast, it's slick, it allows copy-paste and with a bit of command line fu you can connect to the physical display (instead of one of the two virtual sessions). It also uses built-in Windows authentication and offers encryption.

So, now that I'm rolling out CentOS 5 servers - I need something similar that allows me to look at the graphical UI on the box from elsewhere. From what I can tell, my options are:

KVM that supports TCP/IP

Probably one of the holy grails of remote administration. It allows you to see everything from the BIOS setup screen onward without needing to be physically at the machine. The downside is cost. So while I will eventually be hooking one of these up, it's not in the budget for this quarter.

VNC over SSH

I have a love/hate relationship with VNC. On the Windows clients, we use UltraVNC with built-in Windows authentication and the AES encryption plug-in.

But if you want to wrap VNC with SSH, you have to configure port forwarding all the time in PuTTY. Which turns connecting to a remote server into a multi-step process. With Windows' RDP, I just say "connect to IP address X" and I'm done (and I can connect in as anyone that I want). For PuTTY+VNC, I have to jump through a lot more hoops.

There's also the (possible) issue that VNC is nowhere as efficient over the network as RDP. Once you use Terminal Services' RDP, you'll be spoiled and never want to use older technologies. It (almost) never glitches, it's lightning fast and responsive, and it's just pure remote GUI goodness (except for being a MS-only protocol).

X11 over SSH

This is where I'm heading at the moment. It uses SSH for authentication, so we can lock things down that way (forcing the use of public keys).

Now, a word of caution. A misconfigured SSH or X11 server is a security breach waiting to happen. Pay close attention to chapter 9 in SSH, The Secure Shell, The Definitive Guide by Barrett, Silverman & Byrnes (published by O'Reilly).

Installing Xming on Windows

In order to do X11 on Microsoft Windows, you need to install "X Server" software on the Windows box. While there are pay options out there, I'd suggest starting with Xming which is free (GPLv2). You'll want to download and install both Xming and Xming-fonts.

Configuration of sshd and X11

In order for the local X Server (Xming - running on your Windows system) to talk to the remote Linux server, you'll need to verify some settings on the Linux server. First up is configuration of the sshd daemon (typically /etc/ssh/sshd_config for OpenSSH). Look for the following 2 lines and make sure they are configured correctly:

X11Forwarding yes
#X11UseLocalhost yes

By default, OpenSSH ships with X11Forwarding set to "no" but the default for X11UseLocalhost is "yes". So you should only have to add the "X11Forwarding yes" line.

Create a PuTTY session

I'll make the assumption that you're going to use a PuTTY public-key pair. If you need to install a generated PuTTY key (maybe you want to use a separate PuTTY key for X11 forwarding), then here are the directions for OpenSSH.

(login as yourself or as root and then "su" to your username)
# cd ~/.ssh
# cat > machinename@svn.pub
(paste in PuTTY key)
# ssh-keygen -i -f machinename@svn.pub >> authorized_keys
(Ctrl-D to exit)

  1. Right-click on the Pageant icon in the system tray and choose "New Session".
  2. Enter the hostname (i.e. 192.168.1.1)
  3. Go to the Connection -> SSH -> X11 tab
  4. Turn ON "X11 forwarding"
  5. Display location should be: localhost:0
  6. Go back to the Session tab
  7. Enter a name in the Saved Sessions text box (i.e. "MyHost-X11") and click on "Save"
  8. Click the "Open" button to connect to the server

If all goes well, you should see a line like:

/usr/bin/xauth: creating new authority file /home/thomas/.Xauthority

Which tells us that SSH is ready to do some X forwarding.

Fire up Xming

If you haven't already ran Xming you should run XLaunch and just roll through the defaults. Now, in the PuTTY window that is sitting at a command prompt, try:

# xeyes

And you should see the xeyes application open up on your Windows system. If you want to continue to start up other X applications, put an ampersand (&) at the end of the line.

More advanced stuff

  1. Fire up XLaunch
  2. Select "One window" and click "Next"
  3. Select "Start a program" and click "Next"
  4. The start program should be either "gnome-session" or "startkde"
  5. Select Run Remote using PuTTY (plink.exe) and turn on the compression option.
  6. Enter the IP address or hostname in "Connect to computer" of the Linux box that you are connecting to
  7. Enter your username in the "Login as user"
  8. Click the "Next" button
  9. In the "Additional parameters", enter "-screen 0 1024 768" which will set screen zero to be 1024x768
  10. If you run your SSH server on a non-standard port, enter "-P port" in the PuTTY extra options field (run "plink" at a Windows command prompt to see the possible options)
  11. Save your configuration file and click "Finish"

If all goes well, you should see the Gnome desktop!

Final thoughts (for the moment)

Now, it's still not as slick as Terminal Services. But it seems to work just fine and gives me a GUI desktop. I still plan on doing most of my administration from the command line, but this provides a nice GUI for those who follow in my footsteps.

Labels: , , ,

UltraVNC (Server) Install on Windows XP
Installing UltraVNC (see also "UltraVNC Installation")

  1. Download UltraVNC for MS Windows
  2. Run the setup program (currently this is: UltraVNC-100-RC203-Setup.exe)
  3. Accept the license agreement and read the Information screen
  4. Use the default install destination location
  5. Choose "Complete Install"
  6. Use the default Start Menu Folder
  7. Turn ON "Register Ultr@VNC Server as system service"
  8. Turn ON "Start or restart Ultr@VNC service"
  9. Turn OFF the (3) options that create desktop icons
  10. Turn ON "Associated .vnc files with Ultr@VNC Viewer"
  11. Click "Install" to start the installation.

WinVNC: Default Local System Properties (see "configuration for details)

  1. Turn OFF "Enable JavaViewer"
  2. Turn ON "Display Query Window", Set the timeout to 60 seconds, with "Accept" as the default action.
  3. Under "Multi viewer connections", CHANGE to "Keep existing connections"
  4. Under "Authentication", set a secure default password
  5. Under "Authentication", turn ON "Require MS Logon", turn ON "New MS Logon"
  6. Click on "Configure MS Logon Groups", Add, enter "Administrators" (note the plural) and click "OK". Grant that group full control and click "OK" to close the UltraVNC Security Editor window.
  7. Most other options can be left "as is"

AES Encryption plugin (a.k.a. DSM)

  1. Download the AESV2 Plugin (currently: AESV2Plugin100.zip)
  2. Extract the .DSM file to the program folder where you installed UltraVNC (usually: C:\Program Files\UltraVNC), see "DSM quick start" for more information.
  3. Re-open the "Default Local System Properties" window for the UltraVNC server (Start -> UltraVNC -> UltraVNC Server -> Show Default Settings). Alternately, start up the service helper systray app (Run Service Helper) and go to "Admin Properties")
  4. Under "DSM Plugin", turn ON the "Use:" checkbox and select "AESV2Plugin.dsm" from the list.
  5. Click "OK"

Labels: , , ,

Thursday, May 31, 2007
FSVS for sysadmins
Notes: This entry was based on v1.1.4. The 1.1.5 and later versions of FSVS also place a few files in /etc/fsvs.

Original post follows

Okay, I'm heading back to trying FSVS again for doing system configuration management. The FSVS website has some documentation, but for the full documentation you'll want to download the source tarball and look in the "doc/" folder.

Resource links (and other useful information):

FSVS (fsvs.tigris.org) - the home page for FSVS. See also the Purpose of FSVS and Backup pages.

Subversion (software) - The SubVersion explanatory page over at Wikipedia.

SSH tricks - One of the most important documents to read if you want to setup secure SSH access to your SVN server. It specifies how to lock things down so that "/usr/bin/svnserve" is the only thing they can do with a particular public key.

Setting up Subversion in Linux

Creating Subversion Repositories

Now, here's what I do know about FSVS:

- It uses SubVersion for the backend storage. Which means that if you already have a SVN server up and running, you can use it for the FSVS storage. This also means that you could pull configuration files down onto your laptop with SVN to take a gander at the revision history of a particular file.

- FSVS doesn't pollute your directories with ".svn" folders. Instead, it keeps a central storage database elsewhere (by default this goes in /var/spool/fsvs, but you can move it). This WAA (Working copy Administrative Area) directory only contains file lists and hashes. It does not contain "pristine copies" of any files, so it will use up a lot less space then ".svn" folders.

- FSVS will keep track of file metadata (such as timestamps, chmod flags, etc. - see the FSVS website for particulars). I'm not sure whether this includes information needed by SELinux.

- You can use FSVS to push changes to machines. Not something that I'm interested in (yet), but I might use it down the road.

- And most (all?) tricks you can use in SVN repositories apply to FSVS. Such as cloning a machine from another's configuration using "svn cp" or comparing files between two machines. Or creating a branch for a new configuration.

- FSVS allows you to make "empty" commits to the repository. If nothing has changed in the system and you do a fsvs ci -m "commit message" then FSVS will create a new revision in the repository, but with no actual changes. That may come in handy in certain circumstances.

The method to my madness

My preference for system administration is to have a separate user account and SSH key for each machine that I manage. This allows me to use a no-password SSH key on the machine so that I can do svn/fsvs commands easily (or script svn/fsvs commands). Because the SSH server is locked down, and the keys are locked down with the "command=" syntax of SSH - I'm not terribly worried about keys that don't have passwords. Since SVN doesn't allow you to permanently delete files from a repository, there's a limited amount of damage that an attacker could do if they swipe the private key.

(Lastly, because the SSH private key is stored inside of /root/, it means they've cracked the server security already. We're just trying to limit the damage and keep them from being able to erase things on the central SVN server.)

Naturally, after talking about SSH keys, I'm only using the "svn+ssh" method of accessing the central repositories.

SSH Security considerations

On the SVN server, you should edit /etc/ssh/sshd_config and verify that the following are enforced in the SSH daemon configuration:

AllowTcpForwarding no
X11Forwarding no
PermitTunnel no

That eliminates most abuses that are possible, even if someone edits their ~/.ssh/authorized_key file on the SVN server.

Creating a user account on the SSH server

Make sure you have a naming scheme in place. For us, regular developers and administrators get normal looking usernames (i.e. "thomas" or "tgh" or "haroldt"). For machine accounts, we prefix the server name with "sys-" to create the username (i.e. "sys-fw1", "sys-mail1", "sys-gracie"). Which should make it easy to see if a machine has been added to groups that it shouldn't be in. Any groups that are used to control access to SVN directories are prefixed with "svn-repositoryname" such as "svn-sys-fw" (which owns the "sys-fw" repository).

A) On the client machine:

Login as root (or "su" to root).

# cd /root/
(skip the next 2 commands if the .ssh subfolder already exists)
# mkdir .ssh
# chmod .ssh 700
# cd .ssh
# /usr/bin/ssh-keygen -N '' -C 'svn key for root@hostname' -t rsa -b 2048 -f root@hostname
# cp /root/.ssh/root@svn /root/.ssh/id_rsa
# cat root@hostname.pub
(copy this into the clipboard or send it to the SVN server or the SVN server administrator)

B) On the SVN server

Note: You should use some sort of random password creator (or the output of /dev/random or /dev/urandom) to create a long password that can be copied and pasted into the password prompt. Since we're using SSH keys, the account doesn't need a password that anyone knows.

(I know there's a better way to do make an account with an unguessable password, yet still allow SSH access via pub keys, but I can't find it at the moment.)

# useradd -m username
(i.e. "useradd -m sys-fw1-pri")
# passwd username
(paste in a super-long randomized password, such as a few bytes from /dev/urandom shoved through md5sum)
# cd /home/username
# su username
$ mkdir .ssh
$ chmod 700 .ssh
$ cd .ssh
$ cat > root@hostname.pub
(paste in the public key file from the client system)
$ cat root@hostname.pub >> authorized_keys
$ chmod 600 *

Now to lock the key down, edit the ~/.ssh/authorized_keys file and put the following on the front of the key line that will be used by the client machine:

command="/usr/bin/svnserve -t -r /var/svn",no-agent-forwarding,no-pty

This forces the connection to run the "svnserve" command in tunnel mode. So this SSH key cannot be used to login or run any other commands on the server. It also changes the SVN root path to /var/svn. You will want to also add "no-port-forwarding" and "no-X11-forwarding" if you have not disabled those in your /etc/ssh/sshd_config file.

We now have a user account that can be used with SVN. Go ahead and [Ctrl-D] to escape the "su username" session and get back into the root account.

Setting up the repository (on the SVN server)

Reasons to use a common repository for all similar machines:
- Ability to do a svn diff between two different machines
- Ability to clone machines ("svn cp")
- Backup scripts are less complex (fewer repositories)
- You can "svn diff" between machines

Reasons to use individual repositories
- Easy to secure using chmod/chown
- Easy to get report sizes on how much space a machine is using on the repository server
- Easy to dump/load an individual machine's repository
- Easy to take a machine offline and remove the repository to save space
- A machine's SSH key can only be used to look at their repository (unless you configure per-directory authentication in SVN)

Note: I'm using "username", "hostname" and "machinename" fairly interchangeably on this page. (Someday I'll go back and clean it up.)

On the SVN server:

# cd /var/svn
(your repositories may be stored elsewhere)
# svnadmin create /var/svn/sys-machinename
# chmod -R 770 sys-machinename
# chmod -R g+s sys-machinename/db
# chown -R sys-machinename:sys-machinename sys-machinename

Notes:
- A chmod of "770" allows read/write access to everyone in the same group.
- A chmod of "700" only allows read/write to the owner account.
- The third octet should probably always be "0" to prevent the repository from being world-readable.

Verifying the connection (on the client)

You will need to customize the following URL to point at your SVN repository location.

# svn info svn+ssh://sys-machinename@svn.intra.example.com/sys-machinename/

That should prompt you to accept the server's public key, then display a response fron the SVN server. If things don't work, then you've got connectivity (firewall), account (wrong name? wrong key?) or permissions (chmod or chown goofs?).

Installing FSVS (on the client)

Notes:

In order for the install to succeed, you must have installed the "subversion", "subversion-devel" "apr", "apr-devel", "gcc" and "ctags" packages. Two others that you need are "gdbm" and "pcre" (and the associated developer packages). There may be other dependencies that will also be installed that are required by those packages. The following command worked for me on CentOS5:

# yum install subversion subversion-devel ctags apr apr-devel gcc gdbm gdbm-devel pcre pcre-devel

Head on over to the official project page at freshmeat.net and download the tarball. This is currently "fsvs-1.14.tar.gz". Extract the tarball to a folder somewhere (i.e. /root/fsvs-1.14.tar.gz) and use a terminal session to go to that folder.

# cat README
(look for the section that talks about the install)
# cd src
# make
(you will receive a message that the Makefile has now been updated and that you need to run make again)
# make
(you should see a large stream of gcc output with the following at the end)
-rwxr-xr-x 1 root root 201510 May 31 09:54 fsvs
(you must see the above line to know that you got a good compile)
# cp fsvs /usr/local/bin

At this point, FSVS *should* be installed correctly.

Getting started with FSVS

By default, FSVS will want to use /var/spool/fsvs unless you define the "WAA" variable and point it somewhere else. So according to the README you will need to create that folder and then initialize it (which lets FSVS create the administrative files it needs).

Note #1: I'm not sure how large /var/spool/fsvs will get. You may eventually want to break it off into a separate LVM volume of its own. Since it is mostly file lists and hashes, it shouldn't get too large, but you may want to put it on an ext3 volume with more inodes then normal.

Note #2: You should read both the README and the output of "fsvs help urls".

# mkdir -p /var/spool/fsvs
# chmod 700 /var/spool/fsvs
# cd /
# fsvs urls svn+ssh://username@machine/path/to/repos
(The "fsvs urls" won't display any confirmation text.)

The above will connect the root folder to the repository path. You could also do multiple URLs and only link sub-folders (such as /etc, /usr, /home) up against the SVN repository.

Setting up ignore filters

After telling FSVS that we want to use "/" as our working copy, we'll want to also tell it to ignore various directories and files. While this tends to be somewhat similar across Linux distributions, you should also plan on modifying this list to match your distribution.

Make sure you read "fsvs-*/doc/IGNORING" and the output of "# fsvs help ignore".

# fsvs ignore ./backup
# fsvs ignore ./dev
# fsvs ignore ./mnt
# fsvs ignore './proc/*'
# fsvs ignore ./sys
# fsvs ignore ./tmp
# fsvs ignore ./var/tmp
# fsvs ignore ./var/spool

In addition you may wish to initially ignore all of the binary file directories (such as ./lib, ./lib64, ./sbin, ./usr, ./var) and focus solely on /etc, /home, /boot and /root. That will give you a much slimmer listing when you "fsvs status" from the root directory.

You can use the "fsvs ignore dump" and "fsvs ignore load" commands to backup your listing, edit it, then load it back into FSVS. Note that you must be in the base directory of your working copy, otherwise "fsvs ignore dump" will return an empty listing.

# cd /
# fsvs ignore dump > ~/fsvs-ignore.txt
# vi ~/fsvs-ignore.txt
# sort ~/fsvs-ignore.txt | fsvs ignore load

My initial listing on CentOS5 is:

./backup
./dev
./lost+found
./media
./mnt
./proc/*
./selinux
./sys
./tmp
./var/named/chroot/proc
./var/spool
./var/tmp


Putting /etc under version control

Assuming that you setup FSVS where "/" (root) is the base of the working copy, we can now add the contents of /etc to SVN.

# cd /
# fsvs commit -m "Base check-in of /etc" /etc
# fsvs commit -m "Base check-in of /boot" /boot

If you want to deep-commit a single folder (such as /usr/local/sbin) without doing the intervening folders:

# cd /
# fsvs commit -m "Base check-in" /usr/local/sbin

Working with FSVS

Create a test file in /etc

# cat > /etc/testfile.txt
foo
# fsvs commit -m "Checking in a test file" /etc/testfile.txt

Now delete the test file

# rm testfile.txt
# fsvs status .
D... 4 ./etc/testfile
.mC. dir ./etc
# fsvs commit -m "Removed test file" .
Committing to svn+ssh://sys-fw1-pri@svn.example.com/sys-fw1-pri
.mC. dir ./etc
D... 4 ./etc/testfile
committed revision 7 on 2007-06-04T00:17:13.808327Z as sys-fw1-pri


That just scratches the surface, but covers the majority of day-to-day use. Unlike SVN, FSVS knows (assumes) that when a file is missing that it should implicity do a "delete" operation in the repository to make the repository match the file system.

Other useful commands to know are "fsvs unversion" and "fsvs diff".

Labels: , , , ,

Tuesday, May 29, 2007
SVK for system management
I'm a big fan of using a version control system in conjunction with system administration. There's a great feeling to know that even if I screw up a configuration file, I have an easily accessible way to revert or track changes. To accomplish this, I was using SubVersion (SVN) as an administration tool.

However, SVN comes with some downsides, primarily the issue that it creates ".svn" folders in the directory tree. Which can cause issues and maybe even lead to security holes (yes/no? unsure about this).

So, maybe SVK is better suited.

Installing SVK on CentOS5

(See SVK - Distributed Version Control - Part I (Ron Bieber, 2004) for a good tutorial.)

1. Open up the package manager and make sure you have installed the following packages:

"subversion" (possibly not required)
"subversion-perl" (Perl bindings)

Or, using the command-line (for x86_64):

# yum install subversion.x86_64 subversion-perl.x86_64

At least... I think the above command works. I used the GUI package manager for this step.

2. Use Perl and CPAN to install the SVK system.

# perl -MCPAN -e 'install SVK'

You'll be presented with about a dozen questions, and you'll need to install all sorts of modules if this is your first time running that command. It's all pretty self-explanatory (and I was on the phone while doing that, so I wasn't able to jot everything down).

...

Well, after mucking with this for a few hours and getting self-test errors, I'm going to shelve this for now and go look at FSVS insteadk.

Labels: ,

Remote install of CentOS5 using VNC
I haven't tried this yet, but there are times when it could come in handy.

Centos 5 vnc remote installation

Post details: Upgrading to CentOS4, over a remote vnc connection

I figured it was possible, I just hadn't looked.

Labels: , ,

Sunday, May 27, 2007
Squid, SELinux and using a separate volume for the cache_dir
This was a slightly tricky one. I'm running CentOS5 with SELinux and I was trying to setup Squid to put its cache_dir on a LVM volume (to keep it from using up space on the root partition).

# /etc/init.d/squid stop
# cd /var/spool
# lvcreate -L64G -nvar-spool-squid vg
# mke2fs -j /dev/vg/var-spool-squid
# mkdir /mnt/squid ; mount /dev/vg/var-spool-squid squid
# cp -a /var/spool/squid/* /mnt/squid/
# cd /var/spool/squid
# rm -rf *
# cd /var/spool
# mount /dev/vg/var-spools-squid squid
# /etc/init.d/squid start

Starting squid: /etc/init.d/squid: line 53: 9440 Aborted $SQUID $SQUID_OPTS >>/var/log/squid/squid.out 2>&1
[FAILED]

# tail /var/log/messages

May 27 21:50:48 fw1-hosho setroubleshoot: SELinux is preventing /usr/sbin/named (named_t) "write" access to named (named_conf_t). For complete SELinux messages. run sealert -l 663ea169-d194-4c49-a5bb-a6a4bb707990
May 27 22:39:26 fw1-hosho squid: cache_dir /var/spool/squid: (13) Permission denied

# /usr/bin/sealert -l 626e75b4-32aa-4a61-88f7-f36a68fecd35
Summary
SELinux is preventing access to files with the label, file_t.

Detailed Description
SELinux permission checks on files labeled file_t are being denied. file_t
is the context the SELinux kernel gives to files that do not have a label.
This indicates a serious labeling problem. No files on an SELinux box should
ever be labeled file_t. If you have just added a new disk drive to the
system you can relabel it using the restorecon command. Otherwise you
should relabel the entire files system.

Allowing Access
You can execute the following command as root to relabel your computer
system: "touch /.autorelabel; reboot"

Additional Information

Source Context user_u:system_r:squid_t
Target Context user_u:object_r:file_t
Target Objects /var/spool/squid/00 [ dir ]
Affected RPM Packages squid-2.6.STABLE6-4.el5 [application]
Policy RPM selinux-policy-2.4.6-30.el5
Selinux Enabled True
Policy Type targeted
MLS Enabled True
Enforcing Mode Enforcing
Plugin Name plugins.file
Host Name fw1-hosho.intra.example.com.
Platform Linux fw1-hosho.intra.example.com. 2.6.18-8.1.4.el5
#1 SMP Thu May 17 03:16:52 EDT 2007 x86_64 x86_64
Alert Count 10
Line Numbers

Raw Audit Messages

avc: denied { getattr } for comm="squid" dev=dm-0 egid=23 euid=23
exe="/usr/sbin/squid" exit=-13 fsgid=23 fsuid=23 gid=23 items=0 name="00"
path="/var/spool/squid/00" pid=9584 scontext=user_u:system_r:squid_t:s0 sgid=23
subj=user_u:system_r:squid_t:s0 suid=0 tclass=dir
tcontext=user_u:object_r:file_t:s0 tty=(none) uid=23


...

So, the problem is that SELinux had not yet been told to look at the newly created volume (a LVM volume mounted on /var/spool/squid). Fixing this was rather simple once you know about the restorecon command.

# cd /var/spool/squid
# /usr/sbin/squid -z
# /sbin/restorecon -R *
# /etc/init.d/squid start

Labels: , , ,

Friday, May 25, 2007
iSCSITarget on CentOS5
Setting up our test iSCSI SAN box this week. The original plans were to run this on top of Gentoo (which is very powerful and flexible) but after 3 years, I'm not very pleased with Gentoo as a server OS. Which is a whole different topic. So we've migrated over to using CentOS5, which is derived from Red Hat Enterprise Linux 5, a distro that is more suited for corporate use.

There's not much to talk about in terms of the base system. It's a pretty vanilla 64bit CentOS5 install (from DVD) running on top of a dual-CPU dual-core pair of Socket F Opterons. The primary packages that I've installed so far are "Yum Extender" (from stock repositories) and "rdiff-backup" (downloaded as an RPM). The OS runs on top of a 3-disk RAID1 (mirror, all drives active) Software RAID for safety.

I use a semi-customized partition layout on the (3) operating system disks. I have:

a) /boot
b) / (root, the primary OS install area)
c) swap
d) a backup root partition (which is basically a clone of the primary, except for a small change in /etc/fstab) designed for quick recovery from a situation that would hose the primary root partition
e) /var/log (broken out to its own area)
f) /backup/system (a place to store system backups)
g) LVM area (no allocated areas yet)

I mention all that because the first step before installing iscsitarget is to make sure I can recover if things go awry. Since installing iscsitarget involves mucking with the running kernel, I want a good backup of /boot along with making sure GRUB offers me options to boot an older kernel. I'll also freshen my root backup partition.

Step 1 - Backing up /, /boot, and the existing kernel

Simplicity is often best when dealing with the base OS. My methods are crude, but designed to get me back up and running without needing much in the way of software. The primary requirement is a bootable USB pen drive or bootable LiveCD (such as RIPLinuX) with the necessary tools. You could also use the CentOS5 boot DVD.

I'll run with the CentOS install DVD since that's what I have sitting in the optical drive at the moment. When CentOS boots up, enter "linux rescue" at the boot prompt. Note, if you have multiple NICs installed, it's probably better to not start networking (because the CentOS rescue mode takes forever to initialize unconnected NICs).

Select "Skip" when asked about mounting the existing install at /mnt/sysimage. We'll be doing things our own way instead.

Start up Software RAID on the key partitions (/boot, /, the backup /, and the backup partition). The following commands will (usually) startup your existing RAID devices automatically.

# mdadm --examine --scan >> /etc/mdadm.conf
# mdadm --assemble --scan

In my case "md0" is /boot, "md2" is my base CentOS install, "md3" is the backup root partition, and "md5" is where I can store image files. So let's double-check that.

# mkdir /mnt/root ; mount /dev/md3 /mnt/root
# mkdir /mnt/backuproot ; mount /dev/md3 /mnt/backuproot

If we then examine the output of "df -h" or by using "ls" on the mounted volumes we can verify that we know which is which. Let's mount our backup area and create image files. I prefer to kick off the "dd" commands in the background so that I can monitor progress and keep multiple CPUs busy.

# mkdir /mnt/backup ; mount /dev/md5 /mnt/backup
# cd /mnt/backup ; mkdir images ; cd images
# dd if=/dev/md0 | gzip > dd-md0-boot-20070525.img.gz &
# dd if=/dev/md2 | gzip > dd-md2-root-20070525.img.gz &
# dd if=/dev/md3 | gzip > dd-md3-bkproot-20070525.img.gz &

We should also backup the master boot records on each of the hard drives in the unit.

# for i in a b c; do dd if=/dev/sd$i count=1 bs=512 of=dd-sd$i-mbr-20070525.img; done

Unfortunately, the CentOS5 DVD doesn't include tools like "G4L" (Ghost for Linux) or I'd make a second set of backup files using that. I may boot my RIPLinuX CD and see what tools are there. (Because you can never have too many backups.)

Now I can dump the contents of "md2" (the original root) to "md3" (our backup root).

# dd if=/dev/md2 of=/dev/md3

Now for some cleanup stuff...

# mount /dev/md3 /mnt/backuproot
# vi /mnt/backuproot/etc/fstab

We'll need to change any references of "md2" to "md3". Basically flip them around so that "md3" is the official root when /etc/fstab gets processed. I also like to change the prompt and system name to remind myself that I'm using the emergency system. Again, our primary goal is to be able to get a box back up and operational in the case where the primary root partition is hosed. Get it up quickly, then schedule some downtime to deal with it properly.

Now would also be a good time to tune the ext3 file system on your partitions.

The last thing we need to do is edit GRUB's configuration so that we can select our backup root OS from the selection menu.

# mkdir /mnt/boot
# mount /dev/md0 /mnt/boot
# vi /mnt/boot/grub/grub.conf

Things that we'll want to do here (you could also accomplish this by booting the server in normal mode and editing grub.conf there using a more comfortable text editor):

a) Change the timeout=5 value to timeout=15 (or 30 or 60). By default, CentOS doesn't give you very long to pick an alternate boot. I find 5 seconds to be too short of a window, especially on a unit where the storage controller takes a minute or two to scan and setup the drives.

b) Copy the latest "title" section and change "root=/dev/md2" to "root=/dev/md3". I always make the "EMERGENCY" boot option the 2nd one in the list.

# mkdir /mnt/backuproot
# mount /dev/md3 /mnt/backuproot
# vi /mnt/backuproot/etc/sysconfig/network

I like to change the hostname to have "-emergency" tacked onto the end. Which should make it fairly obvious that we are booting up in emergency mode using the backup root partition. I also edit root's .bash_profile to set PS1.

Okay, that was a lot of setup work just to prepare for implementing iSCSITarget (or any other kernel rebuild), but it's always worth it.

Final notes:

- When I test booted the emergency root partition, things didn't work as planned. So while my concept is sound, I may have screwed something up. I think it's an error with /etc/fstab in the emergency partition, so I'll troubleshoot that later.

- It's also possible that you'll need to do a GRUB install on all (3) of the primary mirror disks.

Step 2 - Downloading and compiling the iSCSITarget software

So far, I've found (2) links to be useful here. One is Moving on.... and the other is iSCSI Enterprise Target を CentOS5 にインストールする (japanese). While the 2nd link is in Japanese, it shows the commands in english.

Head over to the The iSCSI Enterprise Target page and download the latest tarball containing the source code. The current version is 0.4.15. If you're using Firefox in CentOS's Gnome shell, it will probably prompt you to open the file with the archive manager. I created a subfolder under /root/iscsitarget-0.4.15 and extracted the contents there.

You will also need to go into Applications -> Add/Remove Software and add the development tools and libraries to your system. (Mostly you just need gcc.)

As noted on jackshck's page, you will also need to install the following packages:

openssl-devel (I installed the x86_64 version)
kernel-devel (again, I'm using the x86_64 version)

Open up a terminal window and go to where you extracted the iscsitarget tarball (I put mine in /root/iscsitarget-0.4.15).

# ls -l /usr/src/kernels
(make note of the kernel folder)
# make KSRC=/usr/src/kernels/2.6.18-8.1.4.el5-x86_64/
# make KSRC=/usr/src/kernels/2.6.18-8.1.4.el5-x86_64/ install

Now we can start up the ietd daemon:

# /etc/init.d/iscsi-target start

And add it to our default runlevel (this is similar to the rc-update command in Gentoo Linux):

# chkconfig iscsi-target on

Step 3 - Creating a target

This is where we get into the nitty-gritty and where I need to take a break and do some research. The /etc/ietd.conf file already exists at this point, but only contains a commented out sample configuration.

Notes:

Dec 21 2007 - The comment about iSCSITarget software for Microsoft Windows really isn't on-topic. But I'll go ahead and list the link to it, but not as an HTML link. Pricing for the real version is currently $395 (Server) or $995 (Professional). And personally, there's no way that I'd recommend running a SAN on top of Microsoft Windows (even Server 2003, which is a nice product).

Labels: , , , ,

Thursday, May 10, 2007
Dealing with a failed Software RAID device
As part of my server setup, I like to make sure that plans are working as expected... which means intentionally breaking things like RAID sets.

In this particular case I have a triple-active RAID1 mirror set on the first 3 disks in the system (/dev/sda, /dev/sdb, /dev/sdc). In this RAID1 set, all 3 disks are active, with no hot-spare. I prefer this over a (2) active (1) hot-spare setup because it allows for up to 2 disks to fail before you lose data. And if I'm already dedicating a hot-spare spindle solely for the use of the RAID1 set, I may as well get to use it. The output of /proc/mdstat looks similar to (note that none of the slices are tagged with a "(S)").:

md2 : active raid1 sdc2[2] sdb2[1] sda2[0]
7911936 blocks [3/3] [UUU]


Each disk has quite a few md devices associated with it. In this particular case I have /dev/md0 up through /dev/md5 created. Probably one of the few downsides to SoftwareRAID is that you end up with quite a few md devices to keep track of. But such is the price for just about the ultimate flexibility.

GRUB Note: In a mirrored setup, you must make sure to install GRUB to the MBR (master boot record) on all of the mirror disks. Some Linux distros don't do this on their own and you'll have to do it yourself. Otherwise, when the first disk in the mirror set fails, you'll find you're left with an unbootable system. This is also why I like to make copies of the MBR for each disk in the system (# dd if=/dev/sda of=dd-sda-mbr-date.img bs=512 count=1).

So, after making very good backups, I decided it was time to test whether I could pull a disk and survive. To make sure that I had taken care of the GRUB issue, I shutdown the server and pulled the primary drive in the RAID set.

# cat /proc/mdstat
md2 : active raid1 sdc2[2] sdb2[1]
7911936 blocks [3/3] [_UU]


Ah good, mdadm is *not* happy here (as expected). It knows that one of the disks has failed in the array. So let's shutdown and replace the failed drive with a blank one. In this case, I used a spare drive that I had laying around that had been previously wiped. (Or, with care, you could zero out the drive that you pulled.)

I recomend using "sfdisk" in dump mode to configure the new drive. So if your failed drive is "sda" and one of the good ones is "sdb", you could use:

# sfdisk -d /dev/sdb | sfdisk /dev/sda

After which, you can use the "mdadm" command to add the new slices to the existing RAID arrays.

# mdadm --add /dev/mdX /dev/sdYZ

Last, don't forget to install GRUB to the MBR on the new disk.

Labels: ,

Monday, May 07, 2007
Brute force disaster recovery for CentOS5
Today's trick is moving a CentOS5 system from an old set of disks over to a new set of disks. Along the way, I'll create an image of the system to allow me to restore it later on.

The CentOS5 system is a fresh install running RAID-1 across (3) disks using Linux Software RAID (mdadm). There are (4) primary partitions (boot, root, swap, LVM) with no data on the LVM partition.

(Why 3 active disks? The normal setup for this server was RAID-1 across 2 disks with a hot-spare. Rather then have a risky window of time where one disk has failed and the hot-spare is synchronizing with the remaining good disk, I prefer to have all 3 disks running. That way, when a disk dies, we still have 2 disks in action. The mdadm / Software RAID doesn't seem to care and it doesn't seem to affect performance at all.)

Because this is RAID-1, capturing the disk layout and migrating over to the new disks will be very easy. It's also a very fresh install, so I'm just going to grab the disk contents using "dd" (most of the partition's sectors are still zeroed out from the original install). Once I've backed up the (3) partitions on the first drive, I'm going to pull the (3) drives and replace them with the new ones.

I'll get the machine up and running with the first replacement drive, then configure the blank 2nd and 3rd drives and add them to the RAID set. That is, if mdadm doesn't beat me to the punch and start the sync on the 2nd/3rd disks automatically.

If things go bad, I can always drop the original disks back in the unit and power it back up. I plan on keeping them around for a few days, just in case. I'll have to recreate the LVM volumes, but there aren't any yet (just a PV and a VG).

One advantage of pulling the old drives out completely and rebuilding using fresh drives - I'll end up with a tested disaster recovery process.

Now for the nitty gritty. I'm using a USB pocket drive formatted with ext3 for the rescue work. Make sure that you plug this in before booting the rescue CD.

  1. Login to the system and power it down.
  2. Boot the CentOS5 install DVD
  3. At the "boot:" enter "linux rescue"
  4. Work your way through the startup dialogs
  5. When prompted whether to mount your linux install, choose "Skip"

This should give you a command shell with useful tools. So let's poke around and check on our system.

  1. Looking at "cat /proc/mdstat" shows that while the mdadm software is running, it has not assembled any RAID arrays.
  2. The "fdisk -l" command shows us that the (3) existing disks are named sda, sdb, sdc. Each has (4) partitions (boot, root, swap, LVM).
  3. My USB drive showed up as "/dev/sdd" so I'll create a "/backup" folder and mount it using "mkdir /backup ; mount /dev/sdd1 /backup ; df -h"

Naturally, we should create a sub-folder under /backup for each machine and possibly create another folder underneath it using today's date. We should grab information about the current disk layout and store it in a text file (fdisk.txt).

  1. # cd /backup ; mkdir machinename ; cd machinename
  2. # mkdir todaysdate ; cd todaysdate
  3. # fdisk -l > fdisk.txt

Now to grab the boot loader and image the two critical partitions (boot and root). We'll grab the boot loader off of all (3) drives because it's so small (and it may not be properly synchronized).

  1. dd if=/dev/sda bs=512 count=1 of=machinename-date-sda.mbr
  2. dd if=/dev/sdb bs=512 count=1 of=machinename-date-sdb.mbr
  3. dd if=/dev/sdb bs=512 count=1 of=machinename-date-sdc.mbr
  4. dd if=/dev/sda1 | gzip > machinename-date-ddcopy-sda1.img.gz
  5. dd if=/dev/sda2 | gzip > machinename-date-ddcopy-sda2.img.gz

Total disk space for my system was around 1.75GB worth of compressed files (8GB root, 250MB boot). You could also use bzip2 if you need more compression. Unfortunately, the CentOS5 DVD does not include the "split" command, which could cause issues if you're trying to write to a filesystem that can't handle files over 2GB in size.

Now you should shut the box back down, burn those files to DVD-R, install the new (blank) disks, and boot from the install DVD again. Again, mount the drive that holds the rescue image files to a suitable path.

  1. dd of=/dev/sda bs=512 count=1 if=machinename-date-sda.mbr
  2. fdisk /dev/sda (fix the last partition)
  3. dd if=/dev/sda bs=512 count=1 of=/dev/sdb
  4. dd if=/dev/sda bs=512 count=1 of=/dev/sdc

That will restore the MBR and partition table from the old drive to the new one. If your new drive has a different size, then the last partition will be incorrectly sized for the disk. Fire up "fdisk" and delete / recreate the last partition on the disk.

Restore the two partition images:

  1. # gzip -dc machinename-date-ddcopy-sda1.img.gz | dd of=/dev/sda1
  2. # gzip -dc machinename-date-ddcopy-sda2.img.gz | dd of=/dev/sda2

At this point, we should be able to boot the system on the primary drive and have Software RAID come up in degraded mode for the arrays. Things that will need to be done once the unit boots:

  1. Tell mdadm about the 2nd (and 3rd) disks and tell it to bring those partitions into the arrays and synchronize them.
  2. Create a new swap area
  3. Recreate the LVM physical volume (PV) and volume group (VG)
  4. Restore any data from the LVM area (we had none in this example)

Getting the Software RAID back up and happy is the trickiest of the steps.

  1. Login as root, open up a terminal window
  2. # cat /proc/mdstat
  3. # fdisk -l
  4. Notice that the swap area on our system is sda3, sdb3, sdc3 and will need to be loaded as /dev/md1.
  5. # mdadm --create /dev/md1 -v --level=raid1 --raid-devices=3 /dev/sda3 /dev/sdb3 /dev/sdc3
  6. # mkswap /dev/md1 ; swapon /dev/md1
  7. Now we're ready to recreate the LVM area
  8. # mknod /dev/md3 b 9 3
  9. # mdadm --create /dev/md3 -v --level=raid1 --raid-devices=3 /dev/sda4 /dev/sdb4 /dev/sdc4
  10. # pvcreate /dev/md3 ; vgcreate vg /dev/md3
  11. Finally, we should add the 2nd and 3rd drive to md0 and md2.
  12. mdadm --add /dev/md0 /dev/sdc1
  13. mdadm --add /dev/md0 /dev/sdb1

Note: If your triple mirror RAID array puts the additional disks in as spares, make sure that you have (a) grown the number of raid devices to 3 for the RAID1 set and (b) make sure that there are no other arrays synchronizing as the same time. It's also best to add the elements one at a time, rather then adding both at the same time. I'm not sure if it's a bug in mdadm or just the way it works, but it took me two tries to get my triple mirror back up with all disks marked as "active" instead of (2) active and (1) hot-spare.

Labels: , , , ,

Saturday, April 28, 2007
Starter kit for an iSCSI SAN
Now that it's spring, it's time for us to start building out our preliminary iSCSI SAN unit. Here's the hardware shopping list:

$0600 Super Micro 4U/TOWER RM EATX BLACK ( CSE-942I-R760B )
- triple-module redundant PSU w/ 760W
- 4U case for either rack or tower use
- (9) 5.25" bays

$0020 20-pin front panel connector to breakout cable
- Converts the 20-pin connector to something that can be attached to normal ATX motherboards
- CBL-0067 30cm
- CBL-0085 15cm

$0050 Rackmount Rail Kit: CSE-PT26

$0320 (2) Spare PSU modules - PWS-0050(M)
- Spare PSU modules for the redundant PSU
- Useful to have a spare or two on-hand

$0600 (4) CSE-M35T1 (black) - SuperMicro SATA 5:3 backplanes
- These allow you to fit a total of (15) SATA drives into the (9) 5.25" bays
- There are other SATA 5:3 backplanes that you can use
- While we're only going to install (3) of these backplanes, I recommend buying a 4th for spare parts

$0167 3848163 (1) INTEL PRO/1000 PT DUAL PORT EXPI9402PT gigabit PCIe x4
- Used for SAN traffic
- Eventually, we'll upgrade to a quad-port PCIe or a 10GigE

$0167 1494573 (1) INTEL PRO/1000 PCI-X
- The PCI card is used to talk to the LAN and internet, no SAN traffic will flow over it
- You could use an inexpensive 10/100 PCI card, but with a dual-port NIC you can bond for high-availability

$0600 12-port Promise SATA-II PCIe x8 card EX12350
- CentOS5 automatically sees any drives attached to this card (when they are configured in JBOD mode)
- We're going the SoftwareRAID route

$0305 TYAN S2927G2NR dual-Opteron Socket F Thunder n3600B (S2927)
$0600 Opteron 2214 dual-core Socket F
$0200 (2) 1GB memory modules
$0100 (2) Socket F cooling fans (Cooljag CJC689C)
- (4) 1.8GHz cores should be plenty of horsepower to do use Software RAID instead of the Promise RAID software
- 2GB is probably minimal for RAM, 4GB would be better

$1800 (15) 500GB SATA-II drives
- 500GB is a good balance between price and capacity

Totals:

$3410 base system
$1800 drives

...

The drive plan for this unit is:

(3) 500GB drives in 3-way RAID1 (mirrored) for the operating system, log files, and other support software

Either:

(10) RAID10 + (2) hot-spares
(2) 5-disk RAID6 + (2) hot-spares

The pair of RAID6 arrays would give us about 20% more capacity (net of 6 disks vs 5 disks). So the RAID10 setup results in around 2.27TB while the RAID6 setup would give 2.72TB.

With an overall cost of around $5500 for the entire unit, the price per gigabytes end up as:

$2.36/GB for (1) RAID10 array
$1.97/GB for (2) RAID6 arrays

Which is not terribly bad for a starter unit.

Labels: , ,

Tuesday, March 06, 2007
My new preferred disk layout for servers
With age comes wisdom? After working with SoftwareRAID and Linux servers for a while, I've changed my preferred disk system design and layout.

RAID

Under the old system, I was running a (2) disk RAID1 (mirror) with a hot-spare disk setup and ready for action. But if you're going to have a hot-spare dedicated to the RAID1 array, why not use it as an active array member? That way, if a disk fails, you still have two good disks. Unfortunately, when a RAID element fails, the load from the rebuild process can often kill the one of the remaining disks in the array.

Is it a likely scenario? Probably not. But Linux's Software RAID handles a triple-active RAID1 mirror without any slowdown, so there's not much reason *not* to implement it that way. Plus it's a useful trick to know for situations where you really *do* need to be that paranoid.

(I'm not sure whether any hardware RAID cards provide for a triple-active mirroring RAID1 configuration.)

Partitions

I've also simplified how many partitions I like to have on the disk. My current disk layouts typically look like:

/dev/sdX1 - /dev/md0 - 250MB - /boot
/dev/sdX2 - /dev/md1 - 12GB - / (primary root)
/dev/sdX3 - /dev/md2 - 12GB - / (backup root)
/dev/sdX5 - /dev/md3 - 32GB - /var/log
/dev/sdX6 - /dev/md4 - 2GB - swap
/dev/sdX7 - /dev/md5 - 64GB - /backup/system
/dev/sdX8 - /dev/md6 - (remainder) - LVM area

During normal operations, we boot and run /dev/md1 as our / (root) partition. The /dev/md2 partition is kept offline and is never mounted. Periodically, after validating that the server is in good health, we will copy the contents of /dev/md1 to /dev/md2, make adjustments to /etc/fstab and the hostname. This requires some server downtime (long enough to setup the 2nd root partition).

In the case where the primary OS is hosed, we can boot from the backup OS partition and get back up and running quickly. That gives us the luxury to continue operations until we can schedule downtime to fix the primary OS partition.

Notice that I've broken /var/log out to its own partition. I do this so that an overflowing set of logs won't take the server box down. Plus, by putting the log files in their own physical partition, it's easy to use a boot CD or USB key to gain access to the logs in case of severe issues.

The other physical partition that I consider necessary is /backup/system. This partition is used to hold images of the boot and root partitions, along with information about the partition layout and images of the MBRs. Basically, it's used to store disaster recovery backups. You should not have this partition mounted during normal operations. Taking the contents of this partition offsite is also a good idea. A basic text file of how the backups were created along with information for how to restore these backups is recommended.

Summary

This setup tries to walk the fine line between keeping it simple, but having enough flexibility to deal with a large set of potential failures. Anything from a two-disk failure, to the primary OS being hosed, to both OS partitions having problems all the way up to boot records or the /boot partition being killed.

Labels: , , ,

Tuesday, February 20, 2007
ext3 tuning
The ext3 system is a great workhorse filesystem. Lots of tools, lots of distros that know how to read it, and it's pretty much the "safe" choice for almost all workloads. Still, there are things that the default ext3 doesn't do as well as it should so most installations need a little TLC.

For the most part, you should plan on shutting down a system before tuning it (after making backups!). Tuning doesn't take too long and is a lot simpler to do if the system is offline.

First off, you should check out your existing filesystem settings with:

# tune2fs -l /dev/hdXY

1) Directory indexing - Which helps ext3 deal with any directories that have lots of files. (See the Gentoo Forum link for explanations of why.)

# tune2fs -O dir_index /dev/hdXY
# e2fsck -D /dev/hdXY

The first command changes the ext3 system to use directory indexing for all new directories, the second command updates all existing directories.

2) Journal mode

# tune2fs -O has_journal -o journal_data /dev/hdXY

I prefer full journaled mode. The "-O has_journal" should be unnecessary (all ext3 file systems have journals after all) but probably ensures that things work if you accidently run it on a ext2 filesystem.

3) Journal size

This requires poking around a bit to find out what your current journal size is. First, you need to find the inode of the journal.

# tune2fs -l /dev/hdXY | grep -i "journal inode"
Journal inode: 8
# /sbin/debugfs /dev/md2
debugfs 1.39 (29-May-2006)
debugfs: stat <8>
Inode: 8 Type: regular Mode: 0600 Flags: 0x0 Generation: 0
User: 0 Group: 0 Size: 134217728
File ACL: 0 Directory ACL: 0
Links: 1 Blockcount: 262416
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x4658e77b -- Sat May 26 22:05:47 2007
atime: 0x00000000 -- Wed Dec 31 19:00:00 1969
mtime: 0x4658e77b -- Sat May 26 22:05:47 2007


In this particular case, for a 12GB partition, the journal size is 128MB (262416 blocks at 4096 bytes each, or look at the "Size:" field which is in bytes). On my 64GB partition, the journal size is also only 128MB.

So, do we want to muck with the journal size? Well, maybe... Doubling the size is probably okay, maybe even making it 4x larger. But beyond that and I think you'd want to tread carefully.

# tune2fs -J size=$SIZE /dev/hdXY

$SIZE is defined in megabytes, so for me to double the 128MB journal, I'd use a value of "size=256".

Source links:
Whitepaper: Red Hat's New Journaling File System: ext3 (RedHat, 2001)
EXT3 Filesystem tuning (Christoph C. Cemper, 2005)
Tuning ext3 for large disk arrays (LKML, Peter Chubb, 2005)
Some ext3 Filesystem Tips (Gentoo Forums, Peter Gordon, 2005)
Performance Tuning Guidelines for Large Deployments (Zimbra, 2007)
Linux Magazine: Tuning Journaling File Systems (2007, registration required)

Labels: ,

Friday, January 05, 2007
Forcing users to use public SSH keys to authenticate
Here are the steps I use when I create a new user account on a secure SSH server (where only public keys are allowed).

# useradd -m username
# passwd username
(paste in a super-long randomized password)
# cd /home/username
# su username
$ mkdir .ssh
$ chmod 700 .ssh
$ cd .ssh
$ cat > username@linux.pub
(paste in the public key file from SecureCRT)
$ ssh-keygen -i -f username@linux.pub >> authorized_keys
$ chmod 600 *

At this point, the user should be able to login via SecureCRT using their private/public key pair. There's no need for them to know the password that you assigned to them on the server (so use something random and at least 30+ characters).

Labels: , ,