Sunday, December 06, 2009
SNMP and MRTG: Interesting OIDs in net-snmp
These can all be found via the "snmpwalk" command in CentOS 5.4 (or RHEL 5.4).

# snmp -v 1 -c public localhost | less

The above assumes that you have configured the SNMP agent on the server to allow read-only access to SNMP v1 clients via the "public" community string.

Approximate number of users logged in

HOST-RESOURCES-MIB::hrSystemNumUsers.0 = Gauge32: 3

Number of logged in users. As you can see, this is a gauge value which means (in SNMP terms) that it is a value that can increase or decrease over time. By default, MRTG assumes that the value is monotonically increasing.

Note: Since MRTG only samples once every 5 minutes, this value is very approximate.

Approximate number of system processes

HOST-RESOURCES-MIB::hrSystemProcesses.0 = Gauge32: 171

Current number of processes running. This often makes a good second number to pair up with the number of users. Or you could choose to display them on separate graphs.

Note: Same issue as logging the number of users, MRTG only samples every 5 minutes which makes this an estimation at best.

MRTG: Reporting on processes and users

Here's a fragment from my MRTG configuration file that shows how I reported on the number of users and processes. I could not get MRTG to resolve the plain names to OIDs automatically, so I had to put in the full numeric OIDs.

### PROCESSES & USERS
Options[_]: gauge, integer, noborder, noinfo, nolegend, noo, nopercent, pngdate, printrouter, transparent
WithPeak[_]: ymw
Legend2[_]:
Legend3[_]:
Legend4[_]:

#Target[localhost.system.users]: hrSystemNumUsers.0&hrSystemNumUsers.0:public@localhost
Target[localhost.system.users]: .1.3.6.1.2.1.25.1.5.0&.1.3.6.1.2.1.25.1.5.0:public@localhost
MaxBytes[localhost.system.users]: 50
YLegend[localhost.system.users]: Users
LegendI[localhost.system.users]: Users
Legend1[localhost.system.users]: Approximate number of users logged in
ShortLegend[localhost.system.users]: ~
Title[localhost.system.users]: firewall:Users - Approximate System Users
PageTop[localhost.system.users]: <h1>firewall: Approximate System Users</h1>
    <div id="sysdetails">
    </div>

#Target[localhost.system.processes]: hrSystemProcesses.0&hrSystemProcesses.0:public@localhost
Target[localhost.system.processes]: .1.3.6.1.2.1.25.1.6.0&.1.3.6.1.2.1.25.1.6.0:public@localhost
MaxBytes[localhost.system.processes]: 5000
YLegend[localhost.system.processes]: Processes
LegendI[localhost.system.processes]: Processes
Legend1[localhost.system.processes]: Approximate number of processes
ShortLegend[localhost.system.processes]: ~
Title[localhost.system.processes]: firewall:Procs - Approximate System Processes
PageTop[localhost.system.processes]: <h1>firewall: Approximate System Processes</h1>
    <div id="sysdetails">
    </div>


Real Memory in Use

HOST-RESOURCES-MIB::hrStorageType.2 = OID: HOST-RESOURCES-TYPES::hrStorageRam
HOST-RESOURCES-MIB::hrStorageDescr.2 = STRING: Real Memory
HOST-RESOURCES-MIB::hrStorageAllocationUnits.2 = INTEGER: 1024 Bytes
HOST-RESOURCES-MIB::hrStorageSize.2 = INTEGER: 8043628
HOST-RESOURCES-MIB::hrStorageUsed.2 = INTEGER: 7962536


Swap (Virtual) Memory in Use

HOST-RESOURCES-MIB::hrStorageType.3 = OID: HOST-RESOURCES-TYPES::hrStorageVirtualMemory
HOST-RESOURCES-MIB::hrStorageDescr.3 = STRING: Swap Space
HOST-RESOURCES-MIB::hrStorageAllocationUnits.3 = INTEGER: 1024 Bytes
HOST-RESOURCES-MIB::hrStorageSize.3 = INTEGER: 4021814
HOST-RESOURCES-MIB::hrStorageUsed.3 = INTEGER: 8292


Processor Utilization

First, we need to find the OIDs of the CPUs.

# snmpwalk -v 1 -c public localhost | grep "HOST-RESOURCES" | egrep "hrDeviceProcessor"
HOST-RESOURCES-MIB::hrDeviceType.768 = OID: HOST-RESOURCES-TYPES::hrDeviceProcessor
HOST-RESOURCES-MIB::hrDeviceType.769 = OID: HOST-RESOURCES-TYPES::hrDeviceProcessor
HOST-RESOURCES-MIB::hrSWRunParameters.32755 = STRING: "hrDeviceProcessor"


That gives us 768 and 769 to look at.

# snmpwalk -v 1 -c public localhost | grep "HOST-RESOURCES" | egrep "(768|769)"
HOST-RESOURCES-MIB::hrDeviceIndex.768 = INTEGER: 768
HOST-RESOURCES-MIB::hrDeviceIndex.769 = INTEGER: 769
HOST-RESOURCES-MIB::hrDeviceType.768 = OID: HOST-RESOURCES-TYPES::hrDeviceProcessor
HOST-RESOURCES-MIB::hrDeviceType.769 = OID: HOST-RESOURCES-TYPES::hrDeviceProcessor
HOST-RESOURCES-MIB::hrDeviceDescr.768 = STRING: AuthenticAMD: AMD Athlon(tm) 64 X2 Dual Core Processor 4200+
HOST-RESOURCES-MIB::hrDeviceDescr.769 = STRING: AuthenticAMD: AMD Athlon(tm) 64 X2 Dual Core Processor 4200+
HOST-RESOURCES-MIB::hrDeviceID.768 = OID: SNMPv2-SMI::zeroDotZero
HOST-RESOURCES-MIB::hrDeviceID.769 = OID: SNMPv2-SMI::zeroDotZero
HOST-RESOURCES-MIB::hrProcessorFrwID.768 = OID: SNMPv2-SMI::zeroDotZero
HOST-RESOURCES-MIB::hrProcessorFrwID.769 = OID: SNMPv2-SMI::zeroDotZero
HOST-RESOURCES-MIB::hrProcessorLoad.768 = INTEGER: 1
HOST-RESOURCES-MIB::hrProcessorLoad.769 = INTEGER: 1


So by looking at the hrProcessorLoad for nodes 768 & 769, we can track the CPU utilization on this PC. But unless you can get MRTG to load the MIBs, you'll need to use the numeric OID format.

# snmpwalk -v 1 -c public localhost -On | egrep "(768|769)" | grep "INTEGER"
.1.3.6.1.2.1.25.3.3.1.2.768 = INTEGER: 9
.1.3.6.1.2.1.25.3.3.1.2.769 = INTEGER: 17

Labels: , ,

Friday, November 27, 2009
FSVS: Install on CentOS 5.4
(Also see my older post on this: FSVS - Install on CentOS 5. Or the original post where I explained the power of FSVS for sysadmins.)

I'm going to start with the assumption that this is a base CentOS 5.4 install without *any* package groups selected during the initial install. In my case, this is a DomU that I'm setting up under Xen to serve as testing server for web development. The only thing I've done so far is setting the root password and configuring it to use a static IP address.

The basic steps will be:

  1. Setup the RPMForge repository
  2. Install the packages needed for FSVS
  3. Download and compile FSVS
  4. Configure ignore patterns
  5. Do the base check-ins


Setting up RPMForge

In order to get the latest Subversion packages for your system, you'll have to add RPMForge as a source repository. The CentOS base repository only has Subversion 1.4.2 and the latest is currently 1.6.6. I recommend doing this in conjunction with the yum-priorities package.

# yum install yum-priorities

After installing the yum-priorities package, you should edit the CentOS-Base.repo file found under /etc/yum.repos.d/. For the base repositories, I recommend setting them to priority values of 1 through 3. For example:

[base]
name=CentOS-$releasever - Base
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os
#baseurl=http://mirror.centos.org/centos/$releasever/os/$basearch/
gpgcheck=1
gpgkey=http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-5
priority=1
exclude=subversion-*


I generally give the [base], [updates], [addons], [extras] repositories a priority of "1", with [centosplus] and [contrib] getting a priority of "3".

In addition, you'll need to add or edit the "exclude=" line in the [base] repository section to exclude Subversion from being sourced from that repository. This will allow the Yum package manager to look in other repositories to find Subversion.

Now we can install the RPMForge repository (see Using RPMForge).

# cd /root/
# wget http://packages.sw.be/rpmforge-release/rpmforge-release-0.3.6-1.el5.rf.x86_64.rpm
# rpm -Uhv rpmforge-release-0.3.6-1.el5.rf.x86_64.rpm
# cd /etc/yum.repos.d/


Now you should edit the rpmforge.repo file and insert a priority= line. I recommend a value of 10 or 25.

You can now verify that you'll pull in the latest Subversion package by the following command:

# yum info subversion
Available Packages
Name : subversion
Arch : x86_64
Version : 1.6.6
Release : 0.1.el5.rf
Size : 6.8 M
Repo : rpmforge


Install the packages needed for FSVS

# yum install subversion subversion-devel ctags apr apr-devel gcc gdbm gdbm-devel pcre pcre-devel apr-util-devel

Download and compile FSVS

As always, you shouldn't compile code as the root user.

# su username
$ mkdir ~/fsvs
$ cd ~/fsvs
$ wget http://download.fsvs-software.org/fsvs-1.2.1.tar.bz2
$ tar xjf fsvs-1.2.1.tar.bz2
$ cd fsvs-1.2.1
$ ./configure
$ make
$ exit
# cp /home/username/fsvs/fsvs-1.2.1/src/fsvs /usr/local/bin/
# chmod 755 /usr/local/bin/fsvs


Creating the repository on the SVN server

This is how we setup users on our SVN server. Machine accounts are prefixed as "sys-" in front of the machine name. The SVN repository name matches the name of the machine. In general, only the machine account should have write access to the repository, although you may wish to add other users to the group so that they can gain read-only access.

# useradd -m sys-www-test
# passwd sys-www-test
# svnadmin create /var/svn/sys-www-test
# chmod -R 740 sys-www-test
# chmod -R g+s sys-www-test/db
# chown -R sys-www-test:sys-www-test sys-www-test


Back on the source machine (our test machine), we'll need to create an SSH key that can be used on our SVN server. You may wish to use a slightly larger RSA key (3200 bits or 4096 bits) if you're working on an extra sensitive server. But a key size of 2048 bits should be secure for another decade for this purpose.

# cd /root/
# mkdir .ssh
# chmod .ssh 700
# cd .ssh
# /usr/bin/ssh-keygen -N '' -C 'svn key for root@hostname' -t rsa -b 2048 -f root@hostname
# cat root@hostname.pub


Copy this key into the clipboard or send it to the SVN server or the SVN server administrator. Then we'll need to create a ~/.ssh/config file to tell the user what account name, port and key file to use when talking to the SVN server.

# vi /root/.ssh/config
Host svn.tgharold.com
Port 22
User sys-www-test
IdentityFile /root/.ssh/root@hostname


Back on the SVN server, you'll need to finish configuration of the user that will add files to the SVN repository.

# su username
$ cd ~/
$ mkdir .ssh
$ chmod 700 .ssh
$ cd .ssh
$ cat >> authorized_keys
(paste in the SSH key from the other server)
$ chmod 600 authorized_keys


Now you'll want to prepend the following in front of the key line in the authorized_keys file.

command="/usr/bin/svnserve -t -r /var/svn",no-agent-forwarding,no-pty,no-port-forwarding,no-X11-forwarding

That ensures (mostly) that the key can only be used to run the svnserve command and that it can't be used to access a command shell on the SVN server.

Test the configuration back on the original server by issuing the "svn info" command. Alternately, you can try to ssh to the SVN repository server. Errors will usually either be logged in /var/log/secure on the source server or in the same log file on the SVN repository server.

Here's an example of a successful connection:

# ssh svn.tgharold.com
( success ( 2 2 ( ) ( edit-pipeline svndiff1 absent-entries commit-revprops depth log-revprops partial-replay ) ) )


This shows that they key is running the "svnserve" command automatically.

Connect the system to the SVN repository

The very first command that you'll need to issue for FSVS is the "urls" (or "initialize") command. This tells FSVS what repository will be used to store the files.

# cd /
# mkdir /var/spool/fsvs
# mkdir /etc/fsvs/
# fsvs urls svn+ssh://svn.tgharold.com/sys-www-test/


You may see the following error, which means you need to create the /var/spool/fsvs folder, then reissue the fsvs urls command.

stat() of waa-path "/var/spool/fsvs/" failed. Does your local WAA storage area exist?

The following error means that you forgot to create the /etc/fsvs/ folder.

Cannot write to the FSVS_CONF path "/etc/fsvs/".

Configure ignore patterns and doing the base check-in

When constructing ignore patterns, generally work on adding a few directories at a time to the SVN repository. Everyone has different directories that they won't want to version, so you'll need to tailor the following to match your configuration. However, I generally recommend starting with the following:

# cd /
# fsvs ignore group:ignore,./dev
# fsvs ignore group:ignore,./etc/fsvs/
# fsvs ignore group:ignore,./etc/gconf/
# fsvs ignore group:ignore,./etc/gdm/
# fsvs ignore group:ignore,./home/
# fsvs ignore group:ignore,./lost+found
# fsvs ignore group:ignore,./media/
# fsvs ignore group:ignore,./mnt/
# fsvs ignore group:ignore,./proc
# fsvs ignore group:ignore,./root/.gconf
# fsvs ignore group:ignore,./root/.nautilus
# fsvs ignore group:ignore,./selinux/
# fsvs ignore group:ignore,./srv
# fsvs ignore group:ignore,./sys
# fsvs ignore group:ignore,./tmp/
# fsvs ignore group:ignore,./usr/tmp/
# fsvs ignore group:ignore,./var/gdm/
# fsvs ignore group:ignore,./var/lib/mlocate/
# fsvs ignore group:ignore,./var/lock/
# fsvs ignore group:ignore,./var/log/
# fsvs ignore group:ignore,./var/mail/
# fsvs ignore group:ignore,./var/run/
# fsvs ignore group:ignore,./var/spool/
# fsvs ignore group:ignore,./var/tmp/


Then you'll either want to ignore (or encrypt) the SSH key files.

# cd /
# fsvs ignore group:ignore,./root/.ssh
# fsvs ignore group:ignore,./etc/ssh/shadow*
# fsvs ignore group:ignore,./etc/ssh/ssh_host_key
# fsvs ignore group:ignore,./etc/ssh/ssh_host_dsa_key
# fsvs ignore group:ignore,./etc/ssh/ssh_host_rsa_key


You can check what FSVS is going to version by using the "fsvs status pathname" command (such as "fsvs status /etc"). Once you are happy with the selection in a particular path, you can do the following command:

# fsvs ci -m "base check-in" /etc

Repeat this for the various top level trees until you have checked everything in. Then you should do one last check-in at the root level that catches anything you might have missed.

Labels: , ,

Saturday, November 21, 2009
FSVS ignore patterns (1.2.0)
Here's an example of a more complex FSVS ignore/take pattern.

On our mail server, we store all mail in MailDir folders under the structure of:

/var/vmail/domainname/username/

We keep our user-specific Sieve scripts in a "Home" folder under that location.

/var/vmail/domainname/username/Home/

So obviously, we want to version the Home folder under each user. But we don't want to version the other MailDir folders at all. The trick to this is that because our folder structure is predictable, we can do it in a handful of FSVS ignore patterns.

# cd /
# fsvs ignore dump >> /root/fsvs-ignore-yyyymmdd.txt


That makes a backup of your current rules, just in case you decide that you don't like your changes (they can be reloaded with "fsvs ignore load < filename").

# cd /
# fsvs ignore group:ignore,./var/vmail/lost+found
# fsvs ignore group:take,./var/vmail/*
# fsvs ignore group:take,./var/vmail/*/*
# fsvs ignore group:take,./var/vmail/*/*/Home
# fsvs ignore group:take,./var/vmail/*/*/Home/**
# fsvs ignore group:ignore,./var/vmail/**


Line 1 "group:ignore,./var/vmail/lost+found": In our setup /var/vmail is a separate mount point, so we'll want to ignore the lost+found folder.

Line 2 "group:take,./var/vmail/*": This tells FSVS to version anything at or below /var/vmail.

Line 3 "ignore group:take,./var/vmail/*/*": Grabs the next directory level and files below /var/vmail.

Line 4 "group:take,./var/vmail/*/*/Home": Now we grab just the "Home" folder inside of the user's MailDir directory. This lets us ignore the new|cur|tmp folders as well as the other hidden MailDir folders (such as .Junk).

Line 5 "group:take,./var/vmail/*/*/Home/**": Grab everything inside of Home and below that point. This will grab all of the Sieve scripts or other files that are located there. If you wanted to exclude certain files in Home, you would insert that ignore rule above this line.

Line 6 "group:ignore,./var/vmail/**": This is the clean-up rule. Anything not explicitly mentioned above here will now be ignored. This keeps us from versioning the messages inside the user's MailDir folders.

Labels: ,

Friday, November 20, 2009
Getting started with GPG4Win
GNU Privacy Guard for Windows Home Page (GPG4Win) - The GPG4Win project recently released version 2.0.1 of their product, so I figured it was a good time to reexamine GPG4Win. There have been a few changes since version 1, most notable for me is that WinPT is no longer part of the GPG4Win distribution.

Installation

For getting started, I strongly recommend using the gpg4win-light package at first as you probably won't need Kleopatra or the german-only manuals). As for the optional modules, I'd recommend installing GPA and GPGEx at a minimum. Note that GPGOL is still only compatible with Outlook 2003 and Outlook 2007, so you may wish to not install that module if you use other versions of Microsoft Outlook. In addition, you probably won't need Claws Mail at first.

By default, GPG4Win puts your key files under (or wherever your HOMEPATH environment variable points to?):

C:\Documents and Settings\USERNAME\Application Data\gnupg

Make sure you include this location in any backup programs that you are using. Your public and secret keyrings are stored in this folder and need to be backed up regularly.

Public Key Pairs

Now we get into the theoretical realm, GPG now supports RSA signing and encryption keys (in addition to the older DSA for signing and Elgamal for encryption methods). DSA signing keys are limited to 1024 bit lengths, while RSA signing keys can be much longer (512 to 4096 bits are commonly used). The only restriction that you should keep in mind for RSA keys is that you should never sign with the same key that you use for encryption (and vice-versa). In GnuPG v2, the default is now to create (2) RSA keys for the account, one for encryption and one for signing.

Typically, you'll want signing keys to have a very long lifespan (at least 5 years, maybe as long as 20 or more). This allows you to build a much larger web of trust before your key can no longer be used to sign other keys. However, you should really expire your encryption key after a few years. Then, a bit before your encryption key expires, you should add a new encryption subkey to your key with a new expiration date.

Unfortunately, the default creation options in GnuPG will assign the same expiration to both the signing key and the encryption keys. But this can be fixed using the "gpg --edit-key" command.

Creating a GPG key

gpg --gen-key
gpg (GnuPG) 2.0.12; Copyright (C) 2009 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Please select what kind of key you want:
(1) RSA and RSA (default)
(2) DSA and Elgamal
(3) DSA (sign only)
(4) RSA (sign only)
Your selection?


Unless you have a strong reason to use DSA/Elgamal, you may as well use the defaults in GPG v2 and pick "RSA and RSA".

RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048)


If you're creating a key that will expire in the next 5 years, I recommend 2048 bits. For longer durations, you may wish to use 3172 or 4096 bits.

Please specify how long the key should be valid.
0 = key does not expire
= key expires in n days
w = key expires in n weeks
m = key expires in n months
y = key expires in n years
Key is valid for? (0)


For an initial key where you're not protecting anything super critical, I suggest starting with a 25 year (entered as "25y") expiration date. You will be asked to confirm the expiration date (enter "y" to continue).

GnuPG needs to construct a user ID to identify your key.

Real name:


For personal use, I suggest just entering your name (i.e. "Thomas Harold"). But if you're creating a key for corporate/business use, I suggest adding a bit more information in this field to make things easier for others if they have more then one key with similar names. I recommend against using parenthesis in this field as it can be confusing later on. Square brackets "[]", curly braces "{}", or angle brackets "<>" are all good choices to set elements off from each other. Some examples:

Thomas Harold, Acme Inc.
Thomas Harold [Acme]
Thomas Harold
Thomas Harold {Example LTD}

Remember, that this and the next two fields are all public information that will be visible to everyone who uses your public key to send you things, or who uses your signing key to verify a signature.

EMail address:

This is very simple, you should enter the primary email address that you want associated with this key (i.e. "tgh@tgharold.com"). If you need to add additional email addresses, you can do that later using the "gpg --edit-key" command.

Comment:

The comment field is a public field and will be seen by others. I recommend putting website information here, or the full company name, or a combination of the two. Keep in mind that the contents of this field are typically displayed as enclosed in parenthesis, so avoid using parenthesis or brackets/braces here. Some examples:

www.tgharold.com
Acme Corporation - www.acme.corp
Example LTD, www.example.com

You selected this USER-ID:
"Thomas Harold [Acme] (Acme Corporate Sales - www.acme.corp) "

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit?


After entering those three values, you will be presented with how it might look to another user. As you can see, the comment gets wrapped in parenthesis while the email address gets presented inside of angled brackets. Once you are satisfied with how it looks, enter "O" for "Okay" to continue.

GnuPG will then pop-up a window that prompts you for a passphrase. This is extremely important. The passphrase that protects your key from unauthorized use is the weakest link of the entire GnuPG encryption chain. Pick something lengthy, yet easy to type, that is extremely difficult for someone to guess or attack. Write it down if you want, but keep that slip of paper secure in a safe or safety deposit box.

You will eventually be presented with something that looks like:

gpg: checking the trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0 valid: 2 signed: 0 trust: 0-, 0q, 0n, 0m, 0f, 2u
gpg: next trustdb check due at 2009-12-16
pub 3200R/AAFA2876 2009-11-21 [expires: 2009-12-16]
Key fingerprint = 0324 917E C27D 2FB0 DDEF ABFA 4DEE 71F0 AAFA 2876
uid Thomas Harold [Acme] (Acme Corporate Sales - www.acme.corp)
sub 3200R/1972B360 2009-11-21 [expires: 2009-12-16]


This means that GnuPG has finished generating your key and has saved it to your keyring. This sample key (both the encryption key and the signing key) will expire Dec 16, 2009.

The key fingerprint is an important piece of information that should be given to your contacts over a secure channel. It will allow them to verify that they have the correct key and that they are not subject to a man-in-the-middle (MitM) attack when they use the key. You can find out the fingerprints of keys in your keyring using the "gpg --fingerprint" command. Typically, you would send them your public encryption key via email or some other digital method while telling them the key's fingerprint over an entirely different medium such as a telephone call or a physical piece of paper (letter / package).

Editing your key

In order to edit your key using GnuPG, you must know the 8-digit key ID. In the above example it is listed on the line that starts with "pub". For example, the key that I just created has a key ID of "AAFA2876":

pub 3200R/AAFA2876 2009-11-21 [expires: 2009-12-16]

In order to edit the key, you will use the following command:

gpg --edit-key aaFa2876

As you can see, the key ID is not case sensitive as it is merely an 8-digit hexadecimal string.

gpg (GnuPG) 2.0.12; Copyright (C) 2009 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Secret key is available.

pub 3200R/AAFA2876 created: 2009-11-21 expires: 2009-12-16 usage: SC
trust: ultimate validity: ultimate
sub 3200R/1972B360 created: 2009-11-21 expires: 2009-12-16 usage: E
[ultimate] (1). Thomas Harold [Acme] (Acme Corporate Sales - www.acme.corp)

Command>


This shows us a bunch of information. The line that starts with "pub" gives us the following information:

pub - indicates that this is the primary key (you will also see "sub"
3200R - this is a 3200 bit RSA key (R=RSA, D=DSA, g=Elgamal)
AAFA2876 - the key ID (or subkey ID)
created: / expire(d|s): - the creation and expiration dates
usage: - indicates how the key can be used (S=sign, E=encrypt)

Useful commands at this point are:

fpr - show key fingerprint
list - list key and user IDs
quit - exit without making changes

Changing the expiration date

By default, all operations will occur to the primary key (the "pub" line) in this keyset. So before you edit a subkey, you need to tell GnuPG to work with that key. These keys are simply numbered 1-N as they are shown in the list.

Command> key 1

pub 3200R/AAFA2876 created: 2009-11-21 expires: 2009-12-16 usage: SC
trust: ultimate validity: ultimate
sub* 3200R/1972B360 created: 2009-11-21 expires: 2009-12-16 usage: E


This puts an asterisk by the "sub*" line telling us that we're going to work on the subkey with ID "1972B360".

Command> expire
Changing expiration time for a subkey.
Please specify how long the key should be valid.
0 = key does not expire
= key expires in n days
w = key expires in n weeks
m = key expires in n months
y = key expires in n years
Key is valid for? (0) 6m
Key expires at 05/19/10 20:28:31 Eastern Daylight Time
Is this correct? (y/N) y

You need a passphrase to unlock the secret key for
user: "Thomas Harold [Acme] (Acme Corporate Sales - www.acme.corp) "
3200-bit RSA key, ID AAFA2876, created 2009-11-21

pub 3200R/AAFA2876 created: 2009-11-21 expires: 2009-12-16 usage: SC
trust: ultimate validity: ultimate
sub* 3200R/1972B360 created: 2009-11-21 expires: 2010-05-20 usage: E


As you can see, the subkey's expiration date changed from "2009-12-16" to "2010-05-20". If we had wanted to change the primary key's expiration date, we would've entered "key 0" then "expire" at the "Command>" prompt.

Once you are happy with the new expiration dates, enter "save" to save and quit the key editor.

Adding another User ID to the key

Let's say that you want to add a second email address to your key pairs. As before, you're going to use the "gpg --edit-key" command to do this.

gpg --edit-key AaFa2876

Then you'll issue the "adduid" command.

Command> adduid
Real name: Thomas Harold [Example]
Email address: tgh@example.com
Comment: www.example.com
You selected this USER-ID:
"Thomas Harold [Example] (www.example.com) "

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? O


Your key will now look like:

pub 3200R/AAFA2876 created: 2009-11-21 expires: 2012-11-20 usage: SC
trust: ultimate validity: ultimate
sub 3200R/1972B360 created: 2009-11-21 expires: 2010-05-20 usage: E
[ultimate] (1) Thomas Harold [Acme] (Acme Corporate Sales - www.acme.corp)
[ unknown] (2). Thomas Harold [Example] (www.example.com)


Now that we have two User IDs associated with this key, we should flag one of them as the primary.

Command> uid 2
Command> primary
Command> uid 0

pub 3200R/AAFA2876 created: 2009-11-21 expires: 2012-11-20 usage: SC
trust: ultimate validity: ultimate
sub 3200R/1972B360 created: 2009-11-21 expires: 2010-05-20 usage: E
[ultimate] (1) Thomas Harold [Example] (www.example.com)
[ultimate] (2). Thomas Harold [Acme] (Acme Corporate Sales - www.acme.corp)


The asterisk by the number in parenthesis is the currently selected user ID. If you see a dot/period after the number in parenthesis, that indicates which user ID is the primary.

Backing up your key

The following command allows you to export your secret key to an ASCII armored text file.

gpg -a --export-secret-keys aafa2876 >> my-secret-key.asc

You should also export your currently usable public encryption key.

gpg -a --export aafa2876 >> my-public-key.asc

You should print these files out as well as keeping an electronic copy in a secure location such as a safe or safe-deposit box. Don't leave the secret key ASCII file laying around. A sealed security envelope with a phrase and the current date written across the sealed flap and then covered with transparent tape is a good countermeasure to detect tampering.

Labels: , ,

Monday, November 09, 2009
CentOS 5, ClamAV 0.95 and /etc/sysconfig/clamav
Trying to configure the new ClamAV 0.95 as a milter for our Postfix install this week. So I've been doing some digging into the configuration files. Here's what I've found so far.

In order to get the newer ClamAV for Red Hat Enterprise Linux 5 (RHEL5) and CentOS 5, I had to use the RPMForge repository in order to get the 0.95 version.

The old clamav-milter package is outdated and should not be installed (use the newer clamav 0.95 or later package).

The /etc/rc.d/init.d/clamav script is still from 2008 and is very old. It references /etc/sysconfig/clamav which has an outdated setting called "CLAMAV_MILTER=yes". In ClamAV 0.95+, the milter was rewritten and now uses a configuration file (/etc/clamav-milter.conf) instead of command-line arguments. The init.d script that manages the clamd daemon is still for the older milter. It works fine for starting and stopping clamd, but you should not use the "CLAMAV_MILTER=yes" setting in the sysconfig file.

If you were using the /etc/sysconfig/clamav file to turn on the milter in RHEL5, then you will probably see the following error when you upgrade to ClamAV 0.95 or later:

Starting clamav-milter: clamav-milter: unrecognized option `--max-children=10'
ERROR: Unknown option passed
ERROR: Can't parse command line options


You'll need to convert your old command line options into configuration file options.

Labels: , ,

Thursday, October 22, 2009
Flash memory price check
Flash memory prices have finally dropped below $2.00/GB (around $1.90 at the moment).

SDHC cards:

$7 2GB
$9 4GB
$16 8GB
$30 16GB
$80 32GB

The sweet spot right now is $30 for 16GB.

What we see is that at the lower end of the scale, there's a minimum price point. Manufacturers don't like to sell units below $6-$7 or maybe retailers don't like to stock units below that price point. You'll see something similar in Hard Drive prices where there are very few drives below $40-$50 on the market.

SSDs (MLC)

$95 32GB
$145 64GB
$280 128GB
$680 256GB

The sweet spot is either 64GB for $145 or 128GB for $280. SSDs are still a bit above the $2/GB price point. Probably due to the extra circuitry and packaging of a 2.5" SSD.

Hopefully, by this time next year they'll break the $1/GB mark. Magnetic 2.5" drives are down around $0.17/GB for 500GB drives.

Labels: ,

Monday, September 07, 2009
SNMP: Finding OIDs and MIBs
The key tool in the toolbox for exploring MIBs and finding things in SNMP is either "snmpwalk" or looking at the actual MIB text definitions. On CentOS 5 (and RHEL 5), the net-snmp package installs a default set of MIBs to "/usr/share/snmp/mibs/".

# snmpwalk -v 2c -c public localhost diskIONReadX

That particular command uses version "2c" of the SNMP protocol to talk to the "public" community on the localhost and looks for "diskIONReadX" (which is a 64bit counter value column from the diskIOTable).

# snmptranslate -m +ALL -IR -Td diskIONReadX

Here, we use "snmptranslate" to report on full details (-Td) of the diskIONReadX property. When looking up SNMP attributes by labels, you'll want to use the above format, but you can change "-Td" to other "-T" options or a "-O" option. Some common choices are:

# snmptranslate -m +ALL -IR -Td diskIONReadX
UCD-DISKIO-MIB::diskIONReadX
diskIONReadX OBJECT-TYPE
-- FROM UCD-DISKIO-MIB
SYNTAX Counter64
MAX-ACCESS read-only
STATUS current
DESCRIPTION "The number of bytes read from this device since boot."
::= { iso(1) org(3) dod(6) internet(1) private(4) enterprises(1) ucdavis(2021) ucdExperimental(13) ucdDiskIOMIB(15) diskIOTable(1) diskIOEntry(1) 12 }

# snmptranslate -m +ALL -IR -On diskIONReadX
.1.3.6.1.4.1.2021.13.15.1.1.12

# snmptranslate -m +ALL -IR -Of diskIONReadX
.iso.org.dod.internet.private.enterprises.ucdavis.ucdExperimental.
ucdDiskIOMIB.diskIOTable.diskIOEntry.diskIONReadX

# snmptranslate -m +ALL -IR -Ou diskIONReadX
enterprises.ucdavis.ucdExperimental.ucdDiskIOMIB.diskIOTable.
diskIOEntry.diskIONReadX

Labels: ,

Monday, August 03, 2009
Second Copy 7 vs Samba v3
One of the tools that we use on our desktop machines is Second Copy 7, which is a very useful tool for doing file-level backups that are user friendly. It has a mode where it mirrors the source directory tree to the remote location, along with putting older copies of the files in a second remote location.

However, if things are strange, you'll find that Second Copy will end up making repeated copies of files in the "older copies" location every time the profile runs.

The primary problem that causes this is if the Windows desktop's clock does not exactly match the server's clock. You will see this problem frequently if you use "time.windows.com" as your clock source. (In Windows XP; Control Panel -> Date and Time -> Internet Time tab.) The "time.windows.com" clock source is generally horribly inaccurate compared to the time that your Linux boxes running Samba get their time from (usually from the pool.ntp.org servers).

So the solution is either to sync your Windows boxes to a better clock source (such as "us.pool.ntp.org" or an internal NTP time server), or to adjust Second Copy to be much more tolerant of time differences. SC's default is a 2 sec time difference allowance. You may wish to increase this to as much as 30 or 60 seconds. This is a hidden option in the Second Copy profiles.dat file.

Setting up a Linux box to poll the pool.ntp.org servers and provide time to the internal network is a much preferred solution. You can also setup Samba to provide time to clients that belong to your domain.

References:

Q10169 - INFO: How does Second Copy handle file time stamps when copying the files between different file systems?

Addendum:

- After a bit of playing around with Samba options, I finally gave up and increased the "IgnoreTimeDifference=N" value under [Options] in the "profiles.dat" file from 2 seconds to 15 seconds. The Windows XP desktop machine, even though it was getting its time from the Linux Samba server, wasn't staying within 2 second variation. But after loosening up the time to 15 seconds, things are working much better.

- If your Windows boxes are actually joined to the Samba Domain as client machines (only possible with Windows XP Pro, or the pro/business versions of Vista/Win7), then they might keep better synchronization with the Samba server's time.

- I'm pretty sure that the problem was not due to my referencing the backup location using UNC naming (i.e. \\servername\share\path).

- This issue mostly comes into play when you are backing up from one machine to another (such as a share location on another desktop or a server share). This is not something that you'll normally run into if you're backing up to a drive hooked directly to the machine.

Labels: , ,

Wednesday, July 22, 2009
3ware SATA RAID - Reboot and Select proper Boot device
Reboot and Select proper Boot device
or Insert Boot Media in selected Boot device and press a key


In the process of setting up my 15-disk array, I kept encountering the above error message while attempting to boot into CentOS 5 off of the 3ware 9650SE array. If I turned off the offending drive (or removed it), the error went away and CentOS would boot properly.

This particular error message seems to be generated by an MS-DOS / Windows boot loader on the hard drive. Some motherboards seem to prefer this particular MBR, rather then loading GRUB/LILO as desired. So, when scanning the drives, they tend to fixate on whatever boot loader they find first.

If you hook the machine to a regular SATA port (or use an external USB device), you can overwrite the first few gigabytes of the drive with zeroes. At which point, adding the drive to the array will no longer cause the problem. Attempting to remedy the issue while the drive is connected to the 3ware controller will not work as the 3ware controller seems to hide the first few cylinders of the drive.

(It might work in JBOD mode, but you can't setup a JBOD disk on the fly using the tw_cli command on a 9650SE controller. So an external USB enclosure / tray / drive toaster is the best route.)

Updates:

- After playing with a little more, it seems to be a BIOS issue where the BIOS gets confused if the 3ware controller presents more then 12 units to the BIOS.

- The 3ware controller does not "hide the first few cylinders of the drive". Instead, it seems to store it's metadata for the drive at the end. Aside from the problem of losing data in the last few cylinders, you can take a drive configured as a "single-unit" and hook it up to a regular SATA controller with no issues.

Labels: , ,

3ware 9650SE SATA RAID and CentOS 5 Linux
A few months ago, I picked up a 3ware 9650SE 16-port controller for use in my primary office server which runs CentOS 5. So far, it's been an up and down ride.

Problem #1 - The boot process could not find the array disks.

I have a triple-mirror set that I use for my primary operating system drive. On the old system, they were hooked up directly to the SATA ports on the motherboard.

(Triple-mirror means that I created the elements in mdadm, a.k.a. Software RAID, where all 3 drives are active mirror copies of each other. This offers a slight speed-up for reads, allows you to survive 2 drive failures, and puts what would otherwise be an idle hot-swap disk to use.)

On the new system, I decided to attach them to the 3ware card and configure them as JBOD. However, the kernel initrd file (2.6.18-92.1.22.el5) that I was using for the old motherboard did not have drivers installed for the 3ware card.

So I had to create a custom initrd file by using gzip and cpio to unpack the contents into a directory. Then I copied the 3ware kernel driver binary from the 2.6.18-92.1.22.el5 folder, added it to the list of modules to be loaded. Finally, I repacked the initrd file, added a grub.conf entry to point at it, and booted cleanly.

Unfortunately, this was all done using the CentOS 5.3 install CD, so I was unable to log the session or keep careful track of what I did.

Problem #2 - JBOD is not really supported on the 9650SE

In the BIOS interface, you *can* setup disks as JBOD and tell the controller to export JBOD disks (see controller options). Setting a drive as JBOD does not currently overwrite/erase data on the drive, so it is a fairly safe operation and a good way to hook up a drive with existing data.

Note that if you have "export JBOD" set in the controller options, you can simply hook the drives up, rescan (using the BIOS or 3ware command-line utility called "tw_cli"), and the drives should show up as /dev/sd? in your list.

The major downside of JBOD mode is that write caching is always disabled by default. This means that your drives are going to have a much higher utilization percentage (as seen by "atop") then if you had enabled write caching.

Now, you can enable write caching for JBOD drives, but the unit has to be told to do that after every reboot. The command (assuming that your controller is "c0" and the unit is "u12") is:

# tw_cli /c0/u12 set cache=on

A final note. If you're going to use write caching, you should spring for the BBU (Battery Backup Unit).

Problem #3 - Controlling the RAID

Download and install the "tw_cli" tarball from 3ware. Since you have to have the 3ware driver installed to talk to the card, and the 9650SE prefers "Single Disk" over "JBOD", you're probably going to want to use 3ware RAID instead of Software RAID.

The problem with "Single Disk" mode is that it overwrites the first few sectors on the disk with 3ware control information. So all of the disks in "Single Disk" mode are going to slightly smaller then a JBOD disk. Be aware that putting a disk into a 3ware array will cause the loss of anything at the start of the disk (such as the partition table) and the number of cylinders will be slightly smaller.

Of course, due to the strange geometry of a disk touched by the 3ware controller, you'll probably have to move the disk to another 3ware controller in order to read the data in the future. Well, maybe, if you're using Software RAID1 mirroring on top of 3ware Single Disks, then the data is highly likely to be in an easy to read format, other then the odd starting point for the partitions.

Anyway, some key commands when using the tw_cli application:

# tw_cli show
- Displays the list of controllers installed. Make note of the "c#" nomenclature as you will use those "c#" labels in later commands to refer to a specific controller.

# tw_cli /c0 show
- Displays units/ports for the *first* controller installed.

# tw_cli /c0 rescan
- Use this after inserting/removing a disk using a hot-swap enclosure.

# tw_cli /c0/ux show all
- Displays configuration information for whichever unit # you provided. Replace the "x" with the unit # that you want to look at (such as /c0/u3 or /c0/u12).

Problem #4 - Performance (a.k.a. I/O wait hell)

Unfortunately, the 3ware Linux kernel driver in Red Hat / CentOS 2.6.18-92.1.22.el5, is not very good. The symptoms are as follows:

1) Create multiple "single disk" units.

2) Make heavy writes to one or more of the units. Such as using "dd" to overwrite the unit with zeros.

3) Attempt to access data on the other units.

What you will find is that:

- Performance of the system starts to feel extremely sluggish for any operations that touch drives on the 3ware controller.

- Looking at "atop", you will see that the other drives are now reporting seek times of 100-200ms instead of 1-10ms. Their utilization numbers will be up around 90-100%, even though the number of reads/writes are only in the 2-3 digit range.

- Turning write caching on/off doesn't make a difference.

From my web searches, it seems like this may be a problem specific to kernel versions prior to 2.6.26. Unfortunately, the stock kernel in Red Hat / CentOS is based off of 2.6.18 and I haven't found out yet whether Red Hat / CentOS have backported the fix.

Updates:

- Even the 2.6.18-128 RHEL/CentOS kernel displays sluggishness any time that we access units (a RAID 6, 8 drive unit that is the only thing on the array). We have zero performance problems with drive attached to a different SATA controller running Software RAID.

- I can't recommend using the 9650SE controller with RHEL/CentOS currently. Performance is absolutely horrid under load.

Labels: , , ,

Friday, June 12, 2009
SVN: Care and feeding of a sparse working copy
One of the wonderful (and long awaited for us) features in SVN 1.5 was the addition of "sparse" working copies. This allows you to create a working copy that checks out from the root of the repository, but without having to bring down the entire repository into your working copy. Which is a great boon for situations where a large monolithic repository is preferred over having lots of smaller repositories.

This was a feature that Visual SourceSafe (VSS) and SourceOffSite (SOS) had for many years prior to SVN 1.5's sparse working copy feature. We were an SOS shop for a long time prior to switching to SVN back in 2006 and our entire VSS repository was basically monolithic and we'd only bring down what we needed to our local working copies.

(The primary advantage for monolithic working copies is mostly ease of use and ease of administration. Rather then pester the SVN administrator to create a new repository for every new client or project, users can simply work within the existing project repository and create folders as needed. Plus, the end-user can more easily do a global update of their working copy overnight without having to write a lengthy batch file to update individual working copies.)

Support for sparse working copies got even better in 1.6 because now you can trim folders back out of your working copy once you no longer need them. In SVN 1.5, once you told SVN to bring a folder tree down, you were stuck with it and your working copy would slowly bloat.

Creating a sparse working copy:

  1. Create the folder on your hard drive (i.e. C:\TGH\Projects).
  2. Right-click in the folder and use, TortoiseSVN's "SVN Checkout..." option.
  3. Enter the repository URL (i.e. svn+ssh://svn.tgharold.com/tgh-projects)
  4. Change the "Checkout Depth" to anything other then "Fully recursive". (I usually use "Only this item".)
  5. Click the "OK" button and TortoiseSVN will do a sparse checkout.

Populating your working copy:

  1. Right-click anywhere inside your working copy and bring up the TortoiseSVN Repository Browser (TSVN -> Repo-Browser).
  2. In the Repo-Browser, navigate to the project tree that you want to bring to your working copy.
  3. Right-click on the project folder and choose "Update item to revision". I usually set the "Update Depth" to "Working Copy" as this will bring down the folder and all of the children files and folders below it.

Note: Sometimes the Repo-Browser will lose track of what has / hasn't been brought down to the working copy. You won't see the "Update item to revision" choice in the right-click menu when this happens. The solution is to close the Repo-Browser window and open up a new one from a location in your working copy (i.e. go back to step 1).

Trimming your working copy:

  1. Right-click on the folder that you want to remove from your working copy and do an "SVN Update".
  2. Right-click on the folder that you want to remove from your working copy and do an "SVN Commit". This is to make sure that there's nothing in the working copy that you have forgotten to commit.
  3. Right-click on the folder that you want to remove from your working copy and use the TortoiseSVN -> Update to revision option.
  4. Change the "Update Depth" to "Exclude" and click the "OK" button.

Note: If TortoiseSVN does not remove the folder, then there was something in it that was not committed or that it felt you wanted to keep around. Which is why I recommend doing an Update and then using the Commit dialog to verify that the working copy folder is clean.

Labels: ,

Wednesday, May 06, 2009
Basic SNMP reference links
Friday, April 10, 2009
Issues with FRAPS
I tend to do a lot of FRAPS capture of games that I play for a variety of reasons (the ability to look back, review gameplay, hobby). However, ever since I upgraded my NVIDIA display drivers a month or two ago, I've started running into the following issue in VirtualDub:

The decompression codec cannot decompress to an RGB format. This is very unusual. Check that any "Force YUY2" options are not enabled in the codec's properties.

Now for the fun symptoms:

- It does not always occur when attempting to do video encoding (usually in the job queue).

- When it does occur, it breaks the encoding job early on right at the start.

- A task that failed to encode the first time, will often work the second time. So there's a random element of chance here.

- When things are screwy, sometimes the VirtualDub menu will disappear while using the "File -> Append AVI segment" menu option. This symptom may or not actually be related to the issue. By vanish, I mean that the preview window will bleed through from the background and wipe out the dropped down File menu. But you can still select menu options by moving the mouse up/down.

All of this points to problems in the video codec rendering path. It's made mass conversion of FRAPS video a real PITA in VirtualDub, because I'm doing 2-pass XVid encoding so a failure in the 1st pass means that the 2nd pass also needs to be tossed.

Labels: ,

Sunday, March 08, 2009
Linux: Checking the fstab prior to a reboot
One of the joys of working on a server from a remote location is dealing with the issue caused by a broken /etc/fstab file. Even the best admins make mistakes and mistakes in that file can lead to a server that won't boot.

Which is fine; if you have an IP-based KVM where you can get console access without actually being at the facility. But not so great when a screwed up fstab file requires you to go physically visit the location.

So how do we verify that our fstab file makes sense prior to a reboot? The answer lies in the mount command. There are (3) useful options that can be passed to the mount command which will help us check the fstab file prior to a reboot.

-f - FAKE IT (Causes everything to be done except for the actual system call). This tells mount to do everything, but don't actually change anything.

-a - Mount all filesystems mentioned in fstab.

-v - Be verbose about it.

Here's an example where everything is fine and dandy.

# mount -fav
mount: /dev/md0 already mounted on /boot
mount: devpts already mounted on /dev/pts
mount: tmpfs already mounted on /dev/shm
mount: proc already mounted on /proc
mount: sysfs already mounted on /sys
mount: /dev/md4 already mounted on /var/log
nothing was mounted


Same example, except that I screwed up the name of /dev/md4 in the fstab file:

# mount -fav
mount: /dev/md0 already mounted on /boot
mount: devpts already mounted on /dev/pts
mount: tmpfs already mounted on /dev/shm
mount: proc already mounted on /proc
mount: sysfs already mounted on /sys
/dev/md4x on /var/log type ext3 (rw,noatime)
nothing was mounted


Now, there's probably a better way to do this, but this serves as at least a moderate check against shooting yourself in the foot.

Labels: , ,

Thursday, February 05, 2009
SELinux and Nagios v3
Note: This post was never finished... so it probably contains lots of errors and incorrect information, with one or two grains of useful information.

Now that Nagios has upgraded to v3, I'm going to revisit my SELinux configuration for it. Back when I first started I was somewhat clueless about SELinux (and still greatly so) and I created a lot of really bad policy modules. They were a brute-force approach to fixing the issue using only audit2allow and ignoring labeling issues in the underlying filesystem.

(See my older piece "SELinux - troubleshooting file labeling issues".)

First off, let's use semodule to take a look at what modules are loaded:

# semodule -l | grep "nagios"
nagios 1.1.0
nagios20080426 1.0
nagios20080522 1.0
nagios20080725 1.0


What you see here is the base nagios module as provided by RedHat/CentOS (nagios 1.1.0) along with three modules that I created using audit2allow. The contents of those modules are pretty immaterial, so I'm going to remove them and recreate the exceptions from scratch.

# semodule -r nagios20080426
# semodule -r nagios20080522
# semodule -r nagios20080725


Now, if I were to startup Nagios right now, it would throw a lot of errors because I have SELinux set to Enforcing mode at the moment. So what we're going to do is temporarily put SELinux in "permissive" mode instead of "enforcing" mode. This will cause SELinux to log AVC denial messages to /var/log/audit/audit.log where we can look at them and use audit2allow to create a better exception policy.

# getenforce
Enforcing
# setenforce Permissive
# getenforce
Permissive


Now we can startup Nagios, taking careful note of the time.

Labels: , , ,

SELinux - troubleshooting file labeling issues
This is a follow-up to SELinux - dealing with exceptions.

First off, a few basics:

chcon should only be used for temporary changes. See SELinux Contexts - Labeling Files. Changes made with chcon will not survive a file system relabeling or use of the restorecon command.

/usr/sbin/semanage fcontext will permanently change the file context in a manner that will survive a relabel or restorecon. See 5.7.2. Persistent Changes: semanage fcontext in the Fedora 10 documentation.

How do I find out what file labels were defined already for a package?

This is a bit trickier, but the key lies in looking under the following directory tree:

/etc/selinux/targeted/contexts/

For file labels, look at the file_context* files under:

/etc/selinux/targeted/contexts/files/

For example, I want to see what file contexts are defined for Nagios:

# grep -h "nagios" /etc/selinux/targeted/contexts/files/file_contexts*
/usr/lib(64)?/nagios/cgi(/.*)? system_u:object_r:httpd_nagios_script_exec_t:s0
/usr/lib(64)?/nagios/plugins(/.*)? system_u:object_r:bin_t:s0
/usr/lib(64)?/nagios/cgi-bin(/.*)? system_u:object_r:httpd_nagios_script_exec_t:s0
/usr/lib(64)?/cgi-bin/nagios(/.+)? system_u:object_r:httpd_nagios_script_exec_t:s0
/usr/lib(64)?/cgi-bin/netsaint(/.*)? system_u:object_r:httpd_nagios_script_exec_t:s0
/etc/nagios(/.*)? system_u:object_r:nagios_etc_t:s0
/var/log/nagios(/.*)? system_u:object_r:nagios_log_t:s0
/var/log/netsaint(/.*)? system_u:object_r:nagios_log_t:s0
/var/spool/nagios(/.*)? system_u:object_r:nagios_spool_t:s0
/usr/bin/nagios -- system_u:object_r:nagios_exec_t:s0
/etc/nagios/nrpe\.cfg -- system_u:object_r:nrpe_etc_t:s0


You can also use the seinfo tool:

# seinfo -t | grep "nagios"
Rule loading disabled
nagios_spool_t
httpd_nagios_script_ra_t
httpd_nagios_script_ro_t
httpd_nagios_script_rw_t
nagios_t
httpd_nagios_script_t
nagios_tmp_t
httpd_nagios_htaccess_t
nagios_var_run_t
httpd_nagios_content_t
nagios_exec_t
httpd_nagios_script_exec_t
nagios_etc_t
nagios_log_t


Another tool is sesearch, i.e.:

# sesearch -a | grep "nagios" | sort | uniq

Troubleshooting and fixing things

Thus, step #1 is generally that we need to figure out whether (A) the AVC denial was caused by a mislabeled file. And if so, we need to change the file label.

Here's an example of what setroubleshoot log messages look like in the /var/log/messages file.

# grep "setroubleshoot" /var/log/messages
setroubleshoot: SELinux is preventing the status.cgi from using potentially mislabeled files ./objects.cache (var_t). For complete SELinux messages. run sealert -l ce49f540-0b35-412c-862c-b901a274a421

setroubleshoot: SELinux is preventing ping (ping_t) "read write" to /var/nagios/spool/checkresults/checkZKmcmr (var_t). For complete SELinux messages. run sealert -l cf227199-1595-4775-9970-3935fc761b38

setroubleshoot: SELinux is preventing ping (ping_t) "read write" to /var/nagios/spool/checkresults/checke4tQgY (var_t). For complete SELinux messages. run sealert -l dbdc707e-193a-4f64-9bf2-0bb0d0a807e9


And here's what they look like in /var/log/audit:

# grep "AVC" /var/log/audit/audit.log | tail

type=AVC msg=audit(1233836684.122:15494): avc: denied { read } for pid=12081 comm="status.cgi" name="objects.cache" dev=md1 ino=1306897 scontext=system_u:system_r:httpd_nagios_script_t:s0 tcontext=user_u:object_r:var_t:s0 tclass=file

type=AVC msg=audit(1233836426.120:15476): avc: denied { read write } for pid=7518 comm="ping" path="/var/nagios/spool/checkresults/checkZKmcmr" dev=md1 ino=1306899 scontext=user_u:system_r:ping_t:s0 tcontext=user_u:object_r:var_t:s0 tclass=file

type=AVC msg=audit(1233836366.097:15454): avc: denied { read write } for pid=20671 comm="ping" path="/var/nagios/spool/checkresults/checke4tQgY" dev=md1 ino=1306899 scontext=user_u:system_r:ping_t:s0 tcontext=user_u:object_r:var_t:s0 tclass=file


In this particular case, the fact that the target context is "var_t" generally indicates a labeling issue. The "var_t" file context is pretty generic and we don't want to give the source context (httpd_nagios_script_t) for status.cgi permissions to all files labeled with var_t (which would be most of /var).

This means that using audit2allow is the wrong fix for this particular issue.

The correct solution is to either find out what file context should be used, or create a context and grant nagios access to those files.

References:

Fedora 10 Security-Enhanced Linux User Guide

Top three things to understand in fixing SELinux problems. Reposted

Fedora SELinux Project Pages (wiki)

Red Hat Enterprise Linux 4: Red Hat SELinux Guide

How to: Install and Setup XEN Virtualization Software on CentOS Linux 5 - Covers how to use semanage to grant the Xen process access to a directory where it will store the DomU storage as files.

Labels: ,

Wednesday, January 28, 2009
Removing a failed, non-existent drive from Software RAID
So, you have a drive that has failed, you've replaced the drive on the fly (using hot-swap SATA) and now you need to remove the old RAID slice.

For example:

md0 : active raid1 sdi1[0] sdc1[2] sdb1[3](F) sda1[1]
264960 blocks [3/3] [UUU]


In this case, sdb1 is marked as failed, and sdi1 was the slice from the newly added drive (via SATA hot-plug). So we want to remove it with mdadm's remove command:

# mdadm /dev/md0 --remove /dev/sdb1
mdadm: cannot find /dev/sdb1: No such file or directory


Oops, we can't do that because we already swapped out the failed drive (sdb).

The answer is found in the mdadm man page for the remove feature:

-r, --remove remove listed devices. They must not be active. i.e. they should be failed or spare devices. As well as the name of a device file (e.g. /dev/sda1) the words failed and detached can be given to --remove. The first causes all failed device to be removed. The second causes any device which is no longer connected to the system (i.e an open returns ENXIO) to be removed. This will only succeed for devices that are spares or have already been marked as failed.

So instead of specifying the name of the failed RAID slice we should instead us the following command:

# mdadm /dev/md0 -r detached  
mdadm: hot removed 8:17


And there you have it, the failed raid slice that is no longer connected to the system has been removed. It will not show up in "/proc/mdstat" any more.

Labels: ,

Wednesday, January 14, 2009
Yum Error - Metadata file does not match checksum
Ran into this issue today when the pgsql folks updated their repository. All of our CentOS 5 machines are behind a transparent HTTP proxy cache server (squid).

filelists.sqlite.bz2      100% |=========================| 157 kB    00:00     
http://yum.pgsqlrpms.org/8.3/redhat/rhel-5-x86_64/repodata/filelists.sqlite.bz2: [Errno -1] Metadata file does not match checksum
Trying other mirror.
Error: failure: repodata/filelists.sqlite.bz2 from pgdg83: [Errno 256] No more mirrors to try.


It doesn't really matter what the package name is, the primary issue is the "[Errno -1] Metadata file does not match checksum" error message.

Solution:

1) Edit /etc/yum.conf and add the following line

http_caching=packages

2) Run "yum cleanup metadata"

3) Retry the yum install

References:

FedoraForum.org > Fedora Support > Installation Help > yum "Metadata file does not match checksum" problem

Labels: , ,

Monday, January 12, 2009
PostgreSQL - Errors during insertions
ERROR: could not access status of transaction 84344832
DETAIL: Could not open file "pg_clog/0050": No such file or directory.

ERROR: could not access status of transaction 84344832
DETAIL: Could not open file "pg_clog/0050": No such file or directory.

Running into this issue while doing heavy inserts on a table. The error also shows up when doing a "vacuum analyze" on the table. This is with PostgreSQL 8.3.5.

Well, troubleshooting steps.

1) # grep "was terminated" /var/lib/pgsql/data/pg_log/*.log

postgresql-2008-12-23_125649.log:LOG: server process (PID 15996) was terminated by signal 6: Aborted

postgresql-2008-12-28_131324.log:LOG: server process (PID 25745) was terminated by signal 6: Aborted

postgresql-2009-01-01_212245.log:LOG: startup process (PID 27003) was terminated by signal 6: Aborted

postgresql-2009-01-01_212334.log:LOG: startup process (PID 27097) was terminated by signal 6: Aborted


All of my terminate statements are due to "signal 6: aborted", so I don't think there's anything to be seen there.

2) Going to look for "Could not open file" in the log files. The command is:

# grep "Could not open file" /var/lib/pgsql/data/pg_log/*.log

postgresql-2009-01-11_000000.log:DETAIL:  Could not open file "pg_clog/0050": No such file or directory.

postgresql-2009-01-11_000000.log:DETAIL: Could not open file "pg_clog/0050": No such file or directory.

postgresql-2009-01-11_170257.log:DETAIL: Could not open file "pg_clog/0050": No such file or directory.

postgresql-2009-01-11_172436.log:DETAIL: Could not open file "pg_clog/0050": No such file or directory.

postgresql-2009-01-12_000000.log:DETAIL: Could not open file "pg_clog/0050": No such file or directory.


Which is the errors that I'm seeing. They started on Jan 11th.

...

Fix attempt #1

A) Shutdown the database, make sure you have good backups.

B) Find out the size of the pg_clog files. The following shows that ours are 256KB in size.

# ls -lk /var/lib/pgsql/data/pg_clog/ | head

-rw------- 1 postgres postgres 256 Dec 29 17:56 007E
-rw------- 1 postgres postgres 256 Dec 29 19:05 007F
-rw------- 1 postgres postgres 256 Dec 29 20:15 0080
-rw------- 1 postgres postgres 256 Dec 29 21:23 0081
-rw------- 1 postgres postgres 256 Dec 29 22:29 0082
-rw------- 1 postgres postgres 256 Dec 29 23:23 0083
-rw------- 1 postgres postgres 256 Dec 30 00:15 0084
-rw------- 1 postgres postgres 256 Dec 30 01:08 0085

C) Create the missing clog file:

# dd if=/dev/zero of=/var/lib/pgsql/data/pg_clog/0050 bs=1k count=256
# chmod 600 /var/lib/pgsql/data/pg_clog/0050
# chown postgres:postgres /var/lib/pgsql/data/pg_clog/0050

D) Restart the pgsql service.

...

Final notes. This is probably absolutely NOT the proper way to fix this error. Proceed at your own risk. The chance of lost data is VERY HIGH.

For us, it was a table that was append only, that we were filling out with test data. So I'm not all that concerned.

Labels: ,

Saturday, January 10, 2009
Subversion Backups - Finding SVN repositories
The first trick when backing up SVN repositories is finding them so that you can run the svnadmin hotcopy command. Well, you *could* just setup a list of export DIRS in your backup script - but as you add new SVN repositories, you have to constantly edit that script.

Caveat #1: This setup probably only works for FSFS repositories. I don't use BerkleyDB repositories (a.k.a. BDB) so I can't guarantee that it correctly locates them. I've chosen to look for folders that contain the "db/uuid" file as our "marker" file. Which should result in zero false-positives or mis-identified repositories.

Caveat #2: I'm making the assumption that all of your repositories are stored in a central location (/var/svn), but they do *not* all have to be at the same depth.

Step #1 - Find the uuid files

This is pretty simple, we're just going to use find and grep.

# find /var/svn -name uuid | grep "db/uuid"

/var/svn/tgh-photo/db/uuid
/var/svn/tgh-dev/db/uuid
/var/svn/tgh-web/db/uuid

Step #2 - Clean up the pathnames

Even better, we can tack a sed command onto the end to trim off the "/db/uuid" portion, which gets us exactly what we need for passing to the "svnadmin hotcopy" command.

# find /var/svn -name uuid | grep "db/uuid" | sed 's/\/db\/uuid//'

/var/svn/tgh-photo
/var/svn/tgh-dev
/var/svn/tgh-web

(Make sure that you get all of the "\" and "/" in the right places.)

Step #3 - Strip off the base directory

Since I'm going to create a script variable called "BASE" that equals "/var/svn/", I'm also going to strip that off of the front of the paths.

However, we'll need to convert the BASE variable into something that sed can properly deal with. Otherwise the slashes won't be escaped properly for the sed replacement.

# echo "/var/svn/" | sed 's/\//\\\//g'
\/var\/svn\/

# find /var/svn -name uuid | grep "db/uuid" | sed 's/\/db\/uuid//' | sed 's/^\/var\/svn\///'
tgh-photo
tgh-dev
tgh-web

Or, even better, we can use a different delimiter for sed. That gives us a search line of:

DIRS=`find ${BASE} -name uuid | grep 'db/uuid$' | sed 's:/db/uuid$::' | sed 's:^/var/svn/::'`

Which puts a list of directory names into the DIRS varaible in our bash script.

Step #4 - Putting it all together

Here's our basic script.

#!/bin/bash
BASE="/var/svn/"
HOTCOPY="/var/svn-hotcopy/"
DIRS=`find ${BASE} -name uuid | grep 'db/uuid$' | sed 's:/db/uuid$::' | sed 's:^/var/svn/::'`

for DIR in ${DIRS}
do
echo "svnadmin hotcopy ${BASE}${DIR} to ${HOTCOPY}${DIR}"
rm -r ${HOTCOPY}${DIR}
svnadmin hotcopy ${BASE}${DIR} ${HOTCOPY}${DIR}
done


However, we're not quite done yet because the svn hotcopy command doesn't like it when the destination folder does not exist. So let's add the following scrap of code into the loop.

if ! test -d ${HOTCOPY}${DIR}
then
mkdir -p ${HOTCOPY}${DIR}
fi


The final script

#!/bin/bash

BASE="/var/svn/"
HOTCOPY="/var/svn-hotcopy/"

FIND=/usr/bin/find
GREP=/bin/grep
RM=/bin/rm
SED=/bin/sed
SVNADMIN=/usr/bin/svnadmin

DIRS=`find ${BASE} -name uuid | $GREP 'db/uuid$' | $SED 's:/db/uuid$::' | $SED 's:^/var/svn/::'`

for DIR in ${DIRS}
do
echo "svnadmin hotcopy ${BASE}${DIR} to ${HOTCOPY}${DIR}"

if ! test -d ${HOTCOPY}${DIR}
then
mkdir -p ${HOTCOPY}${DIR}
fi

$RM -r ${HOTCOPY}${DIR}
$SVNADMIN hotcopy ${BASE}${DIR} ${HOTCOPY}${DIR}
done

# insert rdiff-backup line here


Hopefully that works out. Note the use of "rm -r", which could cause data loss if there are errors in the script. You will want to be very careful while working on the script.

Labels: , , ,

PostgreSQL - Backup Speed Tests
Our backup script for pgsql dumps the databases out in plain text SQL format (my preferred method for a variety of reasons). The question was, do we leave it as plain text and/or which compressor do we use?

...

Here are sample times for a plain-text SQL backup.

real 3m30.523s
user 0m14.053s
sys 0m5.132s

The raw database cluster is 22GB (22150820 KB), but that includes indexes. The resulting size of the backups is 794MB (812436 KB). The specific command line used is:

pg_dump -a -b -O -t $s.$t -x $d -f $DIR/$d/$s.$t.sql

($s, $t, $d and $DIR are variables denoting the schema, table, database, and base backup directory)

...

gzip (default compression level of "6")

pg_dump -a -b -O -t $s.$t -x $d | gzip -c > $DIR/$d/$s.$t.sql

CPU usage is pegged at 100% on one of the four CPUs in the box during this operation (due to gzip compressing the streams). So we are bottlenecked by the somewhat slow CPUs in the server.

real 3m0.337s
user 2m17.289s
sys 0m6.740s

So we burned up a lot more CPU time (user 2m 17s) compared to the plain text dump (user 14s). But the overall operation still completed fairly quickly. So how much space did we save? The resulting backups are only 368MB (376820 KB), which is a good bit smaller.

(It would be better, but a large portion of our current database is comprised of the various large "specialized" tables, which are extremely random and difficult to compress. I can't talk about the contents of those tables, but the data in them is generated by a PRNG.)

...

gzip (compression level of "9")

pg_dump -a -b -O -t $s.$t -x $d | gzip -c9 > $DIR/$d/$s.$t.sql.gz

We're likely to be even more CPU-limited here due to telling gzip to "try harder". The resulting backups are 369MB (376944 KB), which is basically the same size.

real 9m39.513s
user 7m28.784s
sys 0m12.585s

So we burn up 3.2x more CPU time, but we don't really change the backup size. Probably not worth it.

...

bzip2

pg_dump -a -b -O -t $s.$t -x $d | bzip2 -c9 > $DIR/$d/$s.$t.sql.bz2

real 19m45.280s
user 3m52.559s
sys 0m11.709s

Interesting, while bzip2 is about twice as slow as gzip (the default compression level), it didn't perform as badly as the maximum compression option of gzip. The resulting backup files are only 330MB (337296 KB), which is a decent improvement over the gzip compression level.

Now, the other interesting thing is that bzip2 took a lot longer to run then gzip, but the server is pretty busy at the moment.

...

Ultimately, we ended up going with bzip2 for a variety of reasons.

- Better compression
- The additional CPU usage was not an issue
- We could change to a smaller block size (-c2) to be more friendly to rsync

Labels: , ,

Thursday, January 08, 2009
PostgreSQL - Basic backup scheme
Here's a basic backup scheme. We're using pg_dump in plain-text mode, compressing the output with bzip2, and writing the results out to files named after the database, schema and table name. It's not the most efficient method, but allows us to go back to:

- any of the past 7 days
- any Sunday within the past month
- the last week of each month in the past quarter
- the last week of each quarter within the past year
- the last week of each year

Which is about 24-25 copies of the data, stored on the hard drive. So you'll need to make sure that you have enough space on the drive to handle all of these copies.

Most of the grunt work is handled by the include script, the daily / weekly / monthly backup scripts simply setup a few variables and then call the main include script.

backup_daily.sh
#!/bin/bash
# DAILY BACKUPS (writes to a daily folder each day)
DAYNR=`date +%w`
echo $DAYNR
DIR=/backup/pgsql/daily/$DAYNR/
echo $DIR

source ~/bin/include_backup_compressed.sh


backup_weekly.sh
#!/bin/bash
# WEEKLY BACKUPS
# Backups go to a five directories based on the day of the month
# converted into 1-5 based on modulus arithmetic. The fifth week
# will sometimes be left over for a few months depending on how
# many weeks there are in the year.
WEEKNR=`date +%d`
echo $WEEKNR
let "WEEKNR = (WEEKNR+6) / 7"
echo $WEEKNR
DIR=/backup/pgsql/weekly/$WEEKNR/
echo $DIR

source ~/bin/include_backup_compressed.sh


backup_monthly.sh
#!/bin/bash
# MONTHLY BACKUPS
# Backups go to three directories based on the month of year
# converted into 1-3 based on modulus arithmetic.
MONTHNR=`date +%m`
echo $MONTHNR
let "MONTHNR = ((MONTHNR -1) % 3) + 1"
echo $MONTHNR
DIR=/backup/pgsql/monthly/$MONTHNR/
echo $DIR

source ~/bin/include_backup_compressed.sh


backup_quarterly.sh
#!/bin/bash
# QUARTERLY BACKUPS
# Backups go to a four directories based on the quarter of the year
# converted into 1-4 based on modulus arithmetic.
QTRNR=`date +%m`
echo $QTRNR
let "QTRNR = (QTRNR+2) / 3"
echo $QTRNR
DIR=/backup/pgsql/quarterly/$QTRNR/
echo $DIR

source ~/bin/include_backup_compressed.sh


backup_yearly.sh
#!/bin/bash
# ANNUAL BACKUPS
YEARNR=`date +%Y`
echo $YEARNR
DIR=/backup/pgsql/yearly/$YEARNR/
echo $DIR

source ~/bin/include_backup_compressed.sh


include_backup_compressed.sh
#!/bin/bash
# Compressed backups to $DIR
echo $DIR
DBS=$(psql -l | grep '|' | awk '{ print $1}' | grep -vE '^-|^Name|template[0|1]')
for d in $DBS
do
echo $d
DBDIR=$DIR/$d
if ! test -d $DBDIR
then
mkdir -p $DBDIR
fi
SCHEMAS=$(psql -d $d -c '\dn' | grep '|' | awk '{ print $1}' \
| grep -vE '^-|^Name|^pg_|^information_schema')
for s in $SCHEMAS
do
echo $d.$s
TABLES=$(psql -d $d -c "SELECT schemaname, tablename FROM pg_catalog.pg_tables WHERE schemaname = '$s';" \
| grep '|' | awk '{ print $3}' | grep -vE '^-|^tablename')
for t in $TABLES
do
echo $d.$s.$t
if [ $s = 'public' ]
then
pg_dump -a -b -O -t $t -x $d | bzip2 -c2 > $DIR/$d/$s.$t.sql.bz2
else
pg_dump -a -b -O -t $s.$t -x $d | bzip2 -c2 > $DIR/$d/$s.$t.sql.bz2
fi
done
done
done


We tried using gzip instead of bzip2, but found that bzip2 worked a little better even though it uses up more CPU. We use a block size of only 200k for bzip2 in order to be more friendly to an rsync push to an external server.

Labels: , ,

Wednesday, January 07, 2009
Very basic rsync / cp backup rotation with hardlinks
Here's a very basic script that I use with RSync that makes use of hard links to reduce the overall size of the backup folder. The limitations are:

- Every morning, a server copies the current version of all files across SSH (using scp) into a "current" folder. There are two folders on the source server that get backed up daily (/home and /local).

- Later on that day, we run the following script to rsync any new files into a daily folder (daily.0 through daily.6).

- In order to bootstrap those daily.# folders, you have to use "cp -al current/* daily.2/" on each, which fills out the seven daily backup folders with hardlinks. Change the number in "daily.2" to 0-6 and run the command once for each of the seven days. Do this after the "current" folder has been populated with data pushed by the source server.

- Ideally, the source server should be pushing changes to the "current" folder using rsync. But in our case, the current server is an old Solaris 9 server without rsync. Which means that our backups are likely to be about 2x to 3x larger then they should be.

- RDiff-Backup may have been a better solution for this particular problem (and we may switch).

- This shows a good example of how to calculate the current day of week number (0-6) as well as calculating what the previous day number was (using modulus arithmetic).

- I make no guarantees that permissions or ownership will be preserved. But since the source server strips all of that information in the process of sending the files over the wire with scp, it's a moot point for our current situation. (rdiff-backup is probably a better choice for that.)

#!/bin/bash
# DAILY BACKUPS (writes to a daily folder each day)
DAYNR=`date +%w`
echo DAYNR=${DAYNR}
let "PREVDAYNR = ((DAYNR + 6) % 7)"
echo PREVDAYNR=${PREVDAYNR}
DIRS="home local"

for DIR in ${DIRS}
do
echo "----- ----- ----- -----"
echo "Backup:" ${DIR}
SRCDIR=/backup/cfmc1/$DIR/current/
DESTDIR=/backup/cfmc1/$DIR/daily.${DAYNR}/
PREVDIR=/backup/cfmc1/$DIR/daily.${PREVDAYNR}/
echo SRCDIR=${SRCDIR}
echo DESTDIR=${DESTDIR}
echo PREVDIR=${PREVDIR}

cp -al ${PREVDIR}* ${DESTDIR}
rsync -a --delete-after ${SRCDIR} ${DESTDIR}

echo "Done."
done


It's not pretty, but it will work better once the source server starts pushing the daily changes via rsync instead of completely overwriting the "current" directory every day.

The code should be pretty self explanatory but I'll explain the two key lines.

cp -al ${PREVDIR}* ${DESTDIR}

This overwrites all files in ${DESTDIR}, which is today, with the files from yesterday, but does it by creating hard links of all files. Old files which were deleted since last week will be left behind until the rsync step.

rsync -a --delete-after ${SRCDIR} ${DESTDIR}

This then brings today's folder up to date with any changes as compared to the source directory (a.k.a. "current"). It also deletes any file in today's folder that don't exist in the source directory.

References:

Easy Automated Snapshot-Style Backups with Linux and Rsync

Local incremental snap shots with rsync

Labels: , ,

Tuesday, January 06, 2009
Setting up FreeNX/NX on CentOS 5
Quick guide to setting up FreeNX/NX. This the approximate minimums on a fresh CentOS 5.1 box. We're limiting things to using public-key authentication from the outside and we already have a second ssh daemon running (listening on localhost, allowing password authentication).

Note: If you have ATRPMs configured as a repository, make sure that you exclude nx* and freenx*. (Add/edit the exclude= line in the ATRPMs .repo file.)

# yum install nx freenx
# cp /etc/nxserver/node.conf.sample /etc/nxserver/node.conf
# vi /etc/nxserver/node.conf

Change the following lines in the node.conf file:

ENABLE_SSH_AUTHENTICATION="1"
-- remove the '#' at the start of the line

ENABLE_SU_AUTHENTICATION="1"
-- remove the '#' at the start of the line
-- change the zero to a one

ENABLE_FORCE_ENCRYPTION="1"
-- remove the '#' at the start of the line
-- change the zero to a one

Change the server's public/private key pair:

# mv /etc/nxserver/client.id_dsa.key /etc/nxserver/client.id_dsa.key.OLD
# mv /etc/nxserver/server.id_dsa.pub.key /etc/nxserver/server.id_dsa.pub.key.OLD
# ssh-keygen -t dsa -N '' -f /etc/nxserver/client.id_dsa.key
# mv /etc/nxserver/client.id_dsa.key.pub /etc/nxserver/server.id_dsa.pub.key
# cat /etc/nxserver/client.id_dsa.key

You'll need to give the DSA Private Key information to people who should be allowed to use FreeNX/NX to access the server.

You'll also need to put the new public key into the authorized_keys2 file:

# cat /etc/nxserver/server.id_dsa.pub.key >> /var/lib/nxserver/home/.ssh/authorized_keys2

# vi /var/lib/nxserver/home/.ssh/authorized_keys2

Comment out the old key, put the following at the start of the good key line.

no-port-forwarding,no-X11-forwarding,no-agent-forwarding,command="/usr/bin/nxserver"

Restart the FreeNX/NX service:

# service freenx-server restart

You should now be able to connect (assuming that you specify the proper SSH port and paste the private key into the configuration).

Labels: , , , ,

Setup sshd to run a second instance
In order to lock down the servers like I prefer to, yet still allow FreeNX/NX to work, I have to setup a second copy of the sshd daemon. The FreeNX/NX client requires that you have sshd running with password access (not just public key), but we prefer to only allow public-key access to our servers.

I did the following on CentOS 5, it should also work for Fedora or Red Hat Enterprise Linux (RHEL). But proceed at your own risk.

1) Create a hard link to the sshd program. This allows us to distinguish it in the process list. It also makes sure that our cloned copy stays up to date as the sshd program is patched.

# ln /usr/sbin/sshd /usr/sbin/sshd_nx

2) Copy /etc/init.d/sshd to a new name

This is the startup / shutdown script for the base sshd daemon. Make a copy of this script:

# cp -p /etc/init.d/sshd /etc/init.d/sshd_nx
# vi /etc/init.d/sshd_nx

Change the following lines:

# processname: sshd_nx
# config: /etc/ssh/sshd_nx_config
# pidfile: /var/run/sshd_nx.pid
prog="sshd_nx"
SSHD=/usr/sbin/sshd_nx
PID_FILE=/var/run/sshd_nx.pid
OPTIONS="-f /etc/ssh/sshd_nx_config -o PidFile=${PID_FILE} ${OPTIONS}"
[ "$RETVAL" = 0 ] && touch /var/lock/subsys/sshd_nx
[ "$RETVAL" = 0 ] && rm -f /var/lock/subsys/sshd_nx
if [ -f /var/lock/subsys/sshd_nx ] ; then

Note: The OPTIONS= line is probably new and will have to be added right after the PID_FILE= line in the file. There are also multiple lines that reference /var/lock/subsys/sshd, you will need to change all of them.

3) Copy the old sshd configuration file.

# cp -p /etc/ssh/sshd_config /etc/ssh/sshd_nx_config

4) Edit the new sshd configuration file and make sure that it uses a different port number.

Port 28822

5) Clone the PAM configuration file.

# cp -p /etc/pam.d/sshd /etc/pam.d/sshd_nx

6) Set the new service to startup automatically.

# chkconfig --add sshd_nx

...

Test it out

# service sshd_nx start
# ssh -p 28222 username@localhost

Check for errors in the log file:

# tail -n 25 /var/log/secure

...

At this point, I would go back and change the secondary configuration to only listen on the localhost ports:

ListenAddress 127.0.0.1
ListenAddress ::1

...

References:

How to Add a New "sshd_adm" Service on Red Hat Advanced Server 2.1

How to install NX server and client under Ubuntu/Kubuntu Linux (revised)

Labels: , , ,

Friday, January 02, 2009
Samba3: Upgrading to v3.2 on CentOS 5
CentOS 5 currently only has Samba 3.0.28 in their BASE repository. The DAG/RPMForge projects don't have updated Samba3 RPMs either (although I do see an OpenPkg RPM). So the question that I've been dealing with for the past few weeks is "where do I get newer Samba RPMs"?

Ideally, I would get these RPMs from a repository, so that I could be notified via "yum check-update" for when there are security / feature updates. While I don't mind the occasional source package in .tar.gz or .tar.bz2 format, they rapidly become a maintenance nightmare. Especially for security-sensitive packages like Samba which tend to be attack targets.

What I've found that looks promising is:

http://ftp.sernet.de/pub/samba/recent/centos/5/

Which has a .repo file and looks like it might be usable as a repository for yum. (See "Get the latest Samba from Sernet" for confirmation of this.)

# cd /etc/yum.repos.d/
# wget http://ftp.sernet.de/pub/samba/recent/centos/5/sernet-samba.repo

Now, the major change is that the RedHat/CentOS packages are named "samba.x86_64" while the sernet.de packages are named "samba3.x86_64". Also, the sernet.de folks don't sign their packages, so you will need to add "gpgcheck=0" to the end of the .repo file.

(At least, I don't think they do...)

Note: As always, before doing a major upgrade like this, make backups. At a minimum, make sure you have good backups of your Samba configuration files. We use FSVS with a SVN backend for all of our configuration files, which makes an excellent change tracking tool for Linux servers.

# yum remove samba.x86_64
# yum install samba3.x86_64
# service smb start

With luck, you should now be up and running with v3.2 of Samba. You can verify this by looking at the latest log file in the /var/log/samba/ directory.

Labels: , ,

Thursday, January 01, 2009
LVM Maximum number of Physical Extents
Working on a 15-disk system (750GB SATA drives) so this issue came up again:

What is the maximum number of physical extents in LVM?

The answer is that there's no limit on the number of PEs within a Physical Volume (PV) or Volume Group (VG). The limitation is, instead, the maximum number of PEs that can be formed into a Logical Volume (LV). So a 32MB PE size allows for LVs up to 2TB in size.

Note: Older LVM implementations may only allow a maximum of 65k PEs across the entire PV. The VG can be composed of multiple PVs however.

So if you want to be safe, make your PE size large enough that you only end up with less then 65k PEs. Just remember that all PVs within a VG need to use the same PE size. So if you're planning for lots of expansion down the road, with very large PVs to be added later, you may wish to bump up your PE size by a factor of 4 or 8.

A good example of this is a Linux box using Software RAID across multiple 250GB disks. The plan down the road is to replace those disks with larger models, create new Software RAID arrays across the extended areas on the larger disks, then extend VGs across new PVs. At the start, you might only have a net space (let's say RAID6, 2 hot-spares, 15 total disks) of around 2.4TB. That's small enough that a safe PE size might be 64MB (about 39,000 PEs).

Except that down the road, disks are getting much larger (1.5TB drives are now easily obtainable). So if we had stuck with a 64MB PE size, our individual LVs could be no larger then 4TB. If we were to put in 2TB disks (net space of about 20TB), the number of PEs would end up growing by about 8x (312,000). We might even see 4TB drives in a 3.5" size, which would be closer to 40TB of net space.

A PE size of 256MB might have served us better when we setup that original PV area. It would allow individual LVs sized up to 16TB. The only downside is that you won't want to create LVs smaller then 256MB and you'll want to make sure all LVs are multiples of 256MB.

Bottom line, when setting up your PE sizes, plan for a 4x or 8x growth.

References:

LVM Manpage - Talks about the limit in IA32.
Maximum Size Of A Logical Volume In LVM - Walker News (blog entry)
Manage Linux Storage with LVM and Smartctl - Enterprise Networking Planet

Labels: ,