Sunday, December 05, 2004
INACCESSIBLE_BOOT_DEVICE
Came home from my weekend trip to find that one of my Windows 2000 servers had crashed with the INACCESSIBLE_BOOT_DEVICE 0x0000007B error message. Apparently, the power went out sometime this weekend because the 2nd server had turned itself off, and when the primary server booted back up it was BSOD'd with the 7B error.

The boot disks are mounted on a Promise FastTrak100 TX2 RAID card (which showed both disks as working). So I wasn't too worried (I also have fairly fresh backups of everything on the RAID array.

How to troubleshoot "Stop 0x0000007B" error messages in Windows 2000 wasn't all that useful.

Troubleshooting Stop 0x0000007B or "0x4,0,0,0" Error was a bit more useful.

The first thing tried was to remove any CDs/DVDs from the drives and reboot (no luck). Then I tried rebooting into safe mode with the command prompt (no luck). Then I tried rebooting to "last known good configuration" (again, no luck).

Next, I booted up the Win2000 install CDs, loaded the RAID driver from floppy and went into (R)epair, (C)onsole mode (puts you at a command prompt in C:\WINNT). A directory listing looked clean, so I did a "CHKDSK" from the C:\WINNT folder. That found some errors, so I redid the command with the /R option ("CHKDSK /R").

Once it had finished checking the boot drive (C:), I rebooted the box and it came up fine. It ran CHKDSK on all of the other drives during boot (finding some minor errors in the 2nd partition that is part of the Promise RAID).

My guess at this point is that when the UPS ran out of juice and the box crashed, it caused some issues with the NTFS file system. One of these days I'll put each of my personal servers on their own UPSs and hookup the "shutdown on low battery" cable.

Labels:

Monday, November 08, 2004
Win2003 Scheduled Tasks 0x80070005 Error
Trying to setup a backup job (kicked off by a .cmd file) on Windows 2003. I have a special, limited rights, user account created (rather then running the backup job under the administrator account). Everything seems fine until I go to test my scheduled task.

Looking at the log (Scheduled Tasks - Advanced - View Log) will show:

Unable to start task.
The specific error is:
0x80070005: Access is denied.


"Access is denied" error message when you run a batch job on a Windows Server 2003-based computer

As expected, this is a permissions error. Specifically, you need to grant permissions for batch processes to use "cmd.exe".

Labels:

Tuesday, October 26, 2004
Installing ssh and rsync on a Windows machine: minimalist approach
Installing ssh and rsync on a Windows machine: minimalist approach

Found this on Slashdot recently... haven't had time to go read it.

Labels:

Sunday, September 19, 2004
Tyan Tiger K8W (S2875)
2CPU.com Tyan Tiger K8W (S2875) - Thread with users of the Tyan Tiger K8W.

I have one of these boards with a pair of Opteron 246s. Great for video work, but I never managed to get Aquamark3 running on it to determine performance.

Labels:

Tuesday, August 17, 2004
FLAC audio
Re-ripping my CDs to a lossless format for permanent archival (I had ripped them all as 128kbps a few years ago, then reripped at 160kbps, now I'm going lossless). Here's two links that I found useful.

Comparison of lossless audio codecs - Compares most of the popular lossless codecs such as FLAC

FLAC home page - Free Lossless Audio Codec home page over at SourceForge. Be sure to download the FLAC codec installer for Windows if you want to use FLAC in players like WinAmp.

Right now, I'm probably going to rip at compression level = 5 using Easy CD-DA Extractor's Audio CD Ripper. My M.I.B. CD weighed in at 436MB with FLAC(5) compared to about 485MB with compression level zero. That means I'll be able to archive about (8) albums on a DVD-R (compared to 20-30 using 160kbps MP3). If I had ripped it to 320kbps CBR MP3, it would've ended up as 151MB.

AliveAudio - Getting to know FLAC - Lists information about players and ripping utilities.

Labels:

Monday, July 26, 2004
OpenSSH for Windows
I've pretty much given up on trying to extract the key bits from Cygwin in order to setup a SSHD server. The OpenSSH for Windows project at SourceForge seems to have what I'm looking for, they just don't have the RSync application included.

For an excellent introduction to SSH, check out OpenSSH for the impatient.

For setting up OpenSSH on a server, go ahead and grab the packages from the OpenSSH for Windows SourceForge project. The version that I'm using at the moment is "setupssh381-20040709". Inside that file you'll find a "setupssh.exe" which will install the packages as well as creating the Windows Service. I like to install my copy to "c:\bin\openssh".

Now open up the "c:\bin\openssh\docs\readme.txt" (or quickstart.txt) and follow the directions in order to create the "group" and "passwd" files. Then start up the OpenSSHD service (either from the command line as shown in quickstart.txt or using the Services control panel).

You should now be setup so that you can SSH in to the server from another workstation and get a command prompt on the server. However, the default install is pretty good in security, so you should not need to change anything sshd_config file. However, some things you may wish to change are:

1) The default server key-length is 1024 bits (which is okay, but not outstanding anymore). The man page says key lengths over 1024 bits don't matter, but another books says you should use 2048 bit keys.

2) Some key variables in the sshd_config file are:

a) PermitRootLogin - should be set to "no" which prevents you from logging in as root from another machine.

b) RSAAuthentication - setting this to no will disable the ability to login with a SSH1 client (I think...). The default sshd_config file has this explicitly set to "no".

c) PasswordAuthentication - you may want to change this to "no" and force users to setup a public/private key pair in order to login to the server.

(note: this post was never completed... so use with a grain of salt)

Labels:

Wednesday, July 21, 2004
rsync.conf file for Cygwin environments
You should definitely refer to the official rsync website for the real documentation on configuring the rsyncd.conf file.

Locate your /etc folder under where you installed Cygwin. Since I installed Cygwin to C:\bin\cygwin, my /etc folder is C:\bin\cygwin\etc. For a fresh install, you'll need to create the "rsyncd.conf" file in that folder (C:\bin\cygwin\etc\rsyncd.conf).

(minimal rsyncd.conf file)
use chroot = false

strict modes = false
log file = rsyncd.log

[test]
path = /cygdrive/d/rsync/test
read only = false
transfer logging = yes

Labels:

Minimal Cygwin install for RSync and SSH
Source links:

How to setup the secure shell daemon on a Windows 2000 machine?
Windows Rsync Server Setup
CygwinInstallationGuide (a wiki topic about the cygwin installation)

Note: The following probably doesn't work (probably missing a package, or the fact that I have GNU's unix tools for Win32 installed is problematic), but I might come back and make it work later so I'm leaving it here for now. I ran into trouble when trying to configure SSH. Right now, I've gone back to my original plan of either hacking apart the Cygwin files and manually copying only the DLLs and EXEs that I need or using the OpenSSH for Windows project at SourceForge.

1. Run the Cygwin setup.exe file and start the instllation. I chose to install to "c:\bin\cygwin", but left the rest of the options "as-is". Pick your mirror (use the Cygwin public mirrors page to find one close to you).

2. On the "Select Packages" screen, select the "Curr" option and make sure it says "Category" next to the "View" button at the top. The installation dialog is (finally) re-sizeable, so stretch it out or maximize it so you can see all of the columns.

3. Beside the "+All" category, it will say "Install", "Uninstall", ... click on the word until all of the categories say "Uninstall". (Note: These steps assume that you're doing a new Cygwin install and that you don't already have Cygwin installed.) Now we can start picking the minimum number of packages required to setup SSH and RSync.

4a. Under the "+Admin" category, you'll need to install the "cygrunsrv" package (click once on the "Skip" indicator under the "New" column). This will turn on a few other packages that this package depends on (mostly under the "+Base", "+Libs", and "+Shells" categories).

4b. Open up the "+Net" category and select the "rsync" and "openssh" packages. You'll also end up with "openssl" which is required in order to use "openssh".

5. Click the "Next" button to start downloading and installing the packages. If the download fails, choose another mirror, double-check your package selections (my copy remembered which packages I had already selected), and try again. The base install size required around 7MB of downloads and expanded out to 24MB (34MB actual due to a 4KB cluster size).

6. Fire up the cygwin shell, you should see a command-line window open with a "$" prompt. Try out a few unix commands (pwd, ls, whoami) to see if things are working.

7. Further steps... (I'll cover these in future posts)

a) Setup your rsync.conf file (in the "etc" folder)
b) create a service account for use by the rsync service
c) create a Windows service using the "cygrunsvc" tool
d) setup OpenSSH and then re-configure rsync to use it

Labels: ,

Hacking together a minimal rsync for windows installation
Based on what I've read elsewhere (links in my previous posting), I think I can pull the relevant pieces out of the Cygwin package. I'll try to keep good notes as to what worked and what didn't, but let me know if you find any errors. Rsync wrapper for Win32 seems to be a good starting point for which DLLs and files I'll need to pull out of the standard Cygwin release.

You can download the files off of any of the Cygwin public mirrors. Grab the following archives and extract them to a temporary directory on your machine.

release/cygwin/cygwin-1.5.10-3.tar.bz2
- contains the DLL file (usr/bin/cygwin1.dll) and a lot of base utilities

release/popt/libpopt0/libpopt0-1.6.4-4.tar.bz2
- contains the usb/bin/cygpopt-0.dll file

release/rsync/rsync-2.6.2-1.tar.bz2
- RSync (rsync executable)

Create a folder where you're going to store the rsync files (I use C:\bin\rsync).

Copy the following files to your rsync folder:
cygwin1.dll

cygpopt-0.dll
rsync.exe


Create your rsync.conf file and put it in your rsync folder.

Test out whether you've gotten rsync working (thanks to "Aaron Johnson's page about rsync" for showing me what command line options to use). To do this, type the following commands:
c:

cd \bin\rsync
rsync --config="c:\bin\rsync\rsyncd.conf" --daemon

If you have a log file, there should now be an entry indicating that rsync has started up and is listening on the default port (tcp/873). Looking at the processes in Windows Task Manager, you should see the "rsync.exe" process. You should also now test out some rsync transfers from another workstation to verify that your security settings and module settings are correct.

To do:
- create the user account to use for the rsync service
- setup rsync to run as a service (need the SRVANY.EXE file, I think)
- figure out how to get rsync talking through an SSHD server

Labels: ,

RSync and Windows
This is a follow-up to my previous post about Securing cwRSync. We were using the "cwRSync package", but when running in server mode it doesn't know how to talk to clients over an SSH-encrypted connection. Which isn't a big deal if you're only talking to other servers on the local network, but is problematic in cases where you have to be wary of eavesdropping (across WiFi links or untrusted networks like the internet). So I've been looking off-and-on over the past month at figuring out how to get an rsync service running using SSH on a Windows server.

One option is to install the full Cygwin package. Which is a bit much for a server (or rather, I'm not comfortable installing Cygwin on a server... yet).

Another option seems to be the OpenSSH for Windows project at SourceForge. That doesn't include rsync though, just scp. So I might look at "Installing ssh and rsync on a Windows machine: minimalist approach" which requires an absolute bare minimum of files to be installed. However, the files at that location are from Jan 2002, which is a bit old and the latest version as of July 2004 for the Cygwin DLL is cygwin-1.5.10-2.

Labels: ,

Tuesday, July 20, 2004
Linux on Laptops
Today's useful link is: Linux on Laptops.

They have a short-n-sweet index where people submit links to their HOWTO pages regarding how to get a specific distro to work on a particular make/model of laptop.

Labels:

Friday, July 09, 2004
Gentoo: Setting up PostgreSQL
Getting PostgreSQL installed really isn't that difficult on Gentoo Linux.
# emerge -s postgresql
# emerge postgresql
(install takes a while, didn't time it)
# ebuild /var/db/pkg/dev-db/postgresql-7.4.2-r1/postgresql-7.4.2-r1.ebuild config
(a few messages later)
* The current value of SHMMAX is too low for postgresql to run.
* Please edit /etc/sysctl.conf and set this value to at least 134217728.
*
* kernel.shmmax = 134217728
*

Fire up nano... see for an explanation of why we need to edit sysctl.conf. The short version is that the 2.6 linux kernel has a default value (shared memory limits) that is too small to be compatible with PostgreSQL.
# nano -w /etc/sysctl.conf

(add the following lines)
#Kernel parameters for PostgreSQL
#default is 32MB, PostGreSQL needs 128MB
kernel.shmmax = 134217728
kernel.shmall = 134217728

Now manually update the current values and start the server.
# echo 134217728 >/proc/sys/kernel/shmall
# echo 134217728 >/proc/sys/kernel/shmmax
# rc-update add postgresql default
# /etc/init.d/postgresql start

Now I'm off to explore the PostgreSQL documentation.

The default Gentoo install seems to already include a "postgres" user in /etc/passwd. To get logged in as the postgres user account, you will (I think) first need to switch to root.
# su
# cd /usr/local
# su - postgres

Now you can continue with section 16. Skip the page about creating the database cluster, it's already been created in "/var/lib/postgresql/data" back when you ran the "ebuild config" command. You can verify this by looking at the config file ("cat /etc/conf.d/postgresql"), where the PGDATA= line indicates the location of the database.

In fact, skip straight to chapter 16.4 - Run-time Configuration, because the server is already running. To verify that the server is running, "cat /var/lib/postgresql/data/postmaster.pid". Make a note of the PID on the first line (second line is the database location), then "cat /proc/nnnn/status" (replacing "nnnn" with the PID).

Labels: , ,

Thursday, July 08, 2004
Misc Mozilla Bits
Just a few misc Mozilla 1.7 settings that I've found useful. All of these need to be added/changed in your prefs.js file in your profile directory. Make sure that you've exited out of all Mozilla windows, including the QuickLaunch icon in the system tray before making your edits. Otherwise, when Mozilla exits again later, it will overwrite your changes.

It's also a good idea to make a backup file of your prefs.js file prior to making changes.

1) Changing the trash folder in Mozilla Mail (or Thunderbird) to match what is used on your IMAP server. The standard trash folder is called "Trash", but my IMAP service uses "Deleted Items" instead. To make things simple, I changed Mozilla to use "Deleted Items" as well for that particular account. Replace "serverx" with the appropriate server number (e.g. "server7") that matches your IMAP account.

user_pref("mail.server.serverx.trash_folder_name", "Deleted Items");

Changing the default folder for saved copies of sent items from "Sent" to "Sent Items" is a bit easier. Just right-click on the e-mail account and pick "Properties", then look in "Copies & Folders" and change where Mozilla/Thunderbird stores copies of e-mails that you have sent.

For the technically minded, the lines in "prefs.js" that are affected by this change are:

user_pref("mail.identity.id5.fcc_folder", "imap://username@imap.somedomain.com/Sent Items");
user_pref("mail.identity.id5.fcc_folder_picker_mode", "1");

2) Displaying an error page instead of just a blank page when a webpage times out. One of the most annoying features in base Mozilla/Firebird is the way that they handle timeout errors. Instead of getting an error message on the screen, you get a blank page and the location bar will have been cleared. Which, if you were trying to load the page in the background for later viewing, means you have to try and remember or figure out what link you were trying to look at. Adding the following line to your prefs.js file will at least give you an error display that lets you retry the URL:

user_pref("browser.xul.error_pages.enabled", true);

Labels:

Friday, July 02, 2004
Removable PATA/SATA Drive Bays
Shopping around for some removable drive bays (either PATA or SATA). I definitely don't want anything plastic, which cuts the field a bit. StarTech seem to make some nice drawers with multiple fans.

It's a pity that their multi-bay SATA systems aren't ready yet, because that looks interesting. (Holds 3 SATA drives in removable shells, takes up 2 5.25" drive bays, has a single large fan on the back.)

DRW115ATA
DRW115ATABK - Black, aluminim, PATA, ~$60
DRW115SAT
DRW115SATBK - SATA, aluminum, black, ~$60-70

Extra bays are $20-$30, extra caddies are $50-$60.

Since we'll be using these for backups, we'll probably go with PATA and 5400rpm drives (which run a lot cooler then 7200rpm SATA drives). I have to allow for ambient temperatures up to 85F at the office and I'm afraid that 7200rpm drives would cook themselves.

Update: I now own a set of the DRW115ATA drawers. The construction is quite sturdy, but the fit and finish is not always the best. They can be moderately difficult to slide in and out of the drive bays.

Labels:

Friday, June 25, 2004
SPF Records
Published SPF records for my domain this week.

Labels: ,

Sunday, June 20, 2004
Linux Bare-Metal Backup and Recovery Planning
The whole concept behind bare-metal restore is that it's the shortest point from pulling a replacement set of server hardware out of a box to having a working system to replace the one that's down. Ideally, without having to go through re-installing the operating system (which, in the case of Gentoo, can take a few hours). It also (generally) only works when you have exactly the same hardware in the replacement box as the box that's kaput. So if you're using any exotic hardware (e.g. a fancy RAID card), you'd best buy (3). That way when the first one dies, you can put in one of the (2) spare cards and still get at your data.

Some backup software makes the process as simple as dropping a boot CD in the drive and inserting the most recent bare-metal backup (sometimes called a "gold" tape?) tape. It can also be a multi-step process, where the first step restores the operating system to a known state and the second step handles restoring the data and applications.

Since I'm not ready to tackle writing a guide, I'll settle for a few links:

Linux Complete Backup and Recovery HOWTO (by Charles Curley) - A very good starting point.

Linux Journal: Bare Metal Recovery, Revisited (by Charles Curley, Aug 2002) - A good sidebar discussion of the HOWTO document that he wrote.

About.com: Linux backup - A collection of links to backup software.

Unix SysAdm Resources: Backup & Archival Software for Unix (Stokely Consulting) - Listing of backup software for unix.

Linux Backups Mini-FAQ (Karsten M. Self) - Good article on simply using "tar".

Labels: , ,

Saturday, June 19, 2004
Gentoo: Segmentation fault in vgscan during boot
Now for the other error that I got during the initial bootup.
* Using /etc/modules.autoload.d/kernel-2.6 as config:
* Loading module dm-mod... [ ok ]
* Autoloaded 1 module(s)
* Setting up the Logical Volume Manager...
/sbin/rc: line 429: 4422 Segmentation Fault /sbin/vgscan >/dev/nul [ ok ]
* Starting up RAID devices: ...
* Checking all filesystems...
/dev/md0: clean, 39/18072 files, 5573/72192 blocks
fsck.ext: No such file or directory while trying to open /dev/vgmirror/opt
/dev/vgmirror/opt:
The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193

("No such file or directory..." error repeats for all of the other
logical volumes in the volume group(s) on the system)

My initial guess is that the software RAID is not loading up prior to the LVM stuff trying to load. Possibly, I'll have to edit the ordering in "/etc/init.d/checkfs", however since RAID is compiled into the kernel as built-in, and the LVM stuff is a module, the RAID should've already started prior to the LVM stuff.

Looking closer at the boot screen, I can see the "md:" lines correctly autodetecting the RAID arrays. So RAID support seems to be working fine. In fact, if I login to maintenance mode, and "cat /proc/mdstat", all of the RAID arrays show up correctly.

Attempt #1:

Moved things around in /etc/init.d/checkfs. No change in the end-result, except that the messages are re-ordered ("Starting up RAID devices" now appears before "Setting up the Logical Volume Manager") and the error message changes to "4422 Segmentation Fault /sbin/vgscan >/dev/nul". Probably a dry-hole in terms of finding and fixing the real problem.

Attempt #2:

Flipped back to the Gentoo LVM2 documents to see if I missed anything in setting up the LVM set to auto-mount at startup. Booted my way into maintenance mode and use "vgscan -v" to let vgscan attempt to find all of the volume groups. "vgscan" will take a while to run, at least with verbose (-v) mode you'll be able to see some status. On my setup, "vgscan" correctly located the "vgmirror" volume group.

Did a look at the "/etc/lvm" folder on the root volume using "ls -la /etc/lvm" and saw something surprising. There is a ".cache" file which is huge (mine was 10881785 in size). Doing a "cat" of the contents, I see some entries like "/dev/discs/disc2/discs/disc2.../disc2/md/254" which looks like a recursive loop of some sort.

Hint #2, run "vgdisplay -vv" and I see the error message "Too many levels of symbolic links" after each of those long entries. I also see this problem if I run "vgscan -vv". I finally changed my "/etc/lvm/lvm.conf" file to look like the following, and vgscan and vgdisplay are very quick at finding the volume group on my raid array and no longer segfault while looking at other items:
devices = {
scan=["/dev/md"]
filter=["a|^/dev/md/3$|","r/.*/"]
}

Note that this filter only allows vgscan to scan the "md3" device. This keeps vgscan from scanning other devices that don't need to be scanned on my system (and fixes the segfault issue where it goes into infinite recursion on certain devices). If you need to scan other RAID devices (/dev/md1, etc.) or other physical partitions, then you'll need to adjust the "accept" portion of the filter.

Save, shutdown the raid (raidstop -a /dev/md0 for each /dev/md* device), and reboot. My server now boots up correctly.

Links:

Re: mkfs.xfs on software raid5 (2.6.5 kernel) - MD array /dev/md2 not in clean state (alt.os.linux.gentoo) - Shows the exact error message that I'm seeing.

Re: [gentoo-user] LVM2 Date: 2004-03-31 15:40:13 PST (linux.gentoo.user) - Talks about the ordering in which software RAID and the LVM modules load.

Example of lvm.conf file - shows a more complex lvm.conf file, complete with multiple filters.

A more complete example lvm.conf file

Labels: ,

Gentoo: Failed to load mii
More self-inflicted pain (guarantee that I'm doing this to myself, not due to the Gentoo install guide...)

During boot-up after installing the 2.6.6 kernel, the following (2) modules fail to load:
Loading module mii...
Failed to load mii

Loading module via-rhine...
Failed to load via-rhine

The second error is because I had configured the kernel to load "via-rhine" as built-in and not as a module. You can do one or the other, but not both.

Not sure about the "mii" error, I'll merely comment it out in the autoload config file for now and see what happens.

Labels: , ,

Friday, June 18, 2004
Securing cwRSync
At the office we're working on setting up cwRSync on the web server array to push the daily web/ftp/smtp log files back to a central point for archiving. Right now, since all of the web servers are on the same LAN segment at the hosting facility, we're just sending the plain text data across the wire to the rsync port (tcp/873). Since the previous solution was to use FTP to move the log files around, it's no worse then the old solution from a security standpoint. (It is, however, much faster and more efficient.) Security is handled solely thorugh the rsyncd.conf "hosts allow" setting (only the internal IP addresses are allowed to be used to transfer the data) with no passwords or shared keys.

However, since the next step is that we want to setup pulling those log files automatically back to the main office, we need to look into locking it down further and putting encryption in place (e.g. routing rsync traffic over an ssh tunnel).

After digging around a bit here's what I've found:

The cwRSync Service does not support SSH, so there's no way to connect securely to a rsync server that is using cwRSync as its daemon. Future releases are expected to add ssh support for cwRSync servers. Locking down through IP address and username/password is the limit of what you can do for security, all traffic is in the clear (unless you have IPSec between the two machines).

However, you can use cwRSync in a client-configuration and route the traffic over SSH to a SSH-capable rsync server.

That being said, I'm going to explore some other packages. All of which will either require that cygwin be installed, or at least that certain cygwin DLLs be installed.

Links:

Rsync wrapper for Win32 - Uses the cygwin DLLs, but doesn't require a full cygwin install, includes SSH.

Labels: ,

Thursday, June 17, 2004
Troubleshooting software RAID boot problems
First problem is that the system boots straight into "grub". Probably due to a missing "grub.conf" file, which I'm pretty sure I had written to the proper location earlier in the install. So I'll load the kernel by hand and fix the config file once I get the box to boot properly.
grub>
grub> cat /boot/grub/grub.conf
(no file found)

grub> root (hd0,0)
grub> kernel /kernel-2.6.6-gentoo root=/dev/md2
grub> boot

This starts the kernel boot process, which will then lead me to my second error. In the meantime, if you use [shift-PgUp] and [shift-PgDn] you can scroll back and forth through the boot messages. A minor problem is that since the raid didn't shutdown cleanly, there are numerous "md: mdx: raid array is not clean -- starting background reconstruction" messages. And I can't even begin to troubleshoot the "Kernel panic: No init found" error until resync is done (that's a 2.5-3.0 hour process).

Looking back at my kernel configuration, I see that Software RAID with RAID1 support was compiled as BUILTIN, and ext2/ext3 were installed by default if I remember correctly. Those are two of the possible errors that could cause the kernel not to be able to read from the "/" (root) partition.

Also possible is that the "/" filesystem was not properly mounted during the install. The following is what I've tried to do in order to fix the issue (or at least diagnose the issue). I would strongly recommend that you do not use the following on a production system unless you understand what everything does. Since I don't have any data on the system (other then config files), I have a good amount of latitude with regards to what I can do. Hopefully, since /boot, /usr, /opt, /var and /home are intact, things will go quicker then the first install.

This repair process is basically a complete reinstall because nothing else under "/" (root) actually got written to the hard drive. Only the separately mounted folders such as /boot, /opt, /usr, /var, and /home were properly mounted.

Note: The following steps assume that you have nothing on your partitions worth keeping, or that you've already backed everything up. I tried doing it without formatting the "/" partition, but the bootstrap.sh file keeps dying on line 84. So I'm going to do a format of the "/" partition as well as /opt, /usr, /var, /tmp, and /var/tmp.

Boot the LiveCD, at the boot prompt be sure to pick the boot kernel and pass the arguments that you want, then load the raid and LVM modules.
boot: gentoo -nohotplug
(gentoo kernel now loads)

livecd root # modprobe md
livecd root # modprobe dm-mod
livecd root # modprobe via-rhine (if your network adapter failed to autoload)
livecd root # net-setup eth0 (if your network adapter failed to autoload)

If you have a copy of your /etc/raidtab file on floppy, copy it in now, otherwise you'll have to re-key your /etc/raidtab by hand. If you need, you can temporarly mount partitions to copy files from.
livecd root # nano -w /etc/raidtab
livecd root # raidstart -a
livecd root # cat /proc/mdstat
(verify that all raid sets are up and running)

livecd root # swapon /dev/md1
livecd root # mke2fs -j /dev/md2 (OVERWRITES YOUR / PARTITION)
livecd root # mount /dev/md2 /mnt/gentoo
livecd root # mkdir /mnt/gentoo/boot
livecd root # mount /dev/md0 /mnt/gentoo/boot
livecd root # ls /mnt/gentoo/boot
(if you have files already in /boot, this verifies that you mounted in the proper order)

livecd root # mkdir /etc/lvm
livecd root # echo 'devices { filter=["r/cdrom/"] }' > /etc/lvm/lvm.conf
livecd root # vgscan
(verify that it found your existing volume group or groups)

livecd root # vgchange -ay vgmirror
livecd root # lvscan
(verify that your logical volumes now show up and are "ACTIVE")

livecd root # mke2fs -j /dev/vgmirror/opt (OVERWRITES YOUR / PARTITION)
livecd root # mke2fs -j /dev/vgmirror/usr (OVERWRITES YOUR / PARTITION)
livecd root # mke2fs -j /dev/vgmirror/var (OVERWRITES YOUR / PARTITION)
livecd root # mke2fs -j /dev/vgmirror/home (OVERWRITES YOUR / PARTITION)
livecd root # mke2fs /dev/vgmirror/ttmp (OVERWRITES YOUR / PARTITION)
livecd root # mke2fs /dev/vgmirror/vartmp (OVERWRITES YOUR / PARTITION)

livecd root # mkdir /mnt/gentoo/opt
livecd root # mkdir /mnt/gentoo/usr
livecd root # mkdir /mnt/gentoo/var
livecd root # mkdir /mnt/gentoo/home
livecd root # mount /dev/vgmirror/opt /mnt/gentoo/opt
livecd root # mount /dev/vgmirror/usr /mnt/gentoo/usr
livecd root # mount /dev/vgmirror/var /mnt/gentoo/var
livecd root # mount /dev/vgmirror/home /mnt/gentoo/home

livecd root # mkdir /mnt/gentoo/tmp
livecd root # mount /dev/vgmirror/tmp /mnt/gentoo/tmp
livecd root # chmod 1777 /mnt/gentoo/tmp
livecd root # mkdir /mnt/gentoo/var/tmp
livecd root # mount /dev/vgmirror/vartmp /mnt/gentoo/var/tmp
livecd root # chmod 1777 /mnt/gentoo/var/tmp
livecd root # mkdir /mnt/gentoo/proc
livecd root # mount -t proc none /mnt/gentoo/proc

livecd root # date
livecd root # cd /mnt/gentoo
livecd gentoo # tar -xvjpf /mnt/cdrom/stages/stage1-x86-20040218.tar.bz2
livecd gentoo # tar -xvjf /mnt/cdrom/snapshots/portage-20040223.tar.bz2 -C /mnt/gentoo/usr
livecd gentoo # mkdir /mnt/gentoo/usr/portage/distfiles
livecd gentoo # cp /mnt/cdrom/distfiles/* /mnt/gentoo/usr/portage/distfiles/

livecd gentoo # nano -w /mnt/gentoo/etc/make.conf
(repeat your make.conf from the initial install)

livecd gentoo # mirrorselect -a -s4 -o | grep -ve '^Netselect' >> /mnt/gentoo/etc/make.conf
livecd gentoo # cp -L /mnt/gentoo/etc/make.conf /mnt/gentoo/boot/make.conf-backupcopy
livecd gentoo # cp -L /etc/resolv.conf /mnt/gentoo/etc/resolv.conf
livecd gentoo # cp -L /etc/raidtab /mnt/gentoo/etc/raidtab
livecd gentoo # cp -L /etc/raidtab /mnt/gentoo/boot/raidtab-backupcopy
livecd gentoo # mkdir /mnt/gentoo/etc/lvm
livecd gentoo # cp -L /etc/lvm/lvm.conf /mnt/gentoo/etc/lvm/lvm.conf
livecd gentoo # cp -L /etc/lvm/lvm.conf /mnt/gentoo/boot/lvm.conf-backupcopy
livecd gentoo # chroot /mnt/gentoo /bin/bash
livecd / # env-update
livecd / # source /etc/profile
livecd / # emerge sync
livecd / # cd /usr/portage
livecd / # scripts/bootstrap.sh

If bootstrap runs correctly (and it should now that I re-formatted the /opt, /usr, /var, /home, /tmp, and /var/tmp volumes), I can pick back up with the rest of my original install process

Labels: ,

Wednesday, June 16, 2004
Gentoo Install 6 (Grub, System Tools, Finalizing the Install)
(previous post about configuring the kernel and setting up the filesystem)

Picking up with 9.b. Default: Using GRUB in the handbook. (Also see my older post about configuring grub.) Also take a look at the end of the thread on the gentoo forums (look for user "havoc") where it discusses how to setup grub on both the primary and secondary drives (see the original article). Now is also a good time to pull up the official Software RAID HOWTO and review that as well (especially section 7.3).
# grub --no-floppy
grub> find /grub/stage1
(hd0,0)
(hd1,0)
grub> root (hd0,0)
grub> setup (hd0)
grub> device (hd0) /dev/hdc
grub> root (hd0,0)
grub> setup (hd0)
grub> quit

The above is a little tricky to follow. The first "root" and "setup" commands specify that grub should boot from the first partition on hd0 (which is /dev/hda in my config) and "setup" installs the MBR record to the drive.

Then we pull a switch on grub with the "device" command, telling it to pretend that /dev/hdc (the secondary drive in the RAID array) is now hd0. The second set of "root/setup" commands then write the MBR out to the 2nd drive in the array. If I undstand everything correctly, this means that in the case of the primary drive dying, the RAID array will still be able to boot off of the secondary drive. (I don't believe that you would need to move it to the primary cable.)

Now edit/create your config file for grub, "nano -w /boot/grub/grub.conf"". You'll need to know the name of your kernel file that you compiled and copied to /boot earlier. Here's what mine looks like (booting from the first partition on the first drive):
default 0
timeout 30
title=Gentoo Linux 2.6.6 (June 16 2004)
root (hd0,0)
kernel /kernel-2.6.6-gentoo root=/dev/md2

Refer to the handbook, Installing Necessary System Tools, for the next few commands (I mostly used the defaults, but this is a cut-n-paste from my previous install).
# emerge syslog-ng
# rc-update add syslog-ng default
# emerge dcron
# rc-update add dcron default
# crontab /etc/crontab

Then refer to Finalizing your Gentoo Installation.
# passwd
# useradd john -m -G users,wheel,audio -s /bin/bash
# passwd john

Note: Now you need to unmount everything that you can (including LVM), possibly shutdown the RAID as well prior to reboot.
livecd gentoo # exit
livecd / # cd /
livecd / # cat /proc/mounts

(unmount all of your mounted partitions, including the LVM mounts)
livecd / # umount ... (insert list of mounted file systems)

livecd / # vgchange -an vgmirror
livecd / # raidstop -a /dev/md0
livecd / # raidstop -a /dev/md3
livecd / # reboot

If all goes well, the system should shutdown and then restart from the software RAID.

My system locked up during shutdown on or after "Stopping USB and PCI hotplugging". Which probably means there was a boot option that I should've entered way back when I booted off the LiveCD (actually it means I didn't properly specify the "nohotplug" option at the "boot:" prompt on the LiveCD).

Labels: , ,

Gentoo Install: "emerge grub" or "emerge lilo" fails to mount /boot
Sometimes "emerge grub" or "emerge lilo" fails with the following error and you are attempting to mount "/boot" on a software RAID1 partition:
*
* Cannot automatically mount your /boot partition.
* Your boot partition has to be mounted rw before the installation
* can continue. lilo needs to install important files there.
*

!!! ERROR: sys-boot/lilo-22.5.8-r1 failed.
!!! Function mount-boot_mount_boot_partition, Line 53, Exitcode 0
!!! Please mount your /boot partition manually!

!!! FAILED preinst: 1

(The error messages will be pretty much identical for both "lilo" and "grub".)

The problem was, for me at least, that prior to doing the chroot into the new environment, I had failed to mkdir and mount the /boot partition. (See the start of step 2 in my install notes.)

Here's the quick-n-easy way that I fixed the problem. I had to temporarily exit out of the chroot'd environment, back to the livecd bootup environment, mount the partition, and then chroot back.
livecd / # exit
livecd / # mkdir /mnt/gentoo/boot
livecd / # mount /dev/md0 /mnt/gentoo/boot
livecd / # cat /proc/mounts
livecd gentoo # chroot /mnt/gentoo /bin/bash
livecd / # env-update
livecd / # source /etc/profile

(now copy your kernel into /boot again from /usr/src/linux-2.6.6)

livecd / # emerge lilo
(or if you're using grub...)
livecd / # emerge grub


Update: While this set of instructions did properly fixup the /boot partition with "grub", it really didn't treat the root cause of the entire mess. (See Troubleshooting Software RAID.)

The root-cause was that back when I did the first part of the install, not only did I fail to mount the /mnt/gentoo and /mnt/gentoo/boot folders properly, but I then mounted them out-of-order when I did catch the error. That causes all sorts of problems down the road, yet the install process will look like it's going off without a hitch (until you reboot).

Links:

www.gentoo.pl - This page might've had the answer, but unfortunately it was written in polish.

Labels: , ,

Gentoo Install 5 (Manual Kernel Configuration)
(previous post - building the kernel)

Note: This is for a VIA EPIA ME6000 motherboard being used as a headless server. All of the multimedia and graphic options are disabled. (See my previous install.) If this is your first install, you should probably use the "genkernel" method rather then manual configuration. The Gentoo docs explain configuring the kernel. They recommend being familiar with the "cat /proc/pci" and "lsmod" commands which is something I missed on my previous install.
# cd /usr/src/linux
# make menuconfig

Anywhere in the following list where I say "turn ON" means to use the "Y" key to turn an option on as built-in, I'll specifically say MODULE if I loaded the option as a module.

Linux Kernel v2.6.6 Configuration
(C)ode maturity level options
(G)eneral setup
--> (C)onfigure standard kernel features for small systems (turn ON)
--> --> (O)ptimize for size (turn ON)
(L)oadable module support
(P)rocessor type and features
--> (P)rocessor family (changed to "CyrixIII/VIA-C3")
--> (S)ymetric multi-processing support (turned this one OFF)
--> M(a)chine Check Exception (turned this OFF)
(P)ower management options (ACPI, APM)
(B)us options (PCI, PCMCIA, EISA< MCA, ISA)
(E)xecutable file formats
(D)evice drivers
--> (P)arallel port support (turned OFF)
--> (A)TA/ATAPI/MFM/RLL support (turned ON the VIA82CXXX chipset support as BUILT-IN)
--> M(u)lti-device support (turn it ON)
--> --> (R)AID support (turn it ON as BUILT-IN)
--> --> --> (R)AID-1 mirroring mode (turn it ON as BUILT-IN)
--> --> (D)evice mapper support (set to MODULE, per section 13 of LVM2 guide)
--> N(e)tworking support
--> --> N(e)twork device support, (E)thernet 10/100Mbit
--> --> --> (R)ealTek RTL-8139 PCI (turn OFF)
--> --> --> (V)IA Rhine (turn ON as BUILT-IN)
--> --> --> --> (U)se MMIO instead of PIO (turn ON)
--> (C)haracter Devices
--> --> (I)ntel/AMD/VIA HW Random Number Generator (turn ON as BUILT-IN)
--> --> /(d)ev/agpgart AGP Support
--> --> --> (I)ntel 440LX/BX/GX I8xx E7x05 (turn OFF)
--> --> --> (V)IA chipset support (turn ON as BUILT-IN)
--> (I)2C support (turn ON, heavily reliant on building an MP3 server for these options)
--> --> (I)2C device interface (turned ON as BUILT-IN, epiawiki says "on", MP3 server article says "off")
--> --> (I)2C Algorithms
--> --> --> (I)2C bit-banging interface (turn ON as BUILT-IN)
--> --> (I)2C Hardware Bus support
--> --> --> (V)IA 82C586B support (turn on as BUILT-IN)
--> --> (I)2C Hardware Sensors Chip
--> --> --> (V)IA686A (turn on as BUILT-IN)
--> (S)ound
--> --> (S)ound card support (turn OFF)
(F)ile systems
--> (P)seudo filesystems
--> --> /(d)ev file system support OBSOLETE (turn ON)
--> --> --> (A)utomatically mount at boot (turn ON)
(P)rofiling support
(K)ernel hacking
(S)ecurity options
(C)ryptographic options
--> (C)ryptographic API (turn ON)
--> --> (H)MAC support (turn ON)
--> --> (turn ON the others as MODULE)
(L)ibrary routines

Exit and save your configuration. Then build the kernel (the following is for 2.6 kernels). Expect the compile to take about an hour.
make && make modules_install

Now you need to install your kernel into the boot partition. Change the "2.6.6-gentoo" portion of the filenames to whatever you want.
# cp arch/i386/boot/bzImage /boot/kernel-2.6.6-gentoo
# cp System.map /boot/System.map-2.6.6-gentoo
# cp .config /boot/config-2.6.6-gentoo

Next is 7.e. Installing Separate Kernel Modules, which is where we specify which modules from above that we configured as "MODULE" instead of "BUILT-IN" get loaded at bootup. Use the command "nano -w /etc/modules.autoload.d/kernel-2.6" to edit your config file. Here is what mine looked like (yours will probably be different).
# autoloads the following modules at boot time
#LVM2 (logical volume manager)
dm-mod

#ethernet
#mii (not needed?)
#via-rhine (compiled as built-in)

I also need to emerge in LVM2 support as well as the "raidtools" package (per Gentoo x86 Installation Tips & Tricks).
# modules-update
# emerge lvm2
# emerge raidtools

Time to edit the "/etc/fstab" table (see 8.a. Filesystem Information and also refer back to my mount commands from earlier). Here's my "/etc/fstab" file:
/dev/md0 /boot ext2 noauto,noatime 1 2
/dev/md2 / ext3 natime 0 1
/dev/md1 none swap sw 0 0
/dev/cdroms/cdrom0 /mnt/cdrom auto noauto,user 0 0

/dev/vgmirror/opt /opt ext3 noatime 0 3
/dev/vgmirror/usr /usr ext3 noatime 0 3
/dev/vgmirror/var /var ext3 noatime 0 3
/dev/vgmirror/home /home ext3 noatime 0 0
/dev/vgmirror/tmp /tmp ext2 noatime 0 3
/dev/vgmirror/vartmp /var/tmp ext2 noatime 0 3

none /proc proc defaults 0 0
none /dev/shm tmpfs defaults 0 0

Change your hostname, domainname, and the default run level.
# echo yourhostname > /etc/hostname
# echo yourdnsname > /etc/dnsdomainname
# rc-update add domainname default
# nano -w /etc/conf.d/net
(either use iface_eth0="dhcp" or configure your IP and gateway)
# rc-update add net.eth0 default
# cat /etc/resolv.conf
(verify your DNS servers if you specified a static IP)
# nano -w /etc/rc.conf
(change CLOCK="UTC" to CLOCK="local")

Onward to chapter 9, configuring the bootloader. Here's where I ran into trouble; "emerge grub" or "emerge lilo" failed with "cannot automatically mount your /boot partition".
# emerge grub


(continued in my next post)

Labels: , ,

Gentoo Install 4 (Installing the Kernel Sources)
(previous post)

Picking up with 7. Configuring the Kernel. If your system crashes after this point, I do have a few notes jotted down on how to get back to here without rebuilding everything. (Since this is where I screwed up last time and put the machine into an unusable state.)

Timezone for me is EST5EDT, so here's how to set that up.
# ls /usr/share/zoneinfo
# ln -sf /usr/share/zoneinfo/EST5EDT /etc/localtime
# date 06161009
# zdump GMT
# zdump EST5EDT

Next, pick your kernel. For me, since gentoo-sources, gs-sources (gentoo stable) and vanilla-sources are all still on the 2.4 kernel, I'm going to go with development-sources which is at version 2.6.6 and regardless of the name is actually a rather stable tree.
# emerge -s sources | less
# emerge development-sources

This will take a while to run. Last time I think it took somewhere around 2 hours, this time it only took 30-40 minutes. So my previous estimate was probably a bit off (or it took longer to download last time).

(next step)

Labels: , ,

Tuesday, June 15, 2004
Gentoo Install 3 (Bootstrapping)
(previous post)

Time to bootstrap the system (See moving from stage 1 to stage 2). If you have multiple machines on the network, all with the same version of gcc, now is the when you'll want to configure your distcc configuration.
# cd /usr/portage
# scripts/bootstrap.sh

This will take a while to run (update: took 8 hours to run). If the bootstrap script fails, and you're re-using a portage tree (or possibly other files under /opt, /usr, /var, /home, /tmp or /var/tmp), then you may need to clean out the old files. (Generally not an issue if you're doing a fresh install.)

Once that finishes, run the following command.
# emerge system

Which will also take a while to run (last time it took around 5.5 hours). Update: Took around 4.5 hours this time.

(next post)

Labels: , ,

Gentoo Install 2 (VIA EPIA ME6000)
(Previous post about fdisk and setting up the software RAID.)

At this point, we've partitioned the disk and setup the "/etc/raidtab" file. It's a good idea to jot down everything in that file and put it in a safe place. You should also "cat /proc/mdstat" and jot that information down too.

The following commands will format the boot and root partitions (/dev/md0 is /boot, /dev/md2 is /). I'll also be setting up the swap on /dev/md1. Since we're doing RAID1, there's no need to use the "-R stride=n" option of mke2fs (that's only useful for RAID0, RAID4 or RAID5). Note that you must mount the "/" (root) partition before creating and mounting the boot folder within that tree.
# mke2fs /dev/md0
# mke2fs -j /dev/md2
# mkswap /dev/md1
# swapon /dev/md1
# mount /dev/md2 /mnt/gentoo
# mkdir /mnt/gentoo/boot
# mount /dev/md0 /mnt/gentoo/boot

Next, initialize the 4th RAID set in preparation for LVM (pvcreate). Create the "/etc/lvm/lvm.conf" file and create the volume group for the 4th RAID set (vgcreate). Also see the Gentoo LVM documentation. If needed, use "modprobe dm-mod" to load the LVM module.
# pvcreate /dev/md3
# echo 'devices { filter=["r/cdrom/"] }' >/etc/lvm/lvm.conf
# vgcreate vgmirror /dev/md3
# vgscan

Now we need to create some logical volumes inside our "vgmirror" volume group. Here's a list of my initial logical volumes:

4GB /tmp (ext2)
4GB /var/tmp (ext2)
2GB /opt (ext3)
4GB /usr (ext3)
4GB /var (ext3)
8GB /home (ext3)

Create the logical volumes using "lvcreate", then verify by looking in the "/dev/vgmirror" folder as well as "lvscan":
# lvcreate -L4G -ntmp vgmirror
# lvcreate -L4G -nvartmp vgmirror
# lvcreate -L2G -nopt vgmirror
# lvcreate -L4G -nusr vgmirror
# lvcreate -L4G -nvar vgmirror
# lvcreate -L8G -nhome vgmirror
# ls /dev/vgmirror
# lvscan

Now, format the logical volumes with the desired filesystems.
# mke2fs /dev/vgmirror/tmp


# mke2fs /dev/vgmirror/vartmp
# mke2fs -j /dev/vgmirror/opt
# mke2fs -j /dev/vgmirror/usr
# mke2fs -j /dev/vgmirror/var
# mke2fs -j /dev/vgmirror/home

Make the directories to hold your mounted volumes. Mount your volumes.
# mkdir /mnt/gentoo/opt
# mkdir /mnt/gentoo/usr
# mkdir /mnt/gentoo/var
# mkdir /mnt/gentoo/home
# mount /dev/vgmirror/opt /mnt/gentoo/opt
# mount /dev/vgmirror/usr /mnt/gentoo/usr
# mount /dev/vgmirror/var /mnt/gentoo/var
# mount /dev/vgmirror/home /mnt/gentoo/home

Make the special directories to hold your temp file volumes (these require special permissions). Then mount your temp file volumes. Also mount your proc folder.

# mkdir /mnt/gentoo/tmp
# mount /dev/vgmirror/tmp /mnt/gentoo/tmp
# chmod 1777 /mnt/gentoo/tmp
# mkdir /mnt/gentoo/var/tmp
# mount /dev/vgmirror/vartmp /mnt/gentoo/var/tmp
# chmod 1777 /mnt/gentoo/var/tmp
# mkdir /mnt/gentoo/proc
# mount -t proc none /mnt/gentoo/proc

We are now ready to start installing Gentoo (chapter 5 in the handbook). Also see my previous post about CFLAGS, which might explain why I've chosen some particular settings. First, we need to extract the stage 1 tarball.
# date
# ls /mnt/cdrom/stages
# cd /mnt/gentoo
# tar -xvjpf /mnt/cdrom/stages/stage1-x86-20040218.tar.bz2
# ls /mnt/cdrom/snapshots
# tar -xvjf /mnt/cdrom/snapshots/portage-20040223.tar.bz2 -C /mnt/gentoo/usr
# mkdir /mnt/gentoo/usr/portage/distfiles
# cp /mnt/cdrom/distfiles/* /mnt/gentoo/usr/portage/distfiles/
# nano -w /mnt/gentoo/etc/make.conf

Now we need to configure the base compile options. Here's the content of my make.conf file (use at your own risk). Be sure to go look at 5.e. Configuring the Compile Options in the Gentoo Handbook. Also look at Gentoo USE flags and Gentoo Linux USE Variable Descriptions. I've set some very aggressive USE flags in my make.conf file (anything to do with graphics or multimedia since this is a headless file server) and I don't know whether it's proper to remove all of those USE flags yet. Note that even though the USE= line shown here is spread across two lines, it should be all one line in the actual make.conf file (the line break here is for visual clarity only).
CFLAGS="-Os -march=i586 -m3dnow -fomit-frame-pointer"
CHOST="i586-pc-linux-gnu"
USE="apache2 kerberos ldap -apm -gif -gnome -gtk -jpeg -kde -mad -mikmod -mpeg
-oggvorbis -opengl -oss -pdflib -png -qt -quicktime -sdl -truetype -xmms -xv"
CXXFLAGS="$(CFLAGS)"
MAKEOPTS="-j2"

Next we're ready to install the base system, see 6. Installing the Gentoo Base System in the Gentoo Handbook.
# mirrorselect -a -s4 -o | grep -ve '^Netselect' >> /mnt/gentoo/etc/make.conf
# cp -L /mnt/gentoo/etc/make.conf /mnt/gentoo/boot/make.conf-backupcopy
# cp -L /etc/resolv.conf /mnt/gentoo/etc/resolv.conf
# cp -L /etc/raidtab /mnt/gentoo/etc/raidtab
# cp -L /etc/raidtab /mnt/gentoo/boot/raidtab-backupcopy
# mkdir /mnt/gentoo/etc/lvm
# cp -L /etc/lvm/lvm.conf /mnt/gentoo/etc/lvm/lvm.conf
# cp -L /etc/lvm/lvm.conf /mnt/gentoo/boot/lvm.conf-backupcopy
# chroot /mnt/gentoo /bin/bash
# env-update
# source /etc/profile
# emerge sync

Synchronization of the portage tree will take a while (depending on the speed of your internet connection and how fast your system is). My system downloaded 60MB or so worth of updates and took 30-60 minutes (at a guess). Meanwhile, I'll continue this topic in my next post.

Labels: , ,

Gentoo Install 1 (VIA EPIA ME6000)
Going to rebuild my VIA EPIA Gentoo linux server. While the current setup was fine, I've decided that I want to switch to use a pair of matched 5400rpm drives and software RAID1. The configuration is identical to the old drive configuration, except that I'm now using a pair of 300GB 5400rpm Maxtor drives. The power draw seems to be well within the limits of the tiny power-supply in the Morex Venus 668 case.

I'm going to skip some of the initial information about my setup as all of that really hasn't changed. Shared video memory is still only 32MB instead of the default 128MB, and I've turned off all of the ports and devices that I'm not going to use (leaving only ethernet, firewire and USB ports active). I'm still using the Gentoo 2004.0 Universal CD as my bootstrap system.

Start by booting up the Gentoo Universal CD, as soon as you see the "boot:" prompt, enter the following command (which hopefully fixes the shutdown/lockup issue I had at the end of the last install):
boot: gentoo nohotplug

The LiveCD will then boot up, now load the md and dm-mod modules. Prior to loading these two modules, "/proc/mdstat" will not exist:
# modprobe md
# modprobe dm-mod

It's possible that your ethernet card on the VIA EPIA ME6000 will not be detected. To fix this, you'll need to load the via-rhine module by hand, and then reconfigure your network adapters.
# modprobe via-rhine
# net-setup eth0
# ifconfig

Create partitions using fdisk. I want a 64MB /boot partition, a 2048MB swap partition, a 2048MB root partition, and the rest set aside for LVM. Also see the gentoo install documentation section on preparing the disks.
# ls /dev/hd*
# fdisk /dev/hda

Command: n
Command action: p
Partition number: 1
First cylinder: 1
Last cylinder: +64M
Command: a
Partition number: 1
Command: t
Hex code: fd

Command: n
Command action: p
Partition number: 2
First cylinder: [enter]
Last cylinder: +2048M
Command: t
Partition number: 2
Hex code: fd

Command: n
Command action: p
Partition number: 3
First cylinder: [enter]
Last cylinder: +2048M
Command: t
Partition number: 3
Hex code: fd

Command: n
Command action: p
First cylinder: [enter]
Last cylinder: [enter]
Command: t
Partition number: 4
Hex code: fd

Command: p

Verify your configuration. My system had (4) primary partitions, with the first partition marked as active ('*' under the "Boot" column). Now, write the partition table to disk (be sure everything is correct).
Command: w
#

Repeat the above for the 2nd disk in the array (/dev/hdc in my case).

Create your "/etc/raidtab" configuration file (I used "nano -w /etc/raidtab", but other text editors will work).
# this config is for mirroring /dev/hda with /dev/hdc
# /boot (RAID1)
raiddev /dev/md0
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
chunk-size 32
persistent-superblock 1
device /dev/hda1
raid-disk 0
device /dev/hdc1
raid-disk 1

# *swap* (RAID1)
raiddev /dev/md1
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
chunk-size 8
persistent-superblock 1
device /dev/hda2
raid-disk 0
device /dev/hdc2
raid-disk 1

# / (RAID1)
raiddev /dev/md2
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
chunk-size 32
persistent-superblock 1
device /dev/hda3
raid-disk 0
device /dev/hdc3
raid-disk 1

# LVM (RAID1)
raiddev /dev/md3
raid-level 1
nr-raid-disks 2
nr-spare-disks 0
chunk-size 16
persistent-superblock 1
device /dev/hda4
raid-disk 0
device /dev/hdc4
raid-disk 1

# end of /etc/raidtab

Create the raid set(s).
# mkraid /dev/md0
# mkraid /dev/md1
# mkraid /dev/md2
# mkraid /dev/md3

If you get the error message: "raid_disks + spare_disks != nr_disks" when attempting to create any of your RAID sets, go back and verify your "/etc/raidtab" file as well as verifying your disk partitions. The RAID sets will build in the background and you should periodically monitor their progress using "cat /proc/mdstat". Another possibility is that you have set the "chunk-size" setting to be too small or too large (e.g. "chunk-size 4" did not work for me, but "chunk size 8" worked fine).

Since it's going to take 150 minutes to prep that last RAID volume, I'm going to pick this up again later. Data rate according to "cat /proc/mdstat" is around 30MB/sec, which is about what I'd expect for a 5400rpm PATA drive.

Labels: , , ,

LVM and Software RAID
Links:

Linux Logical Volume Manager (LVM) on Software RAID - explains the benefits of using LVM, and also how to get LVM running on top of software RAID in RedHat 8.

Google search for how to do software RAID and LVM

The Software-RAID HOWTO - Updated June 2004, so it's not an old outdated document. Also see the official URL.

Gentoo Server Project Wiki

Notes on Building a Linux Storage Server

Gentoo Forums: How to do a gentoo install on a software RAID - Thread was started back in July 2002, but there are posts as recently as June 2004.

Summary:

The "grub" boot loader needs a real physical partition to boot from (or possibly a RAID1 setup, RAID0 definitely doesn't work). What some folks do is to create (2) identical physical partitions on the drives in the RAID array, and then periodically copy from the primary partition to the secondary partition. (See gentoo forum post or search for user "wrex". Also look for posts by "hover" or "blake121666") The copy command is as simple as "dd if=/dev/hda1 of=/dev/hdb1 bs=8192b". You need to customize that to match your configuration (e.g. changing the source/dest partitions and the block size). If I can puzzle it all out, I think I'm going to go with /boot on RAID1. See section 7.3 of the Software RAID HOWTO

The "swap" partition should be mirrored. Otherwise when the disk fails, existing applications with data in the swap partition of the failed drive will die a nasty death. See section 2.3 of the Software RAID HOWTO.

There's a special set of command that need to be done in "grub" if you want to be able to pull the primary disk and still have the system bootable. (In case the primary disk would fail.) This is all mentioned in the gentoo forums thread about software RAID, look for the post by "hover". Alternatively, you can simply use a boot floppy and fixup the grub settings after the primary drive fails.

Replacing a failed disk in a software RAID1 array requires partitioning the drive by hand prior to rebuilding the array. This is also mentioned in the gentoo forums thread about software RAID, look for the post by "hover".

Searching around on my Gentoo 2004.0 Universal CD, I don't see the "mdadm" tools anywhere, but I do see "raidtools".

Labels: , ,

IMAP Service Providers
IMAP Service Providers: A Step in Dealing with Viruses, Spam, and Email Overload - A very in-depth listing of what companies provide IMAP mail service.

Labels:

Monday, June 14, 2004
Xenu's Link Sleuth
This is one of those indispensable tools for any web developer. Xenu's Link Sleuth is a nice, simple, easy-to-use, tool that will crawl any URL and report back on the status of every link. When it's finished you have the option to generate a report, or just look at the results grid.

(I used this on my site over the weekend... fixed a few dozen errors. So hopefully, nobody sees any broken internal links. External links, OTOH, I have not checked recently.)

Labels:

Monitor calibration
The Monitor calibration and Gamma assessment page - While not the easiest to use, I did get the best results using it. (I set my gamma to 1.8.)

Monitor Calibration and Gamma - The explanation is extremely technical (which is good), but the explanation of how to do the adjustments is a bit vague.

Now, on my Toshiba Tecra 9100, the S3 display utility has a gamma setting tool. The values that I ended up with were Gamma 1.00, Brightness 1.07, Contrast 0.65.

(Of course, this means I'll need to adjust my color scheme again. It now looks too gray, when I wanted a hint of blue. See my color space.)

Labels:

Saturday, June 12, 2004
More CSS 4-panel layout using DIVs (attempt #2)
1) Removed the "Main" DIV tag from the previous attempt. (Go look at BlueRobot.com's 2-panel layout.)

2) Decided that I liked the look of CSS Layout Techniques: for Fun and Profit where the side-bar menu is on the right, which allows text to fill the width of the window once you scroll down past the end. That looks more natural then a left side menu with a fixed left margin. Internet Explorer 6 also seems to like that layout a bit better. (BlueRobot.com's 2-panel right menu layout)

HTML and CSS (see what it looks like, short-body version):

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>CSS 4-panel layout example #2 (June 2004)</title>
<style media="screen" type="text/css">
body {
background-color: White;
color: Black;
font-family: verdana, arial, helvetica, sans-serif;
}
#TopNav {
background-color: Fuchsia;
padding: 2px;
}
#SideBar {
background-color: Gray;
float: right;
width: 15%;
}
#BlogBody {
background-color: Orange;
padding: 2px;
}
#Footer {
background-color: Purple;
clear: right;
padding: 2px;
}
</style>
</head>
<body>
<div id="TopNav">foo foo foo foo foo foo</div>
<div id="SideBar">sidebar<br>sidebar<br><br>sidebar<br>sidebar<br>sidebar<br>sidebar</div>
<div id="BlogBody">
blog body blog body blog
</div>
<div id="Footer">footer-copyright</div>
</body>
</html>

Bugs:

  1. [IE5/Windows]: It's possible that this layout will not work properly on Internet Explorer 5 for MS-Windows. I suspect that IE5's quirks won't really matter in this particular layout, but I have yet to test it.

Labels:

CSS 4-panel layout using DIVs
You'd think it would be simple, right? Oh, young grasshopper, you have much to learn! Actually, it's not all that bad, just tedious to get a layout up and running. Took me about 2 hours of trial-n-error, and looking at some of the sites over at the CSS Zen Garden. Finally, Michael Pick's blog entry about the CSS Zen Garden proved to be the most useful as I teased apart his DIVs and his CSS file. Mike's page was already close to the layout that I wanted and his CSS file was pretty simple.

My goal for this blog's design is a navigation bar across the top of the page, a footer at the bottom of the page, a side-bar containing links to the various archive pages, and a main body that is 80-85% of the page width. This is somewhat similar to the diagram under section 9.6.1 (Fixed positioning) of the W3C.org CSS2 Specification, except that I don't want to use fixed positioning.

First, start with the following HTML file (see what it looks like):

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>CSS 4-panel layout example (June 2004)</title>
<style media="screen" type="text/css">
body {
background-color: White;
color: Black;
font-family: verdana, arial, helvetica, sans-serif;
margin: 0px;
padding: 0px;
}
#Main {
background-color: Blue;
}
#TopNav {
background-color: Fuchsia;
margin: 0;
padding: 2px;
}
#SideBar {
background-color: Gray;
clear: none;
float: left;
margin-top: 0px;
padding: 2px;
width: 14%;
}
#BlogBody {
background-color: Orange;
height: 200px; /* must be larger then height of SideBar to fix IE6 glitch */
margin-bottom: 10px;
margin-left: 15%;
margin-top: 10px;
padding: 2px;
}
#Footer {
background-color: Purple;
clear: both;
padding: 2px;
}
</style>
</head>
<body>
<div id="Main">
<div id="TopNav">foo foo foo foo foo foo</div>
<div id="SideBar">sidebar<br>sidebar<br><br>sidebar<br>sidebar<br>sidebar<br>sidebar</div>
<div id="BlogBody">
blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body blog body
</div>
<div id="Footer">footer-copyright</div>
</div>
</body>
</html>

Yes, those are very ugly colors. But it does make it very easy to see where the various panels have ended up on the page.

Bugs:

  1. [IE6] If the content of the "BlogBody" DIV does not contain enough text to make the height of the DIV more then that of the "SideBar" DIV then the body panel will not display properly. The work-around is to specify a CSS height value that is larger then the height of the "SideBar" DIV. I'm going to play around with the design some more and see if I can get rid of that issue (and get the side-bar to be the full height of the page).

Labels:

CSS quick links
W3C's Cascading Style Sheets, level 2 CSS2 Specification - This is the source for what CSS is supposed to look like. However, it generally avoids talking about what is supported in the real-world.

EchoEcho.com's CSS Tutorial - Covers the basics of CSS.

Dave Raggett's Introduction to CSS - A short guide to using CSS on your web pages.

TopStyle - a CSS editor.

CSS Zen Garden, Resources - Pointers to various articles about CSS layout techniques.

From Table Hacks to CSS Layout: A Web Designer’s Journey - By Jeffrey Zeldman, it talks about the problems with the box model in IE5/Windows and other issues regarding layout.

The Layout Reservoir - A small collection of CSS layout designs by BlueRobot.com.

CSS Boxes - Problem & Workaround Set for a series of CSS Boxes going from a simple single box, through 3 columns with a full width top box, all with variations.

CSS Layout Techniques: for Fun and Profit - Catalog of cross-browser CSS layout techniques.

Labels:

Auditing tools for Windows
fsum by SlavaSoft - (free) Creates md5 signature files compatible with the md5sum command line tool (found on most unix/linux distros), but has the additional feature of directory recursion. The tool also supports other checksum/hash functions.

DumpSec by SomarSoft - (free) This tool used to be called DumpACL back in the days of Windows NT 4.0. It has been re-released to report on the newer ACL information in an Active Directory Domain (Windows 2000), plus it has the option to dump out lists of users, groups, policies, shares, registry ACLs, and a few more goodies. Output is either an interactive report viewer, a custom save file format, or various report file styles.

rsync - (free) While not strictly an auditing tool, rsync is useful for pushing/pulling log files off of a server onto a better protected server for long-term storage. The primary advantage is that rsync will only send the portions of a file that have changed, reducing transfer traffic. It also supports compression of the transfer and you can route the information through ssh for security. The version I use is cwRSync, which is a streamlined version of the Microsoft Windows port that doesn't require the full Cygwin application to be installed.

Labels:

Thursday, June 10, 2004
Installing cwRSync on Windows 2000
The instructions over at cwRSync's install page are a bit vague, so I'm going to jot down the steps that I use. These steps are for installing rsync in a server configuration. Since the install process needs to (optionally) create an user account and create a new service, you'll need administrative access to the machine that you are using. (I'm not sure whether members of the Power Users group have enough privileges.)

  1. Download cwRSync, open up the ZIP file, then extract/run cwRsync_x.x.x_Installer.exe.
  2. Answer "Yes" when asked if you want to continue with the install.
  3. Answer "Yes" when asked if you want to install cwRSync as a Windows Service.
  4. Specify the installation folder where you want to install cwRSync. My personal preference is "c:\bin\cwrsync" instead of the default since our servers already have various command line tools installed under c:\bin.
  5. Enter the account name and password of the local user account that you are going to use for the cwRSync service. It's a good idea to use a seperate account for the cwRSync service, but you may also specify an existing account name.
  6. The upload area can be set to anything. In fact, you'll probably be removing whatever you set here when you configure your rsyncd.conf file. For now, set it to be a sub-folder under where you installed the cwRSync executables to.
  7. Click the "Install" button. The installer will then create the folder where cwRSync is being installed to, (optionally) create the user account for the cwRSync service, and it will set restrictive permissions on the install folder so that only the service's user account has rights.
  8. That takes care of the basics. If you want, view the installation details prior to exiting the install program and cleaning up. Read the instructions on the popup dialog.

Next, we need to finish setting up the RSync service in Windows.

  1. Right-click on My Computer, pick "Manage".
  2. In the left panel, scroll down and open up the "Services and Applications" tree, then select "Services".
  3. Locate the "RsyncServer" service and double-click to open up the properties dialog.
  4. "General" tab: Change the "Startup type" setting to "Automatic".
  5. "Log On" tab: Re-type the password for the user account that you're using. Click the "Apply" button to save your changes and Windows will popup a notification that the user account has been granted the rights to logon as a service.
  6. "Recovery" tab: Change these to match your preferences. My personal preference is to restart the service on the first two failures, do nothing on subsequent failures, reseting the fail count after 1 day and restarting the service after a delay of 30 minutes.
  7. Click "OK" to save and exit.
  8. Don't start the service yet, the rsyncd.conf file needs to be configured first.

You need to configure the rsyncd.conf file and set up your first "module" (a.k.a. a share path). Find your rsyncd.conf file (it's in the folder where you installed cwRSync to) and open it up in a text editor (NotePad works). Now, go read the official rsyncd.conf help page. Read it twice if it's your first time, because it's possible to put a very large gaping security hole into your setup if you're not careful. The default settings at the top of the file are fine, but you may wish to change the "hosts allow = *" to "hosts allow = (your client machine IPs)" as a preventative first step. Then, even if you screw up the other security mechanisms, you've at least limited which IP addresses an attacker can base an attack from. (You can test this by telnet'ing to port 873 and seeing whether the rsync service drops your connection.)

Next, we need to start setting up "modules" in the rsyncd.conf file. "Modules" are basically the same concept as a Windows share, except that you have to use rsync to access the files within the "module". Ignore what it says on the cwRSync install page about rsync modules having to be sub-directories under the cwrsync folder. If you grant correct directory permissions to the cwRSync service account, then the service daemon will be able to read or read/write to the target folders without problems.

The default module installed is called "test". Go ahead and comment it out with '#' symbols and save the file. From my (limited) testing, it does not appear to be necessary to restart the rsync service in order for it to see changes in the rsyncd.conf file.

[test]
path = /cygdrive/c/cwrsync/data
read only = false
transfer logging = yes


There are two basic ways to use rsync and this will affect how you grant permissions to the rsync service account.

The first is a read-only ("pull") setup, where the clients can only pull files from the rsync server. The rsync service account should only have Read & Execute / List Folder Contents / Read permissions for the folder tree that you are going to publish. In addition, when you setup your module in the configuration file, you should specify "read only = true" as a setting.

The second is a "push" setup where clients are writing changes to the rsync server. The rsync service account will require "modify" permissions for the shared directory tree. Under your module configuration section in the rsyncd.conf file, a "push" setup must have "read only = false".

Now, for every directory tree on the rsync server that you wish to share, create a new module section (e.g. "[logs]" or "[web]" or "[joes_backup]"). Verify that the cwRSync service account has proper permissions to the file system tree. Then add the following options (at a minimum) below the module section name:

[joes_backup]
path = /cygdrive/e/backup/joe
read only = false


That allows any client who manages to authenticate with the rsync service to write the E:\Backup\Joe on the rsync server. That is not exactly secure and you should take additional steps to lock it down through the use of "hosts allow", "auth users", "secrets file" and perhaps ssh. Securing your box is a bit beyond the scope of this post. It's also a bit beyond my experience level since I'm just getting started with rsync.

(Update: See Securing cwRSync.)

Labels: ,

Wednesday, May 19, 2004
Tyan Trinity KT400 S2495 Performance
Now that the new motherboard is finally bedded in, it's rather enjoyable compared to the old motherboard. On the old box, the fastest that the Promise FastTrak100 TX2 RAID1 array would every transfer data was around 5-6 MB/s, if it was feeling perky. Sometimes it would degrade down to only 2-3 MB/s. (However, I blame a lot of that on the PCI Latency issue.)

Same drives, same RAID card on the new motherboard easily handles data rates upwards of 20 MB/s, copying from point to point on the drive usually averages 10-15 MB/s. And I've seen peak rates of 30 MB/s. As a comparison, my video cap box with a 5400rpm ATA/100 drive and a 7200rpm SATA/150 drive can hit 32-36 MB/s when copying video files from one drive to the other.

So, even with all of the nuisance of getting everything installed properly, it seems to be performing up to expectations and is turning out to have been worth it.

Labels:

Saturday, May 15, 2004
Tyan Trinity KT400 S2495 (part 2)
More fun with the Tyan Trinity KT400 S2495 board. While attempting to add the Promise FastTrak TX2 PCI RAID card, everything is happy until I go and connect drives to it and define an array. After that, if the Adaptec 2930CU PCI SCSI card is also installed, the system will not boot. I have the HighPoint HPT372N IDE RAID ports disabled in the BIOS and the Silicon Image Sil3112 SATA ports enabled.

Symptom of the boot issue is that the Sil3112 BIOS splash will not appear during the boot process. System will then hang before or at the ESCD/DMI update point (right before it boots from a device).

1. (neither TX2 or 2930CU) = boots
2. TX2 only = boots
3. 2930CU only = boots
4. 2930CU + TX2 = won't boot

Once I remove either of the TX2 or the 2930CU cards, things work fine. I've stripped the 2930CU card, hooked the primary drives up to the TX2 RAID, left the scratch drive hooked to the Sil3112 SATA ports, swapped the SCSI CD-ROM / tape drive / zip drive for an IDE CD-RW and an IDE DVD-ROM/CD-RW drive.

During the install of Windows 2000, I hit F6 during the boot and install both the TX2 and the Sil3112 drivers. This will avoid the issue where the boot order of the drives changes later when I add the Sil3112 driver post-install.

(I've lost count... this is something like my 6th attempt at getting Windows 2000 up and running on this motherboard.)

Update: Everything looks fine so far, my first test copy of 2GB worth of data checked out okay with the MD5 tool (copying from the network to the TX2 RAID array as well as from the network to the Sil3112 SATA scratch drive). Got everything patched and I'm now copying live data files onto the array.

Tyan Trinity KT400 (S2495)
3x512MB PC2100 RAM
AthlonXP 1800+ CPU
Promise FastTrak TX2 PCI IDE card
2x250GB 7200rpm drives, 8MB cache (o/s)
Silicon Image Sil3112 SATA ports (built-in)
200GB SATA 7200rpm drive (scratch)
IDE CD-RW
IDE DVD-ROM/CD-RW

Labels:

More trouble with the Tyan
So... this is definitely a taxing of my patience when installing hardware.

The latest problem is that if I copy a file from the network to the HighPoint RAID 1 array... it gets corrupted. (Using a MD5 tool to verify content.) However, if I verify the file up on the network server, it's correct. And copying it to the SATA scratch drive, it copies cleanly.

So I'm at a bit of a loss at the moment (and running MemTest86 while I ponder).

Off-hand, plan B is to make sure I have the latest and greatest BIOS installed (if I can get the BIOS to install, unlike my last attempt). I'm sure I have the latest drivers, but I'll double-check that again in the morning.

Plan C is to ditch the highpoint RAID and try the Promise IDE RAID card again.

Plan D would be to buy a 3Ware 2-port PATA RAID card.

Update: Well, I went with plan E. In a few places on the web I read 2 things.

1) HighPoint BIOS, when included on the motherboard, is generally not user-updatable. Instead, it's part of the mainboard's BIOS and thus updated when you update the motherboard's BIOS.

2) The driver version that you use should match that of the BIOS. I have a HPT372N with 2.345 of the BIOS. However, I was attempting to use 2.351 of the Windows 2000 device driver. Updating the driver to 2.345 (yes, Windows will actually say the older version is a better match) seems to have fixed the issue.

So right now, it looks like the data corruption bug is fixed. (It only affected files copied to the drive from another server, not the service packs that I installed from CD or from the web site.) Needless to say, I'll be doing some more testing with wxChecksum (MD5 utility) to verify that stuff is copying down correctly.

Update #2: The system is still corrupting files that as they are written to the HighPoint RAID array. Especially when the system is under load, copying files to both the HighPoint and the SATA drives at the same time. Copying from the network to the SATA drive works properly, but copying from the network or the SATA to the IDE RAID causes data corruption.

I'm now going to remove the HPT from the BIOS, and put the drives back on the Promise FastTrak100 TX2 IDE RAID card.

Labels:

Thursday, May 13, 2004
Tyan KT400 Windows 2000 Boot Failure
So after installing the 2nd round of patches (first round of patches was installing SP4 using WindowsUpdate), the system fails to start:

Windows 2000 could not start because the following file is missing or corrupt:
(something)\System32\Ntoskrnl.exe
Please re-install a copy of the above file.

Possibly, this is error 319011 from Microsoft, which indicates a corrupted BOOT.INI file. Could be, since I just installed the driver for another disk in the system (hooked up a scratch drive to the SATA interface) in the previous WindowsUpdate. This may have knocked my IDs around so that the BOOT.INI file is no longer correct.

Steps that they say to use, but which did NOT work for me:
1. Boot the Windows 2000 install CD
2. [F6] to load the device drivers for the boot array (in my case, HighPoint HPT372N IDE RAID).
3. Get to the point where you can pick [R]epair.
4. Choose [C]onsole, which should dump you at a command prompt (after you enter the local Administrator password).
5. Rename the existing BOOT.INI file in the root of C:, then copy a good BOOT.INI file off of a floppy.
6. Verify that the correct NTBootDD.SYS file exists (troubleshooting), and is in the correct place on the boot drive. (Only if you have SCSI drives that you're attempting to boot from, the HighPoint RAID doesn't seem to use the NTBOOTDD.SYS file.)

Here's what my broken BOOT.INI file looks like (see 102873: BOOT.INI and ARC Path Naming Conventions and Usage):

[boot loader]
timeout=30
default=multi(0)disk(0)rdisk(0)partition(1)\WINNT
[operating systems]
multi(0)disk(0)rdisk(0)partition(1)\WINNT="Microsoft Windows 2000 Server" /fastdetect

Here's what my recovery BOOT.INI file looks like (note the change on the default= line, and the addition of a second multi(x) line under [operating systems], also note the long timeout value):

[boot loader]
timeout=120
default=multi(1)disk(0)rdisk(0)partition(1)\WINNT
[operating systems]
multi(0)disk(0)rdisk(0)partition(1)\WINNT="Microsoft Windows 2000 Server" /fastdetect
multi(1)disk(0)rdisk(0)partition(1)\WINNT="Microsoft Windows 2000 Server (HPT372N)" /fastdetect

Unfortunately, no matter what combination of scsi(x) or multi(x) I tried at the start of the line, or putting the driver file on the boot diskette (renamed as NTBootDD.SYS) would get me past the wonderful "Could not read from the selected boot disk" error. At one point, I had a boot diskette with (8) different combinations of boot lines that I had tried.

Hint: The boot diskette is great for testing out BOOT.INI changes, it boots up quickly compared to waiting for the system to boot.

I'm going to try plan B, which is to reinstall... but during the Setup CD when I hit F6, I'm going to install both the HighPoint and the Silicon Image drivers. That way, setup will see all of the disks in the system during the initial install and will hopefully write out a correct BOOT.INI file.

Labels:

Tyan Trinity KT400
So I'm finally ditching the very troublesome Asus A7V266-E motherboard in my one file server. I went with the Tyan Trinity KT400 because it was relatively inexpensive and I was able to simply move my AthlonXP 1800+ CPU and my 512MB PC2100 memory modules over to the new motherboard.

The old A7V266-E is a VIA-based chipset that was notorious for problems with the PCI bus (search around for KT266 and PCI latency, or check the PCI Latency Patch page). (And that's on top of the issue that the A7V266-E Promise FastTrak100 Lite only supports 127GB and smaller drives.) For me, it manifested itself as an incompatibility with my add-in Adaptec USB card. Anytime I had activity on the USB bus, the entire machine would halt for a few seconds at a time. Extremely annoying and the only way that I got around it was to not install the Adaptec 3100 USB PCI card.

So, getting a new motherboard should be easy right? Ha ha ha ha ha ha! (picks self back up off of the floor)

Well, before I start into the problems encountered... first I want to point you at what Tyan does correctly. They make a very pretty manual. When showing you the diagrams of various jumpers / pin-outs on the motherboard, they use very large text to draw your eye to the proper portion of the diagram. It's rather well done, and makes it easy to flip through the book looking for, say, the FAN3 pin-out location. Secondly, their motherboard includes the little 2-digit hexadecimal LED that shows you where the boot process is. Usually, you have to buy an inexpensive add-in card (and I'm not even sure you can get them for PCI?).

Unfortunately, one of the things that they do poorly is the presentation of device drivers for their products. Most motherboard manufacturers have a dedicated search engine, or a seperate page for each product. And each page points to a local copy of all of the BIOS files, device drivers and manuals. Tyan, OTOH, has a single page for BIOS, a single page for all their motherboards, a single page for RAID adapters (for all motherboards). Worse, they rely on external websites to supply some of the drivers (e.g. their link to the VIA 4-in-1 driver set is useless since VIA has rearranged their site). That's a real problem if you're not a grizzled vetran of DIY PC building.

So let's see... first issue is that the battery that came with the motherboard appears to be dead. Removing power from the motherboard causes the BIOS clock to reset to the default of Jan 1 2003. Replacing that was easy, I just stole the CR2032 battery from the old motherboard.

Next up, I plugged the (2) 250GB 7200rpm WD drives into the motherboard IDE RAID (HighPoint HPT372N), booted up on the SCSI CD-ROM and attempted to install Win2000 server. Nada... tried 3x, with re-formatting the disk each time in case of read errors, but after you load the driver, Win2000 setup cannot see the HTP RAID that I had configured.

Okay, plan B, put a Promise FastTrak100 TX2 card in, hook the drives to that... spend another few hours setting up the drive array. Now, the system refused to boot. It gets to the ECSD/DMI portion (where it's setting up the PCI slots, figuring out what's where), but will not boot from the floppy or the CD-ROM. Pull the Promise card back out, system boots up on the floppy or CD-ROM without a problem... put card back in, nada. Updated the Tyan Trinity KT400 motherboard BIOS from 1.02 to 1.05, with no change in the situation. The motherboard will not boot with the Promise card installed, regardless whether the onboard IDE RAID is enabled or disabled.

My next plan is now to try the HighPoint RAID again, possibly updating the HighPoint RAID BIOS (once I wait a few hours for the HighPoint array to finish duplicating itself again, my estimate is it takes 4-5 hours to create the 250GB array, roughly 45-60GB/hr). Note, when trying to figure out which of the BIOS files to load into memory (e.g. my disk has BIOS\3xxv235.p4e, BIOS\3xxv235.p5e, and BIOS\3xxv235.p6e), refer to the README.TXT file on the root of the floppy. Under section 2, there is a file listing that will tell you which BIOS file goes with which controller chip. (e.g. I have a HPT372N, so I use the .P5E file)

Next error (loading the HPT BIOS). I run the LOAD.EXE file, enter the BIOS file name (3XXV2351.P5E), it then errors out with:

A:\BIOS> LOAD
Please input BIOS image file name: 3XXV2351.P5E
Found adapter at bus 0, device 14
No loadable EPROM found
Try '-i' option

A:\BIOS> LOAD /I 3XXV2351.P5E
Found adapter at bus 0, device 14
No loadable EPROM found
Found adapter at bus 0, device 14
No loadable EPROM found

Hmm... oh, wait... looks like the P4E file is also for the 372N chip.

A:\BIOS> LOAD 3XXV2351.P4E
No supporting host adapter is found

Okay... (drums fingers on desk), eh, forget it for now. I have v2.345 already, which is reasonably up-to-date. And this time, the Win2k install seems to have found the partition correctly (only basic difference between now and when it didn't work last night is the motherboard BIOS revision update from 1.02 to 1.05).

Created my 16GB C: partition, and I'm off and installing. Later, I get to test my recovery strategy (going to try to restore the system state through a non-authoritative restore).

Labels: