Archive for the ‘Gentoo’ Category.

Speex causing Asterisk headaches

Many people already know that I dabble with Asterisk on a daily basis. Our Linux distribution of choice here at work, also happens to be Gentoo.

Now, when updating Gentoo’s ‘world’ package base, you do get some problems occassionally. This is a downside to being ‘on the cutting edge’ and, it’s obviously no wonder why distributions such-as Ubuntu, Red Hat and SuSe, stick to well-tested release schedules.

Recently, after a well over-due profile update (from 2006.0 to 2008.0) and the following emerge -av –newuse –deep world command, Asterisk simply stopped working. No warning, and it took a while for me to notice.

Once I had noticed, it became apparent that something was really quite awry. Asterisk wouldn’t start via the init script (with has a seemingly immortal, and hideously annoying process), nor via just calling the executable. I eventually realised, with the help of this bug report and the /var/log/asterisk/full logfile, that Asterisk was failing to find the speex modules it required.

Long story short, as per the bug report, you need to downgrade (and mask for good measure) speex to 1.1.12 to retain functionality on anything older than a January/Febuary release of Asterisk 1.4.x. :(

The fact that Portage still only has Asterisk 1.2.x, means that unless you switch to using the voip overlay, you’re going to find this issue will affect you.

Oh Gentoo, how I love and hate you!

m0n0wall under VMWare Server

Recently, the company I work for has been looking for a way to simulate poor or slow connections to the server component of their next software product. I’m sure you’d all agree that it’s important to know how well your server compensates for those with less than ideal Internet connectivity, so we decided to try out the traffic shaping facilities provided by m0n0wall.

Now, of course we have Cisco routers and the like, but setting them up for something like this could take ages. With m0n0wall, it’s all web interface, and therefore is far more intuitive. So far I’ve not got around to setting up the actual connectivity, but I thought I would make a post covering the nuances I experienced whilst installing it in a virtual environment.

Now I find working with VMs is fantastic in many respects, though it can sometimes be infuriating to not have physical hardware to work with when something isn’t right. Thankfully, I’ve recently learnt that diagnosing network faults/complexities surrounding VMs and their bridges, is as simple as checking out the contents of /proc/vmnet/ (under Linux 2.6.x – don’t ask me about Windows!) and digesting the information contained there-in.

As it turns out, I thought my host’s bridge configuration was wrong. After many different tests, and lots of cat’ing files within /proc/vmnet, I realised that I wasn’t doing anything wrong (well, maybe not directly.. See below) with my host configuration. The ‘default’ NIC chipset used by VMWare Server is covered by the AMD PCNET32 driver. Under Gentoo/Windows this works great, however, it wouldn’t bring up the correct interfaces within m0n0wall. Cue my helpful colleague, who suggested that adding a EthernetN.virtualDev = "e1000" line to the .vmx file; one for each NIC you have configured. In my example, this was simply a case of replicating the line for both ‘Ethernet0′ and ‘Ethernet1′ (but you may very well require a third NIC.)

Once m0n0wall was booted with the e1000 (A.K.A. Intel Gigabit) NICs, each one was described as ‘up’ immediately. But unfortunately, it was apparent that something else was also going wrong with the way I had chosen to setup the VM itself.

What really threw me was the inability of m0n0wall to intelligently use the hardware I provided it with. I configured a fairly generous machine (given the slim system requirements for m0n0wall) which included 96MB RAM and a 512MB IDE virtual drive. I mounted the CD ISO, and provided added a Floppy Disk drive with a blank disk.

However, even with a blank virtual floppy inserted, there was no provision for formatting this – and no complaint when m0n0wall attempted to save it’s configuration and couldn’t. On top of that, why was it preferring the FDD over the HDD? Granted; neither was formatted, but I didn’t think it would be too hard to provide some warning or message regarding your choice of non-volatile storage.

So, I needed a formatted virtual floppy image. I couldn’t help think how annoying the following options would be:

    Power off a (live) Windows guest to add an FDD
    Re-compile a Gentoo VM for FDD support, and add the FDD whilst it’s off
    Install VMWare on my workstation so I could install XP/Gentoo, just to format an FDD

How about no.

In the end I opted for using a Gentoo (minimal) live CD on the VM I had already prepared for m0n0wall. Using links (yep, the bridging worked just fine) to obtain the latest CF card .img file, I issued the command:

gunzip -c generic-pc-x.xxx.img | dd of=/dev/hda bs=16k

Which unpacks the image and copies it to the block device (no need to format, or partition.)

After that, I abandoned the idea of the CDROM/FDD combo, removed their devices from the VM configuration and started it up again. Of course this time it booted straight from the image on the HDD, and coupled with the ‘new’ NICs, everything began working as it should. I had a web interface in no time at all. :)

My words of wisdom? Forget using the CD ROM ISO image! Another little tip I could offer, would be to edit the scsi0.present = "TRUE" attribute in the .vmx file to read "FALSE". You don’t need it, but for some reason you can’t disable it from the VMWare console. It helped to speed up m0n0wall’s boot time, due to removing the need to ‘wait at least 15 seconds for SCSI devices to settle’. ;)

Over-all, I’m quite liking it so far. I may write again soon with some reflections. In the mean time, where did I put that WRAP box? ;)

The woes of incompatible RAID hardware

So in the last few weeks, we’ve been wondering why all these sets of new 500GB hard disks have been degrading in their RAID-5 array, merely hours after being created. Which was also around the time that I’d finished setting up Gentoo, frustratingly.

Eventually it was decided that instead of making frequent trips out to the co-location facility, we’d bring the server in to HQ for some diagnosis. As after three completely different sets of brand-new 500GB disks, purchased from two separate manufacturers, had all exhibited the same behaviour – it was almost undoubtedly a sign that something else was causing the RAID arrays to degrade.

The RAID card used in this particular server is an 8-port 3Ware 9500S. It’s been reliable in the past, and never exhibited a single issue up until the point we began replacing disks with larger, newer models. I even took it upon myself to strip all SATA cables from the machine and replace them with un-used items. Of course, this was a long shot and it made no difference either way (but at least I’d ruled it out).

Now I don’t know who forgot to check – but this particular model of RAID card does not specifically support SATA-II disks. Of course the first ideal that springs to mind is backwards-compatibility; if a SATA-II disk is plugged into a SATA-I port, one would expect it to automatically run at the slower rate (much like PATA of old.) Though as it turns out, that’s just not something you can assume.

So after much head-scratching, Googling and more Googling, I thought it would be worth adding jumpers to the rear of the drives in order to forcibly limit the drive to 1.5Gbit/sec. I’d like to point out that nothing I found on-line, written by either of the two disk manufacturers (Seagate or WD in this instance) mentioning that the jumper limits any other feature of SATA-II – it’s just a speed lock. Kudos to whoever it was that wrote the 3Ware 9500S Wikipedia article (which has been since been deleted), as it proved to be a rather good muse.

For the fourth time I recreated the arrays and began installing Gentoo. After two days of installing and configuring Gentoo, followed by roughly 18 hours of I/O stressing by bonnie++, the RAID card hasn’t skipped a beat. I may be tempting fate by writing about this so soon, but I’m fairly happy to say that I’ve cracked the problem of why our server’s root file system was degrading before it was even fully initialised.

So, how do you use SATA-II disks with a SATA-I 3Ware RAID controller? Forcibly limit the speed to 1.5Gbit/sec, and it all works like it should. As I didn’t find a specific answer to this question on-line, I’m hoping that this may be of some use to others out there who may be struggling with upgrading their 9500S arrays. By all means, please let me know if it has! :)

Update, 22-07-07: The array is still working a charm, so in the words of Borat – “Great Success!”. Watch it fail now! :P

You’ll never guess what..

Yep, a second lot of Seagate drives has failed in the server that I mentioned a few days back

It’s got me wondering if the drives aren’t the problem here. It’s two different series of drives, albeit from the same manufacturer, though the second lot were the more expensive line.

But either waywe’re going to UKS to check out the machine, and make sure it was built properly. Swap wires around, generally just have a nose and a check, and then install the third (and hopefully final) set of drives. Which are almost-certain to be Western Digitals. We’ve already got some of their RE-series drives working in other machines, and they’ve been fine. It’s a good job this server is only used for DRBD back-ups of VMs!

So another trip to UKS. If these drives fail, we’ll be worrying I think. :(

A fortnight of ups and downs (literally)

It’s been two weeks (roughly) since I started my placement, and there has been quite a bit to learn.

My first week has given me time to get to grips with my new environment, and also gave me a steep introduction into the systems that I’m going to be responsible for in the forseeable future. That is, a large amount of Gentoo co-located servers (some running VMs, which in turn run the services, and some caring for VM back-ups via DRBD) and an Asterisk PBX, which itself is actually hosted on a Gentoo server.

Due to.. Something… Wonderful, there isn’t a network/server map in sight. So all of this was pretty confusing to start with, and ust trying to envision the layout of the entire system from one person’s explanation was very difficult indeed. Thankfully now, I’m pretty well-versed, but there are still some aspects I’m yet to grasp. Though it shouldn’t take long now..

What is particularly interesting is the variance of the work. In the early days I was doing rudimentry work; changing passwords on the 3Ware RAID card web interfaces to something new. This was fine until I realised that the passwords had an 8-character limit, and thus the passwords I had picked were far too long. This wasn’t noticed until after I’d been through almost all of the RAID cards… Even the MD was laughing from his office when that hit the e-mails.

On the other hand, last thursday myself and Jon had to visit UK Solutions to replace some hard drives in a server, and remove the aging RedHat install on that machine, along with another, in favour of shiny new Gentoo installs. Unfortunately the phrase ‘Shiny Gentoo install’ is somewhat of an oxymoron; it’s anything but shiny, glossy, or any other descriptive word for ‘pretty’. We’re talking about a purely CLI installation. Any harder and it’d be Linux From Scratch

But if it wasn’t hard, it’d be boring! And it’s definitely a nice change to get out of the office and into the .. frying pan. Anyone who’s had to sit between the arse-end of two server racks will understand my synonym. Cool, it most-definitely is not. But I had a job to do, and that was to install Gentoo onto the new RAID-5 array of recently-purchased 500GB Seagate drives.

The basic installation of Gentoo is by-far the most confusing – there’s a reason the install manual has probably been re-written 5001 times in order to make it simpler for first-time users, and anyone who remembers their first attempt at a Gentoo install will be nodding violently right about now. Thankfully once that’s done (and we head back to HQ in search of a more comfortable, ssh-supported server configuration environment) things do get easier. You get used to simply typing ‘emerge application-name‘, editing the necessary config file (or stealing it with scp from a working server ;) ) and using the necessary init.d script to restart the daemon.

Sounds groovy, but let me take you back to the RAID-5 configuration that I was working on in those two fun-packed days. RAID-5, for the unitiated, provides a fall-back in server environments in the event that one of your drives fails. For instance, in a three-drive RAID5 array of 500GB disks you will have a striped array of 1000GB in total, with one third of each drive taken up by parity data. The downside being that you don’t get the full 1500GB to use. The upside is if any one of the drives in the array fails, you can connect a new drive, and the data on the lost drive can be ‘re-constructed’ from the parity data contained on the remaining two drives. Our nice 3Ware RAID cards also have support for ‘hot spare’ drives, so if one drive fails – the controller can bring in the fourth drive and re-construct the failed drives’ data without anyone having to physically visit the machine.

There are two problems with RAID-5:

  1. It’s slow. Having to write parity data with each write is hugely taxing on write speed. Reads are generally similar to that of RAID-0 though.
  2. If two drives fail at once, you’re screwed. There’s not enough parity data contained on the remaining drive to construct two failed drives. RAID-6 was implemented to cover this, but isn’t widely used. There’s two lots of parity to write, and THEN you need more disks (4) for only the same volume size.

Anyway, what happens just as I’ve finished configuring my first Gentoo server? Yep, one of the drives drops out of the array! The RAID card attempted to replenish the array from the hot spare, but low and behold … THAT FAILED TOO! :(

TYPICAL.. So those drives shall be going back under RMA, and we’re spending a bit more on some Seagate enterprise-class drives (which we should’ve had in the first place really, but then even desktop drives shouldn’t have failed that prematurely!) which shall then require another trip to UKSolutions in order to install them! Oh, and ANOTHER Gentoo install. Plus a few more. And we have to re-wire the internal LAN switch, which is going to be a massive job due to the lovely job of cable tidying that UKS has done.

I think I’m having fun! :) </geek>

More another time, I think. :)