Archive for June 2007

< CME 4.1

Well, I think I might have worked out why we’re having transfer problems from external numbers, to a CME SIP phone via a normal CME SCCP phone on full-consult…

Only CME 4.1 supports the command set ‘voice register dialplan‘ – CME 4.0(2) doesn’t have such a command.

And there’s only one IOS that comes with CME 4.1, available for a 2811. And it’s an XJ3 release.. Is that more unstable than a T train? Probably, but I think I’m going to have to test it anyway!

Thankfully our second 2811 isn’t being used for anything other than a few SCCP phones at the minute. :)

The woes of incompatible RAID hardware

So in the last few weeks, we’ve been wondering why all these sets of new 500GB hard disks have been degrading in their RAID-5 array, merely hours after being created. Which was also around the time that I’d finished setting up Gentoo, frustratingly.

Eventually it was decided that instead of making frequent trips out to the co-location facility, we’d bring the server in to HQ for some diagnosis. As after three completely different sets of brand-new 500GB disks, purchased from two separate manufacturers, had all exhibited the same behaviour – it was almost undoubtedly a sign that something else was causing the RAID arrays to degrade.

The RAID card used in this particular server is an 8-port 3Ware 9500S. It’s been reliable in the past, and never exhibited a single issue up until the point we began replacing disks with larger, newer models. I even took it upon myself to strip all SATA cables from the machine and replace them with un-used items. Of course, this was a long shot and it made no difference either way (but at least I’d ruled it out).

Now I don’t know who forgot to check – but this particular model of RAID card does not specifically support SATA-II disks. Of course the first ideal that springs to mind is backwards-compatibility; if a SATA-II disk is plugged into a SATA-I port, one would expect it to automatically run at the slower rate (much like PATA of old.) Though as it turns out, that’s just not something you can assume.

So after much head-scratching, Googling and more Googling, I thought it would be worth adding jumpers to the rear of the drives in order to forcibly limit the drive to 1.5Gbit/sec. I’d like to point out that nothing I found on-line, written by either of the two disk manufacturers (Seagate or WD in this instance) mentioning that the jumper limits any other feature of SATA-II – it’s just a speed lock. Kudos to whoever it was that wrote the 3Ware 9500S Wikipedia article (which has been since been deleted), as it proved to be a rather good muse.

For the fourth time I recreated the arrays and began installing Gentoo. After two days of installing and configuring Gentoo, followed by roughly 18 hours of I/O stressing by bonnie++, the RAID card hasn’t skipped a beat. I may be tempting fate by writing about this so soon, but I’m fairly happy to say that I’ve cracked the problem of why our server’s root file system was degrading before it was even fully initialised.

So, how do you use SATA-II disks with a SATA-I 3Ware RAID controller? Forcibly limit the speed to 1.5Gbit/sec, and it all works like it should. As I didn’t find a specific answer to this question on-line, I’m hoping that this may be of some use to others out there who may be struggling with upgrading their 9500S arrays. By all means, please let me know if it has! :)

Update, 22-07-07: The array is still working a charm, so in the words of Borat – “Great Success!”. Watch it fail now! :P

You’ll never guess what..

Yep, a second lot of Seagate drives has failed in the server that I mentioned a few days back

It’s got me wondering if the drives aren’t the problem here. It’s two different series of drives, albeit from the same manufacturer, though the second lot were the more expensive line.

But either waywe’re going to UKS to check out the machine, and make sure it was built properly. Swap wires around, generally just have a nose and a check, and then install the third (and hopefully final) set of drives. Which are almost-certain to be Western Digitals. We’ve already got some of their RE-series drives working in other machines, and they’ve been fine. It’s a good job this server is only used for DRBD back-ups of VMs!

So another trip to UKS. If these drives fail, we’ll be worrying I think. :(

That Cisco way

So in the last few days I’ve been playing around with a new Cisco 2811 and CallManager Express 4.0. One thing that’s bugged me hurrendously, ever since I began working with CME, is the generation of config files. When I first started my job, the phones were running from an Asterisk PBX which also hosted the TFTP server (naturally), and when we first began experimenting with CME 3.3 the TFTP server was kept in place upon Asterisk for simplicity and testing speed. However, we’re hoping to move all the phones to CME now, so the TFTP server had to be there too.

The problem is, according to any documentation that you read about creating individual configuration files for SCCP (or SIP) phones on CME, all you’ll be told is:

RouterX#(telephony-service) create cnf-files

And that’s it. But if you dig a little deeper, you’ll find that storing the cnf-files in the system: directory is the default (unbelivably stupid, as I’d imagine it just rapes your router’s RAM). So to remedy this:


RouterX#(telephony-service) cnf-files location [ flash: | tftp: | etc. ]
RouterX#(telephony-service) cnf-files perphone
RouterX#(telephony-service) create cnf-files

And then you should have something that works. But does it? Well, upon running create cnf-files, if you’ve changed it, the localised tone file for your network locale should have been generated and placed into flash, but there won’t be configuration files for each registered phone until you enter another command:


RouterX#(ephone) type [ 7960 | 7970 | etc. ]

And that must be applied for each and every ephone (note: CME allows you can use an ephone-template to achieve the same effect across multiple ephones) before an individual configuration file is generated. It makes sense now that I write this, as phones would inevitably require different configurations, but you try finding any relevence to this in Cisco’s documentation. I wonder how many voice engineers blindly set this value as good practice, not realising that it’s required for perphone configuration?

Meh, it keeps me occupied! :)

All said and done however, I have to say there is a lot more SIP functionality in this newer release, compared to the older 3.3 release I was originally using. Notably the support for SIP MWI is there (though I’m not sure if that might’ve not worked in 3.3, we might’ve just been doing something wrong) and it works a treat. Come to think of it, there’s been a huge improvement in SIP support across the board. This is more than likely because Cisco is moving to phones which support SIP natively (the SIP firmware for 7960‘s is very lacklustre if you wish to have any advanced features from your phones) and that is the challenge for the next few days.

Again, however, Cisco felt it necessary to keep SCCP and SIP phone configuration completely seperate in the IOS. Whilst those familiar with SCCP will feel quite at home in the ‘telephony-service’ sub-mode, you’ll notice when configuring CME to register SIP phones, all the commands are spread across a multitude of sub-menus that all begin with ‘voice’. No telephony-service in sight!

..Wouldn’t it have been easier to create a ‘sip-telephony-service’ sub-mode, Cisco? :/

On top of all this, I’m starting to think we might just need CME 4.1. 12.4 XJ releases are the only IOS I’ve heard of which carry this version, and it doesn’t appear to be on the feature selector – so finding this mythical beast might be a bit harder. Cisco do list a lot of SIP commands as being only supported in CME 4.1 though, so I guess we’ll have to wait and see where the limits are!

Finding the connection

It’s taken a little time, but this Saturday saw the installation of a BT land line in my flat. It’s not going to be used for calls though, no, this is purely for Broadband. And that means, the best broadband I can lay my hands on.

I had originally anticipated ordering a connection from BeBroadband, which would give me an ADSL2+ speed of up to 24Mbit/1.4Mbit on a 3-month contract, with no fixed download limits and all for £25/month. There was a £25 connection fee but that included their ‘BeBox’ ADSL modem, so not at all unreasonable.

But to my utter disappointment it appears that they haven’t enabled my exchange. The centre of Birmingham – B1 for crying out-loud – and they haven’t bothered! Here at work we have a fantastic Be connection, as does a colleague (or two) of mine. All located in other parts of Birmingham. Yet I’m not eligible, apparently? Considering where I live, and what sort of area it is, it makes little to no sense.

So I had a look at SamKnows, whom state that only Easynet and Bulldog have actually installed their LLU equipment into my exchange. Bulldog wouldn’t be bad, though their ADSL2+ service is limited at 16Mbit and also requires that you transfer your landline over to them, which isn’t something that I can do.

So UKOnline, whom re-sell via Easynet, were my choice. One of the companies my employers deal with often do actually have a UKOnline ADSL2+ connection and it’s been quite favourable. Granted though, the connection is definitely down on Be’s; only 22Mbit/768Kbit. Kilobits? IN MY UPSTREAM?! And to add insult to injury, any customers wishing to join on the ADSL2+ connection must purchase a Netgear DG834GT wireless router, or you don’t have the connection.

Now their reason for forcing a £59.99 router on you is purely because the current state of ADSL2+ modems is a little shoddy. Supposedly, at least – my only back-up story to support this was my colleague’s testament on Cisco’s ADSL2 WIC, which for some unbeknownst reason only ever sync’d at 7Mbit/sec – and even now with firmware upgrades, maxes-out at 14.1Mbit/sec due to a physical limitation. If Cisco can’t get it right, what on Earth is going on?

After weighing it up, I didn’t have much choice. Thankfully UKOnline are currently waiving the £25 connection charge, and were very quick and friendly to help me through my order. I really did grill the poor bastard on the end of the phone, but it’s what he’s paid to put up with at the end of the day. Indeed it was nice of him to go the extra mile and arrange for my router and welcome pack to be sent to work instead of the flat, where I’ll actually be around to collect it.

And after some investigation, the router isn’t meant to be all that bad. More-over, even Ebuyer aren’t selling it for less than £60. :)

So in about 2 weeks I should have something to say about my new connection.

A fortnight of ups and downs (literally)

It’s been two weeks (roughly) since I started my placement, and there has been quite a bit to learn.

My first week has given me time to get to grips with my new environment, and also gave me a steep introduction into the systems that I’m going to be responsible for in the forseeable future. That is, a large amount of Gentoo co-located servers (some running VMs, which in turn run the services, and some caring for VM back-ups via DRBD) and an Asterisk PBX, which itself is actually hosted on a Gentoo server.

Due to.. Something… Wonderful, there isn’t a network/server map in sight. So all of this was pretty confusing to start with, and ust trying to envision the layout of the entire system from one person’s explanation was very difficult indeed. Thankfully now, I’m pretty well-versed, but there are still some aspects I’m yet to grasp. Though it shouldn’t take long now..

What is particularly interesting is the variance of the work. In the early days I was doing rudimentry work; changing passwords on the 3Ware RAID card web interfaces to something new. This was fine until I realised that the passwords had an 8-character limit, and thus the passwords I had picked were far too long. This wasn’t noticed until after I’d been through almost all of the RAID cards… Even the MD was laughing from his office when that hit the e-mails.

On the other hand, last thursday myself and Jon had to visit UK Solutions to replace some hard drives in a server, and remove the aging RedHat install on that machine, along with another, in favour of shiny new Gentoo installs. Unfortunately the phrase ‘Shiny Gentoo install’ is somewhat of an oxymoron; it’s anything but shiny, glossy, or any other descriptive word for ‘pretty’. We’re talking about a purely CLI installation. Any harder and it’d be Linux From Scratch

But if it wasn’t hard, it’d be boring! And it’s definitely a nice change to get out of the office and into the .. frying pan. Anyone who’s had to sit between the arse-end of two server racks will understand my synonym. Cool, it most-definitely is not. But I had a job to do, and that was to install Gentoo onto the new RAID-5 array of recently-purchased 500GB Seagate drives.

The basic installation of Gentoo is by-far the most confusing – there’s a reason the install manual has probably been re-written 5001 times in order to make it simpler for first-time users, and anyone who remembers their first attempt at a Gentoo install will be nodding violently right about now. Thankfully once that’s done (and we head back to HQ in search of a more comfortable, ssh-supported server configuration environment) things do get easier. You get used to simply typing ‘emerge application-name‘, editing the necessary config file (or stealing it with scp from a working server ;) ) and using the necessary init.d script to restart the daemon.

Sounds groovy, but let me take you back to the RAID-5 configuration that I was working on in those two fun-packed days. RAID-5, for the unitiated, provides a fall-back in server environments in the event that one of your drives fails. For instance, in a three-drive RAID5 array of 500GB disks you will have a striped array of 1000GB in total, with one third of each drive taken up by parity data. The downside being that you don’t get the full 1500GB to use. The upside is if any one of the drives in the array fails, you can connect a new drive, and the data on the lost drive can be ‘re-constructed’ from the parity data contained on the remaining two drives. Our nice 3Ware RAID cards also have support for ‘hot spare’ drives, so if one drive fails – the controller can bring in the fourth drive and re-construct the failed drives’ data without anyone having to physically visit the machine.

There are two problems with RAID-5:

  1. It’s slow. Having to write parity data with each write is hugely taxing on write speed. Reads are generally similar to that of RAID-0 though.
  2. If two drives fail at once, you’re screwed. There’s not enough parity data contained on the remaining drive to construct two failed drives. RAID-6 was implemented to cover this, but isn’t widely used. There’s two lots of parity to write, and THEN you need more disks (4) for only the same volume size.

Anyway, what happens just as I’ve finished configuring my first Gentoo server? Yep, one of the drives drops out of the array! The RAID card attempted to replenish the array from the hot spare, but low and behold … THAT FAILED TOO! :(

TYPICAL.. So those drives shall be going back under RMA, and we’re spending a bit more on some Seagate enterprise-class drives (which we should’ve had in the first place really, but then even desktop drives shouldn’t have failed that prematurely!) which shall then require another trip to UKSolutions in order to install them! Oh, and ANOTHER Gentoo install. Plus a few more. And we have to re-wire the internal LAN switch, which is going to be a massive job due to the lovely job of cable tidying that UKS has done.

I think I’m having fun! :) </geek>

More another time, I think. :)