Archive for the ‘Networking’ Category.

Source routing with OpenVZ & Linux

If, like me, you have to run lots of OpenVZ-based virtual server hosts, you will likely have encountered the fun that is reverse-path filtering, or ‘rp_filter’. This is the function of the kernel that rejects ‘martian’ IP addresses arriving on any given interface. This is usually a good thing, until you wish to connect your OpenVZ host to two separate networks and have it route IP addresses from both subnets to & from your guests via the VENET-style interfaces.

Essentially, despite differing source addresses, only one default gateway exists to send traffic to IPs not within the connected subnets and thus, traffic on any “secondary” subnet is rejected as a martian when leaving the host’s interface that is connected to its default gateway.

Some people would use bridged intefaces, although this is sadly not an option for me right now. Whilst the performance of VENET is supposedly better, we also have a large install-base of VENET guests that do not wish to be disturbed. So for now I still need a way to make this work with VENET interfaces (and also VETH if required later).

There are two methods around the return_path filtering, with the first being a terrible hack that should only be used temporarily, if at all… If you echo ’1′ to /proc/sys/net/ipv4/conf/all/log_martians, you will be able to see which interface is filtering martian packets. With that information you can then simply disable the rp_filter function by echoing ’0′ to /proc/sys/net/ipv4/conf/INTERFACE/rp_filter and martians won’t be filtered.

However, this isn’t a sensible option. A better solution is to actually create a routing rule to alter the default gateway used, based on the source subnet. It took me a little bit of digging, but I eventually managed to get this working after combing a few sources (including, but not limited to, the iproute2 man file).

For reference, here’s my routing table showing two networks and two /32 IPs assigned to a guest’s VENET interface (note that the networks are /23′s, not /24′s!):

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
10.0.9.159      0.0.0.0         255.255.255.255 UH    0      0        0 venet0
10.0.125.53     0.0.0.0         255.255.255.255 UH    0      0        0 venet0
10.0.8.0        0.0.0.0         255.255.254.0   U     0      0        0 br0
10.0.124.0      0.0.0.0         255.255.254.0   U     0      0        0 br1
0.0.0.0         10.0.9.1        0.0.0.0         UG    0      0        0 br0

Start by opening /etc/iproute2/rt_tables in your favourite editor. You’ll need to append a line to the bottom to create a new routing table:

# cat /etc/iproute2/rt_tables
#
# reserved values
#
255    local
254    main
253    default
0    unspec
#
# local
#
#1    inr.ruhep
100    vlan4

As you can see, I’ve appended a new table named ‘vlan4′ (picking a sensible name helps, in my case this is the VLAN name for 10.0.124.0/23)  and given it a priority of 100. As per my understanding, the priority should be decremented for each subsequent table defined.

Now you need to use ip to define the new rules & routing behaviour, taking advantage of the new table we’ve defined. First, create a rule matching traffic from your secondary subnet:

ip rule add from 10.0.124.0/23 iif venet0 table vlan4

For reference, the ‘iif’ attribute is not a mistake; “iif” not “if”. This was also a key part of the setup, as it only classifies traffic originating from the VENET interfaces, no-where else.

Now add a route to define the new default gateway for our new table of classified traffic and apply it:

ip route add default via 10.0.125.1 dev br1 table vlan4
ip route flush cache

You should now find that guest traffic from either network is routed correctly without having to change any rp_filter settings. At any time you can use the following two commands to see your configuration:

ip rule show
ip route show table vlan4

Be sure to re-apply the ‘ip rule’ and ‘ip route’ statements on your next reboot; under Scientific Linux 6.0 I’ve used the /etc/rc.local file, but you can just as easily apply them on ifup in Debian’s network configuration.

Dell 6224 switch ‘Oversize Packets’ counter

It’s been a while since I’ve written anything on my blog, but following the lack of any hits on Google regarding this, I felt this might well be a useful snippet to those in the same boat as myself.

I’m currently testing & tweaking an iSCSI setup that utilises a Dell 6224 switch. These are very fast switches for the money (about ~£900 if you’ve got a good account manager!) and provide a lot of features, including stacking, if you have more than one. Their drawbacks, however, are mostly in the lack of documentation and/or the same level of user interface ‘polish’ that you receive from other manufacturers. Most people will say; ‘you get what you pay for’, but for the most part they are great switches for the money you pay.

One such “lack of documentation” has had me annoyed today. I’ve been using LACP to bond ports and, at the same time, have raised the MTU to the maximum of 9216 (which can be done per-port, without a reboot or a switchport up/down event, I might add) across all ports. All this, in an attempt to glean a little more performance (i.e. lower processing overhead) from my iSCSI sessions.

And it seemed to work just fine. However, upon inspecting the interface counters, I noticed a stunning amount of packets being regarded as ‘Oversize Packets’:


switch#show interfaces counters port-channel 1
Alignment Errors: ............................. 0
FCS Errors: ................................... 0
Single Collision Frames: ...................... 0
Multiple Collision Frames: .................... 0
Late Collisions: .............................. 0
Excessive Collisions: ......................... 0
Oversize Packets: ............................. 15829678
Internal MAC Rx Errors: ....................... 0
Received Pause Frames: ........................ 0
Transmitted Pause Frames: ..................... 0

I wasn’t sure whether or not to take this as an error or just a simple ‘count’ of packets. “Oversize” would indicate that they’re bigger than the port was expecting, but I was still hitting around 120MB/sec (out of the theoretical 125MB/sec that Gigabit Ethernet can physically provide) which wouldn’t be conducive to a serious string of frame/packet errors.

I couldn’t find anything online, so I contacted Dell ProSupport to raise a ticket. I had to go through the annoying rigmarole of explaining the problem three times over, but eventually a ‘switch expert’ explained that he wasn’t certain on the use of that counter and that its purpose depended on the firmware version currently in use (in my case, this was 3.2.0.9) and needed to check with his colleagues.

He eventually rang back to inform me that this was not a problem with the switch. The “Oversize Packets” counter merely serves to log packets that have a payload in excess of 1518 bytes. A fixed amount. It doesn’t matter than the MTU was set to 9216, it just continues counting the packets. Utterly useless, then!

As some form of consolation, he also mentioned that it didn’t update in real time.. Owing me to believe that there was some form of port stats analysing process running over the real time output. When it’s this useless, could I please have an option to turn it off? Or better yet, don’t bother logging it by default!

IPv6 on m0n0wall

I finally got around to sending my first ping6 echos! Who knew I’d get replies on my first go?!

My ADSL provider Andrews & Arnold have provided me with a /48 IPv6 subnet, which seems somewhat wasteful at 2^80 addresses (throw that in your calculator) but certainly useful for testing nevertheless. Whilst slowly getting my head around the task that is variable-length subnetting of IPv6 ranges – painful at best – I decided to just throw in a /64 subnet and set a static gateway address on m0n0wall‘s LAN interface to see if it would ‘just work’.

The result, is a working IPv6 LAN by simply enabling autoconfig from the m0n0wall box and telling Ubuntu’s Network Manager to use it. Et voila:

teh@desktop:~$ ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:01:29:fc:37:1d
inet addr:81.187.xxx.xxx Bcast:81.187.xxx.xxx Mask:255.255.255.240
inet6 addr: 2001:8b0:ff87:1:201:29ff:fefc:371d/64 Scope:Global
inet6 addr: fe80::201:29ff:fefc:371d/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1616524 errors:0 dropped:0 overruns:0 frame:0
TX packets:2224946 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:277202062 (277.2 MB) TX bytes:519498762 (519.4 MB)
Interrupt:18

You’ll notice that the last 80 bits of my IPv6 address on this host were assigned via autoconfig, using part of my MAC address (the part that doesn’t correspond to a certain manufacturer, IIRC) as well as some randomly-generated bits, too.

And to make my night, ping6 worked straight away, too:

teh@desktop:~$ ping6 2001:08B0:FF88:0001::1
PING 2001:08B0:FF88:0001::1(2001:8b0:ff88:1::1) 56 data bytes
64 bytes from 2001:8b0:ff88:1::1: icmp_seq=1 ttl=64 time=3.81 ms
64 bytes from 2001:8b0:ff88:1::1: icmp_seq=2 ttl=64 time=0.130 ms
64 bytes from 2001:8b0:ff88:1::1: icmp_seq=3 ttl=64 time=0.132 ms

--- 2001:08B0:FF88:0001::1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.130/1.358/3.813/1.735 ms

Now to plan how I’m going to roll this out at work…

m0n0wall and 3G USB modems

I’ve been running a m0n0wall router for some time now. The build and design of the machine was meant to be documented on the ‘RoutITX’ page of this blog, but I’d never gotten around to finishing it off. I may do this now that I have more time, but I’m not promising anything…

Even so, due to the impressive compatibility of the Sony Ericsson K800i and Linux, and the subsequent lack of the same DHCP/CDC USB Ethernet adapter functionality in the K850i, I thought it’d be quite cool to see if the K800i could be configured as a back-up WAN interface within m0n0wall.

So I fish my K800i (now retired, although I wish it wasn’t) out of its resting place, find a USB cable, and plug it into the back of the m0n0wall machine. No new interface appeared on the ‘assign interfaces’ page, so I restarted it. Still no new interfaces. Upon checking the kernel messages in the log, I found these lines pertaining to the CDC USB Ethernet device:

Jan 4 20:28:06 kernel: device_attach: cdce0 attach returned 6
Jan 4 20:28:06 kernel: cdce0: could not find data bulk in
Jan 4 20:28:06 kernel: cdce0: Sony Ericsson Sony Ericsson K800, rev 2.00/0.00, addr 2

Which, as a Linux geek, confused me somewhat. Google turned up a number of results for ‘cdce0′ problems or ‘attach returned 6′ regarding various other drivers, but only one really addressed the issue in particular, albeit for a much older SE phone. You’ll notice that there haven’t been any replies, either.

A former colleague pointed me in the direction of a patch that was submitted around October 2008 which enables the proper handling of the CDC USB device within the Nokia N80. Hopefully it should help, but it may be some time before the patch filters down to m0n0wall.

This is just one of those times when I wish I’d followed the world of software development a little more.

Why Sony, why?

I absolutely adored my Sony Ericsson K800i. What a phone; everyone’s had one or used one at some point. Given that they’re quite long in the tooth now, you’d be hard-pushed to have not come across someone that had/has one.

So when the K850i came out, I was quite eager to get my upgrade. And so far there’s been only one real drawback to it, that I’ve found: using it as a modem.

When I first moved into my current abode, I didn’t have any ADSL for a few weeks. Predictably one can steal some wireless broadband, or one can attempt to use some form of mobile broadband. Before signing my life away for a few months, I decided to test my phone (which at the time, was the K800i) with Ubuntu. To my sheer delight, the phone presents itself as a USB Ethernet adapter, and Ubuntu’s network-manager simply sent a DHCP request and received an ACK. No messing about here: I had 3G broadband within 5 seconds of plugging the USB cable in!

So obviously when I attempted the same trick with my K850i, I was really quite dismayed to find that you can’t do this any longer. The USB Ethernet device is there (grep -i CDC /var/log/messages) but for the life of me, I cannot find a way to obtain a DHCP lease via the usb0 interface.

Yes, it works perfectly (and with HSDPA speeds, thanks to my city-centre location) if you use wvdial or one of its GUI front-ends (gnome-ppp worked well) and I’ve been able to connect like this..

But I can’t understand why the Sony Ericsson engineers would want to remove such a simple mechanism in favour of the greatest faff-about in history. I’d be interested to hear from anyone that’s managed to get this working.. Although I fear by the time I get an answer, I’ll be back on some ADSL goodness: HSPDA is alright in a pinch, but T-Mobile UK’s data network seems so heavily sensitive to peak times (I suspect insane levels of contention) and the latency is atrocious. Half a second? Ugh. That’ll be the Deep Packet Inspection they do…

Zyxel ADSL Modems and Bridging

First thing’s first: AAAAAARGH!!! *waves arms in the air maniacally*

I’ve spent the evening getting my RoutITX project off the ground and into service. But to do this, I needed an ADSL2+ modem. So, rather than persist with using my Netgear DB834GT, I thought I’d try out a P660R-D1 from Zyxel. Simple little thing, only about £25, and claims to be able to do bridging to its (single) Ethernet port.

Can it hell. I’ve tried everything I can; it can sync the DSL to a lovely speed, but it can’t get any further than that.

What I’d like is a nice, small, cheap, ADSL2+ modem (preferably including Annex M) that does a perfect bridge, with good reliability and performance.

There’s got to be one out there? I’d love to know.

Crossing the Gigabit barrier

Recently, I’ve been charged with investigating into faster-than-gigabit networking, in an effort to switch our VM hosts away from local storage to an NFS-based NAS system. There are a few reasons for doing this; the greatest of which is Sun’s ZFS file system.

ZFS, for those of you who aren’t familiar, has really shaken-up the world of file systems recently, as it changes almost everything that we perceive about a modern-day file system. On top of these fundamental changes (which I won’t go into detail about here) the ZFS developers have added some really neat features, such-as zero-cost snaphosts, replication between machines, RAID-Z, and quite a lot more.

It’s the promise of these features that has prompted our change over to a NAS-based storage system. Given that we can completely replace our current system of identical live/backup hosts, with slow backup scripts and drbd mirroring, it’s quite promising to think what we can achieve.

The problem is transport. And keeping fast transport. Given the extra overheads of IP/NFS that NAS brings (weighted against the benefits given ZFS over the more efficient use of raw disks in a SAN) it’s been deemed that a single gigabit link just won’t be up to the demanding task. The problem is that once you decide to cross the gigabit ‘barrier’, your costing simply spirals uncontrollably skyward. :(

There are a few options available to achieve a decent throughput:

  • Multiple, bonded (802.3ad) gigabit links – cheap-ish, but some multiport adapters really aren’t cheap.
  • 4Gbit FibreChannel – readily available Solaris support, but over-shadowed by 10GigE/Infiniband and requires costly HBAs with extremely expensive XFP/SFP+ modules.
  • Infiniband (SDR 4x, 10Gbit) – really, really cool, but there’s a huge lack of support in Solaris.
  • 10Gigabit Ethernet – very new, and switches are extremely expensive (laughably so, think $20,000 for a 24-port switch + Gbics!) mainly due to the lack of 10GBase-T support (meaning we need 10Base-CX4 or some Fiber-based solution.)

So what’s the answer? We’re not a Fortune 500 company, so most of this is still out of reach. On top of it all, we need to rely on Solaris for ZFS – an operating system which seems to have very little manufacturer support, despite its presence in the cluster and virtualisation markets. Sun’s Hardware-Compatibility List is almost devoid of recent Infiniband/10GBase-T adapters, particularly in PCI-E interconnect guises.

It wouldn’t be so bad if some manufacturer had thought to release a small-scale, 8-10 port 10GigE, 10GBase-T switch. They just don’t exist.. At present, it’s quite likely that we’ll have to dump the idea of a switched fabric altogether, opting instead of multiple point-to-point links.

It seems we’re either just a few years ahead of ourselves, or really, really out of our depth.

Modem to Cisco 2811 Console port

I’m having a tough time getting this to work, so expect a few revisions to this.

I’m confused as to why this is so difficult. Below is my post to the Cisco NetPro forum. I’m hoping they’ll be able to help eventually, but I thought I’d include it here for the record. I know a few Cisco schmartypantsh people whom might read this. ;)

Hi all,

I am wondering if someone may be able to point out where I’m going wrong.

I want to be able to see/user the boot cycle/ROMMON mode of a 2811 router, remotely.

As I understand it, the only way to achieve this is with an analogue modem connected to the Console port.

I’m doing the testing here with a Cisco 2651 as it’s the only router I have available to me locally, though as most of the configuration is modem-side, I feel that it shouldn’t make much of a difference..

I’m using a Hayes External Serial modem, connected to the Console port with a Cisco-provided 25pin D -> RJ-45 roll-over.

However, after using this guide, I’ve not had any success. The basic principles appear the same as with a router, but when I dial in, the modems handshake and connect but HyperTerminal displays nothing. There is no output at all UNLESS I enable the ‘post-dial window’, and then I can control the console. My issue is that this is not a native console – there’s no xmodem support and I wouldn’t have a clue how to duplicate this work-around in another client (minicom or screen, for example.)

I’ve noticed that the S37=9 (listed in the guide as the setting for 9600kbps) is not a value that is saved in the current profile. Perhaps the Hayes set has been updated since the article was written. Does anyone know a better way to set this? I have found a way of forcing V.32, which is 9600, but .. is it correct?

Here is the current profile stored on the modem:

ACTIVE PROFILE:
B1 E1 L1 M1 N0 Q0 T V1 W2 X4 Y0 &C1 &D0 &G0 &J0 &K0 &Q5 &R1 &S0 &T5 &X0 &Y0
S00:001 S01:000 S02:043 S03:013 S04:010 S05:008 S06:002 S07:050 S08:002 S09:006
S10:014 S11:085 S12:050 S18:000 S25:005 S26:001 S36:007 S38:020 S46:138 S48:007
S95:000

(Note that I set ATE0 and ATQ1 when testing, but these were flipped for the purpose of viewing the config ;) )

Here is the config from the router:

line con 0
exec-timeout 5 0
logging synchronous

I *think* this is a problem with the speed not detecting the speed of the console properly, but I’m not experienced enough with modems at this level. I’ve even consulted my CCNP2 book, only to find that it concentrated more upon the Aux port.

I can get modem -> Aux working, but that’s just not flexible enough (despite its advantages.)

If anyone can help, I’d be extremely grateful. If you require any more information, I’ll be happy to provide it! :)

Thanks,

Tom

If anyone has any insight into why HyperTerminal is behaving in such a fashion, I’d love to know. :)

TFTP Server via Netgear DB834GT

I’ve written a short guide on how to configure a Netgear DB834GT (and possibly other variants of the DB834) to forward a device to a TFTP server, via the router’s built-in busybox/linux-based udhcpd server. I’ve recently been setting up a Cisco IP phone at home, and this functionality has proven to be extremely useful.

Disclaimer: I am NOT responsible if you screw up, over-write your config, brick your router, forget your password, kill your dog or set fire to the living room. There’s no reason why you should do any of these things, given the simple commands below, but if you do see fit to go a-wandering through the files stored within your router, then it’s not my fault if you break it!

Anyway, to start, you’ll need to enable the debug mode on your Netgear router. Simply construct a link like the one below, paste it into your web browser and login with your username (most-likely ‘admin’) and the configured password:

http://192.168.0.1/setup.cgi?todo=debug

Of course, you’ll need to replace ’192.168.0.1′ with the LAN IP that you’ve chosen for your router.

Now for the fun part. We need to see what the DHCP server is configured to do at present, so issue the following command:

cat /etc/udhcpd.conf

You should see the contents of the udhcp.conf file printed out. It will look something like this:

server 192.168.0.1
start 192.168.0.20
end 192.168.0.254
interface br0
option subnet 255.255.255.0
option router 192.168.0.1
option dns 208.67.222.222
option dns 208.67.220.220
option lease 259200

To setup the TFTP server re-direction, you first need to add a line to this config file. I had to look up a sample config file for udhcpd, but thankfully it lists a number of DHCP options that can be utilised (but aren’t directly supported by the web interface) to extend its usefulness.

Sadly there’s no text file editor included on the router, but one can achieve the same effect with simple shell operators:

echo "option tftp 192.168.0.20" >> /etc/udhcpd.conf

This simply prints a line and appends it to the file. You can check your addition by using the ‘cat’ command above.

Now you need to stop the DHCP server and restart it. This is so that udhcpd will re-load it’s configuration file and take note of your changes. If you issue the command ‘ps’, you’ll see something like this:

# ps
PID Uid VmSize Stat Command
1 root 252 S init
2 root SWN [ksoftirqd/0]
3 root SW< [events/0]
4 root SW< [khelper]
5 root SW< [kblockd/0]
17 root SW [pdflush]
18 root SW [pdflush]
19 root SW [kswapd0]
20 root SW< [aio/0]
26 root SW [mtdblockd]
104 root 244 S /sbin/klogd
147 root 400 S /usr/sbin/hostapd -B /etc/hostapd.conf
163 root 260 S /usr/sbin/netgear_ntp -z GMT+0 -h 208.69.32.170
165 root 256 S /usr/sbin/mini_httpd -d /www -r NETGEAR DG834GT -c *
172 root 268 S /sbin/syslogd -f /etc/syslog.conf
173 root 244 S /usr/sbin/crond
175 root 168 S /usr/sbin/scfgmgr
179 root 184 S /usr/sbin/cmd_agent_ap
180 root 168 S /usr/sbin/pb_ap
193 root 252 S init
602 root 220 S /usr/sbin/utelnetd -d
1835 root 524 S /usr/sbin/pppd plugin pppoa 0.38 user lol@adsl.net
2056 root 300 S /usr/sbin/reaim
2153 root 244 S /usr/sbin/udhcpd /etc/udhcpd.conf
2155 root 396 S /bin/sh
2157 root 268 R ps

As you can see by this line:

2153 root 244 S /usr/sbin/udhcpd /etc/udhcpd.conf

..the udhcpd server is running with PID 2153. To stop it, just kill it:

kill 2153

Note: Although my example shows udhcpd using PID 2153, yours will most-certainly be different, so remember to substitute the the kill command above with one that uses the correct PID value. Killing the wrong PID could mean that you break something vital, and you’ll need to start over again (ie. restart the router, and go back to the beginning of this guide.)

Issuing ‘ps’ once more should show that the udhcpd server has been sucessfully stopped (it won’t appear in the list of running processes). The router won’t restart udhcpd by itself thankfully. Note that whilst you have the DHCP server disabled, any clients expecting a lease or renew won’t get it. I can only recommend that the machine you do this from is assigned a static IP, to avoid any [possible, however unlikely] hassle.

Since we’ve pre-edited the udhcpd.conf file, all that is left to do is restart udhcpd:

/usr/sbin/udhcpd /etc/udhcpd.conf &

The ampersand at the end is important, don’t forget it!

Finally, issue another ‘ps’ command to check that udhcpd has been restarted successfully, and you’re all done.

Unfortunately these changes will not be permanent. If your router is restarted, or power cycled, you will lose these settings. At present, I do not know of a way to successfully add the TFTP option to the configuration saved within the router’s non-volatile memory. It would be nice if Netgear could build some of this functionality into their firmwares (and thus the web interface), and whilst third-party firmwares do exist, I’ve not had the beef to test one out as of yet.

Still, this helped me configure my SIP phone, so hopefully it’ll help someone else. :)

Windows’ Wonderful Network Routing

Update: I’ve since realised (after doing this on another machine) that if you enable the RRAS (Routing and Remote Access) service under Server 2003, it actually does behave in the correct manner.

Why can Linux just do this out of the box, eh?

Running multiple NICs, on different networks, is not something I’ve had to do before with Windows. I’ve always been using Cisco routers or *nix-based devices for my routing needs. However, I came across something really quite annoying when I was fiddling with the above.

Imagine this: A Windows 2003 Server VM, with two virtual NICs. One has a public IP configured, bridged to the Public Address LAN, and the other has a private IP configured (which was obviously bridged to our internal Gigabit LAN.)

Now, possibly for reasons of clarity, 2K3 issues a notification when you configure more than one default gateway, if they’re from differing networks. Something to do with it not working well in load-balancing situations. Fair enough. But I immediately think that if it’s going to complain about something like that, then it obviously doesn’t need a second default gateway, and indeed it shouldn’t (as the networks are in completely separate IP ranges – it should work out where best to send it.)

Unfortunately, someone forgot to mention to Mr. Microsoft, that the term ‘default gateway’ is otherwise known as a ‘gateway of last resort’ and not the ‘gateway of only resort’!

So for the last few hours or so I’d been racking my brains over why connections to the internal LAN weren’t being routed back. The last thing I thought to check, was the damned Windows Server. Why on Earth would it ever decide to route packets for a 10.16.0.0 address over it’s default gateway on another network, when it’s already connected to 10.16.0.0 directly?!

Setting a default gateway (ignoring any notifications) for the internal LAN fixed it immediately. Grr.