phaqphaq

“a geeks daily life”

Archive for the 'Networking' Category

Stale IPv6 Neighbor unreachable – Resolve with IPv6 “ARP” equivalent

Monday, July 19th, 2010

Right before “mov’ing /dev/myself to /var/home” I came along another odd thing on my border router.

An IPv6 peer was unreachable, i.e. did not respond to ICMP ping, as such the BGP session was down as well.
I gave it another indepth look as it happened to be one of our IPv6 upstream peers and had some sort of importance as such.

After talking to the NOC guy of the carrier, which assured that their interface was up, I was a bit confused.
He could neither PING my end, nor could I PING his end.


#ping ipv6 xxxxxxxxxx:80
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to xxxxxxxxxx:80, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)

I had however various IPv6 BGP sessions established within the same prefix, so it all seemed ok for the others.
For network guys not unusual is however to not only check L3 addressing but also inspect L2 addressing.

IPv4 techs use ARP (Address Resolution Protocol), which maps MAC (Media Access Control) to L3 IPv4 addresses.
However this doesn’t work, as ARP will only display MAC-IPv4 mappings (addresses obfuscated).


#show arp
Protocol Address Age (min) Hardware Addr Type Interface
Internet xxxxxxxx.1 115 0000.0000.00fc ARPA GigabitEthernet0/3.110
Internet xxxxxxxxx.2 0 0000.0000.6a9a ARPA GigabitEthernet0/3.110

So, in the IPv6 world this is a little bit different, as ARP not strictly exists.
However, IPv6 uses something similar called “IPv6 neighbor discovery protocol (NDP)”.
The NDP status can be inspected using the “show ipv6 neighbors” command, which will give you a list of all IPv6 addresses and their MAC address counterparts:


#show ipv6 neighbors
IPv6 Address Age Link-layer Addr State Interface
FE80::xxxxxxxxxxxxx:4000 0 0009.b766.4000 REACH Gi0/3.110
FE80::xxxxxxxxxxxxx:9C7D 87 000d.b918.9c7d STALE Gi0/3.111
FE80::xxxxxxxxxxxxx:2A80 0 001e.f7f6.2a80 REACH Gi0/3.110
FE80::xxxxxxxxxxxxx:AB79 0 0013.1937.ab79 STALE Gi0/3.111

You may also query a specific entry for a given IPv6 address:


#show ipv6 neighbors xxxxxxxxxxxxxxx:80
IPv6 Address Age Link-layer Addr State Interface
xxxxxxxxxxxxxxx:80 0 0000.0000.00fc STALE Gi0/3.110

As seen in this case, the address in question is in STALE state.
It is beyond the scope of this article to inspect all possiblbe reasons to why this happened.
A good starting place to inspect this is using the “debug ipv6 nd” command, which I may cover in another article.

The most obvious case is as simple as what seldomly happens with IPv4 ARP as well: the L3 and L2 addresses do not match up (maybe due to hardware exchange).

So, in the IPv4-world, you would do something like “clear arp” – but this won’t help with IPv6 NDP mismatches.
The simple solution is to clear the IPv6 ND cache:


#clear ipv6 neighbors

So, if this is really the root of the problem, you may notice a different L2 address afterwars:


#show ipv6 neighbors xxxxxxxxxxxxxxx:80
IPv6 Address Age Link-layer Addr State Interface
xxxxxxxxxxxxxxx:80 0 0000.0000.00ab REACH Gi0/3.110

In my case, the IPv6 peer was reachable by ICMP PING again and as such the BGP session recovered as well.

And now, finally, “mv /dev/myself /var/home/” ;-)

BGP configuration weirdness on Foundry/Brocade

Monday, July 19th, 2010

If you’re all too familiar with Cisco, then you will – as well as I do – struggle accross some weirdnesses on the Foundry/Brocade routers every now and then.

Not too long ago I fought around with a BGP issue on the XMR 4000.
My problem was that the XMR would announce just about any IPv6 prefix to all BGP peers, despite the fact that I had a configuration in place, which should effectifely only announce my own prefixes.

My Cisco configuration, which used to work properly, looks as shown below.
For the sake of simplicity I stripped away some advanced settings for communities, route-maps and prefix filters to keep it short.


ip as-path access-list 5 permit ^$
ip as-path access-list 5 deny .*
!
router bgp xxxxx
neighbor PEERGROUPv6 peer-group
neighbor PEERGROUPv6 description Some IPv6 Peers
!
neighbor SOMEIPV6PEER remote-as nnnnn
neighbor SOMEIPV6PEER peer-group PEERGROUPv6
!
address-family ipv6
neighbor PEERGROUPv6 soft-reconfiguration inbound
neighbor PEERGROUPv6 filter-list 5 out
!
neighbor SOMEIPV6PEER activate

The basic idea of this setup is to have all peers share the same subset of settings through the peer-group, in this case the filter-list, which should only permit my own AS.

On the cisco, this has this exact effect, causing only my prefixes to be announced to the peers:


#show bgp ipv6 uni neighbors MASKEDIPV6PEER advertised-routes
BGP table version is 6144418, local router ID is MASKEDROUTER
Status codes: s suppressed, d damped, h history, * valid, best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path
*iMASKEDIPV6PREFIX/32 MASKEDROUTER
1 100 0 i

Doing the same configuration 1:1 on the Brocade XMR didn’t work however. I always ended up having ALL IPv6 routes announced to my peers, no matter of their origin:


#show ipv6 bgp neigh MASKEDIPV6PEER advertised-routes
There are 8366 routes advertised to neighbor MASKEDIPV6PEER
Status A:AGGREGATE B:BEST b:NOT-INSTALLED-BEST E:EBGP I:IBGP L:LOCAL
Prefix Next Hop Metric LocPrf Weight Status
1 2001::/32 MASKEDIPV6PEER
990 400 0 BI
AS_PATH: 6730 6695 12859
2 2001::/32 MASKEDIPV6PEER 301 0 E
AS_PATH: 6939
3 2001::/32 MASKEDIPV6PEER
999 100 0 E
AS_PATH: 13030 12859
4 2001:200::/32 MASKEDIPV6PEER
990 400 0 BI
AS_PATH: 6730 6939 2500
5 2001:200::/32 MASKEDIPV6PEER 980 300 0 E
AS_PATH: 6939 2500
6 2001:200::/32 MASKEDIPV6PEER
999 100 0 E
AS_PATH: 13030 2500

If course this is not what I intended, as only my local prefixes should be announced.
My initial Brocade config looked very similar to the Cisco config:


ip as-path access-list 5 seq 5 permit ^$
ip as-path access-list 5 seq 25 deny .*
!
router bgp
neighbor PEERGROUPv6 peer-group
neighbor PEERGROUPv6 description "Some IPv6 Peers"
!
neighbor SOMEIPV6PEER remote-as nnnnn
neighbor SOMEIPV6PEER peer-group PEERGROUPv6
!
address-family ipv6 unicast
neighbor PEERGROUPv6 activate
neighbor PEERGROUPv6 soft-reconfiguration inbound
neighbor PEERGROUPv6 filter-list 5 out
!
neighbor SOMEIPV6PEER activate

The main difference between Cisco and Brocade, despite from some syntactical differences, was the need to have the “neighbor PEERGROUPv6 activate” statement in place.
So if all routes are announced to all peers, a misconfiguration within the peer-group would be most likely the case. I checked the peer-group configuration as follows:


#show ip bgp peer-group
1 BGP peer-group is PEERGROUPv6
Description: Some IPv6 Peers
NextHopSelf: no
SoftInboundReconfiguration: yes
Address family : IPV4 Unicast
Route Filter Policies:
Filter-list: (out) 5
Address family : IPV4 Multicast
Address family : IPV6 Unicast
Filter-list: (out) 5
Prefix-list: (in) ipv6-prefix-in (out) ipv6-prefix-out
Route-map: (in) IXin (out) IXout
Address family : IPV6 Multicast
Members:
IP Address: MASKEDIPV6PEER, AS: MASKEDAS
IP Address: MASKEDIPV6PEER, AS: MASKEDAS
IP Address: MASKEDIPV6PEER, AS: MASKEDAS

This looked ok to me. It didn’t make any sense to me at all.

Well, I read the docs, but I didn’t find any reasonable clue on this. Also friend Google dind’t help.

After some time I had the idea to check what happens, if I add another “neighbor PEERGROUPv6 filter-list 5 out” command to the “address-family ipv4 unicast”, see below:


ip as-path access-list 5 seq 5 permit ^$
ip as-path access-list 5 seq 25 deny .*
!
router bgp
neighbor PEERGROUPv6 peer-group
neighbor PEERGROUPv6 description "Some IPv6 Peers"
!
neighbor SOMEIPV6PEER remote-as nnnnn
neighbor SOMEIPV6PEER peer-group PEERGROUPv6
!
address-family ipv4 unicast
neighbor PEERGROUPv6 filter-list 5 out
!
address-family ipv6 unicast
neighbor PEERGROUPv6 activate
neighbor PEERGROUPv6 soft-reconfiguration inbound
neighbor PEERGROUPv6 filter-list 5 out
!
neighbor SOMEIPV6PEER activate

Now I would need to reset the BGP peer to see if the change had an effect:


#clear ipv6 bgp neighbor SOMEIPV6PEER soft-outbound

I couldn’t believe it when I saw the result:


#show ipv6 bgp neigh MASKEDIPV6PEER advertised-routes
There are 1 routes advertised to neighbor 2001:7f8:24::aa
Status A:AGGREGATE B:BEST b:NOT-INSTALLED-BEST E:EBGP I:IBGP L:LOCAL
Prefix Next Hop Metric LocPrf Weight Status
1 MASKEDIPV6PREFIX::/32 MASKEDROUTER 0 BE
AS_PATH: MASKEDAS

To see if I had done something wrong, I reverted all changes and tried again.
Result: Again all routes were announced.

Having reapplied the “IPv4 filter-list” again caused only required prefixes to be announced — exactly the way how I intended it.

Now, this doens’t make any sense at all and feels like a software bug within the implementation.
Even before adding this non-obvious configuration command, the output of “show ip bgp peer-group” clearly stated which filters and prefix lists the peer-group used.
The output didn’t change at all after applying the changed configuration.

At least my BGP announcements are correct now. The case still needs to be resolved with the Brocade tech guys.

Foundry/Brocade Devices require implicit reload of ACL upon modification – What a Man-Trap!

Wednesday, August 12th, 2009

Well, well, well …

I just stumbled accross a minor difference between Cisco and Foundry, the latter being mostly Cisco-alike.

To update an ACL on Cisco devices (at least the ones I encountered so far) I usually do this:

conf t
!
no ip access-list extended MY_ACCESS_LIST
!
ip access-list extended MY_ACCESS_LIST
    my permit/deny list entries
!
end

This results in immediate application of the access list, so we’re just fine and happy.

Doing the same on a Foundry results in… nothing.
Well, not quiet, at least the changes are applied in terms of “visibility” in the running config or with a “sh access-list name MY_ACCESS_LIST” statement, but they are not enabled.

Once more RTFM holds true, especially when talking about “familiar devices”, which we usually understand well enough to work with easily (which usually holds for most Cisco-alikes), but ommit reading the entire manual for exact THAT reason. Honestly, how many of you REALLY (I mean REALLY!) do this ….?

In this case I learned from the manual, that a Foundry/Brocade devices needs an implicit reload of the access lists after modying them (Dough!).

The command line should effectively read:

conf t
!
no ip access-list extended MY_ACCESS_LIST
!
ip access-list extended MY_ACCESS_LIST
    my permit/deny list entries
!
ip rebind-acl all
!
end

So, I could have saved me 15 minutes if I HAD actually read the manual section about ACL before …

Chaining FreeBSD’s pxeboot with pxelinux

Thursday, September 20th, 2007

Recently I invested some development time on my company’s PXE-based network boot system. While pxelinux serves as a general purpose network boot loader at our site, current demands required further extension beyond it’s capabilities. The main reason for this was the inability of pxelinux to be used for certain bootstrap scenarios. As an example to this we may note FreeBSD. While it can be booted from floppy images or an hd-converted ISO-image via pxelinux’s memdisk loader, this actually has some serious limitations:

  1. You’re always limited to the size of the floppy image
  2. Converting ISO’s to hd-like images tends to crash on some buggy BIOS versions

So I decided to conquer FreeBSD to boot directly from pxe. Because I had to retain compatibility with pxelinux as our primary PXE bootloader, FreeBSD’s own loader had to be “chained” to pxelinux. Now this is the easy part as you will only add your pxelinux default file like this:

label fbsdpxe
             KERNEL pxeboot

You see the error!? Well, I didn’t either at first and stumbled accross an ever lasting error message while trying to boot.

  ----------------------------------------------------------------------
     NetInstall :: Main Menu
  ----------------------------------------------------------------------
     F1 :: F2 :: F3 :: F4 :: F5 :: F6 :: F9 :: exit
  ----------------------------------------------------------------------

     local boot is default after 10 seconds.

     Press F1 - F10 to cycle menu pages, enter "exit" or option name to
boot: fbsdpxe
Loading
Invalid or corrupt kernel image
boot:

At first I though my pxeboot image got corrupted so I fetched another copy still receving the same error.
However I saw the tftp download request in the logs so the config by itself could not be in error.

Jul 19 20:21:39 setup tftpd[22265]: 192.168.2.239: read request for /pxelinux.cfg/default: success
Jul 19 20:21:39 setup tftpd[22267]: 192.168.2.239: read request for /pxelinux.cfg/screens/f1.txt: success
Jul 19 20:21:48 setup tftpd[22272]: 192.168.2.239: read request for /pxeboot: success

So what had happened?

When copying the FreeBSD pxeboot loader over to the tftpd boot directory I must have forgotten to type the filename correctly, so pxeboot.0 suddenly became pxeboot.

Without noticing this I added it like above to the pxelinux config file and ended up with the exact error message as previously shown.

After changing the filename to read pxeboot.0 it actually chain-booted via pxelinux.

During further examination I made the same mistake once before when I iniatlly configured pxelinux years ago, omitting the .0 by the end of the filename.
Interestingly enough pxelinux was served without any problems and could be booted successfully despite it’s possibly wrong filename.
Even more amazing was the fact that also the FreeBSD pxeboot loader would work if served up as primary loader as specified from dhcpd.conf using the “wrong” filename.

Now I’m not sure if there’s a real reason or a naming convention which actually defines the boot loader’s filename to end in .0.
The official specs mention REMOTE.0 to be used as NBP (Network Boot Program) and an optional REMOTE.1 to be fetched additionally in case the NBP exceeds 32k in size and needs to be splitted.

However there seems to be no real, fixed naming convention so to my understanding the file could actually be called anything.

So the mystical .0 at the end of the filename might haven been choosen initially to comply with the original specs and show that the file in question is the first part of the NBP (even if a second part would not exist).

Finally remains the question if pxelinux’s behaviour is by error or by design.

The true answer to this is: by design.

H. Peter Anvin explains this on the common problems page as follows:

[..]It is unfortunate that there isn’t a standard extension used for Linux kernels, and that none of the commonly loaded data formats (except perhaps COM32) have reliable magic numbers.[..]

To conclude from this and the information on reserved filename extensions pxelinux loader routines will decide what to do with a given kernel only upon its filename extension.