Why Bonjour Hates My Wireless Network

Read­ing Time: 5 min­utes

Why Bon­jour Hates my Wire­less Net­work

Many of you know my strug­gle as of late to inte­grate all of my recent­ly acquired Apple devices into my exist­ing net­work.  Many of you also know the frus­tra­tion I’ve had with this process and have been inno­cent passers-by to my inces­sant twit­ter updates, rants and spon­ta­neous bursts of mis­placed anger.  Here then, is my brief expla­na­tion of what the prob­lem is, and why I now—after too many rab­bit-hole adven­tures to list—believe that I will not solve my prob­lem with­out dif­fer­ent equip­ment or a rad­i­cal re-design of my net­work.

I should point out, by the way, that it’s not that I’m a masochist—really I’m not—but rather that in my stud­ies I find it use­ful to have a lot of equip­ment lying around.  That equip­ment inevitably works its way into my home net­work, and after some time I have a large and some­times con­vo­lut­ed struc­ture in place.  In this case, how­ev­er, the wire­less is pret­ty far out­side the scope of any­thing I’ll deal with on the CCIE Rout­ing and Switch­ing Lab Exam, and was brought in specif­i­cal­ly to sup­port some future upgrades to my home: wire­less secu­ri­ty, roam­ing VoIP phones, etc.  The irony, as my wife so per­fect­ly point­ed out the oth­er evening, is that if we just had a “reg­u­lar” lit­tle wire­less router “like all the nor­mal, non-com­put­er geek peo­ple” our Apple devices would all work.

If you haven’t read my pre­vi­ous post­ing on Bon­jour, that might pro­vide some more back­ground but isn’t, strict­ly speak­ing, nec­es­sary.  Some under­stand­ing of Bon­jour might be help­ful, how­ev­er, so very quick­ly, here it is: Bon­jour is Apple’s imple­men­ta­tion of a ser­vice dis­cov­ery pro­to­col sim­i­lar to Microsoft’s zero-conf.  It uses a cou­ple of address­es to make things work, and it is the pro­to­col behind Apple’s “every­thing just works” mag­ic.  If you want more than that, Google can offer you much deep­er expla­na­tions.

Bon­jour uses two address­es, real­ly, to do its work: 224.0.0.251 and 224.0.0.252, the lat­ter of which is the “dis­cov­ery” part of the pro­to­col and the for­mer where the action hap­pens.  The astute among you will notice that these are both link-local address­es and so won’t be for­ward­ed by layer‑3 devices (even real­ly, real­ly bro­ken ones) at all.  I had already been around the block with this once before, and so fig­ured that because my wire­less net­work was one broad­cast domain (thought I smug­ly) every­thing would be all good.  I was wrong.

Now would be a good time to toss in a quick net­work dia­gram so that you can visu­al­ize what we’re talk­ing about here.  The draw­ing below is just the wire­less por­tion of my net­work as it applies to what we’re dis­cussing in this arti­cle.  Rest assured, there is a lot more out there, but none of it is applic­a­ble to this sit­u­a­tion.

As you can hope­ful­ly see, we have a 2811 ISR con­nect­ed to a 2950 switch via 802.1q, and two 1142 APs con­nect­ed at layer‑2 to the switch.  What might not be as obvi­ous at first is that the Wire­less Lan Con­troller you see at the upper right of the dia­gram is a mod­ule sit­ting in the 2811 router.  This is where the heart of evil appar­ent­ly lies, but more on that in a minute.  The access points are on VLAN 16, and get DHCP assign­ment from the 2811 along with option 43 and option 60 which are both nec­es­sary (despite what you may hear) to get the radios reg­is­tered to the con­troller, at least in this con­fig­u­ra­tion.  All VLANs are allowed every­where (for test­ing) and no ACL/VACLs or any oth­er secu­ri­ty out­side of stan­dard wire­less is applied.

Before any­one points out the obvi­ous, by the way, I did recon­fig­ure this arrange­ment to put the APs on the same VLAN as the WLC man­age­ment inter­face, make that the native VLAN all the way through, and bridge the switch and router at lay­er 2 with BVI, just as a test to elim­i­nate layer‑3 bound­aries.  While inter­est­ing to do, that didn’t solve the prob­lem we’re hav­ing here.  In fact, I didn’t even notice the real prob­lem loca­tion until I made this dia­gram (who would have thought?).

The WLC mod­ules that plug into a router, while run­ning the same soft­ware and oth­er­wise oper­at­ing almost iden­ti­cal­ly, are dif­fer­ent in at least one key respect from their stand-alone coun­ter­parts: they can’t com­mu­ni­cate at layer‑2 with the router.  A stan­dard con­troller (say a 4400 series) can com­mu­ni­cate at layer‑2 with radios plugged in to access switch­es, there­by becom­ing the first layer‑3 hop from the radios—even when dif­fer­ent VLANs are assigned than man­age­ment.  The inte­grat­ed mod­ule, how­ev­er, com­mu­ni­cates with the host router across the back­plane at layer‑3.  Look­ing back at the dia­gram, you can clear­ly see that drawn out.  So no mat­ter what I do with bridg­ing from the radios, switch, router, etc., inevitably I’ll have lay­er three sep­a­ra­tion between the radios and the con­troller.

This is all well and good for most pro­to­cols, but not for link-local mul­ti­cast.

I think I found every rab­bit-hole pos­si­ble to get lost down, and pro­ceed­ed to do just that.  When I final­ly ran out of said holes to explore, kind folks on twit­ter that I respect and look up to sent me off in still more direc­tions.  I tried, in no par­tic­u­lar order:

(1)    Using Des­ti­na­tion NAT to change the 224.0.0.251 and 252 address­es to mul­ti­cast in the 239.x.x.x range

(2)    Using Des­ti­na­tion NAT to change the 224.0.0.251 and 252 address­es to uni­cast

(3)    Using helper maps

(4)    Bridg­ing every­thing under the sun to every­thing under the moon.  No love because the back­plane can’t be bridged.

I was going to even try GRE tun­nels, DCI, or any oth­er type of tun­nel to move Layer‑2 over Layer‑3.  At the end of the day, how­ev­er, besides get­ting tired of the project, I decid­ed that noth­ing was like­ly to work.  Why?  Because one of the first things a layer‑3 device does when it receives a pack­et is to decre­ment the TTL.  So no mat­ter what I do with NAT, or tun­nels, or any oth­er damned thing, the router will always decre­ment the TTL before it decides to pass the pack­et to some oth­er ser­vice (like DNAT, GRE, what­ev­er), there­by dis­card­ing the pack­et before it ever reach­es those process­es.

As far as I can tell today, this is unsolv­able.  Apple hates me, and oth­ers like me.  Using a TTL of 1 as your method of lock­ing down com­mu­ni­ca­tions is pret­ty rock-sol­id from a DRM view­point, but also very inflex­i­ble and heavy-hand­ed.  I’m going to put a portable 3560 in my enter­tain­ment cen­ter to sup­port my DirecTV box, Apple TV and oth­er enter­tain­ment devices so that they can share the iTunes library on my main com­put­er, but I’m not hap­py about it.  I lose my shiny N‑connected cool­ness, and my iPad won’t be able to con­trol those devices.  In addi­tion, I’ve had to hard-set my wife’s print­er, since her Mac can’t find it any more.

The bot­tom line is that all of the auto-con­fig­u­ra­tion mag­ic that Apple devices can have has gone away in my cur­rent set up.  I could fix it by run­ning a par­al­lel wire­less net­work using autonomous access points, or buy a cheap‑o wire­less router, but then I have the oth­er prob­lem where I lose vis­i­bil­i­ty and con­trol, just to make a quirky sys­tem work.  The only viable option, real­ly, is to change out my WLC mod­ule for a stand-alone controller—which I may do at some point—but at this point I’m tired and may just move on, defeat­ed.