Boring background infoIn one of the recent threads here, I mentioned my disappointment that Verizon FIOS service isn't very friendly to people who use their own home routers. If you just use FIOS for Internet you're generally fine with your own router, but if you also have TV services, and want to use Video on Demand, remote DVR scheduling, etc., you run into problems unless you use the Verizon-provided router.
The crux of the issue is that Verizon uses
TR-069 to perform some automated setup/configuration of the router, and to figure out what set-top boxes are on the network, so that port forwards can be maintained to provide connectivity to the set-top boxes. You can set these port forwards up yourself, but if the Verizon-provided modem doesn't have a valid external WAN address, it won't report things back to the Verizon mothership, and you're out of luck.
The Rube Goldberg solution that people have come up with requires using
three (!) routers: the primary router on the network perimiter, the Verizon-provided Actiontec router to handle some of the TV stuff for the set-top boxes (guide data, VOD, etc.) and a third router whose only job is to serve the external internet address to the Actiontec's WAN port via DHCP, tricking it into thinking it's the primary router. This is what it ends up looking like on my network:
The goal is to remove "lenny", who's doing nothing except fooling "larry' into thinking he's "barney".
Digging into the weeds, I learned more about the technical details of this TR-069 communication between the Verizon router and the mothership, and I started to think it might be possible to use some software to send the correct data back to Verizon instead of using hardware to trick the router. Though the outbound connection is SSL-encrypted by default, I found out that one can set the URL for that connection to an HTTP URL on the local network, which opens up the possibility of a man-in-the-middle proxy where we can do any substitutions we need to do, then send it out over SSL to the "real" configuration server.
I've made some good progress on this, but one piece I still haven't solved is how to decode some of the configuration values that the router obscures using what seems to be a primitive cipher. I thought I had solved it, but there's something else going on with it that I can't quite figure out, so I thought I'd pose it here as a question, because we've got a bunch of smart people who love solving hard problems.
The puzzleSo, with all that out of the way, here's where I'm stuck.
There are a couple of passwords stored in the router's config that are used to authenticate to Verizon's servers, but they're stored in some sort of obfuscated format, and they need to be presented to the server in plain text. The good news is that it's trivial to store a value in the config that uses the encoding/encryption scheme, so I can generate an arbitrarily long list of raw/cooked text pairs, in an attempt to reverse-engineer the encoding scheme.
The first insight I had from just eyeballing the encoding scheme is that it uses an HTML-entity-like encoding for some characters, e.g.:
&a7;TU&9b;&97;&cb;&1c;&b4;&a1;&89;3&91;&bd;e&a7;&f7;
So, ampersand, followed by a hex number, followed by a semicolon indicates a non-printable ASCII character, with other regular ASCII characters mixed in. The next observation was that, once you decode these HTML entities, the lengths of the raw and cooked values are identical. Seeing this, I started to think it was a simple 1-to-1 mapping of ASCII values, but shifted in some way, a-la ROT13 / Caesar cipher.
I tried to measure the length of each of these shifts for each input character with some simple input, e.g.
input output shift
0000 &86;$&1f;&80; 56 f4 ef 50
So, we start with the first character of the input string (the number zero, ascii 0x30) and add 0x56 to get 0x86 in the output. Then the second character is zero again, but this time we add f4 (wrapping around using mod 256) to get 0x24, which is a dollar sign in ASCII.
Doing a few more of these short sequences, I noticed a pattern:
input output shift
0000 &86;$&1f;&80; 56 f4 ef 50
aaaa &b7;UP&b1; 56 f4 ef 50
AAAA &97;50&91; 56 f4 ef 50
zzzz &d0;ni&ca; 56 f4 ef 50
The next step was to see if the shifts were the same for every pair of input/output strings, and in my initial testing, it looked like they were. I came up with a sequence of shifts that seemed to work for most of the inputs I tried, but for longer / more complex strings, there would be slight differences in certain character positions, e.g.:
input output shift
abcdef b7;VR&b4;&99;&10; 56 f4 ef 50 34 aa
ABCDEF &97;62&94;y&ef; 56 f4 ef 50 34 a9
This suggests that whatever scheme they're using is doing some minor perturbation of these shifts based on... I don't know what, really. To illustrate further, here are some more samples from some sequential tests I ran (ellipses indicate that the pattern repeats):
000000 &86;$&1f;&80;d&d9; 56f4ef5034a9
000001 &86;$&1f;&80;d&da; 56f4ef5034a9
000002 &86;$&1f;&80;d&db; 56f4ef5034a9
000003 &86;$&1f;&80;d&dc; 56f4ef5034a9
...
00000U &86;$&1f;&80;d&fe; 56f4ef5034a9
00000V &86;$&1f;&80;d&ff; 56f4ef5034a9
00000W &86;$&1f;&80;d&01; 56f4ef5034aa
aaaaaa &b7;UP&b1;&95;&0b; 56f4ef5034aa
aaaaab &b7;UP&b1;&95;&0c; 56f4ef5034aa
aaaaac &b7;UP&b1;&95;&0d; 56f4ef5034aa
...
aaaaa| &b7;UP&b1;&95;&26; 56f4ef5034aa
aaaaa} &b7;UP&b1;&95;&27; 56f4ef5034aa
aaaab! &b7;UP&b1;&96;&ca; 56f4ef5034a9
aaaab# &b7;UP&b1;&96;&cc; 56f4ef5034a9
Things get even murkier when you start using longer strings. Here are some results from some random test runs (encoded output omitted for brevity):
input shift
htaoJdCMoKhRlQkN 56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9b 01 78 b4
rKNKPKEqSSTOBSEX 56 f4 ef 50 34 a9 ef 6b 55 4b 03 3c 9a 01 78 b4
WCUPMgniVItNITJj 56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9a 01 78 b4
yQUpwjiGqgZlPOoc 56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9a 01 78 b4
bjOCTlTNlDQkiAsY 56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9b 01 78 b4
yVUDfCDYchhJiEKd 56 f4 ef 50 34 a9 ef 6b 55 4b 03 3c 9b 01 78 b4
vGknPsLbqIJomZQT 56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9b 01 78 b4
nzkXHVZkIFDqMCvp 56 f4 ef 50 34 a9 ef 6b 55 4b 03 3c 9a 01 78 b4
nTHihKzuywlMvCZY 56 f4 ef 50 34 a9 ef 6b 55 4b 03 3c 9b 01 78 b4
pcrIFfIhuwrBJNsr 56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9a 01 78 b4
In these examples, the shifts of character positions 6 and 13 are off by one in some cases. So close, yet so far!
Once you start throwing in punctiation, things get even weirder:
input shift
c-3Lp4bUWp$>@/j] 56 f4 ef 50 34 a9 ef 6b 55 4b 03 3c 9a 01 78 b4
3`eIP4is]}`2*ZrV 56 f4 ef 50 34 a9 ef 6b 55 4b 03 3c 9a 01 78 b4
p=`9C$bp&]:Dxb@\ 56 f4 ef 50 34 a9 ef 6b 55 4b 03 3c 9b 01 78 b4
`8f*s&"n~f8e(~#d 56 f4 ef 50 34 a9 b5 b7 65 33 d5 69 5d 57 1d f5
|LbLIsBx@~8Ep5Jh 56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9b 01 78 b4
zYrslorJ%>RT4u"x 56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9a 01 3e 0a
I$ih@qt4cYW(P4vF 56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9a 01 78 b3
W;e;`\q/aE=9a`[4 56 f4 ef 50 34 aa ef 6b 55 4b 03 3c 9a 01 78 b3
=flO1:o;FB6h-Oed 56 f4 ef 50 34 a9 ef 6b 55 4b 03 3c 9a 01 78 b4
#"ZbZez|:5vX8aPy 56 ba 27 58 2c b5 04 6d 13 46 44 1e 7a 2a 67 dd
So, there's clearly a pattern, but I can't for the life of me figure out what the algorithm is doing to induce these slight changes in how each character is shifted.
Sadly, I never took any crypto classes, and I was never all that great at math, so I'm sort of stumped on what's going on. I was hoping someone here might have some insight on other things to try, or what the algorithm could be doing behind the scenes to introduce these slight changes.
I'm sure there are probably some better forums on the Internets to ask this kind of question, but I know folks here like a challenge, so I thought I'd post it here first. And, if not, I thought it was a fun story to tell about hacking hardware and trying to make it work better, which is something dear to the hearts of most folks here.