Firmware extraction and reconstruction

Recently I had to extract a firmware from an I2C EEPROM.

Although I am pretty used to SPI EEPROM on embedded equipments, seeing an I2C bus seemed pretty unusual to me.

As you may have noticed from my previous posts, I make heavily use of my GoodFET. It is a very handy tool and although I also have a BusPirate v4, I prefer Travis’s tool. Unfortunately, I2C protocol is not compiled by default on the firmware, the tools are marked as “untested” on the website and the pinout is not documented on the website. That’s a lot of things to find out :-)

The pinout issue will be quite easy to solve as all we have to do is look at the firmware source code! In the file “firmware/apps/i2c/i2c.c” we can read the following:

#define SDA TDI
#define SCL TDO

Our next step is to compile a new firmware, including i2c protocol, by running the following commands:

$ board=goodfet41 CONFIG_i2c=y make
$ board=goodfet41 make reinstall

So far, so good. Finally, we have to run the client and get our EEPROM content, right? I2C requires and address on the bus. It is usually a 7 bit address that is then shifted left once to add a R/W bit at the end.

By having a look at the EEPROM datasheet, the basic address is 0x50.

$ goodfet.i2ceeprom dump 0x50 output.bin
Dumping 256 bytes from device 0x50 starting at 0x00 to file: output.bin. Traceback (most recent call last):
File "/usr/local/bin/goodfet.i2ceeprom", line 50, in <module> data=client.I2Ctrans(count, [devadr, start])
File "/home/jm/workspace/goodfet/trunk/client/GoodFETI2C.py", line 32, in I2Ctrans
return self.writecmd(0x02,0x02,len(data)+1,[readcount]+data)
File "/home/jm/workspace/goodfet/trunk/client/GoodFET.py", line 438, in writecmd data[i]=chr(data[i]);
ValueError: chr() arg not in range(256)

Doh! The client is not working at all :-(

By adding some debug lines, it appears that the script is sending 0x100 (the length to be read, aka ‘readcount’ variable) but then, the script tries to convert it into a char.

Obviously I don’t want to read the EEPROM by chunks of 128 bytes saved into individual files and then concatenate everything back. So I spent some time reading the datasheet.

To talk to the EEPROM, you have to write at least 3 bytes:

  • Address of the component on the bus
  • A 2 bytes address from which you want to read or write

Unfortunately, the client is dealing with a single byte address.

Therefore, I patched the client to mimic the behavior of the similar SPI client that dumps EEPROM:

  • Reading/writing addresses are now 2 bytes long
  • The client reads the EEPROM by chunks of 128 bytes and thus can accept arbitrary length.

In addition to those changes, I notified that it would be great to have an I2C bus scanning command, just like the macro provided by BusPirate. Hence I am currently adding it to the client.

As soon as the code has been tested enough, I will push it to the main repository.

Being careful when I dump a firmware directly from a soldered component, I always do it twice to check that the files are identical. And guess what? They were not!

To tackle that problem, I simply ran vbindiff and found the following:

It seems I have read issues at some chunks. I don’t know if it is due to the MCU trying to access the EEPROM simultaneously or a firmware error on the GoodFET but I didn’t want to spend a lot of time investigating that issue.

Therefore, I decided to dump multiple times the EEPROM, assuming that, statistically, the read errors won’t be at the same offset each time. To finally get the full firmware, all I had to do is run a small tool that takes all the dumped files in input and select for each byte the value that is the most frequent one!

I was a bit surprised but I expected such tool to already exist. As a google search wasn’t successful, I wrote the tool and it worked just like expected!

The tool is available in my Bitbucket account and is called “firmware-reconstruct.py”. It takes one parameter for the output file name and all other parameters are considered as input candidates. Files are read by chunks to prevent memory exhaustion issues and then every bytes are compared to choose the right one for the output. The tools also prints a warning if two candidates have the same occurrence frequency, letting you know that you may need another dump.

Any feedback is welcome and I hope this tool will be useful to someone else :-)