I started working on this. I'm posting my results so far here as a "community wiki" answer for two reasons: first, if someone else wants to join in, there's a place to talk; second, if I get pulled away from this project, there'll be hints for someone else to start working.
The backup logic on the host is entirely contained within https://github.com/android/platform_system_core/blob/master/adb/commandline.cpp, in the function named
backup. The function is very simple: it validates the command line options, sends the command mostly as-is to the adb daemon on the phone, and writes the phone's output to the file. There isn't even error-checking: if, for example, you refuse the backup on the phone,
adb just writes out an empty file.
On the phone, the backup logic starts in
service_to_fd() in https://github.com/android/platform_system_core/blob/master/adb/services.cpp. The function identifies that the command from the host is
"backup", and passes the unparsed command to
/system/bin/bu, which is a trivial shell script to launch
com.android.commands.bu.Backup as the main-class of a new Android app process. That calls
ServiceManager.getService("backup") to get the backup service as an
IBackupManager, and calls
IBackupManager.fullBackup(), passing it the still-unused file descriptor (very indirectly) connected to the
backup.ab file on the host.
Control passes to
fullBackup() in com.android.server.backup.BackupManagerService, which pops up the GUI asking the user to confirm/reject the backup. When the user do so,
acknowledgeFullBackupOrRestore() (same file) is called. If the user approved the request,
acknowledgeFullBackupOrRestore() figures out if the backup is encrypted, and passes a message to
BackupHandler (same file.)
BackupHandler then instantiates and kicks off a
PerformAdbBackupTask (same file, line 4004 as of time of writing)
We finally start generating output there, in
PerformAdbBackupTask.run(), between line 4151 and line 4330.
run() writes a header, which consists of either 4 or 9 ASCII lines:
- the backup format version: currently
"0" if the backup is uncompressed or
"1" if it is
- the encryption method: currently either
- (if encrypted), the "user password salt" encoded in hex, all caps
- (if encrypted), the "master key checksum salt" encoded in hex, all caps
- (if encrypted), the "number of PBKDF2 rounds used" as a decimal number: currently
- (if encrypted), the "IV of the user key" encoded in hex, all caps
- (if encrypted), the "master IV + key blob, encrypted by the user key" encoded in hex, all caps
The actual backup data follows, either as (depending on compression and encryption)
TODO: write up the code path that generates the tar output -- you can simply use tar as long as entries are in the proper order (see below).
Tar archive format
App data is stored under the app/ directory, starting with a _manifest file, the APK (if requested) in a/, app files in f/, databases in db/ and shared preferences in sp/. If you requested external storage backup (using the -shared option), there will also be a shared/ directory in the archive containing external storage files.
$ tar tvf mybackup.tar -rw------- 1000/1000 1019 2012-06-04 16:44 apps/org.myapp/_manifest -rw-r--r-- 1000/1000 1412208 2012-06-02 23:53 apps/org.myapp/a/org.myapp-1.apk -rw-rw---- 10091/10091 231 2012-06-02 23:41 apps/org.myapp/f/share_history.xml -rw-rw---- 10091/10091 0 2012-06-02 23:41 apps/org.myapp/db/myapp.db-journal -rw-rw---- 10091/10091 5120 2012-06-02 23:41 apps/org.myapp/db/myapp.db -rw-rw---- 10091/10091 1110 2012-06-03 01:29 apps/org.myapp/sp/org.myapp_preferences.xml
- An AES 256 key is derived from the backup encryption password using 10000 rounds of PBKDF2 with a randomly generated 512 bit salt.
- An AES 256 master key is randomly generated
- A master key 'checksum' is generated by running the master key through 10000 rounds of PBKDF2 with a new randomly generated 512 bit salt.
- A random backup encryption IV is generated.
- The IV, master key, and checksum are concatenated and encrypted with the key derived in 1. The resulting blob is saved in the header as a hex string.
- The actual backup data is encrypted with the master key and appended to end of the file.
Sample pack/unpack code implementation (produces/uses) tar archives: https://github.com/nelenkov/android-backup-extractor
Some more details here: http://nelenkov.blogspot.com/2012/06/unpacking-android-backups.html
Perl scripts for packing/unpacking and fixing broken archives: