Ice Cream Sandwich: why native code support sucks

The Android ICS native library loading code is broken, so we re-wrote it.

You can ask any mobile developer: the major disadvantage of developing on Android is the number of different devices you have to support. You have to handle different sizes of screen, different hardware, and a large range of Android versions evolving so fast that ensuring backward compatibility is close to impossible. And if that was not enough, you sometimes have to deal with bugs… in Android itself. The latest release, Ice Cream Sandwich (ICS), was not an exception to this rule, and did not fail to surprise us with a quite tricky and unpleasant bug.

ICS

Before going on to the bug in itself, let us give a little context on how it happened. We designed the Moodstocks SDK as a C library and we ported it on iOS then Android thanks to the Android Native Development Kit (NDK). For performance reasons, we decided to support only ARMv7 devices (non-ARMv7 devices being mainly older devices), and cross-compiled our library so that the non-ARMv7 version would only return errors to inform the user that his/her device was not compatible.

Bug

Everything went perfectly well until this terrible day of March 2012, when one of our developers informed us that his brand new Samsung Galaxy Nexus running ICS told him that it was not compatible with our SDK. Even worse: after a few tests, he determined that it worked like a charm if he did not bundle too many resources with his app (animation xml’s, graphical resources, etc…), and suddenly decided to become incompatible if he bundled more than 8 resources. Yes, you heard me right: apparently, one of the latest and most powerful smartphones available, running the latest version of Android on its dual-core 1.5GHz ARMv7 processor, chose the most absurd and unrelated reason to start thinking of itself as a 3-year-old ARMv6 device.

Let’s first give a little example of what theoretically happens when you cross-compile a native library for both these architectures and run it on an Android device. Using the NDK, let’s build a simple C library containing this function, cross-compiled for ARMv7 and non-ARMv7 architectures:

  • C
int
Java_com_my_namespace_MyClass_MyNativeFunction
(JNIEnv *env, jobject obj) {
#ifdef __ARM_V7__
  return 1;
#else
  return 0;
#endif
}

This will result in two files: /lib/armeabi-v7a/libfoo.so compiled for ARMv7 devices, and /lib/armeabi/libfoo.so for non-ARMv7 devices. Let’s write the corresponding Java class:

  • Java
package com.my.namespace;
public class MyClass {
  static {
    System.loadLibrary("foo");
  }
  public native int MyNativeFunction();
}

When you call this class in Java, the System.loadLibrary function will check your device, and decide which of the two libraries it will load and use at runtime. And, as expected, this function will return 1 on any ARMv7 device, and 0 otherwise.

That was for theory. Because in practice, and as explained in this thread, ICS developers accidentally let this functionality go rogue on Android 4.0.1 – 4.0.3: when crawling the application’s apk file looking for the right version of the library to use, ICS “forgets” that it found an ARMv7 version and choses the non-ARMv7 version instead! Luckily for us, they provide this quite ugly but useful tip:

“ensure that the armeabi-v7a [i.e. ARMv7] binaries are packaged after the armeabi [i.e. non-ARMv7] ones in the final .apk. This is not trivial, but one way to do it is remove the armeabi-v7a files from the package, then add them back, manually.”

OK, it looks quite annoying, but at least it shows some coherency with the fact that adding resources could mess up with our SDK, as adding files in an archive does not necessarily preserve the files order. We thus started testing this workaround: after all, the Android SDK contains a small tool called aapt made especially to manipulate apk files. Let’s try what is suggested:

$ aapt list MyApp.apk //shows the content of the apk, we shorten it to our libs only:
> lib/armeabi/libfoo.so
> lib/armeabi-v7a/libfoo.so

$ jar xf MyApp.apk //extract
$ aapt remove MyApp.apk lib/armeabi-v7a/libfoo.so //remove the ARMv7 lib
$ aapt add MyApp.apk lib/armeabi-v7a/libfoo.so //put it back

$ aapt list MyApp.apk //check result
> lib/armeabi/libfoo.so
> libfoo.so

See the problem here? The file was added back, but not within the right folder. Let’s have a look at the interesting lines in aapt’s documentation:

aapt a[dd] [-v] file.{zip,jar,apk} file1 [file2 ...]
  Add specified files to Zip-compatible archive.
  [...]
  -k junk path of file(s) added

As you can see, we did not use the -k option. Believe me or not, but the tool provided by Android is bugged too! Never mind… after all, an apk file is nothing more than a zip file, so why not use the usual zip tool? I’ll spare you the details, but applying the same method using zip and checking the files order using the following small python script, we realized that the suggested workaround simply did not work.

  • Python
from zipfile import ZipFile,ZIP_DEFLATED,ZIP_STORED
from sys import argv

with ZipFile(argv[1],"r") as x:
  for y in x.infolist():
    if y.compress_type == ZIP_DEFLATED:
      print y.filename + " deflated"
    elif y.compress_type == ZIP_STORED:
      print y.filename + " stored"
    else:
      print y.filename + " WTF"

Working around the workaround

At this point, we had spent hours trying to work around a bug in Android, using a bugged Android tool, with a method that did not work. But all hope was not lost. After hiding under my desk to cry for a good 10 minutes a good coffee, we decided to try a more brutal method – and i don’t mean smashing the phone with a sledgehammer, however tempting it seemed at the moment.

So, as the problem comes from the fact that this System.loadLibrary function cannot be trusted to choose between the ARMv7 and non-ARMv7 libraries, we’ll simply do it ourselves. The problem divided in two main parts:

  • System.loadLibrary can’t be trusted to choose between two libraries with the same name, even if they are placed in explicitly named directories. By renaming them to /lib/armeabi/libfoo-core.so and /lib/armeabi-v7a/libfoo-core-v7a.so, we would simply call System.loadLibrary(“foo-core-v7a”) if we were on an ARMv7 architecture, and System.loadLibrary(“foo-core”) otherwise.

  • There is no direct way in Java to know if the device you’re using is ARMv7 or not: we had to create another native library that would be in charge of this choice, cross compiled for ARMv7 and non-ARMv7 architectures, and simply named libfoo.so. A good piece of code being worth a thousand words, here is an example of source code for this libfoo.so:

  • c
#include <jni.h>
#include <cpu-features.h>

jboolean
Java_com_my_namespace_MyClass_isARMv7
(JNIEnv *env, jclass class) {
  uint64_t features = android_getCpuFeatures();
  if ((android_getCpuFamily() != ANDROID_CPU_FAMILY_ARM) ||
      ((features & ANDROID_CPU_ARM_FEATURE_ARMv7) == 0) ||
      ((features & ANDROID_CPU_ARM_FEATURE_NEON) == 0)) {
    return JNI_FALSE;
  }
  else {
    return JNI_TRUE;
  }
}

And the corresponding Java class simply becomes:

  • Java
package com.my.namespace;
public class MyClass {

  static {
    System.loadLibrary("foo");
    if (isARMv7()) {
      System.loadLibrary("foo-core-v7a");
    }
    else {
      System.loadLibrary("foo-core");
    }
  }

  public native boolean isARMv7();
  public native int MyNativeFunction();
}

This way, the first loaded library makes a native isARMv7() function available, and this function is used to decide which core library must be loaded immediately after.

In the end, this workaround happens to be far more reliable and viable in the long term. Even though Google promises that the bug will be fixed in the next release of ICS, experience shows that many users don’t update their device, or update them months after the release, which will lead to thousands and thousands of “corrupted” devices that, despite this bug, we want to support. This trick has the advantage of being far more practical and sure than tinkling with our apk files hoping that it magically fixes a bug, and we’ll be able to keep it in place for a long time without concern.

  • Maxime Brénon
comments powered by Disqus