This is a multi-part blog series about Reverse Engineering, a fundamental building block in every hacker’s tool-chain for compromising mobile applications. Throughout different blogs in this series, I’ll explain the major types of reverse engineering (including static analysis and dynamic analysis), including why hackers do it, what information they learn from it, and how they use that information to attack your app, your users, and your business. I’ll also explain how attackers combine reversing methods to break a mobile app’s defenses and understand how your code and application work. I’ll cover some of the most widely used tools and methods hackers use for reversing apps and provide examples of real attack scenarios using different methods. I’ll also cover some of the different options and choices developers have to protect against reversing and explain the pros and cons of each.
And finally, I’ll explain how you can use automation to deliver more comprehensive protection against reverse engineering than you do today, but with a lot less work. First, let’s get a few definitions straight:
What is Reverse Engineering?
Reverse engineering is the process of breaking something down to understand how it works. In the context of mobile apps, reverse engineering (also known as reversing) involves deconstructing, analyzing, and observing compiled apps to understand the app’s code, logic, and underlying function. The 2 main types of reverse engineering are Static Analysis and Dynamic Analysis. With static analysis, attackers analyze the app’s source code and control flows (app logic) to understand what the code does and how the app works, without actually running the app.
Dynamic analysis is the evaluation of a mobile application while it runs, which involves executing and interacting with the app to understand or change its behavior.
In this first post of the series, I will focus on Static Analysis. I’ll cover dynamic analysis in an upcoming blog in this series. You can skip to our resources on Dynamic Analysis if you want to start there.
Who Performs Static Analysis on Mobile Apps and Why?
You can learn a lot about an application just by analyzing the source code and this can be done with both good and bad intentions. For example, penetration testers use static analysis to test the app’s security model, uncover weaknesses in the app’s defenses, and even disable security protections (such as anti-tampering or jailbreak/root detection). And black-hat hackers use similar methods to understand the source code, except their intentions are always malicious. Their goal is to use reverse engineering to understand what, where, and how to attack your app. In some cases, they may be after sensitive data about the app or users, which developers often store in the app’s code (inside strings, preference files, resource folders, etc in clear text). They also use reversing to reveal the application’s logic, find critical files/data, analyze and change workflows, or find and exploit vulnerable code or 3rd party libraries used in the application, among other things.
Reversing also helps hackers increase the scope or impact of their attacks and increase their likelihood of success. Hackers often accumulate a large amount of detailed information about the app and its users, which they use later in downstream attacks or to create specially crafted malware. Generally speaking, the more an attacker knows about your app, the more effective their attacks will be.
How Static Analysis Can Be Used Maliciously
Below are some of the ways cybercriminals use the information that they learn from static analysis. There are far more aspects of reverse engineering than can be covered here, so this is not intended to be an exhaustive list.
- To learn information about how the app connects to its backend, which can be used to perform attacks against the app’s servers (eg: credential stuffing, botnet attacks, account takeovers, DoS attacks, etc)
- To find weaknesses in the app’s encryption model, such as weak ciphers, encryption keys stored in the app insecurely. This can be used to crack the app’s encryption or to find valuable data stored inside the app insecurely.
- To harvest or steal data about users by searching for hard-coded information in the app, or data about users that are stored in plain-text in strings, within app preferences, user defaults, or other resource or assets folders that the app needs in order to run.
- To understand the app’s permissions and intents and other components which govern how the app shares data with other apps and interacts with the operating system or other components. These allow external applications to access data in your app, which can potentially leak user data to other applications. This can be abused to craft specific attacks that abuse legitimate functionality or to create malware that masquerades as a legitimate app and tricks users into granting excessive permissions, revealing sensitive information (eg: screen overlay attacks, StrandHogg)
- To gain intelligence, perform subsequent code modification (such as bypassing or disabling security controls or modifying business logic or workflows).
- To steal intellectual property or create fake versions, clones, or mods of apps
How Attackers Conduct Static Analysis
So now that we know some of the things that can be accomplished via static analysis, let’s explore how hackers go about doing it. I’ll cover some of the basics of static analysis using examples (In this first post, I’ll focus on Android apps in the examples. I’ll cover iOS in a separate post in this series). To begin static analysis an attacker usually starts by downloading Android and iOS apps from various app stores, where the apps are packaged in binary format (.APKs and .IPAs). An APK or IPA is really just a form of a compressed ZIP file that contains the app, its code, and all the resources the app needs to run. To view the contents of an app binary, all you have to do is extract (unzip) the file. For Android apps, you can simply rename the file extension to .zip, open the folder, and view the contents. If the app is only superficially obfuscated or doesn’t make wide use of encryption, then the hacker’s job is going to be quite easy because they’ll be able to simply read the code, or use a disassembler or decompiler to access and analyze source code.
Because binary code is not understandable by humans, hackers will usually disassemble or decompile the mobile app to recreate and analyze the source code in the original format that the developer wrote it.
Popular tools used for such purposes are baskmali, APKTool, JD-GUI, jadx, IDA-Pro, Hopper and a plethora of others.
Neither exactness nor completeness is required for an attacker to reach their goal of understanding the source code. Depending on the hacker’s goal, they can do plenty of damage just by understanding specific blocks of code.
So how does a hacker find the source code in the app? There are many ways to do it, and none of them are very difficult to do. Why? Well, all mobile applications are built and packaged according to a defined and well-understood structure.
Where is Source Code Located in iOS and Android apps?
In native iOS apps, the code is part of the application executable (C/C++/Objective-C/Swift). In native Android apps, code can be found in DEX files (Java/Kotlin) and/or in native libraries (C or C++). With native iOS and Android apps, source code is converted into binary format when the app is compiled.
In Maui, Xamarin apps, which are written in C#, (“C Sharp”) source code is located in DLL files (dynamic link libraries). In non-native apps (hybrid apps) such as Cordova and React Native apps, source code comes in the form of JavaScript, CSS, or HTML files, and is not compiled into binary format the same way as with native apps. Instead, source code for non-native apps is usually stored inside the app in a folder when the app is compiled. So to read the source code for non-native apps, decompiling the app is not always necessary if the app is not obfuscated. One only needs to extract the contents (ie: rename the folder and open it and you can see the source code).
Main Areas Hackers Target For Static Code Analysis In Android Apps
Below are some of the main areas that hackers focus on when performing static analysis of Android apps.
- Android Manifest: (eg AndroidManifest.xml) contains the app’s metadata, configuration and components. When reversing an .apk the manifest is a good starting point for the hacker to determine the best ‘entry point’ (ie: which classes are most attractive targets to analyze first. You can also learn about the app’s activities, running services, permissions, intents, broadcast receivers and more).
- Classes: The classes folder (eg: classes.dex) contains the source code (executable Java classes, otherwise known as DEX files) which is ultimately executed by the Android Runtime. Every APK has a single classes.dex file, which references any classes or methods used within an app.
- Strings: (eg: Java strings, or strings.xml) contains string resources of the Android application, which are often used to store and share text-based information such as usernames/passwords, authentication info, API keys, and more. In many apps, this information is stored in plain-text, and is a favorite target of hackers of all types due to the wealth of valuable information that can be found here.
- Preferences: (eg: SharedPreferences) – contains persistent information about the user and app, likes, and preferences. For example, some apps may store credit card numbers, personally identifiable information (PII), local currency, location information, etc. in preferences.
- Assets: contains Images, videos, and other interactive content. With non-native apps and frameworks, the assets folder can include source code files and other data. For example, Cordova or React Native applications will store JavaScript code in the assets folder. Maui, Xamarin applications may store DLL files in the “assemblies” folder (which serves a similar purpose as the manifest).
- Resources: contains resources, images, layout, directories for launcher icons, and may contain string resource or pre-compiled code or resources (strings, colors, styles, etc.)
- lib/: native libraries (C or C++) used in the application.
- META-INF: APK metadata, signature information which can be used to re-sign, repackage and redistribute clones or pirated apps
How To Protect Against Static Code Analysis
Code obfuscation is one of the primary methods of preventing static code analysis. Obfuscation is the practice of obscuring source code and application logic to prevent attackers from understanding the meaning, intent or function of your source code, including how it executes instructions or logic – all without changing the underlying function or behavior of the app. Obfuscation is one of the most important ‘first lines of defense’ against malicious reversing.
There are multiple techniques to obfuscate code, and a comprehensive solution requires implementing several techniques that complement and reinforce each other. The reason you should layer in multiple techniques is to prevent attackers from being able to circumvent any single protection easily by working their way around the obfuscation.
A robust obfuscation solution will make it easy to implement name obfuscation (renaming methods, objects, variables, classes, libraries), string and resource encryption, encryption of app preferences and user defaults. Additionally, an important protection for Android apps is code packing (obfuscating and/or encrypting DEX files aka. Java Classes). In addition, not only is it important to obfuscate the code and libraries, but also the application’s control flows (app logic), including such diverse techniques as dummy code insertion, replacing function call targets, inserting arbitrary paths into the flow, or making the function tree appear broken to attackers by hiding the original code path or targets in an encrypted location. And finally, it’s always recommended to strip debug information, so that hackers can’t sift through stack traces and easily understand the application’s behavior.
With Appdome you can implement one or multiple of these the above techniques together in order to reinforce and strengthen the overall defense through a multi-layered approach. For example, many of our customers combine native code obfuscation with control flow relocation to protect the code, libraries as well as the application’s logical flows. If your app has non-native code or libraries, you can add Non-native code obfuscation. And it’s always recommended to Strip Debug Info to remove symbols, and all descriptive information from the application’s binaries, such as identifiers (variable and function names) and source code names/line numbers.
And with Appdome, you can implement a comprehensive defense in minutes to prevent against both static and dynamic analysis, all without any manual coding, without symbolicating your code, without using a specialized compiler, and without an SDK. This is how Appdome customers are able to obfuscate and protect a large majority of their code instantly, without any of the complexity, inflexibility or arduous line-by-line coding of traditional obfuscation solutions which are bound to a specific coding language.
Further, Appdome covers all frameworks, all programming languages (Java, Kotlin, Swift, C++, Objective C) all OS platforms, and all non-native or hybrid frameworks (such as React Native, Cordova, Flutter, Maui, Xamarin, Ionic, or any of the low-code development frameworks in the market today) all with a single solution.
The complexity of traditional obfuscation solutions is explained in this leading research paper, which studied millions of Android apps to understand why such a large majority of apps either contained no obfuscation or superficial obfuscation (class name obfuscation only). The simple reason: complexity.
The antidote: Simplicity. Try Appdome and see for yourself.