When we use popular apps with good ratings from official app stores we assume they are safe. This is partially true – usually these apps have been developed with security in mind and have been reviewed by the app store's security team. However, we found that because of third-party SDKs many popular apps are exposing user data to the internet, with advertising SDKs usually to blame. They collect user data so they can show relevant ads, but often fail to protect that data when sending it to their servers.
During our research into dating app security, we found that some analyzed apps were transmitting unencrypted user data through HTTP. It was unexpected behavior because these apps were using HTTPS to communicate with their servers. But among the HTTPS requests there were unencrypted HTTP requests to third-party servers. These apps are pretty popular, so we decided to take a closer look at these requests.
HTTP request with unencrypted user data
One of the apps was making POST requests to the api.quantumgraph[.]com server. By doing so it was sending an unencrypted JSON file to a server that is not related to the app developers. In this JSON file we found lots of user data, including device information, date of birth, user name and GPS coordinates. Furthermore, this JSON contained detailed information about app usage that included information about profiles liked by the user. All this data was sent unencrypted to the third-party server and the sheer volume makes it really scary. This is due to the use of a qgraph analytics module.
Unencrypted user data sent by app
Two other dating apps from our research were basically doing the same. They were using HTTPS to communicate with their servers, but at the same time there were HTTP requests with unencrypted user data being sent to a third-party server. This time it was another server belonging not to an analytics company but to an advertising network used by both dating apps. Another difference was GET HTTP requests with user data being used as parameters in a URL. But in general these apps were doing the same thing – transmitting unencrypted user data to third-party servers.
List of HTTP requests from advertising SDK
At this point it already looked bad, so I decided to check my own device, collecting network activity for one hour. It turned out to be enough to identify unencrypted requests with my own data. And again the cause of these requests was a third-party SDK used by a popular app. It was transmitting my location, device information and token for push messages.
HTTP request from my device with my own unencrypted data
So I decided to take a look at those dating apps with the leaking SDKs to find out why it was happening. It came as no surprise that they were used by more than one third party in these apps – in fact, every app contained at least 40 different modules. They make up a huge part of these apps – at least 75% of the Dalvik bytecode was in third-party modules; in one app the proportion of third-party code was as high as 90%.
List of modules from analyzed dating apps
Developers often use third-party code to save time and make use of existing functionality. This makes perfect sense and allows developers to focus on their own ideas instead of working on something that has already been developed many times before. However, this means developers are unlikely to know all the details of the third-party code used and it may contain security issues. That's what happened with the apps from our research.
Knowing that there are popular SDKs exposing user data and that almost every app uses several SDKs, we decided to search for more of these apps and SDKs. To do so we used network traffic dumps from our internal Android sandbox. Since 2014 we have collected network activities from more than 13 million APKs. The idea is simple – we install and launch an app and imitate user activity. During app execution we collect logs and network traffic. There is no real user data, but to the app it looks like a real device with a real user.
We searched for the two most popular HTTP requests – GET and POST. In GET requests user data is usually part of the URL parameters, while in POST requests user data is in the Content field of the request, not the URL. In our research, we looked for apps transmitting unencrypted user data using at least one of these requests, though many were exposing user data in both requests.
We were able to identify more than 4 million APKs exposing some data to the internet. Some of them were doing it because their developers had made a mistake, but most of the popular apps were exposing user data because of third-party SDKs. For each type of request (GET or POST) we extracted the domains where apps were transmitting user data. Then we sorted these domains by app popularity – how many users had these apps installed. That's how we identified the most popular SDKs leaking user data. Most of them were exposing device information, but some were transmitting more sensitive information like GPS coordinates or personal information.
This domain is part of a popular advertising network. It was used by the two dating apps mentioned at the beginning of this article. We found many more popular apps with this SDK – at least five of them have more than 100 million installations according to Google Play Store and many others with millions of installations.
It transmits the following data in unencrypted form:
HTTP request with user data in URL
Key words are the most interesting part of the transmitted data. They can vary depending on app parameter settings. In our data there was usually some personal information like name, date of birth and gender. Location needs to be set by an app too – and usually apps provide GPS coordinates to the advertising SDK.
We found several different versions of this SDK. The most common version was able to use HTTPS instead of HTTP. But it needs to be set by the app developers and according to our findings they mostly didn't bother, leaving the default value HTTP.
Advertising SDK using HTTP by default
This domain is also part of a popular advertising network. We found two apps with more than 500 million installations, seven apps with more than 100 million installations and many others with millions of installations.
It transmits the following data:
We should mention that while most of this data was transmitted in plain text as URL parameters, the coordinates, IMEI and MAC address were encoded with Base64. We can't say they were protected, but at least they weren't in plain text. We were unable to find any versions of this SDK where it's possible to use HTTPS – all versions had HTTP URLs hardcoded.
Advertising SDK collects device location
Another popular advertising SDK that collects the same data as the others:
We found seven apps with more than 10 million installations from Google Play Store and many other apps with fewer installations. We were unable to find any way for the developers to switch from HTTP to HTTPS in this SDK either.
The fourth advertising SDK is appsgeyser and it differs from the others in that it is actually a platform to build an app. It allows people who don't want to develop an app to simply create one. And that app will have an advertising SDK in it that uses user data in HTTP requests. So, these apps are actually developed by this service and not by developers.
They transmit the following data:
We found a huge amount of apps that have been created with this platform and are using this advertising SDK, but most of them are not very popular. The most popular have just tens of thousands of installations. However, there really are lots of these apps.
Screenshot of appsgeyser.com
According to the appsgeyser.com web page there are more than 6 million apps with almost 2 billion installations between them. And they showed almost 200 billion ads – probably all via HTTP.
All apps posting unencrypted data to this server were created by the same company, so it isn't because of third-party code. But these apps are very popular – one of them was installed more than 500 million times from Google Play Store. These apps collect a large amount of device information:
Device information collected by the app
This unencrypted data is then sent to the server. Furthermore, among the data they are uploading is a list of supported commands – one of them is to install an app. The list of commands is transmitted in plain text and the answer from the server is also unencrypted. This means it can be intercepted and modified. What is even worse about this functionality is that the app can silently install a downloaded app. The app just needs to be a system app or have root rights to do so.
Fragment of code related to the silent installation of apps upon command from the server
Here is another example of popular apps leaking user data not because of third-party code but because of a mistake by developers. We found several popular Lenovo apps collecting and transmitting unencrypted device information:
H TTP request with unencrypted device information
This information is not very sensitive. But we found several Lenovo apps that were sending more sensitive data in unencrypted form, such as GPS coordinates, phone number and email.
_App code for the collection of device coordinates and other data _
We reported these issues to Lenovo and they fixed everything.
This domain is used by a very popular advertising SDK. There are tons of apps using it. One of them even has more than 500 million installations and seven other apps have more than 100 million installations. Most of the apps with this SDK are games. There are two interesting things about this SDK – the transmitted data and the protocol used.
This SDK sends the following data to the server:
It also sends information about hardware availability:
Advertising SDK that collects information about device hardware features
It may also send some personal information, such as age, income, education, ethnicity, political view, etc. There's no magic involved – the SDK has no way of finding this information unless the apps that are using this SDK provide it. We have yet to see any app providing these details to the SDK, but we think users should be aware of the risks when entering such details to apps. The information could be passed on to the SDK and the SDK could expose it to the internet.
Advertising SDK could send personal information
The second interesting thing about this SDK is that it uses HTTPS to transmit data, but usually only for the initial communication. After that it may receive new configuration settings from the server that specify an HTTP URL. At least that's what happened on my device and several other times with different apps on our test devices.
_HTTPS URL in advertising SDK _
Another SDK that is leaking data uses the quantumgraph.com domain. This is an analytics SDK, not an advertising one. We found two apps with more than 10 million installations from Google Play Store and another seven apps with more than a million installations. More than 90% of detected users with this SDK were from India.
This SDK posts JSON files with data via HTTP. The data may vary from app to app because it's an analytics SDK and it sends information provided by the app. In most cases, the following items are among the sent data:
List of installed apps were sent in unencrypted form to the server
In the case of the dating app, there were likes, swipes and visited profiles – all user activity.
App usage was sent in unencrypted form to the server
This SDK was using a hardcoded HTTP URL, but after our report they created a version with an HTTPS URL. However, most apps are still using the old HTTP version.
Of course, there are other SDKs using HTTP to transmit user data, but they are less popular and almost identical to those described above. Many of them expose device locations, while some also expose emails and phone numbers.
Phone number and email collected by an app to be sent via HTTP
During our research, we found many apps that were transmitting unencrypted authentication details via HTTP. We were surprised to discover how many apps are still using HTTP to authenticate their services.
Unencrypted request with authentication token
They weren't always transmitting user credentials – sometimes they were credentials for their services (for example databases) too. It makes no sense having credentials for such services because they are exposed to the internet. Such apps usually transmit authentication tokens, but we saw unencrypted logins and passwords too.
Unencrypted request with credentials
Digging into an HTTP request with unencrypted data allowed us to discover a new malicious site. It turns out that many malicious apps use HTTP to transmit user data too. And in the case of malware it is even worse because it can steal more sensitive data like SMSs, call history, contacts, etc. Malicious apps not only steal user data but expose it to the internet making it available for others to exploit and sell.
In this research we analyzed the network activity of more than 13 million APK files in our sandbox. On average, approximately every fourth app with network communications was exposing some user data. The fact that there are some really popular apps transmitting unencrypted user data is significant. According to Kaspersky Lab statistics, on average every user has more than 100 installed apps, including system and preinstalled apps, so we presume most users are affected.
In most cases these apps were exposing:
Some apps expose personal information, mostly the user's name, age and gender, but it can even include the user's income. Their phone number and email address can also be leaked.
Because this data can be intercepted. Anyone can intercept it on an unprotected Wi-Fi connection, the network owner could intercept it, and your network operator could find out everything about you. Data will be transmitted through several network devices and can be read on any of them. Even your home router can be infected – there are many examples of malware infecting home routers.
Without encryption this data is being exposed as plain text and can be simply extracted from the requests. By knowing the IMSI and IMEI anyone can track your data from different sources – you need to change both the SIM card and device at the same time to change them. Armed with these numbers, anyone can collect the rest of your leaking data.
Furthermore, HTTP data can be modified. Someone could change the ads being displayed or, even worse, change the link to an app. Because some advertising networks promote apps and ask users to install them, it could result in malware being downloaded instead of the requested app.
Apps can intercept HTTP traffic and bypass the Android permission system. Android uses permissions to protect users from unexpected app activity. This involves apps declaring what access they will need. Starting from Android 6, all permissions have been divided into two groups – normal and dangerous. If an app needs dangerous permissions, it has to ask the user for permission in runtime, not just before installation. So, in order to get the location, the app will need to ask the user to grant access. And to read the IMEI or IMSI the app will also need to ask the user for access, because this is classified as a dangerous permission.
But an app can add a proxy to Wi-Fi settings and read all the data being transmitted from other apps. To do so it needs to be a system app or be provisioned as the Profile or Device Owner. Or an app can set a VPN service on the device transmitting user data to its server. After that the app can find out the device's location without accessing it just by reading the HTTP requests.
HTTP (blue) and HTTPS (orange) usage in apps since March 2014
Starting from the second half of 2016, more and more apps have been switching from HTTP to HTTPS. So, we are moving in the right direction, but too slowly. As of January 2018, 63% of apps are using HTTPS but most of them are still also using HTTP. Almost 90% of apps are using HTTP. And many of them are transmitting unencrypted sensitive data.