Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for help with reverse-engineering APK #327

Open
sigaloid opened this issue Nov 19, 2024 · 12 comments
Open

Request for help with reverse-engineering APK #327

sigaloid opened this issue Nov 19, 2024 · 12 comments
Labels
enhancement New feature or request

Comments

@sigaloid
Copy link
Member

sigaloid commented Nov 19, 2024

Hello all,

I'm working on keeping the requests made by Redlib in line with what the official app does, in order to protect against detection. While I was once able to MITM the app, I can no longer - I've tried a lot of things, including apk-mitm, APKLab, etc - all various problems but mainly even once I figure out the splitting of the APKs, and patch out the TLS cert check with apk-mitm, I cannot get the app to launch. Seems like they have some kind of more serious anti-debugging on it that didn't exist a few months ago. While I can still MITM on iOS easily (since user certs aren't treated as different than system certs on iOS, whereas Android apps never use user certs by default unless you manually patch it out), it's not the same, since there's likely differences in how the two apps send headers and connect etc (I think they already spell headers with different capitalizations!).

What I've tried:

  • apk-mitm
  • android-unpinner
  • manually patching out calls to isAppDebuggable, isDebuggerAttached, isEmulator, isRooted, etc
  • APKLab/apktool disassembly

So, if you can figure out how to take the app's APK, patch out the certificate pinning, then also bypass the anti-debugging check, thus providing an easy way to open the app with user-level TLS certificates trusted, and can detail your process, please reply or email me at re@[my site in profile].

EDIT: got it working (you just need root lol)

@sigaloid sigaloid added the enhancement New feature or request label Nov 19, 2024
@jimdrie
Copy link

jimdrie commented Nov 19, 2024

There is one unofficial Reddit app that still works (without having to login), maybe that can be helpful? It's open source: https://github.com/QuantumBadger/RedReader

I believe that one was still allowed because of its accessibility features.

@r7l
Copy link

r7l commented Nov 19, 2024

There also is Glance, which has a Reddit feature. I am using it as a homepage and have the same list of Subs in there as i have configured on Redlib. It's working without any issues.

https://github.com/glanceapp/glance

@sigaloid
Copy link
Member Author

sigaloid commented Nov 19, 2024

As far as i know the two of those both rely on low rate limits and .json routes which are static and based on IP address, Redlib needs a method that is tied to the OAuth token that we generate, so that when we inevitably go over the rate limit in 500 seconds, we can just throw it away and refresh to a new token.

I looked into RedReader and either I can't extract the oauth key they distribute or I'm just not looking right. I also don't want to piggyback off of theirs as if there's the same level of serious load as Redlib causes now, on some small app who relies on the goodwill of Reddit to allow their usage, I worry they'll get kicked off.

I'd prefer to reverse-engineer the actual official Reddit app and match its behavior for the anonymous browsing mode, since it's their own app.

@neckothy
Copy link

From the OP it sounds like you've only tried MitM with user certs. System certs are "easy" to MitM on Android as well, but require a rooted device or emulator. The easiest way to do so is usually to use HTTP Toolkit. I have a gist on the topic if it's helpful, though it's pretty surface level and aimed mostly at those entirely unfamiliar with the concept.

Apologies if reddit uses explicit certificate pinning or if this isn't useful for your workflow. I tried looking at the mobile app myself and it did seem to successfully intercept some requests to reddit, but it doesn't seem like I can try any important requests before signing up, which I'm not going to do. Best of luck.

@FireMasterK
Copy link

@sigaloid I just tried and was able to MITM perfectly with moved system certificates using root and HTTP Toolkit on my computer.

What I noticed is the app uses the GQL API through the gql-fed.reddit.com hostname. I also noticed that the rate limit is significantly higher at x-ratelimit-remaining: 1999.0.

Another thought I had was instead of getting a new token each time we reach the rate limit, could we store a pool of tokens, and keep rotating between them as the rate limit gets used. Once all of them are exhausted, then we fetch a new token and expand the pool.

@sigaloid
Copy link
Member Author

Awesome, I hadn't tried a rooted Android VM. What VM are you running? I'm struggling with getting Waydroid rooted - seem to get stuck on SELinux errors every time.

I do like the idea of using the GQL API if possible - it would be a lot of changes though, as many things are currently built off of the path-based way of constructing requests.

Currently the issue isn't the requesting of new tokens, because that's actually something the app will do by default regularly, but the headers being sent are uniquely identifiable. Granted, it's possible we continue down this path of back-and-forth breaking and fixing, and thus I would need to match behavior that precisely. So for sure something to keep on my radar.

@neckothy
Copy link

What VM are you running?

On Linux I find it easiest to use Android Studio's built-in device emulation for stuff like this.

@FireMasterK
Copy link

Awesome, I hadn't tried a rooted Android VM. What VM are you running? I'm struggling with getting Waydroid rooted - seem to get stuck on SELinux errors every time.

I used a real device running Android 15 and the https://github.com/ys1231/MoveCertificate model with kernelsu.

For a VM, I would recommend using the official AVD, taking the necessary boot/ramdisk image from the system img file, and updating the VM's disk files. Follow this for the patching instructions: https://topjohnwu.github.io/Magisk/install.html After this, you can use the module I linked above.

I do like the idea of using the GQL API if possible - it would be a lot of changes though, as many things are currently built off of the path-based way of constructing requests.

I agree, but I see this as the best long term solution as both the Website and App seem to fully use it these days, to best emulate the app's requests.

Currently the issue isn't the requesting of new tokens, because that's actually something the app will do by default regularly, but the headers being sent are uniquely identifiable. Granted, it's possible we continue down this path of back-and-forth breaking and fixing, and thus I would need to match behavior that precisely. So for sure something to keep on my radar.

But aren't the tokens valid for a day? I don't think the app would fetch a new token regularly unless it's really needed? 🤔 The headers should be the first priority indeed like you said, I agree, but we still aren't emulating the app unless we use GQL. We have quite a bit of work to fully emulate the app. 😅

@neckothy
Copy link

neckothy commented Nov 20, 2024

For a VM, I would recommend using the official AVD, taking the necessary boot/ramdisk image from the system img file, and updating the VM's disk files. Follow this for the patching instructions: ... After this, you can use the module I linked above.

Note that this is all a bit overkill in cases where the default devices haven't been specifically blocked by whatever you're trying to RE. If you use a non Google Play image (so one that says Google APIs instead), you will be rooted by default. After starting the emulator for the first time, you can launch HTTP Toolkit and click the ADB connector and get system trust automatically. Then adb install-multiple com.reddit.frontpage.apk config.en.apk config.mdpi.apk and the app runs without issue.

edit: and if you did want to use a Google Play build or custom kernel, something like rootAVD is probably an easier way to do so, but I have limited experience with this. In my experience, it's pretty uncommon that apps won't work on a standard x86_64 Google APIs image.

@JohnD-o
Copy link

JohnD-o commented Nov 20, 2024

I’ve been working on a new version of Redlib over the past few weeks because I anticipated that Reddit would implement more sophisticated detection measures. I wanted to develop something that aligns with the project's long-term goals and current needs. So far, I’ve made significant improvements and have implemented several features to address the challenges we’ve been facing.

That said, it's a bit overkill at the moment, and I’m still working through some bugs, as I’m relatively new to coding. I didn’t build this to be easily replicated yet, and there are many parts that are tightly coupled to my local setup. But I’m hoping the ideas might be useful, and perhaps some of you can help improve it further.

This is what I've done to mine:

list of real Android app versions
Rotates between different device configurations
Proper OAuth client IDs for Android devices
Token daemon that refreshes tokens
Automatic retry logic with fallback
Proper Android API endpoints
Tracks rate limit remaining
Rolling over mechanism
Randomization to avoid patterns / Adds random jitter to requests to avoid patterns
Improve token rotation with proper rate limit tracking
Tracks rate limit remaining counts
Also added proxy support on my end
Rotate proxies on rate limits
Randomly rotate proxies (X% chance per request) to avoid patterns
Support for Tor (More frequent rotation when using Tor)
Location tagging for geographic distribution
Limited connection pooling
In-memory cache with configurable TTL (default 1 hour)
Cache key generation that ignores irrelevant query parameters
Configurable cache size limits
Automatic cache invalidation
Caches only successful GET responses
Additional Features: A number of small, complex features that I can’t fully explain here

I’ve built a fancy GUI with many debug features that help monitor the functions above. This makes it easier to track requests, rate limits, and proxies, and troubleshoot when things go wrong.

I also dug deep with other privacy centered social media projects similar to this, and one that has dealt with these issues the most is Nitter. Refer to their solutions with rate-limiting, and check out the other solutions they employed to dodge these problems. I think the punishment system for scrapers is perfect, but if you're going to be employing this I recommend putting the amount of requests one can do in public.

As mentioned, the project is still massively buggy, simply put it’s not ready. And since I’m relatively new to coding, a lot of the structure may only work on my computer, but I wanted to throw these ideas out there and see if there’s something here that can be of value to you.

I suggest making maybe a discord and see if other people want to hop on it and help out in a more interactive way. Feel free to reach out if you're interested in taking a look at what I did. Thanks!

@drakeerv
Copy link

I don't know if it helps but I was able to do it with AVD and Gplay by using an older version of android which didnt treat user certs differently and that the app (UDB App Pro) still supported.

@sigaloid
Copy link
Member Author

But aren't the tokens valid for a day? I don't think the app would fetch a new token regularly unless it's really needed? 🤔 The headers should be the first priority indeed like you said, I agree, but we still aren't emulating the app unless we use GQL. We have quite a bit of work to fully emulate the app. 😅

The app does regularly request new tokens way before the expiry for seemingly no reason at all. Also many old versions of Reddit did still use the OAuth routes, and they still have to work for approved third party apps like RedReader (the routes not the tokens).

Btw: did indeed get root and installed the certs, MITM going well now!

I have moved closer to matching headers extremely closely here: 6be6f89

I’ve been working on a new version of Redlib over the past few weeks because I anticipated that Reddit would implement more sophisticated detection measures

I'd love to see your changes, you can reach out to me via email (anything @ domain in profile)!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants