-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix Android support #3396
fix Android support #3396
Conversation
Misc:
Failing CI tests:First, there's the quick and dirty blanket solution to discuss: just disable all of these on Android, at least for now. Simply not worrying about the fact that something could be tested on Android, but isn't, until someone actually encounters a bug as a result seems like a possibly acceptable solution, given how broken things are now anyway. The rest of this discussion assumes that's not the tack we're taking; I've listed "don't test on Android" as an option for the failing tests where I think that could make sense on an individual basis. Second, there's the matter of root on Android. Some of the tests below could pass by only running them as the root user. We should make a decision on what our policy is on this -- namely, do we support rooted Android? Root shells are somewhat rare on Android, used almost exclusively by enthusiasts who want to tweak their ROMs in various ways. On the one hand, it seems excessive to cater support to such a small minority of a use case, but on the other, such enthusiasts are perhaps the most likely to need a well-tested set of coreutils on Android. Were we to decide to run these tests when root, we would also need to decide whether that should be accommodated in the CICD. I haven't yet looked at what it takes to get a root Termux shell in the emulator, but I suspect it would at least be possible.
Failing non-CI tests:These are tests that pass on the CI environment, but fail on at least one emulator or device. They're included here mostly for completion's sake, I don't think it should be necessary to resolve them before merging, and can be broken out into their own issues. If you'd like me to change anything to fix them now though, feel free to say so.
|
wahou, terrific, bravo :) |
I think you can ignore "root test" :) |
@sylvestre Noted, thanks. :) Could you also confirm that the |
Probably? :)
|
I've filed a PR to fix the xattr problem upstream: Stebalien/xattr#23 |
a700910
to
2fdf388
Compare
It's alive! \o/ I went ahead and picked what I thought were reasonable options for the remaining blockers (and also fixed one more problem uncovered: on x86, only 32 bit ELF files can be read by strip by default; I swapped out the relevant test file, since x86-64 can do both). Feel free to request changes to any of those decisions, or anything else. |
I will have look asap. |
Unfortunately, I don't think code coverage is possible yet. As I mentioned before, rustup doesn't seem to work on Android right now, so we're using what's available in the package manager. It's kept up to date, e.g. it has rust 1.60.0, but that means we can't use nightly features/flags. Rust 1.60.0 did add support for code coverage profiling, so I was hoping we could at least get something, but it didn't work:
It sounds like this is somewhat expected, based on the documentation on the profiler. So we'll need to wait for (or fix ourselves) either rustup support, or a rustc with profiling enabled in the package manager. |
Do you want to land this now and iterate from it? thanks |
The code for creating a Passwd from the fields of the raw syscall result assumed that the syscall would return valid C strings in all non-error cases. This is not true, and at least one platform (Android) will populate the fields with null pointers where they are not supported. To fix this and prevent the error from happening again, this commit changes `cstr2string(ptr)` to check for a null pointer, and return an `Option<String>`, with `None` being the null pointer case. While arguably it should be the caller's job to check for a null pointer before calling (since the safety precondition is that the pointer is to a valid C string), relying on the type checker to force remembering this edge case is safer in the long run.
If it looks good to you, then yeah I think that's the way to go, so we don't have to keep looking out for new breakage. I think I almost have code coverage working, I'll just file a new PR for that in the near future. |
This PR is to fix and improve Android support. There are some final points that still need discussion that I'll add in another comment, but it's basically done.
I've broken the PR into four commits, each of which can, if need be, be broken out into their own PRs, but make the most sense together.
getpw*
family of syscalls always returns valid C strings or errors, when null pointer strings are also a possibility. So far, Android is the only platform I know of that does this.In case anyone reading isn't familiar: Android is built on the Linux kernel, but is set up very differently from most Unix environments. Perhaps most notably, each application is installed as and runs under its own Unix user, with very few permissions. Access control is supposed to be done through Android's permissions framework, not the classic file mode bits or capabilities. That said, there is a shell environment, with access to a Unix-style interface, but a lot of things one might expect to work, don't. E.g., hard links are disabled for everyone but root (and even then, won't work in shared directories), there is no privilege escalation mechanism like sudo/su by default, parts of the standard Unix directory layout aren't accessible (e.g., /proc) or don't exist at all (e.g., /tmp).
The last commit is somewhat hacky. There isn't a lot of precedence for CI of terminal-based applications on an Android emulator. The biggest problem is the limitations of the default shell, which would be difficult to set up an environment to run the tests in. Instead, we install the Termux app, which provides something more closely approaching a typical Linux environment (including the GNU coreutils). One of the most useful things Termux provides is a package manager, which allows us to install dependencies, as well as rust/cargo itself, so we can test building on the emulated device (rustup doesn't seem to work). However, there is no mechanism for using the Termux shell directly, so we use the debug bridge to enter keyboard strokes, then query for files to tell when the commands terminate, what their exit codes were, etc.. One adjustment to this flow we could make would be adding a step setting up sshd in Termux, which would allow us to run commands in a more typical manner for a remote host. The reason I haven't done so is we would still need the key strokes + query style of interaction to set up sshd, and keeping to one method of interaction means less code to debug (preventing problems if future changes to Termux, the emulator, or the runner require reconfiguring ssh or something). However, if it is preferred, I can rewrite that commit to use that instead, and the key strokes + query stuff would be contained to the cache setup step. Which, on that note, the CI step is set up to cache the emulator image after installing Termux, Rust, and dependencies. This means we don't have to wait for that every time, we reduce downloads from the package repos, and we reduce noise from changing installed package versions. Any time we want to update the cache, we just need to change the cache key string, e.g. by using a different Termux version, or just adding something like "-v2" to the end.
The action we use to run the Android emulator is popular, with a high install rate (69th most popular action of 12,708 as of this writing) and a good list of notable projects relying on it (including projects from Google and the Android team itself). I have, however, found it to be a bit temperamental in getting it properly configured. Android versions that work on my local machine don't work with it, I couldn't get either ARM architecture working on it, and the shutdown step seems to hang most runs when performing the caching step. However, the configuration in the commit seems to work consistently (with the hang addressed by an unfortunate but effective
pkill -9
). I suspect the configuration won't need changing often, and the project is still actively maintained, so hopefully some of these issues are worked out by the time it does.I tried to make sure to leave code comments on anything potentially confusing, but feel free to request more anywhere, or to ask questions here why something was necessary. Stay tuned for a comment on the minimum that still needs to be addressed before merging...