-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Roadmap for v1.6.0 #145
Comments
If you were to ask me, ideally, on supported platforms you would want to decode directly into memory mapped files. Both easiest and simplest. Technically the data may not hit disk immediately, but it will be available to use immediately, and the user sees a (negligibly unless HDD or huge block) faster completion. That end marker would be useful here. |
@Sewer56 TL;DR: All of this that you mention is what xpar already does. It just lacks use/evaluation for me to tell if it's disruptive to people or not. To give you some more context: I started working on bzip3 when I was in high school (2022). Being a schoolkid I had plenty of time. Now I have applied to a PhD program :-). The codebase has some rough edges and I fixed most of the problems that I have with the current shortcomings in So what is blocking us is not knowing if the changes will break someone's workflow or introduce unnecessary complexity that will be difficult to work around for maintainers or users. Writing the code is the easy part. |
I cannot be expected to know the details of your other projects 😅 , so I was commenting what came to mind without that knowledge. Edit: Above response was edited with more context. |
Hey, you're doing pretty well :p Although I like being involved in making the cutting edge, I never felt like academia was for me personally at least. I think I just don't like crunching through papers enough, haha. I personally started by tinkering with games from a young age. First as an end user, and then I learned code via reversing to data mine. Also can get a bit competitive hence emphasis on optimization. In any case, I've not previously packaged software for distros (I generally do libraries more); so I'm not very experienced on the subject matter. In any case, is the concern interop/user confusion? i.e. Produce a file on one distro, read from another with outdated bzip. Or, as another example, a bash script which pipes result of I believe the Unix philosophy is 'do one thing and do it well' (and the other 2 relevant points). So if that's the worry, I think the approach here would be a separate CLI tool/command for the new format, e.g. |
The problem is that every distro comes with its own set of hacks and limbs and hairs that make everything more difficult than it should be. Easy example of memory mapping: some environments (all the Windows GNUs like Cygwin or MinGW or whatever) provide If you are tempted to not use assembly and think that intrinsics are enough, you are deeply mistaken. I made xpar about 4 times faster by rewriting its hot loops in assembly. Compiler optimisations are brittle, distros will insist on using ancient compilers on low optimisation settings (e.g. -O2 -march=generic), etc. - negating all of the work you put into making the damn thing fast. Etc Etc... If I once again have to figure out why the thing segfaults on FreeBSD 13.0 mipsel64 specifically without any feedback from the maintainers the amount of grey hairs on my head will quadruple and I don't want to inconvenience the users because there's packaging problems. All of these platform specific hacks add liability because I can't reasonably test them all. They introduce exponential blowup into the amount of possible software configurations. On top of that, I don't want to shift the liability of testing to the maintainers, because they never do, and if there's a breaking problem on their specific platform they might be tempted to inexpertly fix and patch it instead of asking for upstream help. Which is dangerous for a compression program. |
If I was to release a new format version then |
Ah, I see. So you're trying to support all of the targets directly from upstream; from pure C (no higher level abstract library, etc), no abstractions, straight to the target. Yeah, that's a lot of portability caveats; but it makes sense since you've gone all the way to standardize the format even. You have my condolences; I've been in similar situations. Sometimes having to fix something for macOS aarch64 for instance can be massively painful without the hardware; so I rely on CI to run that environment, and it's usually 3-4 minutes to get feedback on any change. I also get the appeal of writing code without overhead. I wrote a Rust library the other week for opening handles and mmap(s) just to save around 3-8KB of code, as std's implementation internally used a builder pattern and therefore included unnecessary logic. (It frustrated me, I don't like overhead) Intrinsics are hit and miss, they may introduce extra bloat, they may not, just depends on your luck. If you're working with C, I guess you have it extra bad, since there's many compilers, and some distros will ship ancient ones; that is a yikes. Better off with ASM at that point, even if it looks fine with your Clang. The call convs I imagine are awkward. I'm not sure how far your options go as far as forcing a call conv in a compiler agnostic way go in C. Even if you can set it, I imagine you'd need an ugly hack like making function with forced no inline and a different convention just to perform a 'switch' for a small amount of 'context'. On that note, I've never really understood the whole 'Shadow Space' thing on Microsoft x64 CallConv. It's a pain to work with, not only must the stack be aligned 16 (presumably for SSE), but now there must be at least 32-bytes before the return address, damn. Thing is, while I understand it's supposed to make debugging easier by force spilling the 4 regs onto the stack, at the same time, it's just unnecessary overhead for release builds. Now function calls require an additional stack allocation ( On a fun note I wrote a small JIT that generates optimal stubs for converting between calling conventions as part of my 80% done WIP xplat xarch hooking library (on a bit of a long hiatus though, in favour of other more urgent projects). I don't have any background or study in compiler construction whatsoever, but that's one part that came out quite well.
I actually ran across this just last week. At that point, thinking about how much it would be a hassle for bindings users (in this case equivalent to maintainers I suppose), I just chose to pre-compile every single variation myself and link against that. So I totally understand the pain. |
This ticket is meant to be a collective TODO list for the v1.6.0 release with all the major features that I am planning.
-j 0
. We want a C analogue ofstd::thread::hardware_concurrency()
. Maybe determine the amount of CPUs by task affinity (sched_getaffinity
- Linux-specific),sysconf
(GNU-only),get_nprocs
(also GNU), or maybe read/proc/cpuinfo
... Another possibility ispthread_getaffinity_np
, or NetBSD 5+ GNU'ssched_getaffinity_np
, on Windows we would wantGetProcessAffinityMask
. In practice, thesysconf
appears to work on glibc, Mac OS X 10.5, FreeBSD, AIX, OSF/1, Solaris, Cygwin, Haiku. HP-UX would requirepstat_getdynamic
, IRIX usessysmp
; as a fallback on Windows platforms we could useGetSystemInfo
. Possible m4 code of interest when it comes to making the heads and tails of this mess.getopt_long
shim (DONE IN COMMIT 249b173).yarg
within the CLI tool.The text was updated successfully, but these errors were encountered: