-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
auto-detect song length (detect emulation state looping) #67
Comments
What do you consider a "loop"?
(edited title: removed |
I'm considering a loop in this context a repitition of notes in the same subsong. I see the problem. I thought there might be a way with which you could detect which sections have already played and maybe seek forward to look if a new section comes along. I think this would be a rather hacky approach, but still better than what's given right now. For example: in TRAX, it almost ends prematurely because there's a 2 second gap in the 7th track. I'm not quite sure why it doesn't end, the gap must be slightly smaller than 2 seconds, but still. It'd be nice to have some sort of seeking algorithm that tries to detect loops. |
For loop detection we are halfway into halting problem territory ;-) I think the main problems are that the sound player is a completely freeform program and that we only see IO operations, but don't know anything about notes or any other higher structure (see my comments in issue #66). Theoretically, we could detect loops:
The problem is that we'd have to record all the states that the emulator steps through: That means the complete RAM state, register state, CPU state, hardware state (like current scanline) etc. would have to be recorded for every single step the emulator calculates. (If for example we only record every 100th step and the repetition happens after 99 steps, we will be 1 step off and won't find the repetition). Let's say we skip the video RAM, registers and extra stuff and just record the 8kB internal S-RAM. The CPU runs at 4.19 MHz, that would sum up to 33.5 GB of data per second. Let's say I calculated this wrong and/or there are some easy optimizations available, divide it by 1000 and make that 32 MB per second: That is still a lot and we would not only have to record every step, we'd have to compare the current step to all other already recorded steps 4.194.304 times per second. (I immediately think of compression and hashtables here, but the former needs more computation power and the latter more memory.) So from my current perspective I'd say: We can't do that. But I'd love to hear other approaches to this problem or that' I'm off by a factor of 1.000.000 and it would indeed be possible. |
Wouldn't it be possible to store hashes of the individual states? That would reduce the RAM usage greatly while still detecting loops pretty accurately. Of course it still wouldn't run nearly at realtime speed, but for my purposes (just dumping the wavs) it wouldn't matter. Calculations, using XXH128 and just the S-RAM: I'm not 100% familiar with how the gameboy hardware works, but it can only loop with a jump instruction, if we go by assembly rules. So maybe the sRAM has to be recorded only when a jump instruction happens? I think this method might miss the first iteration, but please let me hear your thoughts. edit: another thought would be to only record after jump instructions that jump back. |
Another optimisation would be to only store the hashes of the beginning of the track (should be configurable, but the first second or so should suffice), and only check the hashes of the following music against these. This would also fail to catch a lot of tracks with intros, but there aren't many of those around & since it's user-configurable, it shouldn't be a problem. |
Execution flow of the emulated playback code and progression of the music aren't normally tied together; the playback code operates sort of like a VM that reads a data stream containing the music data (notes, durations, effects...) and performs the necessary operations on the audio HW to produce sounds. As such it will perform a huge number of jumps that don't have anything to do with the position in the tune itself. However the general idea of taking only specific snapshots sounds promising. In my opinion it should be sufficient to take a snapshot of the SRAM every time the playback code returns from execution which would commonly be once every 1/60th or 1/64th second; assuming the latter this results in 512kB/s of data to aggregate, or 1kB/s in hashes. Of course as the tune continues to play the amount of data to compare would grow substantially; after an hour of playback 3.5MB worth of hashes would need to be compared 64 times per second. [EDIT: numbers corrected for a hash size of 16, not 128 bytes] Also, there exists code that never returns and instead maintains its own control loop; detection would not be possible strictly in the way I laid out. But it isn't a common case as far as I can tell. |
I think that there is a loop finder in vgmplay tools, and I think it:
I think it should be simpler and also faster |
Well storing the audio alone would use up more RAM than the solution proposed by @mrehkopf (if everything is kept in RAM), and analyzing audio for loops ain't exactly easy either. Let alone the fact that you have to detect a loop of at least 4 iterations before you can be sure that there won't come anything new in the track. And there are plenty of long tracks that would be absolutely steamrolled by this solution if we just render the first 5 minutes and cut out any loops in post-processing. Besides the fact that it would be a nightmare to make it work with realtime playing, but that's …up to the user I guess. Also, thanks for catching my 128b!=128B mistake, that brings the numbers down a whole lot. |
There's also the issue that the state looping period may be longer than the song looping period (e.g. say the player code keeps a running time in memory which keeps increasing). |
coming back to this discussion after having fixed some stuff in nosefart and had to understand their internals a bit. They try to infer when a track loops by keeping track of where the last place in memory is that has been accessed. This works fairly well in my experience and is also quite easy to implement (I think, not knowing gbsplay's internals) imo having an imperfect implementation would at least be better than having nothing at all, especially since people can just… not use it… when they don't want to… you know? anyway, just my two cents, I don't really have time to implement this yet. |
Ohhh, Nosefart is still a thing? Cool 😎 |
Do you have any more details in the nosefart implementation? diff --git a/gbcpu.c b/gbcpu.c
index 6652863..11af41c 100644
--- a/gbcpu.c
+++ b/gbcpu.c
@@ -42,6 +42,26 @@ struct opinfo {
#endif
};
+static uint32_t highest_rom_load_address = 0;
+
+static void remember_rom_load(uint32_t addr) {
+ // FIXME: bank mapping for 0x4000 - 0x7FFF is ignored
+
+ if (addr > 0x7FFF)
+ return; // that's not ROM access
+
+ if (addr > highest_rom_load_address) {
+ highest_rom_load_address = addr;
+ printf("highest load address = 0x%04x\n", addr);
+ return;
+ }
+
+ if (addr == highest_rom_load_address) {
+ printf("RE-ACCESS @ highest = 0x%04x\n", addr);
+ return;
+ }
+}
+
static uint32_t none_get(void *priv, uint32_t addr)
{
UNUSED(priv);
@@ -67,6 +87,7 @@ static inline uint32_t mem_get(struct gbcpu* const gbcpu, uint
{
struct get_entry *e = &gbcpu->getlookup[(addr >> 8) & 0xff];
gbcpu->cycles += 4;
+ remember_rom_load(addr);
return e->get(e->priv, addr);
} Be aware: This makes the tests fail because of the extra output on stdout, but it compiles and runs. When I run this and watch this on different tunes, the best I saw until now was repeated access marking the start of a beat. While that is surely some kind of repetition, it is not what we want ;-) So while it does something, it does not do anything useful yet. How does nosefart distinguish "local" from "global" repetitions? |
iirc it just has some amount of minimum cycles without new address reads before it decides it found the loop point |
yeah well the title says it all.
The text was updated successfully, but these errors were encountered: