Replies: 6 comments 3 replies
-
Hey Tim, So, short answer to your question is yes. We've had numerous discussions regarding the topic and are very interested in efforts in this direction. So far, our principal focus has been on the Microsoft TTD extensions to WIndbg, available in the Preview editions. Right now, you can do a few things in this vein. You can load Windbg-generated traces into the debugger and traverse them as if you were debugging a live target. You can display various aspects of the trace in the different providers, e.g. the Memview Provider, via loader scripts. And you can populate the Ghidra trace database with the TTD data. The latter has proved very expensive, although we haven't tested in after several performance improvements to the main code. Our intention was to investigate RR with the same goals. RR would be a natural choice as the RR traces would be readily available through the normal gdb interfaces using RR as the wrapped client in the Linux agent. That said, I think Qira would also be an obvious choice. I haven't played with Qira as much as I have with QEMU. As the traces typically accessed using the gdb machine interface or extensions to the gdb command set. That would be ideal, as you could avoid writing a parser for the Qira trace format in Java or wrapping existing code with JNI or the like. Let us know what your use cases might be and how you envision the two trace sets (Ghidra/Qira) interacting, and we can give you specifics on best approach and some of the underlying infrastructure that might simplify development. |
Beta Was this translation helpful? Give feedback.
-
Thank you for your response. Owww, I stumbled upon TTD while doing some research, and I skipped it because it was on Windows. I looked a bit at RR and from what I understand, it basically records the syscall results, and replays the program deterministically using these results to get the same execution. It's a clever approach, but it doesn't offer a "must-have" feature in my opinion: getting the list of memory accesses for an address. That's why I focused more on Qira. The approach is crude (record all the read/writes for each instruction !), the traces are slower to record and heavier on disk, but I think it better fits reverse-engineering purposes. As you said, I'd like to avoid implementing it in Java. The traces are made of really small entries, and once parsed the whole thing is kept in a Oh, I didn't think about interfacing at the GDB level. Adding a few custom GDB commands to get the extent of the trace, jump to a specific "snap", and optionally get the list of accesses at some memory address... This would probably be a small addition to RR, and possible to make an implementation based on the Qira traces. I'm probably missing a lot of gotchas, but is that what you had in mind ? |
Beta Was this translation helpful? Give feedback.
-
Thanks again for your response, this is very helpful ! I took a quick look at the scripts you gave, and I confirm that this is probably what I was looking for. I will definitely start from here. I tried to test the scripts on Windows to see the actual result but unfortunately had some trouble. I built Ghidra on Windows, made a TTD trace of notepad with WinDbg preview (notepad01.run), and tried to feed it into the scripts. But each time, I got an error (OpenDumpFileWide returns E_INVALIDARG). Idk if I should report a bug, or if it's just me being dumb ? In qira, the parser is written in C++, and there is a Cython interface to expose it to the main application. But the file format is kind of dumb, so I guess I can just reimplement it in Java. I'll give it a try and keep you updated. |
Beta Was this translation helpful? Give feedback.
-
Admittedly, this is somewhat buried in the help files, but did you copy the Windbg Preview amd64 directory contents into the bin directory in the JDK? There's a path issue with hardcoded precedence in the JDK for its version of dbgeng and possibly some of the other files that's, well, truly annoying and for which we haven't figured out a workaround. |
Beta Was this translation helpful? Give feedback.
-
Oh, wow - that's very exciting. Very happy you made progress so quickly. End of my day, but will definitely take a look at your code when next I get a chance. Re speed, good question. Right now, we're hashing through a bunch of performance issues in the debugger proper. Whether fixes to these will help you out depends, I guess, on where the bottle necks are. The MemoryViewer could definitely have exponential issues as it reparses the sets on adds. It sort of has to because it's using relative position rather than absolute time or address to pack the display. That said, I am not an algorithm guru - am sure we can find some folks with ideas for increasing the speed. |
Beta Was this translation helpful? Give feedback.
-
Additional time traveling debuggers w/ QEMU, Bochs, Hypervisors FWIW:
- rr
- https://news.ycombinator.com/item?id=41285518
- https://github.com/gamozolabs/applepie
- https://github.com/gamozolabs/orange_slice
- https://github.com/MarginResearch/cannoli :
It consists of a small patch to QEMU to expose locations to inject
some code directly into the JIT, a shared library which is loaded into QEMU
to decide what and how to instrument, and a final library which consumes
the stream produced by QEMU in another process, where analysis can be done
on the trace.
Cannoli is designed to record this information with minimum
interference of QEMU's execution. In practice, this means that QEMU needs
to produce a stream of events, and hand them off (very quickly) to another
process to handle more complex analysis of them. Doing the analysis during
execution of the QEMU JIT itself would dramatically slow down execution.
Cannoli can handle billions of target instructions per second, can
handle multi-threaded qemu-user applications, and allows multiple threads
to consume the data from a single QEMU thread to parallelize processing of
traces.
- Qira: https://github.com/geohot/qira
…On Thu, Aug 22, 2024, 10:38 AM Dan ***@***.***> wrote:
Closed #2730
<#2730> as
resolved.
—
Reply to this email directly, view it on GitHub
<#2730>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAMNS74VCNOWLR6IWMR7ZTZSXZW7AVCNFSM6AAAAABM6JT7DCVHI2DSMVQWIX3LMV45UABFIRUXGY3VONZWS33OIV3GK3TUHI5E433UNFTGSY3BORUW63R3GE2DMMRVGM4Q>
.
You are receiving this because you are subscribed to this thread.Message
ID:
<NationalSecurityAgency/ghidra/repo-discussions/2730/discussion_event/1462539
@github.com>
|
Beta Was this translation helpful? Give feedback.
-
Hi !
I am looking to implement a Qira-like functionality for Ghidra to allow for "timeless" debugging.
The idea would be to open a Qira trace (or another format), and allow to go forward/backward in time to see the memory and registers (like the Ghidra Trace does for recorded sessions).
It should also be able to add annotations similar to XREF in the listing view, to see for each address where and when it was read/written/executed.
I hesitate between several ways to implement this:
For now I have a PoC in Python that works, but it lacks the timeless-specific functionalities that I would need to implement in a new GADP Interface I guess (snap selection, XREF search).
I didn't look much into how Ghidra traces work, but I understand that it only takes snapshots of memory/registers at breakpoints.
Also, I guess it doesn't register incremental changes, but takes the content of whole pages.
This means that for each instruction, I would have to take a snapshot of all the memory/registers, instead of the list of changes. This would lead to huge files, and probably not allow an efficient XREF research.
This would allow me to reuse interesting components like the snap timeline.
So my question:
Has the subject of timeless debugging been tackled while designing the Ghidra debugger/GADP ?
What do you think would be the best approach for this ?
Beta Was this translation helpful? Give feedback.
All reactions