Skip to content

Volshell: Add Dedicated Method to retrieve EPROCESS/Task object #1381

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 23, 2025

Conversation

the-rectifier
Copy link
Contributor

This PR introduces a gp() (get_process()/get_task()) method to the Windows and Linux Volshell respectively.

I couldn't find a way for one to (quickly) grab an arbitrary EPROCESS/Task object from the shell and I thought it'd be quicker, when interactively exploring structures, to have a dedicated method rather than repeat a one-liner each time.

Also, update linux.pslist version requirements, since they changed in fa06064:

Unsatisfied requirement plugins.Volshell.pslist: Version 2.0.0 dependency on volatility3.plugins.linux.pslist.PsList unmet
Unable to validate the plugin requirements: ['plugins.Volshell.pslist']

@atcuno
Copy link
Contributor

atcuno commented Dec 5, 2024

Hello @the-rectifier - nice timing on this as I was recently working on a similar feature as forcing people to do the list comprehension is pretty miserable as you noted.

With that said, accepting only a pid is not robust enough for this feature to be fully usable. There are a few reasons for this:

  1. Smear can make where multiple processes (one active and one or more terminated) have the same PID, making it non-deterministic.

  2. When specifying the pid, the only thing that can be done to find the process is walk the list (as your code does). This makes it impossible to specify processes hidden from the list by a rootkit and/or missing from the list due to smear.

To work around this, the offset (virtual or physical) needs to be allowed to be specified as this is 1) 100% precise on which process to analyze 2) supports analysis of processes found from psscan / psxview. The cleanest way in my view would be optional parameters like pid=None,v_offset=None,p_offset=None and then the code branch based on which is set.

If you are willing to work these offset parameters into your code then I can drop my code as I would rather encourage more external developers to contribute to the framework.

Copy link
Member

@ikelos ikelos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks fine, thanks! Just wondering about an alias to keep the naming more consistent...

@the-rectifier
Copy link
Contributor Author

Hello @atcuno, thanks for the reply!

I wasn't aware of the issues with smearing. I just hacked up a quick way to interact with structures now that I started working on my thesis.

That being said, I am not that familiar with volatility internals yet, to promise a timeline. I might look into a proper version sometime in the future but I think it'd be better to continue with your codebase

@atcuno
Copy link
Contributor

atcuno commented Dec 5, 2024

@the-rectifier don't give up yet. The basic idea would be to allow either the virtual address or physical offset to be specified and then create an _EPROCESS object at it.

For physical offsets, there is a helper function:

def virtual_process_from_physical(

Let me know if you have any questions.

@ikelos
Copy link
Member

ikelos commented Dec 5, 2024

I'd also be wary of putting too much into the generic volshell plugin. If a plugin can do it better I think we should leave the plugin to do it (even if that includes working on a physical process offset). Linking volshell functionality against a specific plugin (even a stock one) is a bad idea, as would duplicating the functionality be. I'm sure there's a solution, but I don't think it's necessary for the immediate term...

@atcuno
Copy link
Contributor

atcuno commented Dec 5, 2024

I would be fine with code duplication in this case, even if copy/paste from the plugin, as being limited to pid= is very limiting and makes it impossible to use volshell on processes hidden by rootkits.

@ikelos
Copy link
Member

ikelos commented Dec 5, 2024

... and makes it impossible to use volshell on processes hidden by rootkits.

(layer_name) >>> import volatility3.plugins.linux.pslist
(layer_name) >>> dpo(volatility3.plugins.linux.pslist.PsList, kernel=self.kernel)

I think you mean inconvenient. I've no problem loading up specific kernel offsets physical or virtual, but making volshell depend on psscan to find them rather than having to put a couple extra lines isn't worth the complexity it would bring to the command line tool in my opinion.

It's really not difficult to make use of a plugin and if that's too many lines to type, a snippet would reduce that. We're currently in a state where the dependency on simple pslist was missed when the main pslist plugin got version bumped, meaning that stock volshell on the current commit doesn't load because of that dependency. I'm not keen to add more when the plugin output is so readily accessible...

@atcuno
Copy link
Contributor

atcuno commented Dec 5, 2024

The psscan link was to the virtual_process_from_physical helper function.

I was saying that code can be copy/pasted into the volshell side to avoid the dependency. Then passing in a physical offset to get the _EPROCESS would work inside of volshell. That offset would come from the user running psscan or psxview first and noticing the hidden process.

@ikelos
Copy link
Member

ikelos commented Dec 5, 2024

My bad, that function doesn't look overly complex so I think I'd be ok duplicating it too...

@the-rectifier
Copy link
Contributor Author

Thanks for the replies, all. I played with the code a bit, and I added this to the windows side of things:

kvo = self.config['kernel.offset']
ntkrnlmp = self.context.module(self.current_symbol_table, layer_name='layer_name', offset=kvo)

eproc = ntkrnlmp.object(
    object_type='_EPROCESS',
    offset=v_offset,
    absolute=True
)

As @atcuno suggested, much like the psscan plugin, since we have the virtual address we can just grab an _EPROCESS object at that address.

However, I still can't wrap my head around what to do when a physical address is provided. Is there a way to get from the physical address to the corresponding virtual one?

@eve-mem
Copy link
Contributor

eve-mem commented Dec 7, 2024

@the-rectifier if you look at the linux.psscan as an example you'll see that the object is made with offsets on the physical layer but its native layer is set to the virtual one.

It's a sort of magical 'it just works'TM magic part of vol. It allows you to make an object on one layer, but then follow any pointers etc in another.

https://github.com/volatilityfoundation/volatility3/blob/develop/volatility3%2Fframework%2Fplugins%2Flinux%2Fpsscan.py#L148-L154

@eve-mem
Copy link
Contributor

eve-mem commented Dec 7, 2024

Also, this discussion has made me think it might be useful to share scripts that people could import by rs that makes some customisation for specific tasks.

Similar to how people make use of gdb configs. I need to think about sharing all the volshell snippets I use.

@ikelos
Copy link
Member

ikelos commented Dec 16, 2024

kvo = self.config['kernel.offset']

Also, we need to be slightly careful when putting in code that accessesd sub-config values using kernel.offset. The . is reasonable but technically I believe it's technically CONFIG_SEPARATOR which means it can change and shouldn't really be coded in as part of a string. I think using interfaces.configuration.path_join('kernel', 'offset') would be ok...

@the-rectifier
Copy link
Contributor Author

Hey all, happy new year!

I've been working on my thesis and I figured I'll work on this, now that I have a slightly better understanding of the framework and its inner workings.

I updated windows.py to accept a virtual offset and to create the _EPROCESS object at that offset. The only issue for me is that on the windows side, I didn't find any way to create the object at the physical offset. From my understanding, psscan uses poolscanner to search for the _POOL_HEADER (and its tag) and creates the _EPROCESS from there...

thank you @eve-mem for suggesting that linux virtual/physical offsets can be used interchangeably when creating the object

also implemented some extra sanity checks if we know what PID are we expecting

@ikelos
Copy link
Member

ikelos commented Jan 18, 2025

Happy New Year to you too! 5:D

So basically, if you've got physical_layer_name, physical_offset and virtual_layer_name, then you simply do create_object(layer_name=physical_layer_name, offset=physical_offset, native_layer_name=virtual_layer_name). This will read the data for the object out of the physical layer, but ensure that any pointers that are accessed within create their target objects on the virtual layer (at the offset mentioned in the pointer). This has the effect of providing all the right data from the object and its children, but allows it to have been created on the physical layer. Does that help explain the process? You still have to have found the physical offset somehow yourself, but once you've got it, making the actual object is no problem... 5:)

@the-rectifier
Copy link
Contributor Author

So I played with the code a bit, and before committing, I'd like to ask a few more questions:

  • Is it correct to say that the memory_layer (Lime, Elf) is the physical layer and the layer_name and the layers created using add_process_layer() are the virtual layers (Intel32) (and is this always the case)?

  • On the Linux side, at least from my testing, the offset can be either a virtual or a physical one, and the object is always created on the memory_layer with the layer_name as the native layer. On Windows this doesn't work so I would have to put different parameters for each type of offset (as currently implemented). So I wanted to ask if we should consider a type of address more important (concrete) or some precedence, ie physical > virtual > pid, or make it exclusively a single non-None parameter or leave it up to the user?

@ikelos
Copy link
Member

ikelos commented Jan 19, 2025

No, that's not guaranteed, although it is convention. Add process layer will typically create a virtual layer, the layer below an intel layer is typically called memory_layerX, but it's better to ask the virtual layer for layer.config.get('memory_layer') to determine the actual name of the lower layer. Also note, that's only because intel's base layer is called memory_layer and for arm it may end up being different, but we'll cross that bridge later.

The native layer machinery is independent of the OS, but it may also be unnecessary. If you've got the virtual address for it, that's best. If you've got a physical address for it and you know the virtual layer it lives in that comes a close second, but there isn't really a precedent. If you're intending to use a single offset parameter, don't it'll just confusing things, and other wise I'd say if someone provides more than one of (pid, virtual, physical) then you throw an error and say they need to provide only one...

@the-rectifier
Copy link
Contributor Author

Thanks for explaining the layers @ikelos, everything is much clearer now! I've made the changes you suggested, and also now I'm sourcing the layer names from the kernel module (like the linux psscan) as opposed to self.current_layer which can change. Since for linux the same code can be used for both offset types I have only one offset parameter, shall I change that?

@the-rectifier
Copy link
Contributor Author

Also, the force-push removed the documentation commit, will do that when I finalize the code

@ikelos
Copy link
Member

ikelos commented Jan 20, 2025

Thanks for explaining the layers @ikelos, everything is much clearer now! I've made the changes you suggested, and also now I'm sourcing the layer names from the kernel module (like the linux psscan) as opposed to self.current_layer which can change. Since for linux the same code can be used for both offset types I have only one offset parameter, shall I change that?

I probably would, how do you know whether they provided a virtual or a physical offset otherwise? If they both happen to be valid, you've no way of telling which one they meant...

Copy link
Member

@ikelos ikelos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking much better, thanks! I think there's still a few improvements to be had, but to be honest I'd forgotten we used print rather than vollog within volshell, but that seems to be how the rest of it's coded...

)

try:
DescExitStateEnum(ptask.exit_state)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this statement to test for offset validity? If so I think something like is_valid would work better, so ptask.exit_state.is_valid()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I yanked this from the psscan where it checks for a valid exit state psscan.py. Also it seems that is_valid() is not a method of the members of the task_struct but rather the struct's itself pslist.py

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, fair enough. It will work, just sticks out a little to me...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall I use is_valid() then on all proc/tasks? On windows since dt() breaks if the object is not valid, returning None makes more sense, than just printing a warning

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's alright, it just stands out to me as a method of testing (and getting a second example in the code base will make it harder to prevent if more people add it that way). There's currently a bit of a shift in the linux codebase where they're implementing check validity first rather than throwing exceptions. If you'd like to and can get it to work, it might be better in keeping with the rest of the codebase, but if you don't want to or it's tricky to achieve then that's fine...

@@ -50,6 +107,7 @@ def construct_locals(self) -> List[Tuple[List[str], Any]]:
result += [
(["ct", "change_task", "cp"], self.change_task),
(["lt", "list_tasks", "ps"], self.list_tasks),
(["gp", "get_process"], self.get_process),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add "get_task" to the end of that list too please? gt won't work, but get_task then would... Can't tell if it's worth adding that alias to windows too, I'll leave that one up to you...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't particularly like that we needed to mix process/task, so for the windows side I think i'd leave it just with the process suffix

@@ -40,6 +51,52 @@ def change_task(self, pid=None):
return None
print(f"No task with task ID {pid} found")

def get_process(self, pid=None, offset=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies offset is always a physical offset. Were you going to split that into physical_offset and virtual_offset or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did split the code, basically the same as the windows side, but wanted to get your input on the is_valid() thing before committing

@@ -44,11 +44,58 @@ def list_processes(self):
)
)

def get_process(self, pid=None, v_offset=None, p_offset=None):
"""Returns the EPROCESS object that matches the pid. If v_offset/p_offset is provided, construct the EPROCESS object at the provided address. Only one parameter is allowed."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here with regards to Args: and Returns:. Also I kinda prefer variables have more descriptive names (although I'll allow it if you really want, since it is likely to be typed be developers quite often).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall I use virtaddr and physaddr on both?

@the-rectifier the-rectifier changed the title Volshell: Add Dedicated Method to retrieve EPROCESS/Task object given the PID Volshell: Add Dedicated Method to retrieve EPROCESS/Task object Jan 22, 2025
@ikelos ikelos merged commit db0a8a0 into volatilityfoundation:develop Jan 23, 2025
13 checks passed
@ikelos
Copy link
Member

ikelos commented Jan 23, 2025

Ok, looks good, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants