Skip to content

Commit

Permalink
Merge pull request #315 from Washi1337/development
Browse files Browse the repository at this point in the history
4.11.0
  • Loading branch information
Washi1337 authored May 13, 2022
2 parents 388a36e + a098aa1 commit bcfba6b
Show file tree
Hide file tree
Showing 51 changed files with 888 additions and 169 deletions.
2 changes: 1 addition & 1 deletion Directory.Build.props
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<RepositoryUrl>https://github.com/Washi1337/AsmResolver</RepositoryUrl>
<RepositoryType>git</RepositoryType>
<LangVersion>10</LangVersion>
<Version>4.10.0</Version>
<Version>4.11.0</Version>
</PropertyGroup>

</Project>
4 changes: 2 additions & 2 deletions appveyor.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
- master

image: Visual Studio 2022
version: 4.10.0-master-build.{build}
version: 4.11.0-master-build.{build}
configuration: Release

skip_commits:
Expand Down Expand Up @@ -33,7 +33,7 @@
- development

image: Visual Studio 2022
version: 4.10.0-dev-build.{build}
version: 4.11.0-dev-build.{build}
configuration: Release

skip_commits:
Expand Down
108 changes: 88 additions & 20 deletions docs/dotnet/importing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,27 @@ Reference Importing

.NET modules use entries in the TypeRef or MemberRef tables to reference types or members from external assemblies. Importing references into the current module therefore form a key role when creating new- or modifying existing .NET modules. When a member is not imported into the current module, a ``MemberNotImportedException`` will be thrown when you are trying to create a PE image or write the module to the disk.

AsmResolver provides the ``ReferenceImporter`` class that does most of the heavy lifting.
AsmResolver provides the ``ReferenceImporter`` class that does most of the heavy lifting. Obtaining an instance of ``ReferenceImporter`` can be done in two ways.

All samples in this document assume there is an instance of ``ReferenceImporter`` created using the following code:
Either instantiate one yourself:

.. code-block:: csharp
ModuleDefinition module = ...
var importer = new ReferenceImporter(module);
Or obtain the default instance that comes with every ``ModuleDefinition`` object. This avoids allocating new reference importers every time.

Importing metadata members
.. code-block:: csharp
ModuleDefinition module = ...
var importer = module.DefaultImporter;
The example snippets that will follow in this articule assume that there is such a ``ReferenceImporter`` object instantiated using either of these two methods, and is stored in an ``importer`` variable.


Importing existing members
--------------------------

Metadata members from external modules can be imported using the ``ReferenceImporter`` class using one of the following members:
Expand Down Expand Up @@ -59,18 +70,28 @@ Below an example of how to import a type definition called ``SomeType``:
ITypeDefOrRef importedType = importer.ImportType(typeToImport);
Importing type signatures
-------------------------
These types also implement the ``IImportable`` interface. This means you can also use the ``member.ImportWith`` method instead:

.. code-block:: csharp
ModuleDefinition externalModule = ModuleDefinition.FromFile(...);
TypeDefinition typeToImport = externalModule.TopLevelTypes.First(t => t.Name == "SomeType");
ITypeDefOrRef importedType = typeToImport.ImportWith(importer);
Importing existing type signatures
----------------------------------

Type signatures can also be imported using the ``ReferenceImporter`` class, but these should be imported using the ``ImportTypeSignature`` method instead.

.. note::
.. note::

If a corlib type signature is imported, the appropriate type from the ``CorLibTypeFactory`` of the target module will be selected, regardless of whether CorLib versions are compatible with each other.


Importing using reflection
--------------------------
Importing using System.Reflection
---------------------------------

Types and members can also be imported by passing on an instance of various ``System.Reflection`` classes.

Expand All @@ -90,22 +111,72 @@ Types and members can also be imported by passing on an instance of various ``Sy
| ``FieldInfo`` | ``ImportScope`` | ``MemberReference`` |
+---------------------------+------------------------+----------------------+


There is limited support for importing compound types. Types that can be imported through reflection include:
There is limited support for importing complex types. Types that can be imported through reflection include:

- Pointer types.
- By-reference types.
- Array types:
- If an array contains only one dimension, a ``SzArrayTypeSignature`` is returned. Otherwise a ``ArrayTypeSignature`` is created.
- Array types (If an array contains only one dimension, a ``SzArrayTypeSignature`` is returned. Otherwise a ``ArrayTypeSignature`` is created).
- Generic parameters.
- Generic type instantiations.

Instantiations of generic methods are supported.
Instantiations of generic methods are also supported.


Creating new references
-----------------------

Member references can also be created and imported without having direct access to its member definition or ``System.Reflection`` instance. It is possible to create new instances of ``TypeReference`` and ``MemberReference`` using the constructors, but the preferred way is to use the factory methods that allow for a more fluent syntax. Below an example on how to create a fully imported reference to ``void System.Console.WriteLine(string)``:

.. code-block:: csharp
var factory = module.CorLibTypeFactory;
var importedMethod = factory.CorLibScope
.CreateTypeReference("System", "Console")
.CreateMemberReference("WriteLine", MethodSignature.CreateStatic(
factory.Void, factory.String))
.ImportWith(importer);
// importedMethod now references "void System.Console.WriteLine(string)"
Generic type instantiations can also be created using ``MakeGenericInstanceType``:

.. code-block:: csharp
ModuleDefinition module = ...
var factory = module.CorLibTypeFactory;
var importedMethod = factory.CorLibScope
.CreateTypeReference("System.Collections.Generic", "List`1")
.MakeGenericInstanceType(factory.Int32)
.ToTypeDefOrRef()
.CreateMemberReference("Add", MethodSignature.CreateInstance(
factory.Void,
new GenericParameterSignature(GenericParameterType.Type, 0)))
.ImportWith(importer);
// importedMethod now references "System.Collections.Generic.List`1<System.Int32>.Add(!0)"
Similarly, generic method instantiations can be constructed using ``MakeGenericInstanceMethod``:

.. code-block:: csharp
ModuleDefinition module = ...
var factory = module.CorLibTypeFactory;
var importedMethod = factory.CorLibScope
.CreateTypeReference("System", "Array")
.CreateMemberReference("Empty", MethodSignature.CreateStatic(
new GenericParameterSignature(GenericParameterType.Method, 0).MakeSzArrayType(), 1))
.MakeGenericInstanceMethod(factory.String)
.ImportWith(importer);
// importedMethod now references "!0[] System.Array.Empty<System.String>()"
.. _dotnet-importer-common-caveats:

Common Caveats using the Importer
Common Caveats using the Importer
---------------------------------

Caching and reuse of instances
Expand All @@ -116,20 +187,17 @@ The default implementation of ``ReferenceImporter`` does not maintain a cache. E
Importing cross-framework versions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``ReferenceImporter`` does not support importing across different versions of the target framework. Members are being imported as-is, and are not automatically adjusted to conform with other versions of a library.
The ``ReferenceImporter`` does not support importing across different versions of the target framework. Members are being imported as-is, and are not automatically adjusted to conform with other versions of a library.

As a result, trying to import from for example a library part of the .NET Framework into a module targeting .NET Core or vice versa has a high chance of producing an invalid .NET binary that cannot be executed by the runtime. For example, attempting to import a reference to ``[System.Runtime] System.DateTime`` into a module targeting .NET Framework will result in a new reference targeting a .NET Core library (``System.Runtime``) as opposed to the appropriate .NET Framework library (``mscorlib``).
As a result, trying to import from for example a library part of the .NET Framework into a module targeting .NET Core or vice versa has a high chance of producing an invalid .NET binary that cannot be executed by the runtime. For example, attempting to import a reference to ``[System.Runtime] System.DateTime`` into a module targeting .NET Framework will result in a new reference targeting a .NET Core library (``System.Runtime``) as opposed to the appropriate .NET Framework library (``mscorlib``).

This is a common mistake when trying to import using metadata provided by ``System.Reflection``. For example, if the host application that uses AsmResolver targets .NET Core but the input file is targeting .NET Framework, then you will run in the exact issue described in the above.

.. code-block:: csharp
var targetModule = ModuleDefinition.FromFile(...);
var importer = new ReferenceImporter(targetModule);
var reference = importer.ImportType(typeof(DateTime));
// `reference` will target `[mscorlib] System.DateTime` when running on .NET Framework, and `[System.Runtime] System.DateTime` when running on .NET Core.
Therefore, always make sure you are importing from a .NET module that is compatible with the target .NET module.
Therefore, always make sure you are importing from a .NET module that is compatible with the target .NET module.
34 changes: 0 additions & 34 deletions docs/dotnet/methods.rst

This file was deleted.

69 changes: 54 additions & 15 deletions docs/dotnet/unmanaged-method-bodies.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Allowing native code in modules

To make the CLR treat the output file as a mixed mode application, the ``ILOnly`` flag needs to be unset:

.. code-block:: csharp
.. code-block:: csharp
ModuleDefinition module = ...
module.Attributes &= ~DotNetDirectoryFlags.ILOnly;
Expand Down Expand Up @@ -80,7 +80,7 @@ In the following sections, we will briefly go over each of them.
Writing native code
-------------------

The contents of a native method body can be set through the ``Code`` property. This is a ``byte[]`` that represents the raw code stream to be executed. Below an example of a simple method body written in x86 64-bit assembly code, that returns the constant ``0x1337``:
The contents of a native method body can be set through the ``Code`` property. This is a ``byte[]`` that represents the raw code stream to be executed. Below an example of a simple method body written in x86 64-bit assembly code, that returns the constant ``1337``:

.. code-block:: csharp
Expand All @@ -92,14 +92,20 @@ The contents of a native method body can be set through the ``Code`` property. T
.. note::

Since native method bodies are platform dependent, AsmResolver does not provide a standard way to encode these instructions. To construct the byte array that you need for a particular implementation of a method body, consider using a third-party assembler or assembler library.


References to external symbols
------------------------------
Symbols and Address Fixups
--------------------------

In a lot of cases, native method bodies that references symbols (such as imported functions) require direct addresses to be referenced within its instructions. Since the addresses of these symbols are not known yet upon creating a ``NativeMethodBody``, it is not possible to encode such an operand directly in the ``Code`` byte array. To support these kinds of references regardless, AsmResolver can be instructed to apply address fixups just before writing the body to the disk. These instructions are essentially small pieces of information that tell AsmResolver that at a particular offset the bytes should be replaced with a reference to a symbol in the final PE. This can be applied to any object that implements ``ISymbol``. In the following, two of the most commonly used symbols will be discussed.


Imported Symbols
~~~~~~~~~~~~~~~~

In a lot of cases, methods require making calls to functions defined in external libraries and native method bodies are no exception. In the PE file format, these kinds of symbols are often put into the imports directory. This is essentially a table of names that the Windows PE loader will go through, look up the actual address of each name, and put it in the import address table. Typically, when a piece of code is meant to make a call to an external function, the code will make an indirect call to an entry stored in this table. In x86 64-bit, using nasm syntax, a call to the ``puts`` function might look like the following snippet:
In the PE file format, symbols from external modules are often imported by placing an entry into the imports directory. This is essentially a table of names that the Windows PE loader will go through, look up the actual address of each name, and put it in the import address table. Typically, when a piece of code is meant to make a call to an external function, the code will make an indirect call to an entry stored in this table. In x86 64-bit, using nasm syntax, a call to the ``puts`` function might look like the following snippet:

.. code-block:: csharp
Expand All @@ -108,10 +114,6 @@ In a lot of cases, methods require making calls to functions defined in external
call qword [rel puts]
...
Since the import directory is not constructed yet when we are operating on the abstraction level of a ``ModuleDefinition``, the address of the import address entry is still unknown. Therefore, it is not possible to encode an operand like the one in the call instruction of the above example.

To support these kinds of references in native method bodies regardless, it is possible to instruct AsmResolver to apply address fixups just before writing the body to the disk. These are essentially small pieces of information that tell AsmResolver that at a particular offset the bytes should be replaced with a reference to a symbol in the final PE.

Consider the following example x86 64-bit code, that is printing the text ``Hello from the unmanaged world!`` to the standard output stream using the ``puts`` function.

.. code-block:: csharp
Expand All @@ -135,16 +137,16 @@ Consider the following example x86 64-bit code, that is printing the text ``Hell
0x67, 0x65, 0x64, 0x20, 0x77, 0x6f, 0x72, // "ged wor"
0x6c, 0x64, 0x21, 0x00 // "ld!"
};
Notice how the operand of the call instruction is left at zero (`0x00`) bytes. To let AsmResolver know that these 4 bytes are to be replaced by an address to an entry in the import address table, we first create a new instance of ``ImportedSymbol``, representing the ``puts`` symbol:
Notice how the operand of the ``call`` instruction is left at zero (``0x00``) bytes. To let AsmResolver know that these 4 bytes are to be replaced by an address to an entry in the import address table, we first create a new instance of ``ImportedSymbol``, representing the ``puts`` symbol:

.. code-block:: csharp
var ucrtbased = new ImportedModule("ucrtbased.dll");
var puts = new ImportedSymbol(0x4fc, "puts");
ucrtbased.Symbols.Add(puts);
We can then add it as a fixup to the method body:

Expand All @@ -155,12 +157,49 @@ We can then add it as a fixup to the method body:
));
The type of fixup that is required will depend on the architecture and instruction that is used. Below an overview:
Local Symbols
~~~~~~~~~~~~~

If a native body is supposed to process or return some data that is defined within the body itself, the ``NativeLocalSymbol`` class can be used.

Consider the following example x86 32-bit snippet, that returns the virtual address of a string.

.. code-block:: csharp
0xB8, 0x00, 0x00, 0x00, 0x00 // mov eax, message
0xc3, // ret
// message (unicode):
0x48, 0x00, 0x65, 0x00, 0x6c, 0x00, 0x6c, 0x00, 0x6f, 0x00, 0x2c, 0x00, 0x20, 0x00, // "Hello, "
0x77, 0x00, 0x6f, 0x00, 0x72, 0x00, 0x6c, 0x00, 0x64, 0x00, 0x21, 0x00, 0x00, 0x00 // "world!."
Notice how the operand of the ``mov`` instruction is left at zero (``0x00``) bytes. To let AsmResolver know that these 4 bytes are to be replaced by the actual virtual address to ``message``, we can define a local symbol and register an address fixup in the following manner:

.. code-block:: csharp
var message = new NativeLocalSymbol(body, offset: 0x6);
body.AddressFixups.Add(new AddressFixup(
0x1, AddressFixupType.Absolute32BitAddress, message
));
.. warning::

The ``NativeLocalSymbol`` can only be used within the code of the native method body itself. This is due to the fact that these types of symbols are not processed further after serializing a ``NativeMethodBody`` to a ``CodeSegment`` by the default method body serializer.


Fixup Types
~~~~~~~~~~~

The type of fixup that is required will depend on the architecture and instruction that is used. Below an overview of all fixups that AsmResolver is able to apply:

+--------------------------+-----------------------------------------------------------------------+---------------------------------+
| Fixup type | Description | Example instructions |
+==========================+=======================================================================+=================================+
| ``Absolute32BitAddress`` | The operand is an absolute virtual address | ``call dword [address]`` |
| ``Absolute32BitAddress`` | The operand is a 32-bit absolute virtual address | ``call dword [address]`` |
+--------------------------+-----------------------------------------------------------------------+---------------------------------+
| ``Absolute64BitAddress`` | The operand is a 64-bit absolute virtual address | ``mov rax, address`` |
+--------------------------+-----------------------------------------------------------------------+---------------------------------+
| ``Relative32BitAddress`` | The operand is an address relative to the current instruction pointer | ``call qword [rip+offset]`` |
+--------------------------+-----------------------------------------------------------------------+---------------------------------+
2 changes: 1 addition & 1 deletion src/AsmResolver.DotNet/AsmResolver.DotNet.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
<PrivateAssets>all</PrivateAssets>
<IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
</PackageReference>
<PackageReference Include="System.Text.Json" Version="6.0.2" />
<PackageReference Include="System.Text.Json" Version="6.0.4" />
</ItemGroup>

</Project>
Loading

0 comments on commit bcfba6b

Please sign in to comment.