Home About Eric Topics SourceGear

2023-01-25 11:00:00

Objects

This is part of a series on Native AOT.
Previous -- Top -- Next


So far, our examples have been very simplistic, using only integer types. Native AOT won't be very useful if we can never use objects.

And we can. We just need to express them in terms of the conventions of C.

Actually, the techniques for dealing with .NET objects in unmanaged code are not new. We've had good interop capabilities much longer than we've had Native AOT. The primary way to pass an object reference to unmanaged code is a GCHandle, and it first appeared in .NET Framework about 20 years ago.

If you are already familiar with GCHandle, please bear with me as I explain things from first principles in the context of Native AOT.

In many languages, memory is managed manually. If you need a bit of memory, you have to ask for it, and when you are done, you have to release it. In C, the standard library functions for this are called malloc and free. The need to carefully manage memory has been the source of countless bugs.

The memory for .NET objects is managed automatically by a Garbage Collector (GC). When you construct an object, memory is allocated, but you don't have to worry about explicitly releasing it. The .NET runtime keeps track of things for you, and when a block of memory is no longer being used, it is classified as "garbage" and freed.

This works because .NET knows about all objects. But if you want to store an object reference somewhere that .NET cannot see, then the GC doesn't know about that reference, so it might decide the object is garbage, and your reference would become invalid.

This is the problem a GCHandle is designed to solve. We can create a GCHandle for any object, and when we do so, we are telling the GC that "as long as this handle exists, the object is not garbage". The GCHandle can be passed into unmanaged code and stored in unmanaged memory.

The following Native AOT function returns an object (a string):

[UnmanagedCallersOnly(EntryPoint = "get_hello_string")]
public static IntPtr GetHelloString()
{
    string s = "Hello World";
    GCHandle h = GCHandle.Alloc(s);
    return GCHandle.ToIntPtr(h);
}

An IntPtr is an integer that is the same size as a pointer. On most modern systems, that'll be 64 bits. On 32-bit systems, pointers are 32 bits wide, so IntPtr is as well.

In any case, an IntPtr is an integer, so we can return it across the Native AOT boundary, where it can be used by unmanaged code in whatever way we like.

Well, actually, the unmanaged code can't do much with it at all. The IntPtr is "opaque". It's probably the numerical address of a block of memory, but it doesn't have to be, and even if it is, we're not supposed to modify that memory or even look at it.

The only thing we can do with our IntPtr is give it back to the .NET code and ask it to do something. But that opens lots of possibilities.

Here's a Native AOT function that retrieves the length of a string:

[UnmanagedCallersOnly(EntryPoint = "get_string_length")]
public static int GetStringLength(IntPtr v)
{
    GCHandle h = GCHandle.FromIntPtr(v);
    object ob = h.Target;
    string s = (string) ob;
    int len = s.Length;
    return len;
}

This is the typical pattern when we have an object handle in unmanaged code and we pass it back to .NET and ask it to do something.

So far, we've seen one code snippet that converts an object to an IntPtr, and one code snippet that converts an IntPtr back to an object. But it's quite common to need both in the same function. Here's a Native AOT function that accepts a string and returns another string made by calling String.Replace().

[UnmanagedCallersOnly(EntryPoint = "banish_letter_l")]
public static IntPtr BanishLetterL(IntPtr v)
{
    var s = (string) GCHandle.FromIntPtr(v).Target;
    var s2 = s.Replace("l", "NOT");
    return GCHandle.ToIntPtr(GCHandle.Alloc(s2));
}

It is important to remember that every GCHandle must be released. So, if we're going to return objects from Native AOT functions, we must also provide something like the following:

[UnmanagedCallersOnly(EntryPoint = "free_object_handle")]
public static void FreeObjectHandle(IntPtr v)
{
    GCHandle h = GCHandle.FromIntPtr(v);
    h.Free();
}

The GCHandle concept is a way of bridging the gap between the automatic memory management of .NET and the world where memory is managed manually. Like most any other form of manual memory management, GCHandle is very unforgiving. If we don't release a handle, the object will never be freed, and we get a memory leak. If we release a handle more than once, or if we release a handle that does not exist, we are likely to cause memory corruption.

Finally, the C++ code below shows an example of how to call the functions shown above.

#include <cstdint>
#include <stdio.h>

extern "C" uintptr_t get_hello_string();
extern "C" int32_t get_string_length(uintptr_t);
extern "C" uintptr_t banish_letter_l(uintptr_t);
extern "C" void free_object_handle(uintptr_t);

int main()
{
    // the original string is "Hello World"
    uintptr_t s1 = get_hello_string();

    // the length of the original string is 11
    int32_t len1 = get_string_length(s1);
    printf("%d\n", len1);

    // the new string should be "HeNOTNOTo WorNOTd"
    uintptr_t s2 = banish_letter_l(s1);

    // the length of the new string is now 17
    int32_t len2 = get_string_length(s2);
    printf("%d\n", len2);

    // need to release both string objects
    free_object_handle(s1);
    free_object_handle(s2);

    return 0;
}

Two final thoughts about the code sample for this chapter:


The code for this blog entry is available at:

https://github.com/ericsink/native-aot-samples/tree/main/hello_string