COM for the “I did C++ once a thousand years ago, but only do .NET now” Developer –or– In the Defense of COM

 

To some .NET developers, COM is a dirty little turd that no matter how hard they try, it won’t flush.   Microsoft and other vendors keep on pumping out new COM based SDKs year after year.  Sometimes these APIs are directly consumable by .NET and in the case they are not, you can visibly see developers’ anxiety rise at the thought of working directly with COM.  There’s a lot of beef with a lot of developers about COM.  Where does it all come from?

image

“I did XYZ in COM and I couldn’t make it work, so COM sucks”, “I used ABC COM technology and it was way too overcomplicated” and the variations are things that have been heard and are wide spread.  While the fact that so many developers have these grievances about COM is generally relevant, it is also not very fair.  Imagine looking at .NET for the first time and diving right into WCF or Workflow and saying “I got burned by it, so .NET blows”.

Wikipedia has a fairly comprehensive article on COM, but unfortunately it just re-enforces what many developers already believe…That it’s an over-engineered, over-complicated bloated piece of shier…machinery.  (Being objective, I wanted to add that we’ve heard the same thing about the .NET BCLs.)  To the defense of COM, I wanted to write an article explaining why at its root, COM is simple, elegant, largely misunderstood and how it wants to be your friend.

COM – What is it?  Why is it?

COM covers a lot of ground and has accumulated quite a few surrounding technologies around it over the years, many I would consider obsolete.  Because of this, I only wanted to cover what I felt was the core and timeless pieces of COM. 

Get on with it already!

Back in the olden days, a developer would create a DLL using good old “C” to contain all of their wonderful logic.  When they would consume these DLLs, they would dynamically link to these libraries using the DLL’s lib file or export some specific methods (the methods you would call using p/invoke).  Life was simple in these days, but then object oriented programming had to come in and screw it all up. How does one share objects between DLLs?

Looking at the “Why” so we can explain the “What”

To really give an accurate picture of what COM is and why it exists, I need to give a little background.  When C++ was coming on the scene, there were no C++ compilers as we know today.  One would write C++ code, then “precompile” that to “C” code for a “C” compiler to directly consume and compile.  This made practical sense as C++ was really just C with some minor compiler magic.  Your CPU doesn’t know what an object is, so it was the C++ precompiler job to flatten out your class prototype to “C”.  Today’s C++ compilers do not need a C++ precompiler, but the end result is conceptually the same.

Since examples in assembly can scare most folks away from a blog post (and my asm prowess is weak), here is an example of what flattening a C++ class to “C” might look like is this (pseudo code, so not exact, but similar):

class CMyWidget
{
public:
    void Process()
    {
       ProcessMyPrivate(5);
    }

private:
    void ProcessMyPrivate(int num)
    {
      this->m_myLocal = num;
    }

    int m_myLocal;
};

Might get compiled down to this:

CMyWidget_Process(void* pObj)
{
    /* Won’t compile, but gives a good picture */
   *(pObj_vtable + CMyWidget_ProcessMyPrivate_vTable_Offset)(pObj, 5)
}

CMyWidget_ProcessMyPrivate(void* pObj, int num)
{
   /* Calc mem address of m_myLocal and set value */
   *(pObj + m_myLocal_Memory_Offset) = num;
}

When working with higher level and object oriented languages like .NET it’s easy to forget some important items.  So when I explain this example to .NET devs, or devs that have never used native code before, I need to point out a few things.  First is that all methods, even if they belong to your class, are ALWAYS “static”.  That means for every method you write in code, by the time it gets executed, it only exists in one memory location.  Second, there is no such thing as an object.  The object gets compiled down to a pointer, which in the end, only represents a memory address that signifies the start memory address of all your objects local variables (also has the v-table, but I’ll talk about that in a bit).  Notice the class method calls have been expanded to take this context pointer.   So really, to your CPU, an object instance is simply just a context structure, and the “this” keyword simply refers to that context.

Sharing the C++ Class across DLL Boundaries – What Seems to be the Problem?

I explained briefly how simple exporting regular “C” methods can be to share code between DLLs and the application.  I also explained how C++ classes get flattened and compiled down and also how they are seen to your CPU (more or less).  If C++ classes just get compiled to static “C” methods, which can be exported from our DLL, is that not sufficient to reuse and share classes across DLLs?  Yes…but not without some major caveats as this would come at a cost.

·         The first disadvantage of exporting a C++ class is you must use the same C runtime in all DLLs/exe involved.  So if one DLL used MSVCRT7, the caller of the DLL must be using MSVCRT7.  The reason is different versions of C runtimes manage the heap differently, so if DLL A tried to free resources allocated by DLL B, memory corruption can occur.

·         The second disadvantage is the same compiler version must be used.  This is because different compilers mangle names differently.  Name mangling differences among compilers ensures a DLL built with Microsoft would have completely different symbol names than a DLL built with GCC.  So when linking of your code occurs, the linker might be looking for “??4MyWidget…” when the export is really “MyWidget__Fi…”

·         Third, if a large change of the base classes in DLL A are made, even without breaking API, then DLL B that uses the base classes must be recompiled.

·         Forth, if you take the last three disadvantages into consideration, we have a very tight coupling between our DLLs.  This inhibits reusability of the module.

Sometimes these issues are not an issue for a given project.  This is usually acceptable when the project is small, or an in house utility.  That’s totally fine.  But let’s pretend Microsoft ignored these problems.  All developers would have to use the same compiler version.  Applications would have to be written for very specific versions of Windows.  This would be a complete nightmare for everyone.

Enter COM to Save the Day

COM attempts to solve these particular issues of sharing an object between DLLs that may or may not have been built using different compilers or use different C runtimes in a simple and elegant way, by way of using interfaces and harnessing the v-table.  So hopefully I’ve confused you here and asking “WTF is a v-table?” or “C++ doesn’t have a notion of interfaces, WTF are you talking about?”

Remember how I said a C++ object really just gets compiled down to a context-like pointer?  Well it not only keeps reference to your class’s local variables, it also keeps reference to all the memory locations of the functions that make up the class.  This is known as a virtual table aka v-table.

 

CMyWidget Object Pointer

CMyWidget V-Table

0 index

0xDEADBEEF
/* Points to CMyWidget::Process */

1 index

0xBADBEEF
/* Points to CMyWidget::ProcessMyPrivate

 

 

m_myLocal /* Address to local variable */

So if we look at the value located at the CMyWidget object pointer, we will find another pointer.  This is the pointer to the v-table.  Think of the v-table as just an array of pointers.   In an x86 process, the v-table is just 4 bytes that point to a method.  The next 4 bytes point to another method and so on.

So here we can deduce that if we have an object’s pointer, we can read its v-table.  If we can read it’s v-table, we can call any method on a C++ class!  Still this might not be very helpful as the compiler might assign private methods mixed in with the public methods.  COM solves this by using interfaces.  Interfaces don’t exist specifically in C++, but do conceptually and known as abstract virtual classes.  One might look like this:

struct IMyWidget
{
   virtual int Process() = 0;
}

Then we can modify our class to implement this interface:

class CMyWidget : public IMyWidget
{
public:
    void Process()
    {
       ProcessMyPrivate(5);
    }

private:
    void ProcessMyPrivate(int num)
    {
      this->m_myLocal = num;
    }

    int m_myLocal;
};

So how does this solve the issue of passing a class across DLL boundaries? Consider this in DLL A.

extern “C” IMyWidget CreateWidget()
{
   return new CMyWidget();
}

The compiler will construct a CMyWidget object and because it returns an IMyWidget interface, it will return a pointer with one entry in the v-table, which will be the “Process” method defined in the IMyWidget. This means a calling DLL, say DLL B, can simply do something like this:

IMyWidget* widget = CreateWidget();
widget->Process();

This works because when DLL B is compiled, the compiler reads the IMyWidget definition and knows the first (0 index) method is the “Process” method.  This is how COM gets around sharing object instances across DLLs, across compilers and across C runtime versions.  There are a couple implied things here though.  The first is a factory method is required to instantiate the object.  This is because the object needs to be created by the DLL that defines it as that DLL will use a specific version of the C runtime.  The other implication, not shown here, is you cannot simply call “delete widget” from DLL B.  This is because the DLLs may be using different C runtimes.  Instead a method must be added to the CMyWidget and interface that executes “delete this”.  That will ensure the correct version of the C runtime will delete the object.

You Seem to be Digressing.  Get Back to COM!

So now we should be familiar with the general problem of sharing native objects between DLLs and how COM attempts to fix it.   That’s all fine and dandy, but there needs to be some standardization as this simple pattern and compiler parlor tricks don’t make up a technology such as COM.  COM standardization all begins with the interface called IUnknown.  If a class implements IUnknown, it’s a COM object.  No if’s and’s or buts.   Here is what IUnknown is defined as:

interface IUnknown

{

   virtual HRESULT QueryInterface(REFIID riid, void**ppvObject)=0;

   virtual ULONG AddRef(void)=0;

   virtual ULONG Release(void)=0;

};

 

Three functions.  That’s it.  As far as I’m concerned, this IS COM.  Though these methods are simple, they deserve explanation. 

AddRef/Release – Reference Counting

In native languages like C and C++ there is no garbage collector.  When you allocate on the heap, it stays there until “delete” is called on it.  This can create difficulties, especially when you consider object ownership.  The problem is if Class A has reference to Object-X, then gives Object-X to Class B, who “owns” Object-X?  More specifically, who deletes Object-X?  If Class B deletes it when a Class B object gets destroyed, Class A now has an invalid Object-X reference!  A common pattern for fixing this is known as reference counting.  So if we consider Object-X as a COM object that Class A had reference to, Object-X would have a reference count of 1.  If Class B got a reference to Object-X, it would then have a reference count of 2.  If Class B got destroyed, it would not delete Object-X, but instead it would decrement Object-X’s reference count to 1.  If Class A was destroyed, it would decrement Object-X’s reference count, which would then be 0.  If Object-X’s count reaches 0, then Object-X would delete itself (delete this).  The Achilles-Heel of reference counting is circular references, but there are patterns to help with this.

COM has what is known as intrusive reference counting.  This means that reference counting is built into the object itself, by way of IUnknown.  The AddRef method increments the reference count and Release, decrements the reference count, when reaching 0, it will delete itself.

The basic protocol with AddRef/Release on a COM object is this:

·         Any factory method that gives creates you a COM object will have a reference count of greater than 0 and will have been incremented for you.

·         Any time a method returns reference to a COM object, the callee has already called AddRef.

·         When done with a reference to a COM object, call Release

As a side note, if you use smart pointers like CComPtr, it will handle all this house keeping for you and you’ll never leak or never need to call AddRef/Release ever again.

All about IUnknown::QueryInterface

The QueryInterface method is at the very heart of COM.  It enables the “C” in the COM acronym.  As a component, a COM object can contain several different services (or interfaces).  QueryInterface provides access to these.

Consider this example:

IMyWidget* widget = (IMyWidget*)pComObj;

This would work in COM because the compiler knows to construct an IMyWidget pointer, setup with the correct v-table.  The problem is we may not know what interfaces pComObj supports at runtime.  We can’t be casting to various interfaces and get exceptions when one isn’t supported.  Another issue is pComObject might not implement a specific interface, so it won’t be castable, but it might contain an object internally that does.

QueryInterface takes two parameters.  The first is a GUID.  Because C++ doesn’t have a rich typing system/reflection like Java or .NET, QueryInterface needs to know what specific interface is being requested.  If QueryInterface was a .NET technology, it might look like this:

comObj.QueryInterface<IMyWidget>(out widget);

A typical QueryInterface implementation looks like this:

QueryInterface (REFIID   riid, LPVOID * ppvObj)
{
// Always set out parameter to NULL, validating it first.
    if (!ppvObj)
        return E_INVALIDARG;
    *ppvObj = NULL;
    if (riid == IID_IUnknown)
    {
// Increment the reference count and return the pointer.
        *ppvObj = (IUnknown*)this;
        AddRef();
        return NOERROR;
    }
    if (riid == IID_IMyWidget)
    {
// Increment the reference count and return the pointer.
        *ppvObj = (IMyWidget*)this;
        AddRef();
        return NOERROR;
    }
    if (riid == IID_ISomeOtherInterface)
    {
// Return a local reference to an internal service
        *ppvObj = m_myLocalISomeOtherInterface;
        myLocalISomeOtherInterface->AddRef();
        return NOERROR;
    }

    return E_NOINTERFACE;
}

Here you can see any interface can be queried safely at runtime and even internal local instances can be returned, making the COM object truly componentized.  That’s it!  Hopefully by now you see that COM is quite simple, and in its own way, elegant.

There’s Gotta be More to COM!

There is more, but that’s where things do get complicated.  Technologies like threading apartments, registry (if gratuitous), COM+, DCOM.  Bah!   None of these technologies do you have to use in order to use COM.  These are the technologies I feel are obsolete and to an extent I think Microsoft feels the same way as almost all their new COM APIs do not use these things, but instead just use what they describe as “lightweight COM”, which is essentially what I have described here.

COMmon Misconceptions”

“I need to register a COM object (regsvr32) to use it”

False.  Registering a COM object is synonymous with adding a .NET assembly to the GAC.  It is not required for use, but if one wants their COM object globally accessible by just a ProgId or a CLSID (like a DirectShow filter or a WIC codec) registration is required.  When a COM object is registered, it simply adds ONE GUID (along with some other minor metadata) per COM object to the registry and the path to the DLL that the COM object resides in.

“COM is just over complicated and bloated”

False.  COM is just IUnknown, described above in this post.  The years and years of random technologies surrounding COM is what is complicated and bloated.  In the end, you choose how complicated your COM based project will be.

“I cannot use any parameter types I want in my COM methods.  I don’t want to be confined to COM automation types!”

True and False.  This is more of a C++ issue.  For instance if one parameter of your COM method was an std::string, one could have issues because the caller is using a different C runtime than the compiled DLL that contains the COM class.  That’s not to say it’s not possible, but for safety and compatibility you want to pick bare-bones types.

“COM Threading Apartments are bullshit”

True.  They are bullshit and you should not use them (IMO) in modern COM development.  As of this writing, its 2011.  We should not need things like single-threaded-apartment as we know how to properly synchronize our code to be thread safe.  It’s my advice you always make your COM objects free threaded and honestly, just bypass CoInitialize and CoCreateInstance where possible and just use factory methods to instantiate your COM objects.

 

 

 

7 comments

  1. Pingback: COM for the “I did C++ once a thousand years ago, but only do .NET … | Internet blog
  2. Eric Meyer

    Jeremiah … Nice post! You really got down to the real substance of COM without getting bogged down in the bloat. If Microsoft just emphasized this part, more developers would be comfortable using COM instead of being afraid of it. -Eric

  3. Robert Fraser

    My problem with COM isn’t the design of it, but rather the truly ungodly C interface (and the disturbing lack of documentation for it). I’m guessing MS just expects everyone to use C or .NET, but pure C89 is still the lingua Franca of computing. Other cross-platform “object-oriented” systems have nice C interfaces, so why couldn’t MS be arsed to make one?

  4. Bradley Grainger

    I like how you’ve distilled COM to its essential core, but think the QueryInterface example might be poorly chosen; I don’t think it’s a good idea to suggest that QueryInterface can return an internal object because this is a very easy way to violate the rules of QueryInterface (http://msdn.microsoft.com/en-us/library/ms686590.aspx).

    For example, special code would be necessary in myLocalISomeOtherInterface to ensure that a QI for IUnknown returns the parent object. Also, calling AddRef on myLocalISomeOtherInterface instead of ‘this’ (in the IID_ISomeInterface if block) seems like a bad idea, because it could allow the parent’s refcount to drop to 0 while its internal services are still being used. (Of course, the parent could sum the refcounts on all its internal services when deciding to delete itself, but this is not immediately apparent.)

    • jeremiahmorrill

      Good eye! It was a poor example of COM aggregation as more work would need to be done to ensure the same IUnknown pointer is returned along with doing proper ref counting. I’ll update the article to reflect this! Thanks!

  5. David Schach

    Even if you use CComPtr, reference counting still has serious problems with cycles. A simple parent child relationship in COM and requires an extra object to break the cycle (which will prevent the objects from being freed). In complex systems these cycles can be easily missed and result in hard to track down memory leaks.

    The other problem with COM is error handling. Every line of code requires an if-test and the real code becomes lost inside all of the error checking. Even meticulous programmers get it wrong.

    • jeremiahmorrill

      You are correct about circular references and I covered that in my other article on memory management. A common solution is to use a weak_ptr that will break the circular reference. Developers need to be aware of such things, but even in managed languages, developers need to be aware of things that cause their memory not to be automatically freed.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s