Releasing COM objects: Garbage Collector vs. Marshal.RelseaseComObject
We regularly remind developers about the need to release COM object. Here are these blogs:
- How to properly release Excel COM objects: C# code examples – a collection of practices used to never leave a COM object non-released.
- Why Excel doesn’t quit – here I describe the constructs behind COM objects.
- When to release COM objects – common questions and scenarios.
Just recently we’ve got Eduard’s comment declaring the first blog in this list “wrong and misleading”: the author says Marshal.ReleasingComObject() and related things made him spent “…lots of time on writing code that is totally unnecessary” and provides many links supporting the idea that you can use GC.Collect() to get rid of non-released COM objects. In summary, Eduard’s idea is as follows (my wording):
When processing non-referenced variables, GC clears the memory they occupy and releases the underlying COM objects. While you use multiple Marshal.RelseaseComObject() calls (each releases a single variable) I simply call GC.Collect() to release them all.
Here at Add-in Express, we do not agree with this approach. Everything we told you about releasing COM objects is based on our practice, which (the practice of .NET programming for Office), collectively, is almost 120 years. We understand that we may not know something, though.
The context
While that comment’s context is defined as a non-specified Excel version, I define the context as releasing COM objects in COM and VSTO add-ins loaded in any version of any Office application on Windows Desktop. Some subtopics below apply to releasing COM objects in a standalone application automating an Office application. This post doesn’t apply to so-called Shared add-ins; that project template is available in previous Visual Studio versions.
What is Add-in Express from our point of view
We regard Add-in Express for Office as a set of components, utilities, settings, and practices providing a tried path to creating Office extensions. If you follow the path, you get your extension working right from the start, it is deployable and updatable. What is more, it is ready to work in *all* versions of the target Office application(s) and it is free of simple Office-specific mistakes and potential hardly identifiable and debuggable issues (we saw many of them).
Why releasing COM objects?
How Office reacts to a non-released COM object depends on the Office version, Office application and the way you start it, as well as on the scenario. Say, our practice shows that non-released COM objects produce more issues if starting the Office application programmatically. Cutting this short, releasing COM objects that you created in your code is what Office expects from you in all circumstances. There are areas where Microsoft stresses this requirement. There are areas and scenarios where this requirement isn’t that strict, but you may only know this from practice. Say, we find Word [almost always] ready to your leaving COM objects non-released. But again, there’s no guarantee that your practice will apply to different areas of the Office application that you deal with or to the Office build installed on your customer’s machines. Whatever your practice is, Office was built on COM and so your code is subject to COM rules: release every COM object created.
To help you deal with this, Add-in Express code releases every COM object it passes to you through parameters of Add-in Express events. Your responsibility is to release every COM object created in your code.
Why is this so, mmm, non-ideal?
COM was .NET version 0. When COM appeared to be a global fail (it’s only supported on Windows systems), Microsoft developed .NET, which elaborates and develops the receipts of Java. The real difference between COM and .NET is the way the non-used objects are destroyed. In COM, the developer is (mostly indirectly) responsible for managing the reference counter that every COM object has: creating a COM object increments the counter, when you release the COM object, the counter decrements. When the reference counter is zero, the COM server (an Office application in our case) destroys the COM object and releases associated resources. In .NET, there’s a similar reference counter; when the counter is set to zero, this doesn’t produce an immediate effect. Instead, a Garbage Collector (GC) starts on a schedule of its own. The GC finds non-referenced objects and destroys them. This description may be oversimplified. If you need to deal with GC – and I suggest that you never deal with it – start with Fundamentals of garbage collection. A suggestive title, isn’t it?
As described in Why Excel doesn’t quit, in the .NET+COM case, calling a property/method returning a COM object actually creates this triad:
- The COM object itself; it inhabits the native memory; your .NET code cannot access it directly.
- The RCW – a run-time callable wrapper. Being a .NET object; it communicates with the COM object; in particular, it deals with the reference counter scheme of COM.
- The .NET object that your code deals with. This object is a regular .NET object; it references the RCW, which references the COM object.
Now you see how ReleaseComObject() works: it pings the RCW which decrements the reference counter on the COM object and returns; this is an almost immediate operation. As to the COM object, it lives by the COM laws: if its reference counter is zero, the COM server will destroy it.
GC does the same when it finds an unneeded object referencing an RCW.
GC.Collect() vs Marshal.RelseaseComObject()
So, GC.Collect() looks simple. This is pro. Let’s recount the cons of GC.Collect():
- GC starts unpredictably.
- To deal with this, you can start GC.Collect() yourself; this may cost you more processing time.
- Anyway, GC.Collect() produces unpredictable results due to the inner details of garbage collecting; say, check the last note in this article on support.microsoft.com.
- To deal with the unpredictability of GC, you can use magic formulas such as call GC.Collect(), then call GC.WaitForPendingFinalizers(), then again GC.Collect() and again GC.WaitForPendingFinalizers(). This doesn’t produce predictable results either. Moreover, this looks ugly and should require even more time.
- Some variables (such as class-level things) will need to be nullified before you run GC.Collect(). Such variables are the most obvious candidates for releasing via an explicit call to Marshal.ReleaseComObject().
On the other hand, Marshal.ReleaseComObject() is predictable and fast but it may be boring and sometimes it may be difficult to release every COM object (find a short story in the next section).
Also, due to the way COM objects are managed in .NET, you should follow the “two dots” rule; see How to properly release Excel COM objects. When Marshal.RelseaseComObject() releases the COM object associated with a .NET variable, the .NET variable looks intact. And this is dangerous! You may reuse the released variable (by mistake) and get an exception. To have more chances to prevent this, we suggest doing two things simultaneously: release the variable and nullify it. If you reuse a nullified variable, this produces NullReferenceException. We find that it is easier for developers to understand this exception.
GC requires that you know far more things. To start with, you should choose your way to run it: see all GC.Collect() overloads here. Garbage collecting is a difficult topic; I assume this is because you aren’t supposed to run GC.Collect() at all! There’s an article by Rico Mariani’s (2004!) at docs.microsoft.com, where he declares “Never call GC.Collect” the first rule. Note the context: he talks about memory, not about COM objects. These sides cannot be separated in COM add-ins, though.
The general problem with using GC is you may not be sure that a given COM object is released. This is risky! If it works on your machine, it may not work on another machine! Did you ever think of publishing something on MANY machines? We think our add-ins are capable to work on that many PCs because we do not use GC.Collect to release COM objects!
Releasing COM objects in practice
Short story. Several generations of developers worked on add-ins that you see at ablebits.com. We started with non-releasing COM objects at all. Or with releasing them non-systematically. And it was okay until some Office build ruined it. After we finished cleaning the code, it appeared that we found all but one (or two) unreleased COM object – we confirm that this can be difficult to release all COM objects. Sometimes, that COM object caused an exception in a scenario that didn’t give us a clue. After several attempts, we’ve called GC.Collect() at the end of a long-running process. Now, although this issue is declared solved, we know about a bomb ticking in our code. Is it waiting for a new Office build? Anyway, we won’t modify the working code. Summing up: we regard GC.Collect() as a bad practice because it makes your code unpredictable. End of story.
Now, we write code adhering to these rules:
- We release a COM object as soon as possible. We call it “short transaction”: get, use, release.
- We usually don’t cache COM objects; instead we cache the information that we use to create a corresponding COM object when required; say, the EntryId string lets you get an Outlook mail item, folder or store, the full document name lets you get the document from the Documents (Workbooks, Presentations) collections. Similarly, a Shape has ID, Sheet has Name, etc.
- We remember that Add(), Move(), Copy(), etc. methods may return a COM object.
- Add-in Express releases every COM object passed to every event handler our add-ins receive. Say, Add-in Express will release COM objects passed to the SheetChange (Excel), WindowSize (Word) or AttachmentAdd (Outlook) events as soon as the corresponding event handler finishes. In VSTO, the developer needs to release these COM objects explicitly.
- We use this approach: the caller is responsible for the COM objects it passes to the callee.
- We use for loops on COM collections as a foreach loop creates a hidden COM object that may produce issues e.g. in Outlook.
What’s wrong with Eduard’s arguments
1) Let’s check what Govert says in this post on stackoverflow.com.
When Microsoft says it is correct (or incorrect) to release COM objects in this or that way, we should pay attention to the context. Say, Shared add-ins don’t have shims and thus they inhabit the default AppDomain while every VSTO add-in and Add-in Express based COM add-in has a loader that creates a separate AppDomain for every add-in. Now you see why releasing something in a non-shimmed add-in may be disastrous for other non-shimmed add-ins. This is one of the reasons why Shared add-ins are rare nowadays; the real reason is: Microsoft removed that project template from Visual Studio. By the way, do you know that Visual Studio allows loading non-shimmed .NET add-ins in the IDE? I’ve read Microsoft suggesting to never use Marshal.ReleaseComObject() in such add-ins. Not a wonder. On the other hand, here‘s Microsoft suggesting calling Marshal.ReleaseComObject() in a way that raises questions but again, note the context: that page is about running an Office application from a standalone application and this may seriously differ from COM add-ins.
On reportedly incorrect information from Microsoft. First off, it may be incorrect; I hope they are human. I would like to see exactly what information was incorrect, though. In our case, they emphasize the need to release COM object when doing recurrent appointments in Outlook. And our practice confirms that.
On using magic formulas. To be *sure* that your COM object(s) gets released by GC.Collect(), you will need to know many (and I mean MANY) more things. Until that, GC.Collect is unpredictable. We can’t rely on unpredictable things; so, we clean our variables ourselves.
Also, you should understand that we feel responsible for the success or failure of an average developer who does his or her first Office add-in project and who has a truly great number of things to do and to take care of under a pressing schedule. To explain even “fundamentals of GC.Collect()” and know that the developer won’t understand it – this is nonsense. We explain Marshal.ReleaseCOM object: it is logical and easy. Yes, your practice may differ: the “COM in .NET” topic is difficult! That is why we answer questions on releasing COM object on our forum and in email, we do explain this, and we will be explaining this again if required. The win is: your code works in *all* Office versions.
On self-cleaning methods. Note. The DoTheWork method that Govert suggests, contains this declaration: Dim app As New Microsoft.Office.Interop.Excel.Application. The “New” means, this method is used in a standalone application that automates Excel, not in a COM add-in; a COM add-in should use an Application object that is provided at startup; all other ways are somewhere between forbidden and non-recommended. Be aware of suggestions that may not apply to COM add-ins! End of note. Anyway, we ask: what effects such a method produces when it is given a million of COM objects (e.g. cells)? Say, in an Outlook+Exchange configuration, you may run out of free RPC channels after reading some 250 Outlook items in a folder. How to process a folder of 100-200 thousand emails with GC.Collect?
The blog (Marshal.ReleaseComObject Considered Dangerous) that convinced Govert to join the GC.Collect() party, isn’t applicable to COM add-ins as it talks about a non-shimmed add-in loaded in Visual Studio. I’ve dismissed this exact article as non-relevant in 2011; see comments in When to release COM objects.
Conclusion: #1 isn’t convincing and it mostly doesn’t apply to COM add-ins.
Recommendations from Add-in Express support team:
- When debugging a COM add-in or an application automating Office, make sure all COM add-ins are turned off in the target Office application; this helps you isolate issues in your code.
- A COM add-in developer should always let the target Office application close normally when debugging an add-in. Not doing so may end with Office marking the add-in as faulty. Also, COM developers test their add-ins in scenarios where the host application is started programmatically.
2) The second page on stackoverflow explains how to deal with garbage collecting when debugging an application automating Office. I believe these explanations help someone. We can’t comment on this page as it explains something that doesn’t directly relate to the topic.
Conclusion: #2 doesn’t directly apply to COM add-ins.
3) the third post (stackoverflow again) is about changing their position: from releasing COM objects individually to using GC.Collect(). VVS found a post (it’s #1 above) convincing.
Conclusion: #3 isn’t convincing as it doesn’t provide new arguments.
Finally
In the Marshal.ReleaseComObject() vs GC.Collect() topic, I see a parallel to Option Strict On vs Option Strict Off in Visual Basic .NET. Both GC.Collect() and Option Strict Off are about saving your time while you type your code. While Marshal.ReleaseComObject() and Option Strict On are about saving your debugging time. You should decide: which time is more valuable?
I can only repeat my words written in Why Excel doesn’t quit (2011!): I vote for using Marshal.ReleaseComObject(). If this isn’t convincing, let it be so.
2 Comments
Great post! Thank you.
Agree 100%. In the last 10 years or so, I have been hired as an expert to help literally dozens of companies to ‘fix’ their plugins that misbehave. One of the things I always do is to and use Marshal.ReleaseComObject explicitly. Nine projects out of ten, that’s not the case with their code and it’s causing lots of issues.
one can only guess how much money and time is being wasted like that :(