Tuesday, October 09, 2012

Heap corruption in C

I want to copy a string (source) to another variable (destination) in C. I can use the functions strcpy_s or strcpy for that. When compiled in debug mode (I use Visual Studio 2010), if destination buffer is too small to house the source string, strcpy_s throws an exception and stops execution. When compiled in release mode, it again stops but no messages are displayed. strcpy, on the other hand remains silent even in debug mode and the code crashes at a later malloc / free call which makes it much harder to find the problem.

Example code:


Result for useStrCpy_s = 0 (it does not stop at strcpy where the corruption actually happened, continues execution, prints corrupt destination string which does not look corrupt at all, stops and hangs at free without any debug information, you are left scratching your head):


Result for useStrCpy_s = 1 (stops right at strcpy_s and provides information about the problem):


Another difficulty arises due to the fact that visibility of heap corruption problems can be operating system dependent. For example, your code can work fine in Windows XP but can crash in Windows 7 (or worse, it can work on XP Service Pack 1 but crash on XP Service Pack 3). It can be a long time (your tests might not reveal it) before you become aware that there is a problem. And if you are calling a C dll from your java program, you will only get the very helpful (!) message "Problematic frame: C [ntdll.dll+0x5235c]", which means you are in deep deep trouble. Worse still, when you compile your dll in debug mode, you might not see the crash. You might see it when you compile in release mode (which is what you do after you finish debugging and think you found everything). My advice is:
* Be careful (duh!), meaning review code, reduce WTF/min and write extensive unit tests.
* Pay attention to warnings like "C4996: 'strcpy': This function or variable may be unsafe. Consider using strcpy_s instead" and prefer to use strcpy_s.
* Test on different versions of operating systems (minimum: WinXP 32 bit, Win7 64 bit)
* Start your tests with dll's compiled in debug mode. After you eliminated visible bugs, don't forget to seriously test with the dll compiled in release mode. Be mentally ready for challenging problems (since your assertions won't work in release mode).

The "(pdf) C Traps and Pitfalls" book sums it up nicely [p.1]:
The C language and its typical implementations are designed to be used easily by experts.
Further reading:
*  Heap corruption in c: "...the integrity checks are only performed when the heap gets manipulated, which is in the malloc and free functions (and friends)"
* Heap corruption detected
How to handle a java call to a dll that ocassionaly throws EXCEPTION_ACCESS_VIOLATION: "... if a dll crashes then the JVM will terminate... Run your DLL in a separate JVM process, which is purely responsible for providing this solver functionality, communicate with second JVM using sockets, tcp-ip, soap, rest - whatever you prefer. Than, use some like Java Service Wrapper to run this second JVM and configure it in the way that it restarts the JVM if it crashed."

music: Gotye - Somebody That I Used To Know (Feat. Kimbra) (4FRNT Remix)

No comments: