Tay Ray Chuan home archive

potential dangers of building python extensions in mingw

Mon, 2 Jan 2012 16:52:32 +0800 | Filed under python, msvc, mingw

Recently, I ran into some difficultly getting lxml to compile and work. It was crashing mysteriously on the seemingly-innocuous fread(). After many days of debugging, I stumbled upon the cause:

Each copy of the CRT library has a separate and distinct state. As such, CRT objects such as file handles, environment variables, and locales are only valid for the copy of the CRT where these objects are allocated or set. When a DLL and its users use different copies of the CRT library, you cannot pass these CRT objects across the DLL boundary and expect them to be picked up correctly on the other side.

Potential Errors Passing CRT Objects Across DLL Boundaries

It was a great opportunity to learn about developing on Windows (the MSVC toolchain like cl and nmake, debugging symbols), and even led me to ask my first stackoverflow question.

By default, mingw/gcc links against msvcrt.dll (up to Visual C++ 6.0), but recent python builds (2.6, 2.7) are linked against msvcr90.dll (Visual C++ 2008). So, be careful if you're building native (C code) extensions for python extensions with mingw/gcc - be sure to not pass FILE* pointers across the DLL boundaries.

Ruby seems unique, it continues to link against msvcrt.dll. Will have to investigate more on their build configuration. (I don't believe they're building it in Visual Studio 6.0!!)

One way would be to wrap calls to standard C library functions through ctypes - that way, we get to use the libc functions from the CRT version linked to by the python application, allowing the DLL/extension to be built independently.

I also wonder if we could GetProcAddress on the CRT library loaded by the parent python interpreter.

blog comments powered by Disqus