Ticket #39268

g++ -O1 produces crashing code when accessing dlls via LoadLibrary()/GetProcAddress()/reinterpret_cast

Open Date: 2019-05-31 02:26 Last Update: 2019-06-05 01:19

Reporter:
Owner:
(None)
Type:
Status:
Closed
Component:
(None)
MileStone:
(None)
Priority:
5 - Medium
Severity:
5 - Medium
Resolution:
Invalid
File:
1
Vote
Score: 0
No votes
0.0% (0/0)
0.0% (0/0)

Details

I'm investigating the following crash, which occurs when accessing a dll via LoadLibrary(), getProcAddress(), reinterpret_cast:

Problem signature:

Problem Event Name: BEX
Application Name: test.exe
...
Fault Module Name: StackHash_0a9e
...
Exception Offset: 0028fe9c
Exception Code: c0000005
Exception Data: badc0de1
...
Additional Information 1: 0a9e
Additional Information 2: 0a9e372d3b4ad19135b953a78882e789
Additional Information 3: 0a9e
Additional Information 4: 0a9e372d3b4ad19135b953a78882e789



The code only crashes if it is compiled with -O1 (-O2, ...).
Moving around pieces of the code seems to fix the issue.
I was finally able to create some reduced example code with working / non working variants (attachment follows).

Ticket History (3/20 Histories)

2019-05-31 02:26 Updated by: comer352l
  • New Ticket "g++ -O1 produces crashing code when accessing dlls via LoadLibrary()/GetProcAddress()/reinterpret_cast" created
2019-05-31 02:40 Updated by: comer352l
Comment

Unfortunately, the dll I'm using in the example code is too large to attach it (1.7MB).
Should work with any other dll (adjusting dll name, function names and prototypes).

2019-05-31 02:43 Updated by: comer352l
Comment

Tested operating Systems: Windows 7 64bit, Windows XP 32bit
Tested MinGW versions:
MinGW/gcc 4.8.2 (i686-posix-dwarf-rev3, Built by MinGW-W64 project)
MinGW/gcc 7.3.0 (i686-posix-dwarf-rev0, Built by MinGW-W64 project)

2019-05-31 03:04 Updated by: keith
Comment

Reply To comer352l

Unfortunately, the dll I'm using in the example code is too large to attach it (1.7MB).

Hmm. I was just about to ask where that DLL originates.

Should work with any other dll (adjusting dll name, function names and prototypes).

Well, if you want us to investigate, you really need to provide a fully self-contained test case. Don't expect us to work around your omissions — you need to either:­–

  • adapt your test case, to work with one of the system DLLs, or
  • provide a reduced sample DLL — ideally in source form — within your test case, on which it operates.
2019-05-31 03:17 Updated by: keith
Comment

Reply To comer352l

Tested operating Systems: Windows 7 64bit, Windows XP 32bit
Tested MinGW versions:
MinGW/gcc 4.8.2 (i686-posix-dwarf-rev3, Built by MinGW-W64 project)
MinGW/gcc 7.3.0 (i686-posix-dwarf-rev0, Built by MinGW-W64 project)

Products from the MinGW-W64 project — a project which gratuitously (and illegally) infringes our registered trademark — do not originate, and are not supported here. Notwithstanding, if you can provide a self-contained test case, I'd be interested to see if the same problem is reproduced with our official MinGW compiler suite.

Do please note that your c0000005 exception code is a SEGFAULT, ("access violation" in Microsoft parlance). This means that your program is attempting to access memory to which it does not have access rights. Likely causes are references to an uninitialized pointer, or to memory which has already been deallocated; usually these are the result of errors in the user program code, but unless you do provide a fully self-contained test case, I am not willing to investigate further.

2019-05-31 04:05 Updated by: comer352l
Comment

Reply To keith

Reply To comer352l Well, if you want us to investigate, you really need to provide a fully self-contained test case. Don't expect us to work around your omissions — you need to either:­–

Sure, I'm working on it.

* adapt your test case, to work with one of the system DLLs, or

Any suggestions ? Windows is not my "native" environment...

What's the maximum attachment size ?

2019-05-31 04:10 Updated by: comer352l
Comment

Reply To keith

Reply To comer352l Products from the MinGW-W64 project — a project which gratuitously (and illegally) infringes our registered trademark — do not originate, and are not supported here.

I didn't know that, sorry.
I was just testing with the versions that came with Qt4/5.
I'll go and switch over to the original.

Do please note that your c0000005 exception code is a SEGFAULT, ("access violation" in Microsoft parlance). This means that your program is attempting to access memory to which it does not have access rights. Likely causes are references to an uninitialized pointer, or to memory which has already been deallocated; usually these are the result of errors in the user program code, but unless you do provide a fully self-contained test case, I am not willing to investigate further.

I know. But I'm pretty sure the code is correct.

2019-05-31 06:38 Updated by: keith
Comment

Reply To comer352l

Reply To keith

Reply To comer352l Well, if you want us to investigate, you really need to provide a fully self-contained test case. Don't expect us to work around your omissions — you need to either:­–

Sure, I'm working on it.

* adapt your test case, to work with one of the system DLLs, or

Any suggestions ? Windows is not my "native" environment...

Any of the system DLLs, which is not linked by default; I've used psapi.dll, in the past: see https://osdn.net/projects/mingw/scm/git/mingw-org-wsl/blobs/5.2-trunk/mingwrt/mingwex/dlfcn.c

What's the maximum attachment size ?

I don't know; it doesn't appear to be mentioned in the documentation, so I've submitted a technical query to ask.

2019-05-31 07:07 Updated by: keith
Comment

Reply To comer352l

Reply To keith

Do please note that your c0000005 exception code is a SEGFAULT, ("access violation" in Microsoft parlance). This means that your program is attempting to access memory to which it does not have access rights. Likely causes are references to an uninitialized pointer, or to memory which has already been deallocated; usually these are the result of errors in the user program code, but unless you do provide a fully self-contained test case, I am not willing to investigate further.

I know. But I'm pretty sure the code is correct.

Hmm. I'm not so sure. At a first quick glance, this looks odd:

  1. extern "C" {
  2. #include "windows.h"
  3. }

  • Firstly, (although GCC will tolerate it), the # introducing the include directive really should not be anywhere other than in column one.
  • Secondly, why use iquote semantics for "windows.h"? Is this the system <windows.h>, or your own project-local header? (If the latter, then I would suggest that the name is extremely badly chosen, because it conflicts with the system header name).
  • Thirdly, you should not embed the #include directive within the extern "C" block scope; that scope should be correctly specified within the included header itself, (and, in this case if it is the system's <windows.h> which is being included, then this is certainly taken care of, because it is a C++ compatible C header).
(Edited, 2019-05-31 07:09 Updated by: keith)
2019-05-31 16:28 Updated by: keith
Comment

Reply To comer352l

Reply To keith

Do please note that your c0000005 exception code is a SEGFAULT, ("access violation" in Microsoft parlance). This means that your program is attempting to access memory to which it does not have access rights. Likely causes are references to an uninitialized pointer, or to memory which has already been deallocated; usually these are the result of errors in the user program code, but unless you do provide a fully self-contained test case, I am not willing to investigate further.

I know. But I'm pretty sure the code is correct.

More serious than the oddity I noted yesterday, this is wrong:

  1. bool J2534_API::selectLibrary(std::string libPath)
  2. {
  3. FreeLibrary( _J2534LIB );
  4. _J2534LIB = LoadLibraryA( libPath.c_str() );
  5. ...
  6. }

Your J2534_API class constructor initializes its _J2534LIB private member variable to NULL; you then invoke its selectLibrary() method, which immediately calls FreeLibrary() on that NULL pointer. Microsoft's documentation for FreeLibrary() says that its argument must be a module handle, as returned by either LoadLibrary(), or GetModuleHandle(), (or one of their derivatives); it does not say that NULL is an acceptable value, so that's a potential SEGFAULT staring you right in the face — dereference of a NULL pointer is, perhaps, the most common cause of a SEGFAULT exception. (Even if FreeLibrary() does handle the NULL pointer gracefully, such that the call becomes a no-op, the behaviour is undocumented, and it is wrong for your code to depend on it.)

2019-05-31 16:51 Updated by: keith
Comment

Reply To keith

Your J2534_API class constructor initializes its _J2534LIB private member variable to NULL; you then invoke its selectLibrary() method ...

BTW, do you really need the selectLibrary() method anyway? I don't know your requirements specification, but the ability to arbitrarily change the DLL association of a J2534_API class instance, "on the fly", seems unusual. If you don't explicitly need such a capability, why not bind the DLL within the constructor itself, either using a name passed as a constructor argument, or using a fixed name specified within the constructor code?

2019-05-31 20:22 Updated by: comer352l
Comment

Thank you very much for your comments.

I think I've finally found the reason for the crashes:
The compiler needs to know the calling convention, which is WINAPI alias __stdcall in this case.
After adding it to the function prototypes, everything works fine again.

Sure, if the compiler uses the wrong calling convention, the stack doesn't get managed correctly.
I just wonder why it has been been working fine without __stdcall for 10+ years (and apparently still works fine without -O1).

(Edited, 2019-05-31 20:23 Updated by: comer352l)
2019-05-31 20:27 Updated by: keith
Comment

Reply To keith

Reply To comer352l

What's the maximum attachment size ?

I don't know; it doesn't appear to be mentioned in the documentation, so I've submitted a technical query to ask.

Prompted by my enquiry, OSDN.net have now raised the attachment size limit to 10MB; (they haven't said what it was originally). They also admit that they believe even 10MB is too small, so will work towards increasing it further, to some new (unspecified) limit.

2019-05-31 20:46 Updated by: comer352l
Comment

Reply To keith

BTW, do you really need the selectLibrary() method anyway? I don't know your requirements specification, but the ability to arbitrarily change the DLL association of a J2534_API class instance, "on the fly", seems unusual. If you don't explicitly need such a capability, why not bind the DLL within the constructor itself, either using a name passed as a constructor argument, or using a fixed name specified within the constructor code?

Background:
The code is about accessing SAE-J2534-compliant dlls.
SAE-J2534 specifies an unified interface for accessing different automotive hardware (interfaces for diagnostic purposes, firmware flashing etc.) via a proprietary dlls.
The spec defines function prototypes, arguments and return values and the behavior of the functions, the implementation is up to the equipment manufaturers.
It also defines a method to register/find J2534-compliant dlls using the Windows registry.
See http://www.drewtech.com/support/passthru.html

That's why the DLL association needs to be changed "on the fly".

2019-06-02 22:01 Updated by: comer352l
Comment

After reading a bit more about calling conventions and the GNU compiler specifics, I think I can't blame the compiler for doing anything wrong here.
Looks like __stdcall has always been needed and it was just *BIG* luck that it has been working fine without it for such a long time.
So I'm revoking this bug report, sorry for the noise.

2019-06-02 22:02 Updated by: comer352l
  • Resolution Update from None to Invalid
  • Status Update from Open to Closed
2019-06-03 20:48 Updated by: keith
Comment

Reply To comer352l

The code is about accessing SAE-J2534-compliant dlls.
SAE-J2534 specifies an unified interface for accessing different automotive hardware (interfaces for diagnostic purposes, firmware flashing etc.) via a proprietary dlls. The spec defines function prototypes, arguments and return values and the behavior of the functions, the implementation is up to the equipment manufaturers. It also defines a method to register/find J2534-compliant dlls using the Windows registry. See http://www.drewtech.com/support/passthru.html

Sure, but I don't see any specific requirements specification for C++ bindings ... in particular, no requirement for a selectLibrary() method.

That's why the DLL association needs to be changed "on the fly".

By "on the fly", I mean ad-hoc, possibly any number of times after the class instance has been created. I understand the need to support multiple arbitrarily named DLLs, but surely any one class instance would need to establish only one DLL association? Certainly, that's how I would implement it ... a distinct class instance for each distinct DLL association, with the DLL name, (or a registry selector for that name), passed as argument to the constructor, and no method tempting a subsequent (potentially harmful) change. But that's just my opinion; it's your application, so your choice.

2019-06-05 01:18 Updated by: comer352l
Comment

Reply To keith

Reply To comer352l

The code is about accessing SAE-J2534-compliant dlls.
SAE-J2534 specifies an unified interface for accessing different automotive hardware (interfaces for diagnostic purposes, firmware flashing etc.) via a proprietary dlls. The spec defines function prototypes, arguments and return values and the behavior of the functions, the implementation is up to the equipment manufaturers. It also defines a method to register/find J2534-compliant dlls using the Windows registry. See http://www.drewtech.com/support/passthru.html

Sure, but I don't see any specific requirements specification for C++ bindings ... in particular, no requirement for a selectLibrary() method.

That's right. These are no requirements, just the way I have choosen to go.

Implicit linking could also be used instead LoadLibrary/GetProcAddress/reinterpret_cast.
But that means binding to a specific DLL/hardware, which is exactly the opposite of what SAE-J2534 tries to reach.
And you usually don't get anything else than the DLL from a hardware manufacturer (no header file etc.).

2019-06-05 01:19 Updated by: comer352l
Comment

Reply To keith

Reply To comer352l

That's why the DLL association needs to be changed "on the fly".

By "on the fly", I mean ad-hoc, possibly any number of times after the class instance has been created. I understand the need to support multiple arbitrarily named DLLs, but surely any one class instance would need to establish only one DLL association? Certainly, that's how I would implement it ... a distinct class instance for each distinct DLL association, with the DLL name, (or a registry selector for that name), passed as argument to the constructor, and no method tempting a subsequent (potentially harmful) change. But that's just my opinion; it's your application, so your choice.

Right, it's one DLL association per class instance.
A library selection method allows switching the DLL (the used hardware) without re-creating an instance.
It's not that hard to implement. ;-)

Attachment File List

Edit

Please login to add comment to this ticket » Login