Coffeehouse Thread

12 posts

Is there a way to read a C/C++ PDB file contents?

Back to Forum: Coffeehouse
  • User profile image
    BitFlipper

    I want to read the contents of a VS PDB file, specifically I want to get a list of all source files involved with a particular dll or exe's compilation. I know there are .Net classes to read .Net PDB files but I need to do this with C/C++ PDB files.

    Note I need to do this programmatically, so I need some sort of PDB parser I can incorporate into my own code.

    Any help appreciated.

  • User profile image
    BitFlipper

    Well OK I guess this would help.

  • User profile image
    felix9

    DIA is good if you want COM, but I prefer DbgHelp for C-style API, easier for me. Smiley

    http://msdn.microsoft.com/en-us/library/windows/desktop/ms679309%28v=vs.85%29.aspx

  • User profile image
    BitFlipper

    OK so one thing I'm not clear about...

    Can System.Diagnostics.SymbolStore read both C/C++ and .Net PDBs, or is it strictly .Net only? The documentation only mentions .Net but if you look at this you can see many languages represented including C/C++. I think it is using DIA underneath so it makes sense to be able to read C/C++ PDBs too. But so far in my searches I have not been able to determine either way.

    BTW, does anyone know if PDBs contain the preprocessor definitions that were used to compile the executable with?

  • User profile image
    BitFlipper

    OK found a good breakdown over here. It appears System.Diagnostics.SymbolStore is for managed only.

    I agree with felix9 that DbgHelp would probably be the easiest as it only requires P/Invoking into a C API.

    It would be great to not re-invent the wheel here so if anyone is aware of a managed wrapper for any of these APIs, I'm all ears!

  • User profile image
    BitFlipper

    , BitFlipper wrote

    BTW, does anyone know if PDBs contain the preprocessor definitions that were used to compile the executable with?

    Quoting myself here... I imagine the PDB file might contain the full command that was used to perform the compilation with. If that is the case then I would be able to parse the command to get the preprocessor definitions.

  • User profile image
    evildictait​or

    Are these your symbols (i.e. private symbols), or are they public symbols, such as the symbols for kernel32.dll and so on that you got with WinDbg or with a symbols package from Microsoft?

    Public symbols certainly don't have preprocessor definitions in them Sad I'm not sure about privates, but I'd guess probably not because of how the compiler works under the hood, with the preprocessor decoupled from the compiler itself.

  • User profile image
    BitFlipper

    OK let me explain in a bit more detail what I'm doing...

    We have huge source code trees that contain multiple products. Most products have dependencies all over its source code tree (some even in other source code trees). All the projects are either build using MAKE or SCons. These are build for both Windows and Linux.

    In the past, debugging these projects in VS was a bit of a pain since while you can debug in VS, VS didn't know much about the source files and could not locate symbol definitions etc. It didn't know where header files are located, etc etc.

    I started an internal framework project that helps to integrate this into VS. Right now I'm parsing through MAKE logs to figure out what files were included in the build, and which preprocessor definitions, include folders etc were used. This info is then used to auto-configure the VS project's source files, preprocessor definitions etc. Doing this, the debugging experience becomes much better because now VS can actually find symbols, open header files by itself, etc etc. Now it knows the "Big Picture". This framework has a bunch of other features too but that is probably one of the main ones (the other is to enable remote Linux debugging inside VS).

    Anyway, I'm now looking for a faster, more reliable and less hacky way to extract source file, header file, include folders and preprocessor definition info from the DLLs and PDB files.

    So far I've had good luck playing around with DbgHelp.dll. I can now enumerate all source files given a DLL/EXE and its matching PDB file (which I will always have). Using a binary editor, I can clearly see that the PDB file contains the original argument used to initiate the build, which does contain all of the preprocessor definitions etc. However I'm now combing through the DbgHelp.dll help file to see which of the functions can give me that info.

    BTW I plan to do the same thing for unstripped Linux *.so files, and somehow I'm sure there will be tools on Linux to extract that info as well.

  • User profile image
    Ion Todirel

    , felix9 wrote

    DIA is good if you want COM, but I prefer DbgHelp for C-style API, easier for me. Smiley

    http://msdn.microsoft.com/en-us/library/windows/desktop/ms679309%28v=vs.85%29.aspx

    I don't like DIA because of the extra dependencies, I can't be bothered to deploy some more dlls, while dbghelp is shipped with the OS, so no worries or dependencies there

  • User profile image
    BitFlipper

    DbgHelp ships with the OS? I didn't know that. I installed the Windows Debug SDK to get it. Guess it wasn't necessary.

    BTW I could not figure out how to get the preprocessor definitions using DbgHelp but implemented a (somewhat hacky) PDB parser that can now retrieve all include folders, working folder and preprocessor definitions that were specified for each individual compiled file. Together with DbgHelp's SymEnumSourceFiles, I now have everything I need on the Windows side.

    Now to figure this out for Linux. "readelf" is probably the way to go but it seems to lump all preprocessor definitions and source code #defines together, making it hard to tell what the passed in command line preprocessor definitions were.

  • User profile image
    Ion Todirel

    , BitFlipper wrote

    OK so one thing I'm not clear about...

    Can System.Diagnostics.SymbolStore read both C/C++ and .Net PDBs, or is it strictly .Net only? The documentation only mentions .Net but if you look at this you can see many languages represented including C/C++. I think it is using DIA underneath so it makes sense to be able to read C/C++ PDBs too. But so far in my searches I have not been able to determine either way.

    BTW, does anyone know if PDBs contain the preprocessor definitions that were used to compile the executable with?

    I don't know about System.Diagnostics, but from what I know the pdb format is universal, you parse it the same way for managed or native, or at least that's how DbgHelp is designed, I might be wrong, but give it a try first. I'm not aware of anything lower level than DbgHelp.

  • User profile image
    BitFlipper

    @Ion Todirel:

    Well, I got it pretty much working perfectly for my needs after making a wrapper class for DbgHelp, so I'm not going to bother with System.Diagnostics.SymbolStore. I figured the PDB format is very stable so even though the parser is a bit hacky, it probably won't break any time soon.

    Basically, when you parse through the PDB file looking for "strings", you will see this pattern for every compiled file:

    Microsoft (R) Optimizing Compiler
    D:\Develop\Projects\Experiment\TestCpp\TestCpp
    cl
    C:\Program Files (x86)\Develop\Microsoft Visual Studio 11.0\VC\bin\CL.exe
    cmd
    -c -ID:\Develop\Projects\Experiment\TestCpp\DummyInclude -ZI -nologo -W3 -WX- -sdl -Od -Oy- -DDUMMYDEFINE -DWIN32 -D_DEBUG -D_WINDOWS -D_USRDLL -DTESTCPP_EXPORTS -D_WINDLL -D_UNICODE -DUNICODE -Gm -EHs -EHc -RTC1 -MDd -GS -fp:precise -Zc:wchar_t -Zc:forScope -Yustdafx.h -FpD:\Develop\Projects\Experiment\TestCpp\TestCpp\Debug\TestCpp.pch -FoD:\Develop\Projects\Experiment\TestCpp\TestCpp\Debug\ -FdD:\Develop\Projects\Experiment\TestCpp\TestCpp\Debug\vc110.pdb -Gd -TP -analyze- -errorreport:prompt -I"C:\Program Files (x86)\Develop\Microsoft Visual Studio 11.0\VC\include" -I"C:\Program Files (x86)\Develop\Microsoft Visual Studio 11.0\VC\atlmfc\include" -I"C:\Program Files (x86)\Windows Kits\8.0\Include\um" -I"C:\Program Files (x86)\Windows Kits\8.0\Include\shared" -I"C:\Program Files (x86)\Windows Kits\8.0\Include\winrt" -X
    src
    TestCpp.cpp
    pdb
    D:\Develop\Projects\Experiment\TestCpp\TestCpp\Debug\vc110.pdb
    

    So if you look for the "cl", "cmd", "src" and "pdb" sequence then you know what the command arguments, source file, working directory and dependent PDB file was.

Comments closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums, or Contact Us and let us know.