Great talk and great undertaking! Good luck man.
I use OpenMP heavily, and sadly it is said to be incompatible with PGO. I suspect this is a major adoption block for many others too (if you have OpenMP usage statistics, you might be able to verify this).
What's weirder is that a Connect issue I upvoted long ago about exactly this has just vanished off the net (here's the dead url). The only trace I can find of a semi-explanation given there for this incompatibility persists in my blog:
'Since PGO and OpenMP are both designed to increase performance, it is a shame that they can't be used together, but there were some major design issues preventing this. There is an ordering dependency in the designs of PGO and OpenMP. OpenMP changes some things about the program state that are important to PGO, and it does this after PGO has already processed the program state and made many decisions based on it.
We will be considering a modification of the designs of PGO and/or OpenMP to get this to work in a future product version.' [This was posted circa 2005 -Ofek]
So guys - PGO is a great feature and i'd love to use it. But if you're serious about enhancing adoption, puh-leeese grab this bull by the horns, and give us PGO with OpenMP.
Jim, Jim and Charles - thanks for a great talk! I'd love to hear more from the compiler makers.
Can argument decoration aid the vectorizer?
Maybe decorating an argument as __declspec(align(16)) can make the vectorizer load registers with aligned instructions?
Does decorating arguments with __restrict have *any* impact currently? Can you give an example of the effect it has?
Both Charles & STL - thanks for a great interview!
If I understood correctly, a major factor that makes HAS_ITERATOR_DEBUGGING sos slow is the need to traverse the entire iterator linked list to remove a single iteratr when it is destroyed. Maybe making the list doubly-linked can accelerate that? (the added space cost seems well worth it)