Niner since 2013
@JandeVaan: assume existing x64 code: void _vectorcall foo(_m128); is faster than void foo(_m128), because VCC uses register directly. And yes, you're right. You may also use _m256 if your machine supports it.
The answer is no. Currently, VC++ doesn't have the feature you are asking.