So instead, I need to have an additional void* in each thread whose address will be passed in to the SwitchContext call, correct?
Yes, this value contains the current stack pointer for the thread. When you switch away from a thread its current stack pointer gets saved and the stack pointer of the new thread is loaded.
The question is, if this is the very first context switch, what should be stored at the pNewStack value
That's the role of InitializeContext, it sets up the initial stack so it looks similar to the one expected by SwitchContext. That's why the start address gets pushed on the stack, when SwitchContext willreturn to what's stored on the stack, your start address. This might sound strange, the start function is notcalled but returned too, but it works fine as long as the stack is setup correctly. That's also the reason for those 4 push eax, they compensate the fact that SwitchContext pops edi, esi, ebp, ebx.
I assume this should be at some negative offset into that thread's NativeStack
The InitializeContext already computes the correct value but it looks like it doesn't store it correctly. Let's fix things a bit: