That D2D calls D3D is obvious. What doesn't seem to be obvious is that before doing that, D2D must tesselate the geometry because D3D has no clue how to draw an ellipse for example.

And then you have stroke styles, gradient brushes, text rendering support. And geometry operations like combine and widen which don't have anything to do with calling D3D.

So yeah, I guess you could use XNA and its SpriteBatch because implementing all of the above is trivial and implementing your own SpriteBatch is not. Oh wait...