I'm not sure where you're going with this. But I'm intrigued and so I thought I'd ramble:
As I'm sure you already know, Unity has motion vectors in standard and HDRP. URP is in the works now. But you'll likely want to use their methodology as it is integrated into the engine, IIUC. So they "cull" on the C++ side those objects that don't have settings activated for motion vect (you get camera motion at those locations).
As for depth buffer writes, generally if you write the depth buffer, you write the motion vector. But if you don't, you don't. Think transparent objects (window) and a car flying by on the road outside. You don't want the window's motion vector, you want the car's motion vector. But they have an option already integrated that let's you write motion vectors for transparent (at least they do in HDRP) if you wish to.
You want to do it in the same pass? Using MRT? Trying to save draw calls? Is that what I'm reading? So you'd pass two transforms in and use MRT to write the per-object motion vectors? I think you're fighting the engine if you do that. You may end up having to roll your own motion vector system and completely bypass theirs. But I think you could do it with two transforms, but I haven't tried it. You'd have to update the objects at the end of the pass to move the current transforms to the previous transforms, then let the game have at them again to get the new ones updated. Then add a camera motion pass at the end to fill in the gaps and/or update the object motions allowing for camera motion too.
Interesting. But I'd stick with their system if I could. Better engine integration. URP is in the works.