Functions often want to use integer args with pointers (as indices or to calculate an end-pointer as a loop bound), or with other integer args in GP registers. Or with other integers loaded from memory that they want to work with in GP registers
You can't efficiently use an integer in an XMM reg as a loop counter or bound, because there's no packed-integer compare that sets integer flags for branch instructions. (pcmpgtd creates a mask of 0/-1 elements).
See also Why not store function parameters in XMM vector registers? and the other answer here for more.
But even beyond that, this design idea is not even an option for Windows x64 fastcall / vectorcall.
Windows x64 chooses to waste space on purpose to simplify variadic functions. The register args can be dumped into the 32-byte "shadow space" / "home space" above the return address, to form an array of args.
This is why (for example) Windows x64 passes the 3rd arg in R8 or XMM2, regardless of the types of the earlier args. And why calls to variadic functions require FP args to also be copied to the corresponding integer register, so the function prologue can dump the arg regs without figuring out which variadic args were FP and which were integer.
To make the arg-array thing work, only 4 total args can be passed in registers, regardless of whether you have a mix of integer and FP args. There are enough GP integer regs to hold the max number of register args already, even if they're all integer.
(Unlike x86-64 System V, where the first up-to-8 FP args are passed in xmm0..7 regardless of how many integer/pointer arg-passing registers are used.)
Why does Windows64 use a different calling convention from all other OSes on x86-64?