The quirky instruction (v)pmovmskb, dating back to SSE, takes the most significant bits of the bytes in an mm, xmm, or ymm register and moves them into a general purpose register. This is very useful for classifying vector elements or performing SWAR operations on individual bits. Specifically, I have used this instruction in a previous answer to compute a positional population count.
Unfortunately, this instruction has not been extended to ZMM registers and is surprisingly absent from the AVX-512 roster. How can I emulate its effect efficiently for ZMM registers? What similar/other options do I have?