There may be programming quirks or bugs that turn anything into anything. I won't touch those, but rather address the fundamentals of your question.
Color format
Common rgb formats ("sRGB") represent each pixel with either 24 bits (8 bits each for "red, "green" and "blue") or 32 bits (8 additional bits for an alpha channel that is irrelevant here). They are related to the physics of light using electro-optic functions and spectral sensitivity that may differ from format to format but is easily forgotten when working with transmission/storage-focused software.
NV12 represents rgb after applying a 3x3 matrix turning it into luma ("y"), blue difference chroma ("Cb" or "U") and red difference chroma ("Cr" or "V").
Referring to "Digital video and HDTV algorithms and interfaces" (Poynton, 2003) p.p. 285:
"There are about 2.75 million valid codewords in 8-bit Y'CbCr, compared to 10.6 million in 8-bit studio R'G'B'. If R'G'B is transcoded to Y'CbCr, then transcoded back to R'G'B', the resulting R'G'B' cannot have any more than 2.75 million colors."
(The author uses prime (') for gamma-encoded video (what is most common), and "studio" referring to limited range numbers (16...235 rather than 0...255) which is the norm in television systems but uncommon in computers)
Thus, even YCbCr 4:4:4 at 8 bits per primary will have approximately 25% of the color possibilities of rgb at 8 bits per primary.
Spatial downsampling
You start out with 4:2:0 where the chroma channels are downsampled by 2x2. This can be converted to rgb and back again without loss if simple 2x2 boxcar filtering is employed in the downsampling and upsampling. The 4:2:0->4:4:4 conversion would simply repeat one Cb (or Cr) scalar in a 2x2 pattern:
Cb(i,j) -> [Cb(2i,2j), Cb(2i,2j+1);
Cb(2i+1,2j), Cb(2i+1,2j+1)]
And when converting back 4:4:4->4:2:0 we would have:
Cb(i,j) = (Cb(2i,2j) + Cb(2i,2j+1) + Cb(2i+1,2j) + Cb(2i+1,2j+1)) / 4;
Video tends to be compute heavy. Thus any given implementation might do fixed-point operations that sacrifice numerical properties for speed of execution.
MATLAB example
This snippet codifies Eq 25.12 and 25.13 of the Poynton book referred above. Essentially, this is what you would do to encode JPEG/JFIF "fullrange" BT.601 to "computer" rgb and back again. Note that there are a number of variations on this conversion, depending on if you are using ITU BT.601, BT.709, BT.2020, if you use limited range 16-235/239 or full range 0-255 numbers. I have written this code prioritizing clarity.
%% define a minimal YCbCr 4:2:0 "image" of 2x2 pixels
y_in = [100 101;...
102 103];
cb_in = 131;
cr_in = 119;
%% upscale chroma to 4:4:4
cb = kron(cb_in, ones(2));
cr = kron(cr_in, ones(2));
in = cat(3, y_in, cb, cr);
%% convert color format from rgb (full scale)
% to ITU BT.601 (full scale) and back
kb = 0.114;%0.0722
kr = 0.299;%0.2126
kg = (1-kb-kr);
fwd_mat = [kr kg kb;...
(0.5/(1-kb))*[-kr -kg (1-kb)];...
(0.5/(1-kr))*[(1-kr) -kg -kb]];
scaling = [1 1 1];%[256 255 255]./256;
scaled_fwd_mat = scaling' .* fwd_mat;
inv_mat = [1 0 2*(1-kr);...
(-kr-kb+1)/kg 2*kb*(kb-1)/kg 2*kr*(kr-1)/kg;...
1 2*(1-kb) 0];
scaled_inv_mat = (1./scaling).*inv_mat;
for row = 1:size(in,1)
for col = 1:size(in,2)
%% convert to rgb
norm_imdata = shiftdim(double(in(row,col,:)), 1) - [0 128 128];
rgb = scaled_inv_mat * norm_imdata';
%% quantize and back again
rgb = double(uint8(rgb));
%% convert back to YCbCr
ycbcrtmp = scaled_fwd_mat * rgb;
ycbcrtmp = ycbcrtmp + [0 128 128]';
out(row,col,:) = double(uint8(ycbcrtmp));
end
end
%% downscale chroma to 4:2:0
y_out = out(:,:,1);
cb_out = out(:,:,2);
cr_out = out(:,:,3);
cb_out = conv2(cb_out, ones(2)/4, 'same');
cb_out = cb_out(1:2:end-1, 1:2:end-1);
cr_out = conv2(cr_out, ones(2)/4, 'same');
cr_out = cr_out(1:2:end-1, 1:2:end-1);
y_in - y_out
cb_in - cb_out
cr_in - cr_out
I could not make sense of Poyntons 256/255 mid-step/mid-thread scaling/clipping of chroma channels, so I left that one out, producing slightly different numbers.
We see that the 3x3 matrixes are inverses (to within reasonable numerical precision):
>> scaled_fwd_mat*scaled_inv_mat-eye(3)
ans =
1.0e-15 *
-0.1110 0 0
0 0 -0.0278
0.0833 0 0
MATLAB fun
If you dislike seeing explicit loops in MATLAB, the above code can be compacted using the new tensorprod function. While it may be obscure for some, I hope that you find it compact and possibly fast if you enjoy MATLAB-esque.
%% define 4:2:0 input as unsigned 8-bit ranged double-precision floats
y_in = [100 101;...
102 103];
cb_in = 131;
cr_in = 119;
%% upscale chroma to 4:4:4
cb = kron(cb_in, ones(2));
cr = kron(cr_in, ones(2));
in = cat(3, y_in, cb, cr);
%% convert color format from rgb (full scale)
% to ITU BT.601 (full scale) and back
kb = 0.114;%0.0722
kr = 0.299;%0.2126
kg = (1-kb-kr);
fwd_mat = [kr kg kb;...
(0.5/(1-kb))*[-kr -kg (1-kb)];...
(0.5/(1-kr))*[(1-kr) -kg -kb]];
inv_mat = [1 0 2*(1-kr);...
(-kr-kb+1)/kg 2*kb*(kb-1)/kg 2*kr*(kr-1)/kg;...
1 2*(1-kb) 0];
%% convert to rgb
norm_imdata = double(in) - shiftdim([0 128 128], -1);
rgb = tensorprod(norm_imdata, inv_mat, 3, 2);
%% quantize and back again
rgb = double(uint8(rgb));
%% convert back to YCbCr
ycbcrtmp = tensorprod(rgb, fwd_mat, 3, 2);
ycbcrtmp = ycbcrtmp + shiftdim([0 128 128], -1);
out = double(uint8(ycbcrtmp));
%% downscale chroma to 4:2:0
y_out = out(:,:,1);
cb_out = conv2(out(:,:,2), ones(2)/4, 'same');
cb_out = cb_out(1:2:end-1, 1:2:end-1);
cr_out = conv2(out(:,:,3), ones(2)/4, 'same');
cr_out = cr_out(1:2:end-1, 1:2:end-1);