Well, this is embarrassing.
I'm trying to do something fairly straightforward: conduct a robustness check by seeing if the correlation between x and y is removed if the values for x are "mismatched" with y. I'm trying to do this by creating a third variable z which "mixes up" the existing values of 'x' at random. While this is a similar question to the one previously answered here, my data are in long form so I need to randomize WITHIN an id variable.
For example, my dataset might be:
x y id
1 4 1
1 5 1
2 8 1
2 8 1
3 12 1
3 11 1
4 16 1
4 15 1
1 4 2
1 5 2
2 8 2
2 8 2
3 12 2
3 11 2
4 16 2
4 15 2
What I'd like to do is to create a new variable z which essentially "mixes up" the values of x (but is based on the actual values of x, NOT a random variable within a certain range):
x y id z
1 4 1 2
1 5 1 3
2 8 1 1
2 8 1 4
3 12 1 4
3 11 1 3
4 16 1 2
4 15 1 1
1 4 2 1
1 5 2 1
2 8 2 3
2 8 2 3
3 12 2 4
3 11 2 4
4 16 2 2
4 15 2 2
How on earth do I do this? I started out thinking it was a simple task, but then got very very confused.
SUPER-DUPER-BONUS-QUESTION:
Finally, as the careful reader will note, the data are in long form (each id has 8 rows) but they are also grouped by x (which has 4 values per id). In other words, each person has 8 observed outcomes of y, but only 4 predictors of x. In a perfect world, I'd be able to create a function where z mixed up values of x within id -- and but never assigned the same value of x to z.
In other words, if x=1, then z=2,3, or 4 but NOT 1. It is a subtle difference, but a potentially meaningful one!
x y id z
1 4 1 2
1 5 1 3
2 8 1 1
2 8 1 4
3 12 1 4
3 11 1 2
4 16 1 3
4 15 1 1
1 4 2 3
1 5 2 3
2 8 2 1
2 8 2 1
3 12 2 4
3 11 2 4
4 16 2 2
4 15 2 2