0

I am new to R, and I am struggling with the following issue.

I want to convert the row values which are countries to Columns and assign the corresponding values from Column Year_2000 to it. Thank you for your help! I am attaching the screen shots.

   Country_a <- c("Argentina","China","US","Brazil","France","Germany","Cananda")
  Country_b <- c("Brazil","Mexico","New 
  Zealand","France","Mongolia","China","US")
  Year_2000 <- c(30,54,67,4,7,4,5)
  dataframe <- data.frame(Country_a,Country_b,Year_2000)

This is the screenshot

https://i.stack.imgur.com/wAoTM.jpg

Sondor
  • 3
  • 1
  • You can use the `tidyr` function `spread` to do this easily. There are myriad examples of how to do so. Best try it to figure it out faster. – hmhensen Mar 22 '19 at 22:08
  • Possible duplicate of [How to use the spread function properly in tidyr](https://stackoverflow.com/questions/34684925/how-to-use-the-spread-function-properly-in-tidyr) – hmhensen Mar 22 '19 at 22:10
  • Base R: `xtabs(Year_2000 ~ Country_a + Country_b, dataframe)`. – Rui Barradas Mar 22 '19 at 22:12

1 Answers1

0

I guess you should consider building your data structure as a sparse matrix instead of a dataframe. This can be done for instance with the Matrix package.

library(Matrix)
sparse_Matrix_output <- sparseMatrix(i = as.integer(dataframe$Country_a), j = as.integer(dataframe$Country_b), x = dataframe$Year_2000)
colnames(sparse_Matrix_output) = levels(dataframe$Country_b)
rownames(sparse_Matrix_output) = levels(dataframe$Country_a)

And here's the output:

sparse_Matrix_output
7 x 7 sparse Matrix of class "dgCMatrix"
          Brazil China France Mexico Mongolia New Zealand US
Argentina     30     .      .      .        .           .  .
Brazil         .     .      4      .        .           .  .
Cananda        .     .      .      .        .           .  5
China          .     .      .     54        .           .  .
France         .     .      .      .        7           .  .
Germany        .     4      .      .        .           .  .
US             .     .      .      .        .          67  .
Alessio
  • 910
  • 7
  • 16
  • Thank you for your help on this!@alessio However, sometimes I run this code for another dataframe, it gives me this error: I don't know why this is happening. Could you please help me on it? Error in dimnamesGets(x, value) : invalid dimnames given for “dgCMatrix” object. – Sondor Mar 29 '19 at 18:24
  • hi @Sondor any chance to have a subset of this dataframe you state? (to try to reproduce the error). Also had a look a this https://stackoverflow.com/questions/32353191/error-when-making-a-sparse-matrix? – Alessio Mar 29 '19 at 19:16
  • Hi @alessio, I have uploaded in the above link, please have a look there. For this dataset, I selected some country names for two columns (reportername and partnername) based on another vectors of country names, and then I ran the code and found the code is not working. Same code actually works for its original data set which is without filtering countries. Thank you so much for your help! – Sondor Apr 01 '19 at 15:21
  • Can you report how you've defined the "wheat_try" dataframe? – Alessio Apr 01 '19 at 22:42
  • hi@alessio, here is the information: str(wheat_try) 'data.frame': 5901 obs. of 3 variables: $ ReporterName: Factor w/ 202 levels "Albania","Algeria",..: 192 192 192 192 192 192 192 192 192 192 ... $ PartnerName : Factor w/ 242 levels "Afghanistan",..: 12 20 40 46 79 105 108 111 114 113 ... $ year2018 : num 0 0 0 0 0 0 0 0 0 0 ... – Sondor Apr 03 '19 at 15:29