data science tutorials and snippets prepared by greysweater42
reshape2 is an R package that lets you change the shape of any dataframe, i.e. to pivot and to “unpivot”.
Keep in mind that if your favourite R package for dataframes manipulation is data.table, functions dcast and melt are already in this package and work exactly the same as those in reshape2.
In fact there are only two functions worth mentioning: dcast, which is equivalent to MS Excel pivot table, and melt, which does the opposite or unpivots a table.
Here’s an example dataframe:
d <- data.frame(
account_no = paste(rep(7, 5), 1:5, sep=""),
Jan = rnorm(5, 10, 1),
Feb = rnorm(5, 10, 2),
Mar = rnorm(5, 10, 3)
)
print(d)
## account_no Jan Feb Mar
## 1 71 10.780595 11.831565 13.751443
## 2 72 9.838766 8.252641 8.160033
## 3 73 10.688876 10.686336 5.731527
## 4 74 8.287228 8.267534 6.156336
## 5 75 9.139528 11.398885 10.314572
Transormation into a normalized table (unpivot):
dn <- reshape2::melt(
data = d,
id.vars = "account_no",
variable.name = "month",
value.name = "revenue"
)
print(dn)
## account_no month revenue
## 1 71 Jan 10.780595
## 2 72 Jan 9.838766
## 3 73 Jan 10.688876
## 4 74 Jan 8.287228
## 5 75 Jan 9.139528
## 6 71 Feb 11.831565
## 7 72 Feb 8.252641
## 8 73 Feb 10.686336
## 9 74 Feb 8.267534
## 10 75 Feb 11.398885
## 11 71 Mar 13.751443
## 12 72 Mar 8.160033
## 13 73 Mar 5.731527
## 14 74 Mar 6.156336
## 15 75 Mar 10.314572
And back to the previous format using a pivot:
reshape2::dcast(
data = dn,
formula = account_no ~ month,
value.var = "revenue"
)
## account_no Jan Feb Mar
## 1 71 10.780595 11.831565 13.751443
## 2 72 9.838766 8.252641 8.160033
## 3 73 10.688876 10.686336 5.731527
## 4 74 8.287228 8.267534 6.156336
## 5 75 9.139528 11.398885 10.314572
A pretty nice and much longer tutorial is available here.