data science tutorials and snippets prepared by greysweater42
reshape2
is an R package that lets you change the shape of any dataframe, i.e. to pivot and to “unpivot”.
Keep in mind that if your favourite R package for dataframes manipulation is data.table, functions dcast and melt are already in this package and work exactly the same as those in reshape2
.
In fact there are only two functions worth mentioning: dcast, which is equivalent to MS Excel pivot table, and melt, which does the opposite or unpivots a table.
Here’s an example dataframe:
d <- data.frame(
account_no = paste(rep(7, 5), 1:5, sep=""),
Jan = rnorm(5, 10, 1),
Feb = rnorm(5, 10, 2),
Mar = rnorm(5, 10, 3)
)
print(d)
## account_no Jan Feb Mar
## 1 71 10.780595 11.831565 13.751443
## 2 72 9.838766 8.252641 8.160033
## 3 73 10.688876 10.686336 5.731527
## 4 74 8.287228 8.267534 6.156336
## 5 75 9.139528 11.398885 10.314572
Transormation into a normalized table (unpivot):
dn <- reshape2::melt(
data = d,
id.vars = "account_no",
variable.name = "month",
value.name = "revenue"
)
print(dn)
## account_no month revenue
## 1 71 Jan 10.780595
## 2 72 Jan 9.838766
## 3 73 Jan 10.688876
## 4 74 Jan 8.287228
## 5 75 Jan 9.139528
## 6 71 Feb 11.831565
## 7 72 Feb 8.252641
## 8 73 Feb 10.686336
## 9 74 Feb 8.267534
## 10 75 Feb 11.398885
## 11 71 Mar 13.751443
## 12 72 Mar 8.160033
## 13 73 Mar 5.731527
## 14 74 Mar 6.156336
## 15 75 Mar 10.314572
And back to the previous format using a pivot:
reshape2::dcast(
data = dn,
formula = account_no ~ month,
value.var = "revenue"
)
## account_no Jan Feb Mar
## 1 71 10.780595 11.831565 13.751443
## 2 72 9.838766 8.252641 8.160033
## 3 73 10.688876 10.686336 5.731527
## 4 74 8.287228 8.267534 6.156336
## 5 75 9.139528 11.398885 10.314572
A pretty nice and much longer tutorial is available here.