Tuesday, March 12, 2013

Melt and join

Following my last post on rearranging and managing arrays, I've started to make use of the reshape package in R. The key functions are melt() and cast(). The melt() function rearranges an array into a long format where ids are duplicated for different covariate values. That means that all of the entries for a given patient are repeated with all of the information that was spread across multiple columns, one for each variable, put into only two; a variable and value column. Once the data is in this form you can then go ahead and cast the data in what ever `shape' you want. In the example below, I wanted to flag patients according to a group membership determined by if they had certain codes in one of their diagnosis fields. I used melt to collapse down the multiple diagnosis columns into a variable (what diagnosis field) and value (code) column pairing. In this case, the diagnosis number is not important. Then, I matched the codes to groups using a look-up table (like in Excel) and the join() function. I gather join() and merge() are similar but that's just the one I happened to use. I finally reconstructed the original array but now with the patient groupings.

No comments:

Post a Comment