R Commander – data manipulation and summaries

June 14th, 2010

Previously we considered the R Commander interface as a simple GUI for the R statistical software system. Here we will look at how to undertake data manipulation and creating basic statistical summaries of data sets.

Fast Tube by Casper

The R Commander GUI has two menus “Data” and “Statistics” that are used for manipulating data sets and calculating descriptive statistics and various commonly used statistical techniques. In the “Data” menu there is a sub-menu “Manage variables in active data set” that has some useful features. These include:

  • Compute new variables – used for transforming variables, e.g. converting to a logarithmic scale.
  • Standardise variables – centre data on the mean and scale to the variance of the variable.
  • Convert numeric variables to factors – this is useful for categorical data that is recorded as numbers where we would be interested in working with these as factor levels rather than the actual values.
  • Bin numeric variable – in some situations converting a continuous measurement to groups can make exploratory analysis easier.

The “Statistics” menu provides access to various descriptive and summary statistics via the “Summaries” sub-menu including:

  • Numerical summaries – mean, standard deviation or quantiles for a variable.
  • Frequency distributions – used to create tables to summarise the number of times each level of a factor occurs in a variable.
  • Table of statistics – mean, standard deviation for a numeric variable for each of the groups within a categorical variable.
  • Correlation matrix – the correlation between a set of numeric variables in a data frame.

There are other data manipulation options and summary functions available from these two menus.

Other useful resources are provided on the Supplementary Material page.

Comments are closed.