Installing R on Ubuntu

February 8th, 2014

The R statistical software is provided either as source code or pre-compiled binary files. In the majority of cases the binaries are sufficient but there may be situations where it is necessary to compile the software from source code and this post describes the steps required on an Ubuntu Linux system. Read the rest of this entry »

Google Maps and ggmap

December 22nd, 2013

The ggmap package can be used to access maps from the Google Maps API and there are a number of examples on various statistics related blogs. These include here, here and here. Read the rest of this entry »

Word Clouds using Text Mining

December 19th, 2013

There was an interesting post on a blog which showed how straightforward it is to use the text mining tools (tm) from R along with the wordcloud package to create Word Clouds. Read the rest of this entry »

Seasonal Trend Decomposition in R

January 11th, 2013

The Seasonal Trend Decomposition using Loess (STL) is an algorithm that was developed to help to divide up a time series into three components namely: the trend, seasonality and remainder. The methodology was presented by Robert Cleveland, William Cleveland, Jean McRae and Irma Terpenning in the Journal of Official Statistics in 1990. The STL is available within R via the stl function. Read the rest of this entry »

Split strings based on a character in the string

December 11th, 2012

R has various facilities for string manipulation including the strsplit function to divide a string into substrings based on matching to another string. Read the rest of this entry »

Theme Elements in ggplot2

May 3rd, 2012

This website provides a simple summary of the theme elements that can be set within ggplot2. There should be sufficient information here to change the default settings for graphs within the ggplot2 package.

Melt

April 5th, 2012

There are many situations where data is presented in a format that is not ready to dive straight to exploratory data analysis or to use a desired statistical method. The reshape2 package for R provides useful functionality to avoid having to hack data around in a spreadsheet prior to import into R. Read the rest of this entry »

Useful functions for data frames in R

February 17th, 2012

This post will consider some useful functions for dealing with data frames during data processing and validation. Read the rest of this entry »

Surfaces in ternary plots

January 31st, 2012

In mixture experiments there is a constraint that the variables are the proportions of components that are mixed together with the consequence that these proportions sum to one. When fitting regression models to data from mixture experiments we may be interested in reprenting the fitted model with a surface plot. Read the rest of this entry »

Cricket All Round Performances

September 19th, 2011

In cricket a player who can perform well with both the bat and bowl is a great asset for any team and across the history of international cricket there have been a number of cricketers that hall into this bracket. Read the rest of this entry »