Installing R on Ubuntu

February 8th, 2014

The R statistical software is provided either as source code or pre-compiled binary files. In the majority of cases the binaries are sufficient but there may be situations where it is necessary to compile the software from source code and this post describes the steps required on an Ubuntu Linux system. Read the rest of this entry »

Getting started with GAMLSS

January 19th, 2014

The Generalized Additive Models for Location, Scale and Shape (GAMLSS) is a recent development which provides a framework with access to a large set of distributions and the ability to model all of the parameters of these distributions as functions of the explanatory variables within a data set. Read the rest of this entry »

Google Maps and ggmap

December 22nd, 2013

The ggmap package can be used to access maps from the Google Maps API and there are a number of examples on various statistics related blogs. These include here, here and here. Read the rest of this entry »

Word Clouds using Text Mining

December 19th, 2013

There was an interesting post on a blog which showed how straightforward it is to use the text mining tools (tm) from R along with the wordcloud package to create Word Clouds. Read the rest of this entry »

Design of Experiments: General Block Design

October 12th, 2013

In some experiments, where the aim is to compare a set of treatments, there are one or two sources of variation that can be accounted for at the design stage of a study. The statistical technique that is used in these situation is blocking and it can be used to reduce the variance of pairwise treatment comparisons. Read the rest of this entry »

Design of Experiments: Blocking, Confounding and Interactions

September 27th, 2013

In a previous post we considered some general points about experimental design. In this post we will look at some other common considerations when planning an experiment, specifically blocking, confounding and interactions. Read the rest of this entry »

Design of Experiments: General Background

June 2nd, 2013

The statistical methodology of design of experiments has a long history starting back with the work of Fisher, Yates and other researchers. One of the main motivating factors is to make good use of available resources and to avoid making decisions that cannot be corrected during the analysis stage of an investigation. Read the rest of this entry »

Book on Time Series Forecasting

May 6th, 2013

The online book on time series forecasting methods by Rob Hyndman and George Athana­sopou­los has been completed and was announced on the Hyndsight blog. It is a very accessible book and worth reading to understand time series methodology and useful strategies for making predictions using these models.

Link R and Hadoop

April 30th, 2013

The RHIPE website provides details of linking together the power of R and Hadoop.

Seasonal Trend Decomposition in R

January 11th, 2013

The Seasonal Trend Decomposition using Loess (STL) is an algorithm that was developed to help to divide up a time series into three components namely: the trend, seasonality and remainder. The methodology was presented by Robert Cleveland, William Cleveland, Jean McRae and Irma Terpenning in the Journal of Official Statistics in 1990. The STL is available within R via the stl function. Read the rest of this entry »