The Generalized Additive Models for Location, Scale and Shape (GAMLSS) is a recent development which provides a framework with access to a large set of distributions and the ability to model all of the parameters of these distributions as functions of the explanatory variables within a data set. Read the rest of this entry »
The online book on time series forecasting methods by Rob Hyndman and George Athanasopoulos has been completed and was announced on the Hyndsight blog. It is a very accessible book and worth reading to understand time series methodology and useful strategies for making predictions using these models.
The Seasonal Trend Decomposition using Loess (STL) is an algorithm that was developed to help to divide up a time series into three components namely: the trend, seasonality and remainder. The methodology was presented by Robert Cleveland, William Cleveland, Jean McRae and Irma Terpenning in the Journal of Official Statistics in 1990. The STL is available within R via the stl function. Read the rest of this entry »
The pie chart is a frequently seen graph that uses area to compare percentages for a set of categories. Although this type of graph is based on comparing single metric for each category the display is two dimensional but sometimes even appears in three dimensions. Read the rest of this entry »
There are a set of basic principles that hold true for the design of many graphs and various authors have their own preferences. One author who is prominent due to his good work in the area of data visualisation and presentation of evidence to support decision making is Edward Tufte. Read the rest of this entry »
David Firth published a paper in 1993 on maximum likelihood estimation and the reduction of bias when using this approach. The research in this area appears to provide benefit for logistic regression in small data sets where there is complete of quasi separation. This approach has been implemented for Generalized Linear Models in the brglm package. Read the rest of this entry »
The Generalized Linear Model (GLM) allows us to model responses with distributions other than the Normal distribution, which is one of the assumptions underlying linear regression as used in many cases. When data is counts of events (or items) then a discrete distribution is more appropriate is usually more appropriate than approximating with a continuous distribution, especially as our counts should be bounded below at zero. Negative counts do not make sense. Read the rest of this entry »
As many people are aware Hans Rosling is an enthusiastic swedish academic with a passion for statistics who recently presented the program The Joy of Stats. One of the great things about Hans Rosling is his presentations and the interactive graphics that he uses to make his points. Read the rest of this entry »