Cricket is a sport that generates a large volume of performance data and corresponding debate about the relative qualities of various players over their careers and in relation to their contemporaries. The cricinfo website has an extensive database of statistics for professional cricketers that can be searched to access the information in various formats.
As an initial example we will consider the English legend Sir Ian Botham who played 102 test matches for England between his debut in 1977 until his final game in 1992.
The first obvious breakdown is to consider how Botham performed against the six countries that he played against during his test career. A summary of his statistics are shown here:
Opposition Matches Bat Inns Runs NO Bowl Inns Wicket Catch
Australia 36 49 1673 2 66 148 57
India 14 16 1201 0 23 59 14
New Zealand 15 22 846 2 28 64 14
Pakistan 14 20 647 1 18 40 14
Sri Lanka 3 3 41 0 6 11 2
West Indies 20 37 792 1 27 61 19 |
Botham only played three matches against Sri Lanka so it is difficult to properly assess his performance against them. If the above table is stored in a data frame itb.opp then we can create a histogram of the total runs (or wickets) by opposition country:
ggplot(itb.opp, aes(Opposition, Runs)) + geom_bar() + xlab("Country") +
ylab("Total Runs") |
This code produces the following graph:
The total wickes graph is produced by the next code:
ggplot(itb.opp, aes(Opposition, Wicket)) + geom_bar() + xlab("Country") +
ylab("Total Wickets") |
We may now want to delve deeper into the performance against different nations to take into account the number of games or innings where Botham batted or bowled. The traditional way to assess performance is to calculate batting and bowling averages and we can do this by opposition which provides the following data frame:
> itb.opp.sum
Opposition Discipline Average
Australia Batting 29.35088
India Batting 70.64706
New Zealand Batting 42.30000
Pakistan Batting 32.35000
Sri Lanka Batting 13.66667
West Indies Batting 21.40541
Australia Bowling 27.65541
India Bowling 26.40678
New Zealand Bowling 23.43750
Pakistan Bowling 31.77500
Sri Lanka Bowling 28.18182
West Indies Bowling 35.18033 |
This can be converted into a dot plot so we can see whether Botham had a high batting average than bowling average, which is often taken to be one of the signs of an all-rounder.
ggplot(itb.opp.sum, aes(Average, Opposition, colour = Discipline)) +
geom_point()+ xlab("Average") + ylab("") |
The graph is shown here:
We can see the differences in performance based on the opposition. Botham’s performance against the West Indies, by far the strongest team during most of his international career, were worse than against the other countries. However, his averages were far from embarassing when compared to other players at the time. The graph also shows that Botham enjoyed batting and bowling against India.
We can divide this data further based on whether the matches were played in England or outside of England and this data is shown here:
> itb.opp.ha.sum
Opposition Venue Discipline Average
Australia Away Batting 30.22581
India Away Batting 61.55556
New Zealand Away Batting 50.44444
Pakistan Away Batting 16.00000
Sri Lanka Away Batting 13.00000
West Indies Away Batting 14.17647
Australia Home Batting 28.30769
India Home Batting 80.87500
New Zealand Home Batting 35.63636
Pakistan Home Batting 34.16667
Sri Lanka Home Batting 14.00000
West Indies Home Batting 27.55000
Australia Away Bowling 28.44928
India Away Bowling 25.53333
New Zealand Away Bowling 27.44444
Pakistan Away Bowling 45.00000
Sri Lanka Away Bowling 21.66667
West Indies Away Bowling 39.50000
Australia Home Bowling 26.96203
India Home Bowling 27.31034
New Zealand Home Bowling 20.51351
Pakistan Home Bowling 31.07895
Sri Lanka Home Bowling 30.62500
West Indies Home Bowling 31.97143 |
A dot plot is created from this data with a separate panel for each of the six opposition countries and the averages divided into batting and bowling performances. The coloured dots in the graph indicated whether the average is for matches at home or away.
ggplot(itb.opp.ha.sum, aes(Average, Discipline, colour = Venue)) +
geom_point() + facet_wrap( ~ Opposition) +
xlab("Batting Average") + ylab("") |
This graph is shown below:
We can see that the difference between home and away peformance is, in general, not very large for bowling averages but in some cases there is a noticeable difference in batting averages. When looking at Botham’s performances against the West Indies his statistics at home are much better than his away performance, suggesting that his main struggles against the strong West Indies team were in the Caribbean. This might be due to his swing bowling being more suitable to English conditions compared to pitches in the West Indies.
To round off this brief look at the career of IT Botham let us consider some other important statistics, in particular games where he performed with the bat and ball.
- Overall Botham scored 14 hundreds and 22 fifties out of 161 innings so he reached fifty runs every five innings or so.
- He also took 27 five wicket hauls and 17 four wicket hauls so he took four or more wickets every four innings or so.
- He took 120 catches.
Individual matches of excellence include five games with a century and at least five wickets:
Year Opposition Ground Venue Runs Wicket 1978 New Zealand Christchurch Away 133 8 1978 Pakistan Lord's Home 108 8 1980 India Mumbai Away 114 13 1981 Australia Leeds Home 199 7 1984 New Zealand Wellington Away 138 6 |
These performances and others show why Botham was considered such a great player as he produced some sustained periods of excellent all-round cricket rather than having one discipline more dominant for a long period of time.




Ah, but was he as good as Paul Collingwood? 😉
That is a difficult question, but I can safely say that he was much better than Derek Pringle!