<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Software for Exploratory Data Analysis and Statistical Modelling &#187; Probability Distributions</title>
	<atom:link href="http://www.wekaleamstudios.co.uk/topics/statistical-analysis/probability-distributions/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.wekaleamstudios.co.uk</link>
	<description>Statistical Modelling with R</description>
	<lastBuildDate>Wed, 01 Feb 2012 19:44:22 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>R and Tolerance Intervals</title>
		<link>http://www.wekaleamstudios.co.uk/posts/r-and-tolerance-intervals/</link>
		<comments>http://www.wekaleamstudios.co.uk/posts/r-and-tolerance-intervals/#comments</comments>
		<pubDate>Mon, 19 Apr 2010 20:19:31 +0000</pubDate>
		<dc:creator>Ralph</dc:creator>
				<category><![CDATA[Data Summary]]></category>
		<category><![CDATA[Exploratory Data Analysis]]></category>
		<category><![CDATA[Probability Distributions]]></category>
		<category><![CDATA[normtol.int]]></category>
		<category><![CDATA[tolerance]]></category>
		<category><![CDATA[Tolerance Intervals]]></category>

		<guid isPermaLink="false">http://www.wekaleamstudios.co.uk/?p=905</guid>
		<description><![CDATA[Confidence intervals and prediction intervals are used by statisticians on a regular basis. Another useful interval is the tolerance interval that describes the range of values for a distribution with confidence limits calculated to a particular percentile of the distribution. The R package tolerance can be used to create a variety of tolerance intervals of [...]]]></description>
			<content:encoded><![CDATA[<p>Confidence intervals and prediction intervals are used by statisticians on a regular basis. Another useful interval is the tolerance interval that describes the range of values for a distribution with confidence limits calculated to a particular percentile of the distribution. The <strong>R</strong> package <strong>tolerance</strong> can be used to create a variety of tolerance intervals of interest.<span id="more-905"></span></p>
<p>These tolerance limits, taken from the estimated interval, are limits within which a stated proportion of the population is expected to occur. The function <strong>normtol.int</strong> from the <strong>tolerance</strong> package can be used to calculate a tolerance interval for data from a normal distribution.</p>
<p>The function arguments include the data itself in a vector denoted <strong>x</strong>. The confidence level associated with the tolerance interval is specified by <strong>alpha</strong>, where <strong>alpha</strong> is the difference between 100% and the confidence level &#8211; <strong>alpha</strong> is 0.05 for 95% confidence. The argument <strong>P</strong> is the proportion of the data to be included in the tolerance interval. The <strong>side</strong> argument determines whether a one-sided or two-sided interval is required.</p>
<p>Consider a simulated set of data from a manufacturing process loaded into R, stored as vector object <strong>obs</strong>, as follows:</p>

<div class="wp_syntax"><div class="code"><pre class="text" style="font-family:monospace;">obs = c(102.17, 102.45, 106.23, 98.16, 100.82, 101.40, 90.51, 102.51, 97.93,
  96.98, 101.74, 104.34, 103.50, 94.72, 102.80, 103.92, 97.43, 102.76, 100.03,
  107.12, 104.96, 105.32, 87.06, 97.89, 100.23)</pre></div></div>

<p>A 95% tolerance interval for 90% of data of this type, based on the 25 observations above is created with this code:</p>

<div class="wp_syntax"><div class="code"><pre class="text" style="font-family:monospace;">&gt; normtol.int(x = obs, alpha = 0.05, P = 0.90, side = 2)
  alpha   P    x.bar 2-sided.lower 2-sided.upper
1  0.05 0.9 100.5192      90.07606      110.9623</pre></div></div>

<p>The <strong>alpha</strong> and <strong>P</strong> are as noted above and the average of the data is reported along with the lower and upper tolerance intervals in this case as we asked for a two-sided interval. This can be easily changed to cover 95% rather than 90% of the data:</p>

<div class="wp_syntax"><div class="code"><pre class="text" style="font-family:monospace;">&gt; normtol.int(x = obs, alpha = 0.05, P = 0.95, side = 2)
  alpha    P    x.bar 2-sided.lower 2-sided.upper
1  0.05 0.95 100.5192      88.07543      112.9630</pre></div></div>

<p>The package <strong>tolerance</strong> can create intervals for other data distributions.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.wekaleamstudios.co.uk/posts/r-and-tolerance-intervals/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Plotting Probability Distributions</title>
		<link>http://www.wekaleamstudios.co.uk/posts/plotting-probability-distributions/</link>
		<comments>http://www.wekaleamstudios.co.uk/posts/plotting-probability-distributions/#comments</comments>
		<pubDate>Tue, 02 Jun 2009 19:53:26 +0000</pubDate>
		<dc:creator>Ralph</dc:creator>
				<category><![CDATA[Base Graphics]]></category>
		<category><![CDATA[Probability Distributions]]></category>
		<category><![CDATA[abline]]></category>
		<category><![CDATA[expression]]></category>
		<category><![CDATA[main]]></category>
		<category><![CDATA[Mathematical Labels]]></category>
		<category><![CDATA[mean]]></category>
		<category><![CDATA[Normal]]></category>
		<category><![CDATA[plot]]></category>
		<category><![CDATA[Probability Distribution]]></category>
		<category><![CDATA[Standard Normal]]></category>
		<category><![CDATA[variance]]></category>
		<category><![CDATA[xlab]]></category>
		<category><![CDATA[ylab]]></category>

		<guid isPermaLink="false">http://www.wekaleamstudios.co.uk/?p=196</guid>
		<description><![CDATA[There are many distributions that are available within the base R Statistical System and it is possibly to use these functions to visualise the density or cumulative density functions for a distribution with a given set of parameters. To illustrate this we could the standard normal distribution which has zero mean and variance of one [...]]]></description>
			<content:encoded><![CDATA[<p>There are many distributions that are available within the base R Statistical System and it is possibly to use these functions to visualise the density or cumulative density functions for a distribution with a given set of parameters.<span id="more-196"></span></p>
<p>To illustrate this we could the standard normal distribution which has zero mean and variance of one and the cumulative density function has the familiar S-shape. To plot the distribution on a graph we first create a variable to store the values for the distribution, which we set to be a sequence ranging from -4 to +4 and save the data to a variable <strong>tempX</strong> so that it can be used in the <strong>plot</strong> function:</p>

<div class="wp_syntax"><div class="code"><pre class="text" style="font-family:monospace;">tempX = seq(-4, 4, 0.1)</pre></div></div>

<p>The next step is to call the plot function and we provide a list of X and Y values that we want to plot against each other. In this case we have already defined the X values so we use the <strong>pnorm</strong> function to calculate the cumulative values at each of the X values that we have specified. We also set the text for the title and the two axis using the arguments <strong>main</strong>, <strong>xlab</strong> and <strong>ylab</strong>. We use the <strong>expression</strong> function to create a text string with Mathematical characters in it. The <strong>mu</strong> and <strong>sigma</strong> are converted to the corresponding greek letters. Lastly the option <strong>type = &#8220;l&#8221;</strong> is used to get the <strong>plot</strong> function to draw lines rather than symbols. Our final function call is:</p>

<div class="wp_syntax"><div class="code"><pre class="text" style="font-family:monospace;">plot(tempX, pnorm(tempX, mean=0, sd=1), xlab=&quot;X Values&quot;,
  ylab=&quot;Cumulative Probability&quot;, 
  main = expression(paste(&quot;Normal Distribution: &quot;, mu, &quot; = 0, &quot;,
    sigma, &quot; = 1&quot;)), type=&quot;l&quot;)</pre></div></div>

<p>We add a horizontal grey line at the bottom of the graph using the <strong>abline</strong> function:</p>

<div class="wp_syntax"><div class="code"><pre class="text" style="font-family:monospace;">abline(h=0, col=&quot;gray&quot;)</pre></div></div>

<p>The graph that is produced looks like this:<br />
<div id="attachment_205" class="wp-caption aligncenter" style="width: 310px"><img src="http://www.wekaleamstudios.co.uk/wp-content/uploads/2009/05/normal-distribution-300x300.png" alt="Plot of the Cumulative Standard Normal Distribution" title="Cumulative Normal Distribution" width="300" height="300" class="size-medium wp-image-205" /><p class="wp-caption-text">Plot of the Cumulative Standard Normal Distribution</p></div></p>
<p>We can use this approach to visualise the density or cumulative density functions of any distribution that is available in <strong>R</strong>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.wekaleamstudios.co.uk/posts/plotting-probability-distributions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

