<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.scott5.org/index.php?action=history&amp;feed=atom&amp;title=R_Statistics</id>
	<title>R Statistics - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.scott5.org/index.php?action=history&amp;feed=atom&amp;title=R_Statistics"/>
	<link rel="alternate" type="text/html" href="https://wiki.scott5.org/index.php?title=R_Statistics&amp;action=history"/>
	<updated>2026-04-13T00:31:40Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.43.1</generator>
	<entry>
		<id>https://wiki.scott5.org/index.php?title=R_Statistics&amp;diff=255&amp;oldid=prev</id>
		<title>Scott: Created page with &#039;==Basic stats==  &#039;&#039;&#039;&lt;code&gt;mean, min, max, range = c(min, max)&lt;/code&gt;&#039;&#039;&#039; &lt;pre&gt; mean(c(1,2,3,4,5,NA),na.rm=TRUE)   # 3, ignore NA&#039;s mean(c(-1,0:100,2000),trim=0.1)    # 50, ignore …&#039;</title>
		<link rel="alternate" type="text/html" href="https://wiki.scott5.org/index.php?title=R_Statistics&amp;diff=255&amp;oldid=prev"/>
		<updated>2011-02-01T23:31:13Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;#039;==Basic stats==  &amp;#039;&amp;#039;&amp;#039;&amp;lt;code&amp;gt;mean, min, max, range = c(min, max)&amp;lt;/code&amp;gt;&amp;#039;&amp;#039;&amp;#039; &amp;lt;pre&amp;gt; mean(c(1,2,3,4,5,NA),na.rm=TRUE)   # 3, ignore NA&amp;#039;s mean(c(-1,0:100,2000),trim=0.1)    # 50, ignore …&amp;#039;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;==Basic stats==&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;&amp;lt;code&amp;gt;mean, min, max, range = c(min, max)&amp;lt;/code&amp;gt;&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mean(c(1,2,3,4,5,NA),na.rm=TRUE)   # 3, ignore NA&amp;#039;s&lt;br /&gt;
mean(c(-1,0:100,2000),trim=0.1)    # 50, ignore 10% of outliers&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;&amp;lt;code&amp;gt;quantile, fivenum, IQR, summary&amp;lt;/code&amp;gt;&amp;#039;&amp;#039;&amp;#039; give quantile-related stats&lt;br /&gt;
&lt;br /&gt;
==Correlation and covariance==&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Pearson&amp;#039;&amp;#039;&amp;#039; correlation assumes normally distributed data&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Spearman&amp;#039;&amp;#039;&amp;#039; correlation is nonparametric and doesn&amp;#039;t make assumptions about the underlying distribution:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cor(x, y, method=&amp;quot;spearman&amp;quot;)   # correlation&lt;br /&gt;
cov(x, y, method=&amp;quot;pearson&amp;quot;)    # covariance&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Principal Components Analysis==&lt;br /&gt;
&lt;br /&gt;
http://en.wikipedia.org/wiki/Principal_component_analysis&lt;br /&gt;
&lt;br /&gt;
http://www.youtube.com/watch?v=BfTMmoDFXyE&lt;br /&gt;
&lt;br /&gt;
You have a data set in N dimensions.  The first principle component is a linear combination of these dimensions that best explains the variance in the data.  The second principle component is orthogonal to the first and best explains the variance in the rest of the data, and so on.  It is useful for exploring a large multi-dimensional data set.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;&amp;lt;code&amp;gt;princomp&amp;lt;/code&amp;gt;&amp;#039;&amp;#039;&amp;#039; involves the calculation of the eigenvalue decomposition of the data covariance matrix.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;&amp;lt;code&amp;gt;prcomp&amp;lt;/code&amp;gt;&amp;#039;&amp;#039;&amp;#039; uses singular value decomposition which gives better numerical accuracy&lt;br /&gt;
&lt;br /&gt;
==Probability Distributions==&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dnorm(x, mean = 0, sd = 1, log = FALSE)                         # density function, dnorm(0) = 0.3989423&lt;br /&gt;
pnorm(q, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE)    # distribution function: pnorm(0) = 0.5&lt;br /&gt;
qnorm(p, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE)    # quantile function: qnorm(0.5) = 0 (aka inverse dist fn)&lt;br /&gt;
rnorm(n, mean = 0, sd = 1)                                      # generate n random values from norm dist&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Same functions are available for beta, binomial, cauchy, etc.&lt;br /&gt;
&lt;br /&gt;
==Compare data set to a distribution function==&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
shapiro.test(x)   # Shapiro-Wilk test for normality, small p-value means good match&lt;br /&gt;
ks.test(x, dist)  # Kolmogorov-Smirnov test to see if x values came from dist distribution&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Scott</name></author>
	</entry>
</feed>