R I/O: Difference between revisions

From Wiki
Jump to navigation Jump to search
 
 
Line 30: Line 30:


Alternate commands:
Alternate commands:
'''<code>read.csv</code>''' assumes a header row, comma-separated, quoted
* '''<code>read.csv</code>''' assumes a header row, comma-separated, quoted
'''<code>read.delim</code>''' assumes a header row, tab-delimited, quoted
* '''<code>read.delim</code>''' assumes a header row, tab-delimited, quoted
'''<code>read.fwf</code>''' works for fixed-width files
* '''<code>read.fwf</code>''' works for fixed-width files
'''<code>readLines</code>''' gives you one string per line
* '''<code>readLines</code>''' gives you one string per line
'''<code>scan</code>''' lets you read data directly into a pre-defined data structure
* '''<code>scan</code>''' lets you read data directly into a pre-defined data structure
 
To read a file containing a single column of numbers into an array:
<pre>
cadence = scan('cadence.txt')
hist(cadence)
</pre>


== Write data to a file ==
== Write data to a file ==

Latest revision as of 23:40, 16 September 2011

Save and load native R-format files

save(top.5.salaries,file="~/top.5.salaries.RData")
load("~/top.5.salaries")

Read data from a file

Sample data file:

name.last,name.first,team position,salary
"Manning","Peyton","Colts","QB",18700000
"Brady","Tom","Patriots","QB",14626720
"Pepper","Julius","Panthers","DE",14137500
"Palmer","Carson","Bengals","QB",13980000
"Manning","Eli","Giants","QB",12916666

Read:

top.5.salaries = read.table("top.5.salaries.csv", header=TRUE, sep=",", quote="\"")

Options:

  • header: default is no header row
  • sep: whitespace is default separator
  • quote: none is default
  • row.names: A character vector containing row names for the returned data frame
  • nrows: number of rows to read (default is all)
  • strip.white: default is to not strip white space
  • blank.lines.skip: default is to skip blank lines
  • comment.char: ignores lines that begin with # by default

Alternate commands:

  • read.csv assumes a header row, comma-separated, quoted
  • read.delim assumes a header row, tab-delimited, quoted
  • read.fwf works for fixed-width files
  • readLines gives you one string per line
  • scan lets you read data directly into a pre-defined data structure

To read a file containing a single column of numbers into an array:

cadence = scan('cadence.txt')
hist(cadence)

Write data to a file

write.table(top.5.salaries, "top.5.salaries.txt")

Also write.csv and write.delim.

Read data from a web page

sp500 = read.csv(paste("http://ichart.finance.yahoo.com/table.csv?",
    "s=%5EGSPC&a=03&b=1&c=1999&d=03&e=1&f=2009&g=m&ignore=.csv"))

Connect to a database

Use the RMySQL package to connect to MySQL: http://rss.acs.unt.edu/Rdoc/library/RMySQL/html/RMySQL-package.html

Install:

install.packages("RMySQL")

Connect:

library(RMySQL)
connection = dbConnect(MySQL(), host="localhost", dbname="sp5", user="barney", password="XXXXX")

Import and export data.frames:

d = dbReadTable(con, "WL")
dbWriteTable(con, "WL2", a.data.frame)         ## table from a data.frame
dbWriteTable(con, "test2", "~/data/test2.csv") ## table from a file

Run an arbitrary SQL statement and extract all its output (returns a data.frame):

dbGetQuery(con, "select count(*) from a\_table")
dbGetQuery(con, "select * from a\_table")

Disconnect

dbDisconnect(connection)