# 20170509 rand db_lesugent

Post on 23-Jan-2018

1.064 views

Embed Size (px)

TRANSCRIPT

Prof. dr. ir. Leander Van NesteVP Scientific & Clinical Affairs miDiagnosticsVisiting Professor UGent

R Reference Cardby Tom Short, EPRI PEAC, tshort@epri-peac.com 2004-11-07Granted to the public domain. See www.Rpad.org for the source and latestversion. Includes material from R for Beginners by Emmanuel Paradis (withpermission).

Getting helpMost R functions have online documentation.help(topic) documentation on topic?topic id.help.search("topic") search the help systemapropos("topic") the names of all objects in the search list matching

the regular expression topichelp.start() start the HTML version of helpstr(a) display the internal *str*ucture of an R objectsummary(a) gives a summary of a, usually a statistical summary but it is

generic meaning it has different operations for different classes of als() show objects in the search path; specify pat="pat" to search on a

patternls.str() str() for each variable in the search pathdir() show files in the current directorymethods(a) shows S3 methods of amethods(class=class(a)) lists all the methods to handle objects of

class a

Input and outputload() load the datasets written with savedata(x) loads specified data setslibrary(x) load add-on packagesread.table(file) reads a file in table format and creates a data

frame from it; the default separator sep="" is any whitespace; useheader=TRUE to read the first line as a header of column names; useas.is=TRUE to prevent character vectors from being converted to fac-tors; use comment.char="" to prevent "#" from being interpreted asa comment; use skip=n to skip n lines before reading data; see thehelp for options on row naming, NA treatment, and others

read.csv("filename",header=TRUE) id. but with defaults set forreading comma-delimited files

read.delim("filename",header=TRUE) id. but with defaults setfor reading tab-delimited files

read.fwf(file,widths,header=FALSE,sep="",as.is=FALSE)read a table of f ixed width formatted data into a data.frame; widthsis an integer vector, giving the widths of the fixed-width fields

save(file,...) saves the specified objects (...) in the XDR platform-independent binary format

save.image(file) saves all objectscat(..., file="", sep=" ") prints the arguments after coercing to

character; sep is the character separator between argumentsprint(a, ...) prints its arguments; generic, meaning it can have differ-

ent methods for different objectsformat(x,...) format an R object for pretty printingwrite.table(x,file="",row.names=TRUE,col.names=TRUE,

sep=" ") prints x after converting to a data frame; if quote is TRUE,

character or factor columns are surrounded by quotes ("); sep is thefield separator; eol is the end-of-line separator; na is the string formissing values; use col.names=NA to add a blank column header toget the column headers aligned correctly for spreadsheet input

sink(file) output to file, until sink()Most of the I/O functions have a file argument. This can often be a charac-ter string naming a file or a connection. file="" means the standard input oroutput. Connections can include files, pipes, zipped files, and R variables.On windows, the file connection can also be used with description ="clipboard". To read a table copied from Excel, usex 3] all elements greater than 3x[x > 3 & x < 5] all elements between 3 and 5x[x %in% c("a","and","the")] elements in the given set

Indexing listsx[n] list with elements nx[[n]] nth element of the listx[["name"]] element of the list named "name"x$name id.Indexing matricesx[i,j] element at row i, column jx[i,] row ix[,j] column jx[,c(1,3)] columns 1 and 3x["name",] row named "name"Indexing data frames (matrix indexing plus the following)x[["name"]] column named "name"x$name id.

Variable conversionas.array(x), as.data.frame(x), as.numeric(x),

as.logical(x), as.complex(x), as.character(x),... convert type; for a complete list, use methods(as)

Variable informationis.na(x), is.null(x), is.array(x), is.data.frame(x),

is.numeric(x), is.complex(x), is.character(x),... test for type; for a complete list, use methods(is)

length(x) number of elements in xdim(x) Retrieve or set the dimension of an object; dim(x)

unique(x) if x is a vector or a data frame, returns a similar object but withthe duplicate elements suppressed

table(x) returns a table with the numbers of the differents values of x(typically for integers or factors)

subset(x, ...) returns a selection of x with respect to criteria (...,typically comparisons: x$V1 < 10); if x is a data frame, the optionselect gives the variables to be kept or dropped using a minus sign

sample(x, size) resample randomly and without replacement size ele-ments in the vector x, the option replace = TRUE allows to resamplewith replacement

prop.table(x,margin=) table entries as fraction of marginal tableMathsin,cos,tan,asin,acos,atan,atan2,log,log10,expmax(x) maximum of the elements of xmin(x) minimum of the elements of xrange(x) id. then c(min(x), max(x))sum(x) sum of the elements of xdiff(x) lagged and iterated differences of vector xprod(x) product of the elements of xmean(x) mean of the elements of xmedian(x) median of the elements of xquantile(x,probs=) sample quantiles corresponding to the given prob-

abilities (defaults to 0,.25,.5,.75,1)weighted.mean(x, w) mean of x with weights wrank(x) ranks of the elements of xvar(x) or cov(x) variance of the elements of x (calculated on n1); if x is

a matrix or a data frame, the variance-covariance matrix is calculatedsd(x) standard deviation of xcor(x) correlation matrix of x if it is a matrix or a data frame (1 if x is a

vector)var(x, y) or cov(x, y) covariance between x and y, or between the

columns of x and those of y if they are matrices or data framescor(x, y) linear correlation between x and y, or correlation matrix if they

are matrices or data framesround(x, n) rounds the elements of x to n decimalslog(x, base) computes the logarithm of x with base basescale(x) if x is a matrix, centers and reduces the data; to center only use

the option center=FALSE, to reduce only scale=FALSE (by defaultcenter=TRUE, scale=TRUE)

pmin(x,y,...) a vector which ith element is the minimum of x[i],y[i], . . .

pmax(x,y,...) id. for the maximumcumsum(x) a vector which ith element is the sum from x[1] to x[i]cumprod(x) id. for the productcummin(x) id. for the minimumcummax(x) id. for the maximumunion(x,y), intersect(x,y), setdiff(x,y), setequal(x,y),

is.element(el,set) set functionsRe(x) real part of a complex numberIm(x) imaginary partMod(x) modulus; abs(x) is the sameArg(x) angle in radians of the complex numberConj(x) complex conjugateconvolve(x,y) compute the several kinds of convolutions of two se-

quences

fft(x) Fast Fourier Transform of an arraymvfft(x) FFT of each column of a matrixfilter(x,filter) applies linear filtering to a univariate time series or

to each series separately of a multivariate time seriesMany math functions have a logical parameter na.rm=FALSE to specify miss-ing data (NA) removal.

Matricest(x) transposediag(x) diagonal%*% matrix multiplicationsolve(a,b) solves a %*% x = b for xsolve(a) matrix inverse of arowsum(x) sum of rows for a matrix-like object; rowSums(x) is a faster

versioncolsum(x), colSums(x) id. for columnsrowMeans(x) fast version of row meanscolMeans(x) id. for columnsAdvanced data processingapply(X,INDEX,FUN=) a vector or array or list of values obtained by

applying a function FUN to margins (INDEX) of Xlapply(X,FUN) apply FUN to each element of the list Xtapply(X,INDEX,FUN=) apply FUN to each cell of a ragged array given

by X with indexes INDEXby(data,INDEX,FUN) apply FUN to data frame data subsetted by INDEXmerge(a,b) merge two data frames by common columns or row namesxtabs(a b,data=x) a contingency table from cross-classifying factorsaggregate(x,by,FUN) splits the data frame x into subsets, computes

summary statistics for each, and returns the result in a convenientform; by is a list of grouping elements, each as long as the variablesin x

stack(x, ...) transform data available as separate columns in a dataframe or list into a single column

unstack(x, ...) inverse of stack()reshape(x, ...) reshapes a data frame between wide format with

repeated measurements in separate columns of the same record andlong format with the repeated measurements in separate records;use (direction=wide) or (direction=long)

Stringspaste(...) concatenate vectors after converting to character; sep= is the

string to separate terms (a single space is the default); collapse= isan optional string to separate collapsed results

substr(x,start,stop) substrings in a character vector; can also as-sign, as substr(x, start, stop) ), seq(), and difftime() are useful.Date also allows + and . ?DateTimeClasses gives more information. Seealso package chron.as.Date(s) and as.POSIXct(s) convert to the respective class;

format(dt) converts to a string representation. The default stringformat is 2001-02-21. These accept a second argument to specify aformat for conversion. Some common formats are:

%a, %A Abbreviated and full weekday name.%b, %B Abbreviated and full month name.%d Day of the month (0131).%H Hours (0023).%I Hours (0112).%j Day of year (001366).%mMonth (0112).%MMinute (0059).%p AM/PM indicator.%S Second as decimal number (0061).%UWeek (0053); the first Sunday as day 1 of week 1.%wWeekday (06, Sunday is 0).%WWeek (0053); the first Monday as day 1 of week 1.%y Year without century (0099). Dont use.%Y Year with century.%z (output only.) Offset from Greenwich; -0800 is 8 hours west of.%Z (output only.) Time zone as a character string (empty if not available).

Where leading zeros are shown they will be used on output bu