Extract.data.frame {base} | R Documentation |
Extract or replace subsets of data frames.
## S3 method for class 'data.frame' x[i, j, drop = ] ## S3 replacement method for class 'data.frame' x[i, j] <- value ## S3 method for class 'data.frame' x[[..., exact = TRUE]] ## S3 replacement method for class 'data.frame' x[[i, j]] <- value ## S3 replacement method for class 'data.frame' x$name <- value
x |
data frame. |
i, j, ... |
elements to extract or replace. For |
name |
A literal character string or a name (possibly backtick quoted). |
drop |
logical. If |
value |
A suitable replacement value: it will be repeated a whole
number of times if necessary and it may be coerced: see the
Coercion section. If |
exact |
logical: see |
Data frames can be indexed in several modes. When [
and
[[
are used with a single index (x[i]
or x[[i]]
),
they index the data frame as if it were a list. In this usage a
drop
argument is ignored, with a warning.
Note that there is no data.frame
method for $
, so
x$name
uses the default method which treats x
as a
list. There is a replacement method which checks value
for
the correct number of rows, and replicates it if necessary.
When [
and [[
are used with two indices (x[i, j]
and x[[i, j]]
) they act like indexing a matrix: [[
can
only be used to select one element. Note that for each selected
column, xj
say, typically (if it is not matrix-like), the
resulting column will be xj[i]
, and hence rely on the
corresponding [
method, see the examples section.
If [
returns a data frame it will have unique (and non-missing)
row names, if necessary transforming the row names using
make.unique
. Similarly, if columns are selected column
names will be transformed to be unique if necessary (e.g. if columns
are selected more than once, or if more than one column of a given
name is selected if the data frame has duplicate column names).
When drop = TRUE
, this is applied to the subsetting of any
matrices contained in the data frame as well as to the data frame itself.
The replacement methods can be used to add whole column(s) by specifying non-existent column(s), in which case the column(s) are added at the right-hand edge of the data frame and numerical indices must be contiguous to existing indices. On the other hand, rows can be added at any row after the current last row, and the columns will be in-filled with missing values. Missing values in the indices are not allowed for replacement.
For [
the replacement value can be a list: each element of the
list is used to replace (part of) one column, recycling the list as
necessary. If columns specified by number are created, the names
(if any) of the corresponding list elements are used to name the
columns. If the replacement is not selecting rows, list values can
contain NULL
elements which will cause the corresponding
columns to be deleted. (See the Examples.)
Matrix indexing (x[i]
with a logical or a 2-column integer
matrix i
) using [
is not recommended, and barely
supported. For extraction, x
is first coerced to a matrix.
For replacement, a logical matrix (only) can be used to select the
elements to be replaced in the same way as for a matrix.
Both [
and [[
extraction methods partially match row
names. By default neither partially match column names, but
[[
will unless exact=TRUE
. If you want to do exact
matching on row names use match
as in the examples.
For [
a data frame, list or a single column (the latter two
only when dimensions have been dropped). If matrix indexing is used for
extraction a matrix results. If the result would be a data frame an
error results if undefined columns are selected (as there is no general
concept of a 'missing' column in a data frame). Otherwise if a single
column is selected and this is undefined the result is NULL
.
For [[
a column of the data frame or NULL
(extraction with one index)
or a length-one vector (extraction with two indices).
For $
, a column of the data frame (or NULL
).
For [<-
, [[<-
and $<-
, a data frame.
The story over when replacement values are coerced is a complicated one, and one that has changed during R's development. This section is a guide only.
When [
and [[
are used to add or replace a whole column,
no coercion takes place but value
will be
replicated (by calling the generic function rep
) to the
right length if an exact number of repeats can be used.
When [
is used with a logical matrix, each value is coerced to
the type of the column into which it is to be placed.
When [
and [[
are used with two indices, the
column will be coerced as necessary to accommodate the value.
Note that when the replacement value is an array (including a matrix)
it is not treated as a series of columns (as
data.frame
and as.data.frame
do) but
inserted as a single column.
The default behaviour when only one row is left is equivalent to
specifying drop = FALSE
. To drop from a data frame to a list,
drop = TRUE
has to be specified explicitly.
Arguments other than drop
and exact
should not be named:
there is a warning if they are and the behaviour differs from the
description here.
subset
which is often easier for extraction,
data.frame
, Extract
.
sw <- swiss[1:5, 1:4] # select a manageable subset sw[1:3] # select columns sw[, 1:3] # same sw[4:5, 1:3] # select rows and columns sw[1] # a one-column data frame sw[, 1, drop = FALSE] # the same sw[, 1] # a (unnamed) vector sw[[1]] # the same sw[1,] # a one-row data frame sw[1,, drop=TRUE] # a list sw["C", ] # partially matches sw[match("C", row.names(sw)), ] # no exact match try(sw[, "Ferti"]) # column names must match exactly swiss[ c(1, 1:2), ] # duplicate row, unique row names are created sw[sw <= 6] <- 6 # logical matrix indexing sw ## adding a column sw["new1"] <- LETTERS[1:5] # adds a character column sw[["new2"]] <- letters[1:5] # ditto sw[, "new3"] <- LETTERS[1:5] # ditto sw$new4 <- 1:5 sapply(sw, class) sw$new4 <- NULL # delete the column sw sw[6:8] <- list(letters[10:14], NULL, aa=1:5) # update col. 6, delete 7, append sw ## matrices in a data frame A <- data.frame(x=1:3, y=I(matrix(4:6)), z=I(matrix(letters[1:9],3,3))) A[1:3, "y"] # a matrix A[1:3, "z"] # a matrix A[, "y"] # a matrix ## keeping special attributes: use a class with a ## "as.data.frame" and "[" method: as.data.frame.avector <- as.data.frame.vector `[.avector` <- function(x,i,...) { r <- NextMethod("[") mostattributes(r) <- attributes(x) r } d <- data.frame(i= 0:7, f= gl(2,4), u= structure(11:18, unit = "kg", class="avector")) str(d[2:4, -1]) # 'u' keeps its "unit"