Discussion:
[R] get latest dates for different people in a dataset
(too old to reply)
Tan, Richard
2015-01-23 23:43:51 UTC
Permalink
Hi,

Can someone help for a R question?

I have a data set like:

Name CheckInDate Temp
John 1/3/2014 97
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5
John 1/4/2014 99

I'd like to return a dataset that for each Name, get the row that is the latest CheckInDate for that person. For the example above it would be

Name CheckInDate Temp
John 1/4/2014 99
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5


Thank you for your help!

Richard


[[alternative HTML version deleted]]

______________________________________________
R-***@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
William Dunlap
2015-01-24 00:14:08 UTC
Permalink
Here is one way. Sort the data.frame, first by Name then break ties with
CheckInDate.
Then choose the rows that are the last in a run of identical Name values.
txt <- "Name CheckInDate Temp
+ John 1/3/2014 97
+ Mary 1/3/2014 98.1
+ Sam 1/4/2014 97.5
+ John 1/4/2014 99"
d <- read.table(header=TRUE,
colClasses=c("character","character","numeric"), text=txt)
d$CheckInDate <- as.Date(d$CheckInDate, as.Date, format="%d/%m/%Y")
isEndOfRun <- function(x) c(x[-1] != x[-length(x)], TRUE)
dSorted <- d[order(d$Name, d$CheckInDate), ]
dLatestVisit <- dSorted[isEndOfRun(dSorted$Name), ]
dLatestVisit
Name CheckInDate Temp
4 John 2014-04-01 99.0
2 Mary 2014-03-01 98.1
3 Sam 2014-04-01 97.5


Bill Dunlap
TIBCO Software
wdunlap tibco.com
Hi,
Can someone help for a R question?
Name CheckInDate Temp
John 1/3/2014 97
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5
John 1/4/2014 99
I'd like to return a dataset that for each Name, get the row that is the
latest CheckInDate for that person. For the example above it would be
Name CheckInDate Temp
John 1/4/2014 99
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5
Thank you for your help!
Richard
[[alternative HTML version deleted]]
______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]

______________________________________________
R-***@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Göran Broström
2015-01-25 09:01:11 UTC
Permalink
Post by William Dunlap
Here is one way. Sort the data.frame, first by Name then break ties with
CheckInDate.
Then choose the rows that are the last in a run of identical Name values.
I do it by sorting by the reverse order of CheckinDate (last date first)
within Name, then
Post by William Dunlap
dLatestVisit <- dSorted[!duplicated(dSorted$Name), ]
I guess it is faster, but who knows?

Göran
Post by William Dunlap
txt <- "Name CheckInDate Temp
+ John 1/3/2014 97
+ Mary 1/3/2014 98.1
+ Sam 1/4/2014 97.5
+ John 1/4/2014 99"
d <- read.table(header=TRUE,
colClasses=c("character","character","numeric"), text=txt)
d$CheckInDate <- as.Date(d$CheckInDate, as.Date, format="%d/%m/%Y")
isEndOfRun <- function(x) c(x[-1] != x[-length(x)], TRUE)
dSorted <- d[order(d$Name, d$CheckInDate), ]
dLatestVisit <- dSorted[isEndOfRun(dSorted$Name), ]
dLatestVisit
Name CheckInDate Temp
4 John 2014-04-01 99.0
2 Mary 2014-03-01 98.1
3 Sam 2014-04-01 97.5
Bill Dunlap
TIBCO Software
wdunlap tibco.com
Hi,
Can someone help for a R question?
Name CheckInDate Temp
John 1/3/2014 97
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5
John 1/4/2014 99
I'd like to return a dataset that for each Name, get the row that is the
latest CheckInDate for that person. For the example above it would be
Name CheckInDate Temp
John 1/4/2014 99
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5
Thank you for your help!
Richard
[[alternative HTML version deleted]]
______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-***@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
William Dunlap
2015-01-25 19:27:05 UTC
Permalink
Post by Göran Broström
Post by William Dunlap
dLatestVisit <- dSorted[!duplicated(dSorted$Name), ]
I guess it is faster, but who knows?
You can find out by making a function that generates datasets of
various sizes and timing the suggested algorithms. E.g.,
makeData <-
function(nPatients, aveVisitsPerPatient, uniqueNameDate = TRUE){
nrow <- trunc(nPatients * aveVisitsPerPatient)
patientNames <- paste0("P",seq_len(nPatients))
possibleDates <- as.Date(16001:17000, origin=as.Date("1970-01-01"))
possibleTemps <- seq(97, 103, by=0.1)
data <- data.frame(Name=sample(patientNames, replace=TRUE, size=nrow),
CheckInDate=sample(possibleDates, replace=TRUE, size=nrow),
Temp=sample(possibleTemps, replace=TRUE, size=nrow))
if (uniqueNameDate) {
data <- data[!duplicated(data[, c("Name", "CheckInDate")]), ]
}
data
}
funs <- list(
f1 = function(data) {
do.call(rbind, lapply(split(data, data$Name), function(x)
x[order(x$CheckInDate),][nrow(x),]))
}, f2 = function (d)
{
isEndOfRun <- function(x) c(x[-1] != x[-length(x)], TRUE)
dSorted <- d[order(d$Name, d$CheckInDate), ]
dSorted[isEndOfRun(dSorted$Name), ]
}, f3 = function (d)
{
# is the following how you did reverse sort on date (& fwd on name)?
# Too bad that order's decreasing arg is not vectorized
dSorted <- d[order(d$Name, -as.numeric(d$CheckInDate)), ]
dSorted[!duplicated(dSorted$Name), ]
}, f4 = function(dta)
{
dta %>% group_by(Name) %>% filter(CheckInDate==max(CheckInDate))
})

D <- makeData(nPatients=35000, aveVisitsPerPatient=3.7) # c. 129000 visits
library(dplyr)
Z <- lapply(funs, function(fun){
time <- system.time( result <- fun(D) ) ; list(time=time,
result=result) })

sapply(Z, function(x)x$time)
# f1 f2 f3 f4
#user.self 461.25 0.47 0.36 3.01
#sys.self 1.20 0.00 0.00 0.01
#elapsed 472.33 0.47 0.39 3.03
#user.child NA NA NA NA
#sys.child NA NA NA NA

# duplicated is a bit better than diff, dplyr rather slower, rbind much
slower.

equivResults <- function(a, b) {
# results have different classes and different orders, so only check
size and contents
identical(dim(a),dim(b)) && all(a[order(a$Name),]==b[order(b$Name),])
}
sapply(Z[-1], function(x)equivResults(x$result, Z[[1]]$result))
# f2 f3 f4
#TRUE TRUE TRUE

Note that the various functions give different results if any patient comes
in twice on the same day. f4 includes both visits in the ouput, the other
include either the first or last (as ordered in the original file).

Bill Dunlap
TIBCO Software
wdunlap tibco.com
Post by Göran Broström
Post by William Dunlap
Here is one way. Sort the data.frame, first by Name then break ties with
CheckInDate.
Then choose the rows that are the last in a run of identical Name values.
I do it by sorting by the reverse order of CheckinDate (last date first)
within Name, then
Post by William Dunlap
dLatestVisit <- dSorted[!duplicated(dSorted$Name), ]
I guess it is faster, but who knows?
Göran
Post by William Dunlap
txt <- "Name CheckInDate Temp
+ John 1/3/2014 97
+ Mary 1/3/2014 98.1
+ Sam 1/4/2014 97.5
+ John 1/4/2014 99"
d <- read.table(header=TRUE,
colClasses=c("character","character","numeric"), text=txt)
d$CheckInDate <- as.Date(d$CheckInDate, as.Date, format="%d/%m/%Y")
isEndOfRun <- function(x) c(x[-1] != x[-length(x)], TRUE)
dSorted <- d[order(d$Name, d$CheckInDate), ]
dLatestVisit <- dSorted[isEndOfRun(dSorted$Name), ]
dLatestVisit
Name CheckInDate Temp
4 John 2014-04-01 99.0
2 Mary 2014-03-01 98.1
3 Sam 2014-04-01 97.5
Bill Dunlap
TIBCO Software
wdunlap tibco.com
Hi,
Can someone help for a R question?
Name CheckInDate Temp
John 1/3/2014 97
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5
John 1/4/2014 99
I'd like to return a dataset that for each Name, get the row that is the
latest CheckInDate for that person. For the example above it would be
Name CheckInDate Temp
John 1/4/2014 99
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5
Thank you for your help!
Richard
[[alternative HTML version deleted]]
______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]

______________________________________________
R-***@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, min
Göran Broström
2015-01-25 20:38:45 UTC
Permalink
See inline;
Post by William Dunlap
Post by Göran Broström
dLatestVisit <- dSorted[!duplicated(dSorted$__Name), ]
I guess it is faster, but who knows?
You can find out by making a function that generates datasets of
various sizes and timing the suggested algorithms. E.g.,
makeData <-
function(nPatients, aveVisitsPerPatient, uniqueNameDate = TRUE){
nrow <- trunc(nPatients * aveVisitsPerPatient)
patientNames <- paste0("P",seq_len(nPatients))
possibleDates <- as.Date(16001:17000, origin=as.Date("1970-01-01"))
possibleTemps <- seq(97, 103, by=0.1)
data <- data.frame(Name=sample(patientNames, replace=TRUE, size=nrow),
CheckInDate=sample(possibleDates, replace=TRUE, size=nrow),
Temp=sample(possibleTemps, replace=TRUE, size=nrow))
if (uniqueNameDate) {
data <- data[!duplicated(data[, c("Name", "CheckInDate")]), ]
}
data
}
funs <- list(
f1 = function(data) {
do.call(rbind, lapply(split(data, data$Name), function(x)
x[order(x$CheckInDate),][nrow(x),]))
}, f2 = function (d)
{
isEndOfRun <- function(x) c(x[-1] != x[-length(x)], TRUE)
dSorted <- d[order(d$Name, d$CheckInDate), ]
dSorted[isEndOfRun(dSorted$Name), ]
}, f3 = function (d)
{
# is the following how you did reverse sort on date (& fwd on name)?
Yes; in fact I do this all the time in my applications (survival
analysis), where I have several records for each individual.

Göran
Post by William Dunlap
# Too bad that order's decreasing arg is not vectorized
dSorted <- d[order(d$Name, -as.numeric(d$CheckInDate)), ]
dSorted[!duplicated(dSorted$Name), ]
}, f4 = function(dta)
{
dta %>% group_by(Name) %>% filter(CheckInDate==max(CheckInDate))
})
D <- makeData(nPatients=35000, aveVisitsPerPatient=3.7) # c. 129000 visits
library(dplyr)
Z <- lapply(funs, function(fun){
time <- system.time( result <- fun(D) ) ; list(time=time,
result=result) })
sapply(Z, function(x)x$time)
# f1 f2 f3 f4
#user.self 461.25 0.47 0.36 3.01
#sys.self 1.20 0.00 0.00 0.01
#elapsed 472.33 0.47 0.39 3.03
#user.child NA NA NA NA
#sys.child NA NA NA NA
# duplicated is a bit better than diff, dplyr rather slower, rbind much
slower.
equivResults <- function(a, b) {
# results have different classes and different orders, so only check
size and contents
identical(dim(a),dim(b)) && all(a[order(a$Name),]==b[order(b$Name),])
}
sapply(Z[-1], function(x)equivResults(x$result, Z[[1]]$result))
# f2 f3 f4
#TRUE TRUE TRUE
Note that the various functions give different results if any patient comes
in twice on the same day. f4 includes both visits in the ouput, the other
include either the first or last (as ordered in the original file).
Bill Dunlap
TIBCO Software
wdunlap tibco.com <http://tibco.com>
Here is one way. Sort the data.frame, first by Name then break ties with
CheckInDate.
Then choose the rows that are the last in a run of identical Name values.
I do it by sorting by the reverse order of CheckinDate (last date
first) within Name, then
Post by Göran Broström
dLatestVisit <- dSorted[!duplicated(dSorted$__Name), ]
I guess it is faster, but who knows?
Göran
txt <- "Name CheckInDate Temp
+ John 1/3/2014 97
+ Mary 1/3/2014 98.1
+ Sam 1/4/2014 97.5
+ John 1/4/2014 99"
d <- read.table(header=TRUE,
colClasses=c("character","__character","numeric"), text=txt)
d$CheckInDate <- as.Date(d$CheckInDate, as.Date,
format="%d/%m/%Y")
isEndOfRun <- function(x) c(x[-1] != x[-length(x)], TRUE)
dSorted <- d[order(d$Name, d$CheckInDate), ]
dLatestVisit <- dSorted[isEndOfRun(dSorted$__Name), ]
dLatestVisit
Name CheckInDate Temp
4 John 2014-04-01 99 <tel:2014-04-01%2099>.0
2 Mary 2014-03-01 98 <tel:2014-03-01%2098>.1
3 Sam 2014-04-01 97 <tel:2014-04-01%2097>.5
Bill Dunlap
TIBCO Software
wdunlap tibco.com <http://tibco.com>
Hi,
Can someone help for a R question?
Name CheckInDate Temp
John 1/3/2014 97
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5
John 1/4/2014 99
I'd like to return a dataset that for each Name, get the row
that is the
latest CheckInDate for that person. For the example above
it would be
Name CheckInDate Temp
John 1/4/2014 99
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5
Thank you for your help!
Richard
[[alternative HTML version deleted]]
________________________________________________
list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/__listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
PLEASE do read the posting guide
http://www.R-project.org/__posting-guide.html
<http://www.R-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
________________________________________________
-- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/__listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
PLEASE do read the posting guide
http://www.R-project.org/__posting-guide.html
<http://www.R-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.
________________________________________________
To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/__listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
PLEASE do read the posting guide
http://www.R-project.org/__posting-guide.html
<http://www.R-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-***@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal
Tan, Richard
2015-01-26 17:29:27 UTC
Permalink
Thank you!


From: William Dunlap [mailto:***@tibco.com]
Sent: Friday, January 23, 2015 7:14 PM
To: Tan, Richard
Cc: r-***@R-project.org
Subject: Re: [R] get latest dates for different people in a dataset

Here is one way. Sort the data.frame, first by Name then break ties with CheckInDate.
Then choose the rows that are the last in a run of identical Name values.
txt <- "Name CheckInDate Temp
+ John 1/3/2014 97
+ Mary 1/3/2014 98.1
+ Sam 1/4/2014 97.5
+ John 1/4/2014 99"
d <- read.table(header=TRUE, colClasses=c("character","character","numeric"), text=txt)
d$CheckInDate <- as.Date(d$CheckInDate, as.Date, format="%d/%m/%Y")
isEndOfRun <- function(x) c(x[-1] != x[-length(x)], TRUE)
dSorted <- d[order(d$Name, d$CheckInDate), ]
dLatestVisit <- dSorted[isEndOfRun(dSorted$Name), ]
dLatestVisit
Name CheckInDate Temp
4 John 2014-04-01 99.0
2 Mary 2014-03-01 98.1
3 Sam 2014-04-01 97.5


Bill Dunlap
TIBCO Software
wdunlap tibco.com<http://tibco.com>

On Fri, Jan 23, 2015 at 3:43 PM, Tan, Richard <***@panagora.com<mailto:***@panagora.com>> wrote:
Hi,

Can someone help for a R question?

I have a data set like:

Name CheckInDate Temp
John 1/3/2014 97
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5
John 1/4/2014 99

I'd like to return a dataset that for each Name, get the row that is the latest CheckInDate for that person. For the example above it would be

Name CheckInDate Temp
John 1/4/2014 99
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5


Thank you for your help!

Richard


[[alternative HTML version deleted]]

______________________________________________
R-***@r-project.org<mailto:R-***@r-project.org> mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

______________________________________________
R-***@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Chel Hee Lee
2015-01-24 01:09:15 UTC
Permalink
do.call(rbind, lapply(split(data, data$Name), function(x)
x[order(x$CheckInDate),][nrow(x),]))
Name CheckInDate Temp
John John 2014-04-01 99.0
Mary Mary 2014-03-01 98.1
Sam Sam 2014-04-01 97.5
Is this what you are looking for? I hope this helps.

Chel Hee Lee
Hi,
Can someone help for a R question?
Name CheckInDate Temp
John 1/3/2014 97
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5
John 1/4/2014 99
I'd like to return a dataset that for each Name, get the row that is the latest CheckInDate for that person. For the example above it would be
Name CheckInDate Temp
John 1/4/2014 99
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5
Thank you for your help!
Richard
[[alternative HTML version deleted]]
______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-***@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Barron
2015-01-24 12:56:00 UTC
Permalink
Hi Richard,

You could also do it using the package dplyr:

dta <- data.frame(Name=c('John','Mary','Sam','John'),

CheckInDate=as.Date(c('1/3/2014','1/3/2014','1/4/2014','1/4/2014'),
format='%d/%m/%Y'),
Temp=c(97,98.1,97.5,99))


library(dplyr)
dta %>% group_by(Name) %>% filter(CheckInDate==max(CheckInDate))

Source: local data frame [3 x 3]
Groups: Name

Name CheckInDate Temp
1 Mary 2014-03-01 98.1
2 Sam 2014-04-01 97.5
3 John 2014-04-01 99.0
Post by William Dunlap
do.call(rbind, lapply(split(data, data$Name), function(x)
x[order(x$CheckInDate),][nrow(x),]))
Name CheckInDate Temp
John John 2014-04-01 99.0
Mary Mary 2014-03-01 98.1
Sam Sam 2014-04-01 97.5
Is this what you are looking for? I hope this helps.
Chel Hee Lee
Hi,
Can someone help for a R question?
Name CheckInDate Temp
John 1/3/2014 97
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5
John 1/4/2014 99
I'd like to return a dataset that for each Name, get the row that is the
latest CheckInDate for that person. For the example above it would be
Name CheckInDate Temp
John 1/4/2014 99
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5
Thank you for your help!
Richard
[[alternative HTML version deleted]]
______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-***@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Tan, Richard
2015-01-26 17:29:15 UTC
Permalink
Thank you!

-----Original Message-----
From: David Barron [mailto:***@gmail.com]
Sent: Saturday, January 24, 2015 7:56 AM
To: Tan, Richard; r-***@R-project.org
Subject: Re: [R] get latest dates for different people in a dataset

Hi Richard,

You could also do it using the package dplyr:

dta <- data.frame(Name=c('John','Mary','Sam','John'),

CheckInDate=as.Date(c('1/3/2014','1/3/2014','1/4/2014','1/4/2014'),
format='%d/%m/%Y'),
Temp=c(97,98.1,97.5,99))


library(dplyr)
dta %>% group_by(Name) %>% filter(CheckInDate==max(CheckInDate))

Source: local data frame [3 x 3]
Groups: Name

Name CheckInDate Temp
1 Mary 2014-03-01 98.1
2 Sam 2014-04-01 97.5
3 John 2014-04-01 99.0
Post by William Dunlap
do.call(rbind, lapply(split(data, data$Name), function(x)
x[order(x$CheckInDate),][nrow(x),]))
Name CheckInDate Temp
John John 2014-04-01 99.0
Mary Mary 2014-03-01 98.1
Sam Sam 2014-04-01 97.5
Is this what you are looking for? I hope this helps.
Chel Hee Lee
Hi,
Can someone help for a R question?
Name CheckInDate Temp
John 1/3/2014 97
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5
John 1/4/2014 99
I'd like to return a dataset that for each Name, get the row that is
the latest CheckInDate for that person. For the example above it
would be
Name CheckInDate Temp
John 1/4/2014 99
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5
Thank you for your help!
Richard
[[alternative HTML version deleted]]
______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-***@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Tan, Richard
2015-01-26 17:29:08 UTC
Permalink
Thank you!

-----Original Message-----
From: Chel Hee Lee [mailto:***@mail.usask.ca]
Sent: Friday, January 23, 2015 8:09 PM
To: Tan, Richard; 'r-***@R-project.org'
Subject: Re: [R] get latest dates for different people in a dataset
do.call(rbind, lapply(split(data, data$Name), function(x)
x[order(x$CheckInDate),][nrow(x),]))
Name CheckInDate Temp
John John 2014-04-01 99.0
Mary Mary 2014-03-01 98.1
Sam Sam 2014-04-01 97.5
Is this what you are looking for? I hope this helps.

Chel Hee Lee
Hi,
Can someone help for a R question?
Name CheckInDate Temp
John 1/3/2014 97
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5
John 1/4/2014 99
I'd like to return a dataset that for each Name, get the row that is
the latest CheckInDate for that person. For the example above it
would be
Name CheckInDate Temp
John 1/4/2014 99
Mary 1/3/2014 98.1
Sam 1/4/2014 97.5
Thank you for your help!
Richard
[[alternative HTML version deleted]]
______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-***@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Continue reading on narkive:
Loading...