Discussion:
[R] alternative for multiple if_else statements
(too old to reply)
Kevin Wamae
2018-02-21 20:33:57 UTC
Permalink
Hi, I am having trouble trying to figure out why if_else is behaving the way it is, it may be my code or the way the data is structured.

Below is a snapshot of a database am working on and it represents a longitudinal survey of study participants in a trial with weekly follow up.

The variable "survey_start" represents the start of the study-defined one year follow up (which we called "survey_year").

I am trying to populate all subsequent entries for each participant, per survey year, with the entry "survey" followed by an underscore and the respective year, eg. survey_2014.

There are missing entries such as the participant represented here, wasn't available at the start of the 2015 survey. Also, some participants don’t have complete one-year follow ups but I still need to include them.

I have written two codes, first one fails while the second works, the only difference being I have reversed the order in which the entries are populated in the second code (from 2007-2016 to 2016-2007) and removed the if_else statement for 2015. Also noticed, that for the second code, which spans the years 2007-2016 (less 2015), if a participants entries start from 2010-2016, the code fails.

Kindly assist in figuring this out...or better yet, an alternative.

trialData <- structure(list(study = c("site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1"), studyno = c("child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1"), date = structure(c(16078, 16085, 16092,
16098, 16104, 16115, 16121, 16129, 16135, 16140, 16146, 16156,
16162, 16168, 16177, 16185, 16191, 16195, 16203, 16210, 16217,
16225, 16234, 16237, 16246, 16253, 16262, 16269, 16278, 16283,
16288, 16297, 16304, 16311, 16319, 16326, 16332, 16337, 16346,
16353, 16360, 16366, 16370, 16381, 16384, 16395, 16399, 16407,
16415, 16422, 16444, 16452, 16454, 16467, 16474, 16477, 16484,
16490, 16501, 16508, 16514, 16520, 16529, 16533, 16539, 16550,
16556, 16564, 16566, 16578, 16582, 16593, 16599, 16604, 16613,
16620, 16623, 16635, 16636, 16654, 16660, 16666, 16673, 16681,
16688, 16693, 16702, 16706, 16714, 16721, 16728, 16734, 16745,
16749, 16757, 16764, 16769, 16778, 16785, 16792, 16805, 16812,
16819, 16830, 16832, 16839, 16846, 16856, 16862, 16867, 16877,
16884, 16890, 16898, 16904, 16912, 16917, 16923, 16936, 16938,
16953, 16960, 16966, 16973, 16980), class = "Date"), year = c(2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L), month = c(1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L,
5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L,
8L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L,
12L, 12L, 12L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L,
4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L,
7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 11L,
11L, 11L, 11L, 11L, 12L, 12L, 12L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L,
6L, 6L), survey_start = c("", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "Y", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "Y", "", "", "", "", "", "", "", "",
"", "", "", "", "", "")), class = "data.frame", row.names = c(NA,
-125L), .Names = c("study", "studyno", "date", "year", "month",
"survey_start"))


code 1 fails:

trialData <- trialData %>% arrange(studyno, date) %>% group_by(studyno) %>%
mutate(survey_year = if_else(date >= date[survey_start == "Y" & year == 2007 & study == "site_1"][1] & date < date[month == 5 & year == 2008 & study == "site_1"][1], "survey_2007",
if_else(date >= date[survey_start == "Y" & year == 2008 & study == "site_1"][1] & date < date[month == 4 & year == 2009 & study == "site_1"][1], "survey_2008",
if_else(date >= date[survey_start == "Y" & year == 2009 & study == "site_1"][1] & date < date[month == 5 & year == 2010 & study == "site_1"][1], "survey_2009",
if_else(date >= date[survey_start == "Y" & year == 2010 & study == "site_1"][1] & date < date[month == 5 & year == 2011 & study == "site_1"][1], "survey_2010",
if_else(date >= date[survey_start == "Y" & year == 2011 & study == "site_1"][1] & date < date[month == 4 & year == 2012 & study == "site_1"][1], "survey_2011",
if_else(date >= date[survey_start == "Y" & year == 2012 & study == "site_1"][1] & date < date[month == 4 & year == 2013 & study == "site_1"][1], "survey_2012",
if_else(date >= date[survey_start == "Y" & year == 2013 & study == "site_1"][1] & date < date[month == 4 & year == 2014 & study == "site_1"][1], "survey_2013",
if_else(date >= date[survey_start == "Y" & year == 2014 & study == "site_1"][1] & date < date[month == 4 & year == 2015 & study == "site_1"][1], "survey_2014",
if_else(date >= date[survey_start == "Y" & year == 2015 & study == "site_1"][1] & date < date[month == 3 & year == 2016 & study == "site_1"][1], "survey_2015",
if_else(date >= date[survey_start == "Y" & year == 2016 & study == "site_1"][1], "survey_2016","")))))))))))

code 2 works:

trialData <- trialData %>% arrange(studyno, date) %>% group_by(studyno) %>%
mutate(survey_year = if_else(date >= date[survey_start == "Y" & year == 2016 & study == "site_1"][1] , "survey_2016",
if_else(date >= date[survey_start == "Y" & year == 2014 & study == "site_1"][1] & date < date[month == 4 & year == 2015 & study == "site_1"][1], "survey_2014",
if_else(date >= date[survey_start == "Y" & year == 2013 & study == "site_1"][1] & date < date[month == 4 & year == 2014 & study == "site_1"][1], "survey_2013",
if_else(date >= date[survey_start == "Y" & year == 2012 & study == "site_1"][1] & date < date[month == 4 & year == 2013 & study == "site_1"][1], "survey_2012",
if_else(date >= date[survey_start == "Y" & year == 2011 & study == "site_1"][1] & date < date[month == 4 & year == 2012 & study == "site_1"][1], "survey_2011",
if_else(date >= date[survey_start == "Y" & year == 2010 & study == "site_1"][1] & date < date[month == 5 & year == 2011 & study == "site_1"][1], "survey_2010",
if_else(date >= date[survey_start == "Y" & year == 2009 & study == "site_1"][1] & date < date[month == 5 & year == 2010 & study == "site_1"][1], "survey_2009",
if_else(date >= date[survey_start == "Y" & year == 2008 & study == "site_1"][1] & date < date[month == 4 & year == 2009 & study == "site_1"][1], "survey_2008",
if_else(date >= date[survey_start == "Y" & year == 2007 & study == "site_1"][1] & date < date[month == 5 & year == 2008 & study == "site_1"][1], "survey_2007",""))))))))))

______________________________________________________________________

This e-mail contains information which is confidential. It is intended only for the use of the named recipient. If you have received this e-mail in error, please let us know by replying to the sender, and immediately delete it from your system. Please note, that in these circumstances, the use, disclosure, distribution or copying of this information is strictly prohibited. KEMRI-Wellcome Trust Programme cannot accept any responsibility for the accuracy or completeness of this message as it has been transmitted over a public network. Although the Programme has taken reasonable precautions to ensure no viruses are present in emails, it cannot accept responsibility for any loss or damage arising from the use of the email or attachments. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of KEMRI-Wellcome Trust Programme.
______________________________________________________________________

[[alternative HTML version deleted]]

______________________________________________
R-***@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide co
Eric Berger
2018-02-22 10:04:16 UTC
Permalink
Hi,
1. I think the reason that the different ordering leads to different
results is because of the following:
date[ some condition is true ][1]
will give you an NA if there are no rows where 'some condition holds'.
In the code that 'works' you don't have such a situation, but in the
code that 'does not work' you presumably hit an NA before you get to the
result that you really want.
2. I am not a big fan of your "nested if" layout. I think you could rewrite
it more clearly - and without nesting - with something like
trialData$survey_year <- rep(NA_character_, nrow(trialData))
trialData$survey_year[ condition for survey_2007 ] <- "survey_2007"
trialData$survey_year[ condition for survey_2008 ] <- "survey_2008"
etc
HTH,
Eric
Hi, I am having trouble trying to figure out why if_else is behaving the
way it is, it may be my code or the way the data is structured.
Below is a snapshot of a database am working on and it represents a
longitudinal survey of study participants in a trial with weekly follow up.
The variable "survey_start" represents the start of the study-defined one
year follow up (which we called "survey_year").
I am trying to populate all subsequent entries for each participant, per
survey year, with the entry "survey" followed by an underscore and the
respective year, eg. survey_2014.
There are missing entries such as the participant represented here, wasn't
available at the start of the 2015 survey. Also, some participants don’t
have complete one-year follow ups but I still need to include them.
I have written two codes, first one fails while the second works, the only
difference being I have reversed the order in which the entries are
populated in the second code (from 2007-2016 to 2016-2007) and removed the
if_else statement for 2015. Also noticed, that for the second code, which
spans the years 2007-2016 (less 2015), if a participants entries start from
2010-2016, the code fails.
Kindly assist in figuring this out...or better yet, an alternative.
trialData <- structure(list(study = c("site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1"), studyno = c("child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1"), date = structure(c(16078, 16085, 16092,
16098, 16104, 16115, 16121, 16129, 16135, 16140, 16146, 16156,
16162, 16168, 16177, 16185, 16191, 16195, 16203, 16210, 16217,
16225, 16234, 16237, 16246, 16253, 16262, 16269, 16278, 16283,
16288, 16297, 16304, 16311, 16319, 16326, 16332, 16337, 16346,
16353, 16360, 16366, 16370, 16381, 16384, 16395, 16399, 16407,
16415, 16422, 16444, 16452, 16454, 16467, 16474, 16477, 16484,
16490, 16501, 16508, 16514, 16520, 16529, 16533, 16539, 16550,
16556, 16564, 16566, 16578, 16582, 16593, 16599, 16604, 16613,
16620, 16623, 16635, 16636, 16654, 16660, 16666, 16673, 16681,
16688, 16693, 16702, 16706, 16714, 16721, 16728, 16734, 16745,
16749, 16757, 16764, 16769, 16778, 16785, 16792, 16805, 16812,
16819, 16830, 16832, 16839, 16846, 16856, 16862, 16867, 16877,
16884, 16890, 16898, 16904, 16912, 16917, 16923, 16936, 16938,
16953, 16960, 16966, 16973, 16980), class = "Date"), year = c(2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L), month = c(1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L,
5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L,
8L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L,
12L, 12L, 12L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L,
4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L,
7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 11L,
11L, 11L, 11L, 11L, 12L, 12L, 12L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L,
6L, 6L), survey_start = c("", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "Y", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "Y", "", "", "", "", "", "", "", "",
"", "", "", "", "", "")), class = "data.frame", row.names = c(NA,
-125L), .Names = c("study", "studyno", "date", "year", "month",
"survey_start"))
trialData <- trialData %>% arrange(studyno, date) %>% group_by(studyno) %>%
mutate(survey_year = if_else(date >= date[survey_start == "Y" & year ==
2007 & study == "site_1"][1] & date < date[month == 5 & year == 2008 &
study == "site_1"][1], "survey_2007",
if_else(date >= date[survey_start == "Y" & year ==
2008 & study == "site_1"][1] & date < date[month == 4 & year == 2009 &
study == "site_1"][1], "survey_2008",
if_else(date >= date[survey_start == "Y" & year ==
2009 & study == "site_1"][1] & date < date[month == 5 & year == 2010 &
study == "site_1"][1], "survey_2009",
if_else(date >= date[survey_start == "Y" & year ==
2010 & study == "site_1"][1] & date < date[month == 5 & year == 2011 &
study == "site_1"][1], "survey_2010",
if_else(date >= date[survey_start == "Y" & year ==
2011 & study == "site_1"][1] & date < date[month == 4 & year == 2012 &
study == "site_1"][1], "survey_2011",
if_else(date >= date[survey_start == "Y" & year ==
2012 & study == "site_1"][1] & date < date[month == 4 & year == 2013 &
study == "site_1"][1], "survey_2012",
if_else(date >= date[survey_start == "Y" & year ==
2013 & study == "site_1"][1] & date < date[month == 4 & year == 2014 &
study == "site_1"][1], "survey_2013",
if_else(date >= date[survey_start == "Y" & year ==
2014 & study == "site_1"][1] & date < date[month == 4 & year == 2015 &
study == "site_1"][1], "survey_2014",
if_else(date >= date[survey_start == "Y" & year ==
2015 & study == "site_1"][1] & date < date[month == 3 & year == 2016 &
study == "site_1"][1], "survey_2015",
if_else(date >= date[survey_start == "Y" & year ==
2016 & study == "site_1"][1], "survey_2016","")))))))))))
trialData <- trialData %>% arrange(studyno, date) %>%
group_by(studyno) %>%
mutate(survey_year = if_else(date >= date[survey_start == "Y" & year ==
2016 & study == "site_1"][1]
, "survey_2016",
if_else(date >= date[survey_start == "Y" & year
== 2014 & study == "site_1"][1] & date < date[month == 4 & year == 2015 &
study == "site_1"][1], "survey_2014",
if_else(date >= date[survey_start == "Y" & year
== 2013 & study == "site_1"][1] & date < date[month == 4 & year == 2014 &
study == "site_1"][1], "survey_2013",
if_else(date >= date[survey_start == "Y" & year
== 2012 & study == "site_1"][1] & date < date[month == 4 & year == 2013 &
study == "site_1"][1], "survey_2012",
if_else(date >= date[survey_start == "Y" & year
== 2011 & study == "site_1"][1] & date < date[month == 4 & year == 2012 &
study == "site_1"][1], "survey_2011",
if_else(date >= date[survey_start == "Y" & year
== 2010 & study == "site_1"][1] & date < date[month == 5 & year == 2011 &
study == "site_1"][1], "survey_2010",
if_else(date >= date[survey_start == "Y" & year
== 2009 & study == "site_1"][1] & date < date[month == 5 & year == 2010 &
study == "site_1"][1], "survey_2009",
if_else(date >= date[survey_start == "Y" & year
== 2008 & study == "site_1"][1] & date < date[month == 4 & year == 2009 &
study == "site_1"][1], "survey_2008",
if_else(date >= date[survey_start == "Y" & year
== 2007 & study == "site_1"][1] & date < date[month == 5 & year == 2008 &
study == "site_1"][1], "survey_2007",""))))))))))
______________________________________________________________________
This e-mail contains information which is confidential. It is intended
only for the use of the named recipient. If you have received this e-mail
in error, please let us know by replying to the sender, and immediately
delete it from your system. Please note, that in these circumstances, the
use, disclosure, distribution or copying of this information is strictly
prohibited. KEMRI-Wellcome Trust Programme cannot accept any responsibility
for the accuracy or completeness of this message as it has been
transmitted over a public network. Although the Programme has taken
reasonable precautions to ensure no viruses are present in emails, it
cannot accept responsibility for any loss or damage arising from the use of
the email or attachments. Any views expressed in this message are those of
the individual sender, except where the sender specifically states them to
be the views of KEMRI-Wellcome Trust Programme.
______________________________________________________________________
[[alternative HTML version deleted]]
______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]

______________________________________________
R-***@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Eric Berger
2018-02-22 12:15:37 UTC
Permalink
Hi Kevin,
I ran the code on the full data set and was able to reproduce the problem
that you are facing.
My guess is that you have an error in your intuition and/or logic, and that
this relates to the use of the subscript [1].
Specifically, on the full dataset, the condition
trialData$date[trialData$survey_start == "Y" & trialData$year == 2013 &
trialData$site == "site_1"]

yields 412 matches, of which there are 9 unique ones, specifically

April 2,3,4,5,8,10,11,16,17

In the full data set the first element that appears, i.e. subscript[1], is
"2013-04-04".

In the filtered data set the first element that appears is "2013-04-05".

I hope that is enough information for you to make further progress from
here.

Best,
Eric
Dear Eric, wow, this seems to do the trick. But I have encountered a
problem.
I have tested it on the larger dataset and it seems to work on a filtered
dataset but not on the whole dataset (attached). See below script..
#load packages
Library(dplyr)
#load data
trialData <- fread("trialData.txt") %>% mutate(date =
as.Date(date,"%d/%m/%Y"))
#create blank variable
trialData$survey_year <- rep(NA_character_, nrow(trialData))
*#attempt 1 fails: code for survey*
trialData$survey_year[trialData$date >= trialData$date[trialData$survey_start
== "Y" & trialData$year == 2013 & trialData$site == "site_1"][1] &
trialData$date < trialData$date[trialData$month == 4 & trialData$year ==
2014 & trialData$site == "site_1"][1]] <- "survey_2013"
#filter trialData
trialData <- trialData %>% filter(id == "id_786/3")
*#attempt 2 works: code for survey*
trialData$survey_year[trialData$date >= trialData$date[trialData$survey_start
== "Y" & trialData$year == 2013 & trialData$site == "site_1"][1] &
trialData$date < trialData$date[trialData$month == 4 & trialData$year ==
2014 & trialData$site == "site_1"][1]] <- "survey_2013"
*Date: *Thursday, 22 February 2018 at 13:05
*Subject: *Re: [R] alternative for multiple if_else statements
Hi,
1. I think the reason that the different ordering leads to different
date[ some condition is true ][1]
will give you an NA if there are no rows where 'some condition holds'.
In the code that 'works' you don't have such a situation, but in the
code that 'does not work' you presumably hit an NA before you get to the
result that you really want.
2. I am not a big fan of your "nested if" layout. I think you could
rewrite it more clearly - and without nesting - with something like
trialData$survey_year <- rep(NA_character_, nrow(trialData))
trialData$survey_year[ condition for survey_2007 ] <- "survey_2007"
trialData$survey_year[ condition for survey_2008 ] <- "survey_2008"
etc
HTH,
Eric
Hi, I am having trouble trying to figure out why if_else is behaving the
way it is, it may be my code or the way the data is structured.
Below is a snapshot of a database am working on and it represents a
longitudinal survey of study participants in a trial with weekly follow up.
The variable "survey_start" represents the start of the study-defined one
year follow up (which we called "survey_year").
I am trying to populate all subsequent entries for each participant, per
survey year, with the entry "survey" followed by an underscore and the
respective year, eg. survey_2014.
There are missing entries such as the participant represented here, wasn't
available at the start of the 2015 survey. Also, some participants don’t
have complete one-year follow ups but I still need to include them.
I have written two codes, first one fails while the second works, the only
difference being I have reversed the order in which the entries are
populated in the second code (from 2007-2016 to 2016-2007) and removed the
if_else statement for 2015. Also noticed, that for the second code, which
spans the years 2007-2016 (less 2015), if a participants entries start from
2010-2016, the code fails.
Kindly assist in figuring this out...or better yet, an alternative.
trialData <- structure(list(study = c("site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1"), studyno = c("child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1"), date = structure(c(16078, 16085, 16092,
16098, 16104, 16115, 16121, 16129, 16135, 16140, 16146, 16156,
16162, 16168, 16177, 16185, 16191, 16195, 16203, 16210, 16217,
16225, 16234, 16237, 16246, 16253, 16262, 16269, 16278, 16283,
16288, 16297, 16304, 16311, 16319, 16326, 16332, 16337, 16346,
16353, 16360, 16366, 16370, 16381, 16384, 16395, 16399, 16407,
16415, 16422, 16444, 16452, 16454, 16467, 16474, 16477, 16484,
16490, 16501, 16508, 16514, 16520, 16529, 16533, 16539, 16550,
16556, 16564, 16566, 16578, 16582, 16593, 16599, 16604, 16613,
16620, 16623, 16635, 16636, 16654, 16660, 16666, 16673, 16681,
16688, 16693, 16702, 16706, 16714, 16721, 16728, 16734, 16745,
16749, 16757, 16764, 16769, 16778, 16785, 16792, 16805, 16812,
16819, 16830, 16832, 16839, 16846, 16856, 16862, 16867, 16877,
16884, 16890, 16898, 16904, 16912, 16917, 16923, 16936, 16938,
16953, 16960, 16966, 16973, 16980), class = "Date"), year = c(2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L), month = c(1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L,
5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L,
8L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L,
12L, 12L, 12L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L,
4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L,
7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 11L,
11L, 11L, 11L, 11L, 12L, 12L, 12L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L,
6L, 6L), survey_start = c("", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "Y", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "Y", "", "", "", "", "", "", "", "",
"", "", "", "", "", "")), class = "data.frame", row.names = c(NA,
-125L), .Names = c("study", "studyno", "date", "year", "month",
"survey_start"))
trialData <- trialData %>% arrange(studyno, date) %>% group_by(studyno) %>%
mutate(survey_year = if_else(date >= date[survey_start == "Y" & year ==
2007 & study == "site_1"][1] & date < date[month == 5 & year == 2008 &
study == "site_1"][1], "survey_2007",
if_else(date >= date[survey_start == "Y" & year ==
2008 & study == "site_1"][1] & date < date[month == 4 & year == 2009 &
study == "site_1"][1], "survey_2008",
if_else(date >= date[survey_start == "Y" & year ==
2009 & study == "site_1"][1] & date < date[month == 5 & year == 2010 &
study == "site_1"][1], "survey_2009",
if_else(date >= date[survey_start == "Y" & year ==
2010 & study == "site_1"][1] & date < date[month == 5 & year == 2011 &
study == "site_1"][1], "survey_2010",
if_else(date >= date[survey_start == "Y" & year ==
2011 & study == "site_1"][1] & date < date[month == 4 & year == 2012 &
study == "site_1"][1], "survey_2011",
if_else(date >= date[survey_start == "Y" & year ==
2012 & study == "site_1"][1] & date < date[month == 4 & year == 2013 &
study == "site_1"][1], "survey_2012",
if_else(date >= date[survey_start == "Y" & year ==
2013 & study == "site_1"][1] & date < date[month == 4 & year == 2014 &
study == "site_1"][1], "survey_2013",
if_else(date >= date[survey_start == "Y" & year ==
2014 & study == "site_1"][1] & date < date[month == 4 & year == 2015 &
study == "site_1"][1], "survey_2014",
if_else(date >= date[survey_start == "Y" & year ==
2015 & study == "site_1"][1] & date < date[month == 3 & year == 2016 &
study == "site_1"][1], "survey_2015",
if_else(date >= date[survey_start == "Y" & year ==
2016 & study == "site_1"][1], "survey_2016","")))))))))))
trialData <- trialData %>% arrange(studyno, date) %>%
group_by(studyno) %>%
mutate(survey_year = if_else(date >= date[survey_start == "Y" & year ==
2016 & study == "site_1"][1]
, "survey_2016",
if_else(date >= date[survey_start == "Y" & year
== 2014 & study == "site_1"][1] & date < date[month == 4 & year == 2015 &
study == "site_1"][1], "survey_2014",
if_else(date >= date[survey_start == "Y" & year
== 2013 & study == "site_1"][1] & date < date[month == 4 & year == 2014 &
study == "site_1"][1], "survey_2013",
if_else(date >= date[survey_start == "Y" & year
== 2012 & study == "site_1"][1] & date < date[month == 4 & year == 2013 &
study == "site_1"][1], "survey_2012",
if_else(date >= date[survey_start == "Y" & year
== 2011 & study == "site_1"][1] & date < date[month == 4 & year == 2012 &
study == "site_1"][1], "survey_2011",
if_else(date >= date[survey_start == "Y" & year
== 2010 & study == "site_1"][1] & date < date[month == 5 & year == 2011 &
study == "site_1"][1], "survey_2010",
if_else(date >= date[survey_start == "Y" & year
== 2009 & study == "site_1"][1] & date < date[month == 5 & year == 2010 &
study == "site_1"][1], "survey_2009",
if_else(date >= date[survey_start == "Y" & year
== 2008 & study == "site_1"][1] & date < date[month == 4 & year == 2009 &
study == "site_1"][1], "survey_2008",
if_else(date >= date[survey_start == "Y" & year
== 2007 & study == "site_1"][1] & date < date[month == 5 & year == 2008 &
study == "site_1"][1], "survey_2007",""))))))))))
______________________________________________________________________
This e-mail contains information which is confidential. It is intended
only for the use of the named recipient. If you have received this e-mail
in error, please let us know by replying to the sender, and immediately
delete it from your system. Please note, that in these circumstances, the
use, disclosure, distribution or copying of this information is strictly
prohibited. KEMRI-Wellcome Trust Programme cannot accept any responsibility
for the accuracy or completeness of this message as it has been
transmitted over a public network. Although the Programme has taken
reasonable precautions to ensure no viruses are present in emails, it
cannot accept responsibility for any loss or damage arising from the use of
the email or attachments. Any views expressed in this message are those of
the individual sender, except where the sender specifically states them to
be the views of KEMRI-Wellcome Trust Programme.
______________________________________________________________________
[[alternative HTML version deleted]]
______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________________________________
This e-mail contains information which is confidential. It is intended
only for the use of the named recipient. If you have received this e-mail
in error, please let us know by replying to the sender, and immediately
delete it from your system. Please note, that in these circumstances, the
use, disclosure, distribution or copying of this information is strictly
prohibited. KEMRI-Wellcome Trust Programme cannot accept any responsibility
for the accuracy or completeness of this message as it has been transmitted
over a public network. Although the Programme has taken reasonable
precautions to ensure no viruses are present in emails, it cannot accept
responsibility for any loss or damage arising from the use of the email or
attachments. Any views expressed in this message are those of the
individual sender, except where the sender specifically states them to be
the views of KEMRI-Wellcome Trust Programme.
______________________________________________________________________
[[alternative HTML version deleted]]

______________________________________________
R-***@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Kevin Wamae
2018-02-23 05:13:13 UTC
Permalink
Dear Eric, thank you for that observation.

I realised that some of the participants have duplicated “survey_start” dates and when I corrected this, the code works.

Regards
------------------
Kevin Wamae
From: Eric Berger <***@gmail.com>
Date: Thursday, 22 February 2018 at 15:16
To: Kevin Wamae <***@kemri-wellcome.org>
Cc: "R-***@r-project.org" <R-***@r-project.org>
Subject: Re: [R] alternative for multiple if_else statements

Hi Kevin,
I ran the code on the full data set and was able to reproduce the problem that you are facing.
My guess is that you have an error in your intuition and/or logic, and that this relates to the use of the subscript [1].
Specifically, on the full dataset, the condition
trialData$date[trialData$survey_start == "Y" & trialData$year == 2013 & trialData$site == "site_1"]

yields 412 matches, of which there are 9 unique ones, specifically

April 2,3,4,5,8,10,11,16,17

In the full data set the first element that appears, i.e. subscript[1], is "2013-04-04".

In the filtered data set the first element that appears is "2013-04-05".

I hope that is enough information for you to make further progress from here.

Best,
Eric



On Thu, Feb 22, 2018 at 1:28 PM, Kevin Wamae <***@kemri-wellcome.org<mailto:***@kemri-wellcome.org>> wrote:
Dear Eric, wow, this seems to do the trick. But I have encountered a problem.

I have tested it on the larger dataset and it seems to work on a filtered dataset but not on the whole dataset (attached). See below script..

#load packages
Library(dplyr)

#load data
trialData <- fread("trialData.txt") %>% mutate(date = as.Date(date,"%d/%m/%Y"))

#create blank variable
trialData$survey_year <- rep(NA_character_, nrow(trialData))

#attempt 1 fails: code for survey
trialData$survey_year[trialData$date >= trialData$date[trialData$survey_start == "Y" & trialData$year == 2013 & trialData$site == "site_1"][1] & trialData$date < trialData$date[trialData$month == 4 & trialData$year == 2014 & trialData$site == "site_1"][1]] <- "survey_2013"

#filter trialData
trialData <- trialData %>% filter(id == "id_786/3")

#attempt 2 works: code for survey
trialData$survey_year[trialData$date >= trialData$date[trialData$survey_start == "Y" & trialData$year == 2013 & trialData$site == "site_1"][1] & trialData$date < trialData$date[trialData$month == 4 & trialData$year == 2014 & trialData$site == "site_1"][1]] <- "survey_2013"



From: Eric Berger <***@gmail.com<mailto:***@gmail.com>>
Date: Thursday, 22 February 2018 at 13:05
To: Kevin Wamae <***@kemri-wellcome.org<mailto:***@kemri-wellcome.org>>
Cc: "R-***@r-project.org<mailto:R-***@r-project.org>" <R-***@r-project.org<mailto:R-***@r-project.org>>
Subject: Re: [R] alternative for multiple if_else statements

Hi,
1. I think the reason that the different ordering leads to different results is because of the following:
date[ some condition is true ][1]
will give you an NA if there are no rows where 'some condition holds'.
In the code that 'works' you don't have such a situation, but in the code that 'does not work' you presumably hit an NA before you get to the result that you really want.
2. I am not a big fan of your "nested if" layout. I think you could rewrite it more clearly - and without nesting - with something like
trialData$survey_year <- rep(NA_character_, nrow(trialData))
trialData$survey_year[ condition for survey_2007 ] <- "survey_2007"
trialData$survey_year[ condition for survey_2008 ] <- "survey_2008"
etc
HTH,
Eric

On Wed, Feb 21, 2018 at 10:33 PM, Kevin Wamae <***@kemri-wellcome.org<mailto:***@kemri-wellcome.org>> wrote:
Hi, I am having trouble trying to figure out why if_else is behaving the way it is, it may be my code or the way the data is structured.

Below is a snapshot of a database am working on and it represents a longitudinal survey of study participants in a trial with weekly follow up.

The variable "survey_start" represents the start of the study-defined one year follow up (which we called "survey_year").

I am trying to populate all subsequent entries for each participant, per survey year, with the entry "survey" followed by an underscore and the respective year, eg. survey_2014.

There are missing entries such as the participant represented here, wasn't available at the start of the 2015 survey. Also, some participants don’t have complete one-year follow ups but I still need to include them.

I have written two codes, first one fails while the second works, the only difference being I have reversed the order in which the entries are populated in the second code (from 2007-2016 to 2016-2007) and removed the if_else statement for 2015. Also noticed, that for the second code, which spans the years 2007-2016 (less 2015), if a participants entries start from 2010-2016, the code fails.

Kindly assist in figuring this out...or better yet, an alternative.

trialData <- structure(list(study = c("site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1"), studyno = c("child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1"), date = structure(c(16078, 16085, 16092,
16098, 16104, 16115, 16121, 16129, 16135, 16140, 16146, 16156,
16162, 16168, 16177, 16185, 16191, 16195, 16203, 16210, 16217,
16225, 16234, 16237, 16246, 16253, 16262, 16269, 16278, 16283,
16288, 16297, 16304, 16311, 16319, 16326, 16332, 16337, 16346,
16353, 16360, 16366, 16370, 16381, 16384, 16395, 16399, 16407,
16415, 16422, 16444, 16452, 16454, 16467, 16474, 16477, 16484,
16490, 16501, 16508, 16514, 16520, 16529, 16533, 16539, 16550,
16556, 16564, 16566, 16578, 16582, 16593, 16599, 16604, 16613,
16620, 16623, 16635, 16636, 16654, 16660, 16666, 16673, 16681,
16688, 16693, 16702, 16706, 16714, 16721, 16728, 16734, 16745,
16749, 16757, 16764, 16769, 16778, 16785, 16792, 16805, 16812,
16819, 16830, 16832, 16839, 16846, 16856, 16862, 16867, 16877,
16884, 16890, 16898, 16904, 16912, 16917, 16923, 16936, 16938,
16953, 16960, 16966, 16973, 16980), class = "Date"), year = c(2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L), month = c(1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L,
5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L,
8L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L,
12L, 12L, 12L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L,
4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L,
7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 11L,
11L, 11L, 11L, 11L, 12L, 12L, 12L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L,
6L, 6L), survey_start = c("", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "Y", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "Y", "", "", "", "", "", "", "", "",
"", "", "", "", "", "")), class = "data.frame", row.names = c(NA,
-125L), .Names = c("study", "studyno", "date", "year", "month",
"survey_start"))


code 1 fails:

trialData <- trialData %>% arrange(studyno, date) %>% group_by(studyno) %>%
mutate(survey_year = if_else(date >= date[survey_start == "Y" & year == 2007 & study == "site_1"][1] & date < date[month == 5 & year == 2008 & study == "site_1"][1], "survey_2007",
if_else(date >= date[survey_start == "Y" & year == 2008 & study == "site_1"][1] & date < date[month == 4 & year == 2009 & study == "site_1"][1], "survey_2008",
if_else(date >= date[survey_start == "Y" & year == 2009 & study == "site_1"][1] & date < date[month == 5 & year == 2010 & study == "site_1"][1], "survey_2009",
if_else(date >= date[survey_start == "Y" & year == 2010 & study == "site_1"][1] & date < date[month == 5 & year == 2011 & study == "site_1"][1], "survey_2010",
if_else(date >= date[survey_start == "Y" & year == 2011 & study == "site_1"][1] & date < date[month == 4 & year == 2012 & study == "site_1"][1], "survey_2011",
if_else(date >= date[survey_start == "Y" & year == 2012 & study == "site_1"][1] & date < date[month == 4 & year == 2013 & study == "site_1"][1], "survey_2012",
if_else(date >= date[survey_start == "Y" & year == 2013 & study == "site_1"][1] & date < date[month == 4 & year == 2014 & study == "site_1"][1], "survey_2013",
if_else(date >= date[survey_start == "Y" & year == 2014 & study == "site_1"][1] & date < date[month == 4 & year == 2015 & study == "site_1"][1], "survey_2014",
if_else(date >= date[survey_start == "Y" & year == 2015 & study == "site_1"][1] & date < date[month == 3 & year == 2016 & study == "site_1"][1], "survey_2015",
if_else(date >= date[survey_start == "Y" & year == 2016 & study == "site_1"][1], "survey_2016","")))))))))))

code 2 works:

trialData <- trialData %>% arrange(studyno, date) %>% group_by(studyno) %>%
mutate(survey_year = if_else(date >= date[survey_start == "Y" & year == 2016 & study == "site_1"][1] , "survey_2016",
if_else(date >= date[survey_start == "Y" & year == 2014 & study == "site_1"][1] & date < date[month == 4 & year == 2015 & study == "site_1"][1], "survey_2014",
if_else(date >= date[survey_start == "Y" & year == 2013 & study == "site_1"][1] & date < date[month == 4 & year == 2014 & study == "site_1"][1], "survey_2013",
if_else(date >= date[survey_start == "Y" & year == 2012 & study == "site_1"][1] & date < date[month == 4 & year == 2013 & study == "site_1"][1], "survey_2012",
if_else(date >= date[survey_start == "Y" & year == 2011 & study == "site_1"][1] & date < date[month == 4 & year == 2012 & study == "site_1"][1], "survey_2011",
if_else(date >= date[survey_start == "Y" & year == 2010 & study == "site_1"][1] & date < date[month == 5 & year == 2011 & study == "site_1"][1], "survey_2010",
if_else(date >= date[survey_start == "Y" & year == 2009 & study == "site_1"][1] & date < date[month == 5 & year == 2010 & study == "site_1"][1], "survey_2009",
if_else(date >= date[survey_start == "Y" & year == 2008 & study == "site_1"][1] & date < date[month == 4 & year == 2009 & study == "site_1"][1], "survey_2008",
if_else(date >= date[survey_start == "Y" & year == 2007 & study == "site_1"][1] & date < date[month == 5 & year == 2008 & study == "site_1"][1], "survey_2007",""))))))))))

______________________________________________________________________

This e-mail contains information which is confidential. It is intended only for the use of the named recipient. If you have received this e-mail in error, please let us know by replying to the sender, and immediately delete it from your system. Please note, that in these circumstances, the use, disclosure, distribution or copying of this information is strictly prohibited. KEMRI-Wellcome Trust Programme cannot accept any responsibility for the accuracy or completeness of this message as it has been transmitted over a public network. Although the Programme has taken reasonable precautions to ensure no viruses are present in emails, it cannot accept responsibility for any loss or damage arising from the use of the email or attachments. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of KEMRI-Wellcome Trust Programme.
______________________________________________________________________

[[alternative HTML version deleted]]

______________________________________________
R-***@r-project.org<mailto:R-***@r-project.org> mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________________________________

This e-mail contains information which is confidential. It is intended only for the use of the named recipient. If you have received this e-mail in error, please let us know by replying to the sender, and immediately delete it from your system. Please note, that in these circumstances, the use, disclosure, distribution or copying of this information is strictly prohibited. KEMRI-Wellcome Trust Programme cannot accept any responsibility for the accuracy or completeness of this message as it has been transmitted over a public network. Although the Programme has taken reasonable precautions to ensure no viruses are present in emails, it cannot accept responsibility for any loss or damage arising from the use of the email or attachments. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of KEMRI-Wellcome Trust Programme.
______________________________________________________________________


______________________________________________________________________

This e-mail contains information which is confidential. It is intended only for the use of the named recipient. If you have received this e-mail in error, please let us know by replying to the sender, and immediately delete it from your system. Please note, that in these circumstances, the use, disclosure, distribution or copying of this information is strictly prohibited. KEMRI-Wellcome Trust Programme cannot accept any responsibility for the accuracy or completeness of this message as it has been transmitted over a public network. Although the Programme has taken reasonable precautions to ensure no viruses are present in emails, it cannot accept responsibility for any loss or damage arising from the use of the email or attachments. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of KEMRI-Wellcome Trust Programme.
______________________________________________________________________

[[alternative HTML version deleted]]

______________________________________________
R-***@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproduci
Ista Zahn
2018-02-22 13:57:37 UTC
Permalink
I don't fully understand the logic you are trying to implement, but
something along the lines of

foo <- cut(trialData$date,
breaks = as.Date(c("2007-01-01",
"2008-05-01",
"2009-04-01",
"2010-05-01",
"2011-05-01",
"2012-04-01",
"2013-04-01",
"2014-04-01",
"2015-04-01",
"2016-03-01",
"2017-01-01")))

might work.

Best,
Ista
Post by Kevin Wamae
Hi, I am having trouble trying to figure out why if_else is behaving the way it is, it may be my code or the way the data is structured.
Below is a snapshot of a database am working on and it represents a longitudinal survey of study participants in a trial with weekly follow up.
The variable "survey_start" represents the start of the study-defined one year follow up (which we called "survey_year").
I am trying to populate all subsequent entries for each participant, per survey year, with the entry "survey" followed by an underscore and the respective year, eg. survey_2014.
There are missing entries such as the participant represented here, wasn't available at the start of the 2015 survey. Also, some participants don’t have complete one-year follow ups but I still need to include them.
I have written two codes, first one fails while the second works, the only difference being I have reversed the order in which the entries are populated in the second code (from 2007-2016 to 2016-2007) and removed the if_else statement for 2015. Also noticed, that for the second code, which spans the years 2007-2016 (less 2015), if a participants entries start from 2010-2016, the code fails.
Kindly assist in figuring this out...or better yet, an alternative.
trialData <- structure(list(study = c("site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1"), studyno = c("child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1"), date = structure(c(16078, 16085, 16092,
16098, 16104, 16115, 16121, 16129, 16135, 16140, 16146, 16156,
16162, 16168, 16177, 16185, 16191, 16195, 16203, 16210, 16217,
16225, 16234, 16237, 16246, 16253, 16262, 16269, 16278, 16283,
16288, 16297, 16304, 16311, 16319, 16326, 16332, 16337, 16346,
16353, 16360, 16366, 16370, 16381, 16384, 16395, 16399, 16407,
16415, 16422, 16444, 16452, 16454, 16467, 16474, 16477, 16484,
16490, 16501, 16508, 16514, 16520, 16529, 16533, 16539, 16550,
16556, 16564, 16566, 16578, 16582, 16593, 16599, 16604, 16613,
16620, 16623, 16635, 16636, 16654, 16660, 16666, 16673, 16681,
16688, 16693, 16702, 16706, 16714, 16721, 16728, 16734, 16745,
16749, 16757, 16764, 16769, 16778, 16785, 16792, 16805, 16812,
16819, 16830, 16832, 16839, 16846, 16856, 16862, 16867, 16877,
16884, 16890, 16898, 16904, 16912, 16917, 16923, 16936, 16938,
16953, 16960, 16966, 16973, 16980), class = "Date"), year = c(2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L), month = c(1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L,
5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L,
8L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L,
12L, 12L, 12L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L,
4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L,
7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 11L,
11L, 11L, 11L, 11L, 12L, 12L, 12L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L,
6L, 6L), survey_start = c("", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "Y", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "Y", "", "", "", "", "", "", "", "",
"", "", "", "", "", "")), class = "data.frame", row.names = c(NA,
-125L), .Names = c("study", "studyno", "date", "year", "month",
"survey_start"))
trialData <- trialData %>% arrange(studyno, date) %>% group_by(studyno) %>%
mutate(survey_year = if_else(date >= date[survey_start == "Y" & year == 2007 & study == "site_1"][1] & date < date[month == 5 & year == 2008 & study == "site_1"][1], "survey_2007",
if_else(date >= date[survey_start == "Y" & year == 2008 & study == "site_1"][1] & date < date[month == 4 & year == 2009 & study == "site_1"][1], "survey_2008",
if_else(date >= date[survey_start == "Y" & year == 2009 & study == "site_1"][1] & date < date[month == 5 & year == 2010 & study == "site_1"][1], "survey_2009",
if_else(date >= date[survey_start == "Y" & year == 2010 & study == "site_1"][1] & date < date[month == 5 & year == 2011 & study == "site_1"][1], "survey_2010",
if_else(date >= date[survey_start == "Y" & year == 2011 & study == "site_1"][1] & date < date[month == 4 & year == 2012 & study == "site_1"][1], "survey_2011",
if_else(date >= date[survey_start == "Y" & year == 2012 & study == "site_1"][1] & date < date[month == 4 & year == 2013 & study == "site_1"][1], "survey_2012",
if_else(date >= date[survey_start == "Y" & year == 2013 & study == "site_1"][1] & date < date[month == 4 & year == 2014 & study == "site_1"][1], "survey_2013",
if_else(date >= date[survey_start == "Y" & year == 2014 & study == "site_1"][1] & date < date[month == 4 & year == 2015 & study == "site_1"][1], "survey_2014",
if_else(date >= date[survey_start == "Y" & year == 2015 & study == "site_1"][1] & date < date[month == 3 & year == 2016 & study == "site_1"][1], "survey_2015",
if_else(date >= date[survey_start == "Y" & year == 2016 & study == "site_1"][1], "survey_2016","")))))))))))
trialData <- trialData %>% arrange(studyno, date) %>% group_by(studyno) %>%
mutate(survey_year = if_else(date >= date[survey_start == "Y" & year == 2016 & study == "site_1"][1] , "survey_2016",
if_else(date >= date[survey_start == "Y" & year == 2014 & study == "site_1"][1] & date < date[month == 4 & year == 2015 & study == "site_1"][1], "survey_2014",
if_else(date >= date[survey_start == "Y" & year == 2013 & study == "site_1"][1] & date < date[month == 4 & year == 2014 & study == "site_1"][1], "survey_2013",
if_else(date >= date[survey_start == "Y" & year == 2012 & study == "site_1"][1] & date < date[month == 4 & year == 2013 & study == "site_1"][1], "survey_2012",
if_else(date >= date[survey_start == "Y" & year == 2011 & study == "site_1"][1] & date < date[month == 4 & year == 2012 & study == "site_1"][1], "survey_2011",
if_else(date >= date[survey_start == "Y" & year == 2010 & study == "site_1"][1] & date < date[month == 5 & year == 2011 & study == "site_1"][1], "survey_2010",
if_else(date >= date[survey_start == "Y" & year == 2009 & study == "site_1"][1] & date < date[month == 5 & year == 2010 & study == "site_1"][1], "survey_2009",
if_else(date >= date[survey_start == "Y" & year == 2008 & study == "site_1"][1] & date < date[month == 4 & year == 2009 & study == "site_1"][1], "survey_2008",
if_else(date >= date[survey_start == "Y" & year == 2007 & study == "site_1"][1] & date < date[month == 5 & year == 2008 & study == "site_1"][1], "survey_2007",""))))))))))
______________________________________________________________________
This e-mail contains information which is confidential. It is intended only for the use of the named recipient. If you have received this e-mail in error, please let us know by replying to the sender, and immediately delete it from your system. Please note, that in these circumstances, the use, disclosure, distribution or copying of this information is strictly prohibited. KEMRI-Wellcome Trust Programme cannot accept any responsibility for the accuracy or completeness of this message as it has been transmitted over a public network. Although the Programme has taken reasonable precautions to ensure no viruses are present in emails, it cannot accept responsibility for any loss or damage arising from the use of the email or attachments. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of KEMRI-Wellcome Trust Programme.
______________________________________________________________________
[[alternative HTML version deleted]]
______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-***@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Kevin Wamae
2018-02-23 05:11:20 UTC
Permalink
Dear Ista, thank you. Let me see how best I can implement this.

Regards
------------------
Kevin Wamae

On 22/02/2018, 16:58, "Ista Zahn" <***@gmail.com> wrote:

I don't fully understand the logic you are trying to implement, but
something along the lines of

foo <- cut(trialData$date,
breaks = as.Date(c("2007-01-01",
"2008-05-01",
"2009-04-01",
"2010-05-01",
"2011-05-01",
"2012-04-01",
"2013-04-01",
"2014-04-01",
"2015-04-01",
"2016-03-01",
"2017-01-01")))

might work.

Best,
Ista
Post by Kevin Wamae
Hi, I am having trouble trying to figure out why if_else is behaving the way it is, it may be my code or the way the data is structured.
Below is a snapshot of a database am working on and it represents a longitudinal survey of study participants in a trial with weekly follow up.
The variable "survey_start" represents the start of the study-defined one year follow up (which we called "survey_year").
I am trying to populate all subsequent entries for each participant, per survey year, with the entry "survey" followed by an underscore and the respective year, eg. survey_2014.
There are missing entries such as the participant represented here, wasn't available at the start of the 2015 survey. Also, some participants don’t have complete one-year follow ups but I still need to include them.
I have written two codes, first one fails while the second works, the only difference being I have reversed the order in which the entries are populated in the second code (from 2007-2016 to 2016-2007) and removed the if_else statement for 2015. Also noticed, that for the second code, which spans the years 2007-2016 (less 2015), if a participants entries start from 2010-2016, the code fails.
Kindly assist in figuring this out...or better yet, an alternative.
trialData <- structure(list(study = c("site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1"), studyno = c("child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1"), date = structure(c(16078, 16085, 16092,
16098, 16104, 16115, 16121, 16129, 16135, 16140, 16146, 16156,
16162, 16168, 16177, 16185, 16191, 16195, 16203, 16210, 16217,
16225, 16234, 16237, 16246, 16253, 16262, 16269, 16278, 16283,
16288, 16297, 16304, 16311, 16319, 16326, 16332, 16337, 16346,
16353, 16360, 16366, 16370, 16381, 16384, 16395, 16399, 16407,
16415, 16422, 16444, 16452, 16454, 16467, 16474, 16477, 16484,
16490, 16501, 16508, 16514, 16520, 16529, 16533, 16539, 16550,
16556, 16564, 16566, 16578, 16582, 16593, 16599, 16604, 16613,
16620, 16623, 16635, 16636, 16654, 16660, 16666, 16673, 16681,
16688, 16693, 16702, 16706, 16714, 16721, 16728, 16734, 16745,
16749, 16757, 16764, 16769, 16778, 16785, 16792, 16805, 16812,
16819, 16830, 16832, 16839, 16846, 16856, 16862, 16867, 16877,
16884, 16890, 16898, 16904, 16912, 16917, 16923, 16936, 16938,
16953, 16960, 16966, 16973, 16980), class = "Date"), year = c(2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L), month = c(1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L,
5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L,
8L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L,
12L, 12L, 12L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L,
4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L,
7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 11L,
11L, 11L, 11L, 11L, 12L, 12L, 12L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L,
6L, 6L), survey_start = c("", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "Y", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "Y", "", "", "", "", "", "", "", "",
"", "", "", "", "", "")), class = "data.frame", row.names = c(NA,
-125L), .Names = c("study", "studyno", "date", "year", "month",
"survey_start"))
trialData <- trialData %>% arrange(studyno, date) %>% group_by(studyno) %>%
mutate(survey_year = if_else(date >= date[survey_start == "Y" & year == 2007 & study == "site_1"][1] & date < date[month == 5 & year == 2008 & study == "site_1"][1], "survey_2007",
if_else(date >= date[survey_start == "Y" & year == 2008 & study == "site_1"][1] & date < date[month == 4 & year == 2009 & study == "site_1"][1], "survey_2008",
if_else(date >= date[survey_start == "Y" & year == 2009 & study == "site_1"][1] & date < date[month == 5 & year == 2010 & study == "site_1"][1], "survey_2009",
if_else(date >= date[survey_start == "Y" & year == 2010 & study == "site_1"][1] & date < date[month == 5 & year == 2011 & study == "site_1"][1], "survey_2010",
if_else(date >= date[survey_start == "Y" & year == 2011 & study == "site_1"][1] & date < date[month == 4 & year == 2012 & study == "site_1"][1], "survey_2011",
if_else(date >= date[survey_start == "Y" & year == 2012 & study == "site_1"][1] & date < date[month == 4 & year == 2013 & study == "site_1"][1], "survey_2012",
if_else(date >= date[survey_start == "Y" & year == 2013 & study == "site_1"][1] & date < date[month == 4 & year == 2014 & study == "site_1"][1], "survey_2013",
if_else(date >= date[survey_start == "Y" & year == 2014 & study == "site_1"][1] & date < date[month == 4 & year == 2015 & study == "site_1"][1], "survey_2014",
if_else(date >= date[survey_start == "Y" & year == 2015 & study == "site_1"][1] & date < date[month == 3 & year == 2016 & study == "site_1"][1], "survey_2015",
if_else(date >= date[survey_start == "Y" & year == 2016 & study == "site_1"][1], "survey_2016","")))))))))))
trialData <- trialData %>% arrange(studyno, date) %>% group_by(studyno) %>%
mutate(survey_year = if_else(date >= date[survey_start == "Y" & year == 2016 & study == "site_1"][1] , "survey_2016",
if_else(date >= date[survey_start == "Y" & year == 2014 & study == "site_1"][1] & date < date[month == 4 & year == 2015 & study == "site_1"][1], "survey_2014",
if_else(date >= date[survey_start == "Y" & year == 2013 & study == "site_1"][1] & date < date[month == 4 & year == 2014 & study == "site_1"][1], "survey_2013",
if_else(date >= date[survey_start == "Y" & year == 2012 & study == "site_1"][1] & date < date[month == 4 & year == 2013 & study == "site_1"][1], "survey_2012",
if_else(date >= date[survey_start == "Y" & year == 2011 & study == "site_1"][1] & date < date[month == 4 & year == 2012 & study == "site_1"][1], "survey_2011",
if_else(date >= date[survey_start == "Y" & year == 2010 & study == "site_1"][1] & date < date[month == 5 & year == 2011 & study == "site_1"][1], "survey_2010",
if_else(date >= date[survey_start == "Y" & year == 2009 & study == "site_1"][1] & date < date[month == 5 & year == 2010 & study == "site_1"][1], "survey_2009",
if_else(date >= date[survey_start == "Y" & year == 2008 & study == "site_1"][1] & date < date[month == 4 & year == 2009 & study == "site_1"][1], "survey_2008",
if_else(date >= date[survey_start == "Y" & year == 2007 & study == "site_1"][1] & date < date[month == 5 & year == 2008 & study == "site_1"][1], "survey_2007",""))))))))))
______________________________________________________________________
This e-mail contains information which is confidential. It is intended only for the use of the named recipient. If you have received this e-mail in error, please let us know by replying to the sender, and immediately delete it from your system. Please note, that in these circumstances, the use, disclosure, distribution or copying of this information is strictly prohibited. KEMRI-Wellcome Trust Programme cannot accept any responsibility for the accuracy or completeness of this message as it has been transmitted over a public network. Although the Programme has taken reasonable precautions to ensure no viruses are present in emails, it cannot accept responsibility for any loss or damage arising from the use of the email or attachments. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of KEMRI-Wellcome Trust Programme.
______________________________________________________________________
[[alternative HTML version deleted]]
______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________________________________

This e-mail contains information which is confidential. It is intended only for the use of the named recipient. If you have received this e-mail in error, please let us know by replying to the sender, and immediately delete it from your system. Please note, that in these circumstances, the use, disclosure, distribution or copying of this information is strictly prohibited. KEMRI-Wellcome Trust Programme cannot accept any responsibility for the accuracy or completeness of this message as it has been transmitted over a public network. Although the Programme has taken reasonable precautions to ensure no viruses are present in emails, it cannot accept responsibility for any loss or damage arising from the use of the email or attachments. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of KEMRI-Wellcome Trust Programme.
______________________________________________________________________
______________________________________________
R-***@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide comme
S Ellison
2018-02-26 11:56:39 UTC
Permalink
That many ifelse statements is obviously rather a pain.

Would you not have got what you want with

... paste("survey", year, sep="_")
?

If that is not what you're looking for (eg because 'year' is the observation year and not the study start year), perhaps something that picks the minimum year for a subject or other relevant group might work? For example
paste("survey", ave(year, studyno, FUN=min), sep="_")


S Ellison
-----Original Message-----
Wamae
Sent: 21 February 2018 20:34
Subject: [R] alternative for multiple if_else statements
Hi, I am having trouble trying to figure out why if_else is behaving the way it is,
it may be my code or the way the data is structured.
Below is a snapshot of a database am working on and it represents a
longitudinal survey of study participants in a trial with weekly follow up.
The variable "survey_start" represents the start of the study-defined one year
follow up (which we called "survey_year").
I am trying to populate all subsequent entries for each participant, per survey
year, with the entry "survey" followed by an underscore and the respective
year, eg. survey_2014.
There are missing entries such as the participant represented here, wasn't
available at the start of the 2015 survey. Also, some participants don’t have
complete one-year follow ups but I still need to include them.
I have written two codes, first one fails while the second works, the only
difference being I have reversed the order in which the entries are populated in
the second code (from 2007-2016 to 2016-2007) and removed the if_else
statement for 2015. Also noticed, that for the second code, which spans the
years 2007-2016 (less 2015), if a participants entries start from 2010-2016, the
code fails.
Kindly assist in figuring this out...or better yet, an alternative.
trialData <- structure(list(study = c("site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1"), studyno = c("child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1"), date = structure(c(16078, 16085, 16092, 16098, 16104, 16115,
16121, 16129, 16135, 16140, 16146, 16156, 16162, 16168, 16177, 16185,
16191, 16195, 16203, 16210, 16217, 16225, 16234, 16237, 16246, 16253,
16262, 16269, 16278, 16283, 16288, 16297, 16304, 16311, 16319, 16326,
16332, 16337, 16346, 16353, 16360, 16366, 16370, 16381, 16384, 16395,
16399, 16407, 16415, 16422, 16444, 16452, 16454, 16467, 16474, 16477,
16484, 16490, 16501, 16508, 16514, 16520, 16529, 16533, 16539, 16550,
16556, 16564, 16566, 16578, 16582, 16593, 16599, 16604, 16613, 16620,
16623, 16635, 16636, 16654, 16660, 16666, 16673, 16681, 16688, 16693,
16702, 16706, 16714, 16721, 16728, 16734, 16745, 16749, 16757, 16764,
16769, 16778, 16785, 16792, 16805, 16812, 16819, 16830, 16832, 16839,
16846, 16856, 16862, 16867, 16877, 16884, 16890, 16898, 16904, 16912,
16917, 16923, 16936, 16938, 16953, 16960, 16966, 16973, 16980), class =
"Date"), year = c(2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L), month = c(1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L,
7L, 7L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L,
11L, 12L, 12L, 12L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L,
5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 10L,
10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 12L, 12L, 12L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L), survey_start =
c("", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "Y", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "Y", "", "", "", "", "", "", "", "", "", "", "", "", "", "")), class =
"data.frame", row.names = c(NA, -125L), .Names = c("study", "studyno", "date",
"year", "month",
"survey_start"))
trialData <- trialData %>% arrange(studyno, date) %>% group_by(studyno) %>%
mutate(survey_year = if_else(date >= date[survey_start == "Y" & year == 2007
& study == "site_1"][1] & date < date[month == 5 & year == 2008 & study ==
"site_1"][1], "survey_2007",
if_else(date >= date[survey_start == "Y" & year == 2008 & study
== "site_1"][1] & date < date[month == 4 & year == 2009 & study ==
"site_1"][1], "survey_2008",
if_else(date >= date[survey_start == "Y" & year == 2009 & study
== "site_1"][1] & date < date[month == 5 & year == 2010 & study ==
"site_1"][1], "survey_2009",
if_else(date >= date[survey_start == "Y" & year == 2010 & study
== "site_1"][1] & date < date[month == 5 & year == 2011 & study ==
"site_1"][1], "survey_2010",
if_else(date >= date[survey_start == "Y" & year == 2011 & study
== "site_1"][1] & date < date[month == 4 & year == 2012 & study ==
"site_1"][1], "survey_2011",
if_else(date >= date[survey_start == "Y" & year == 2012 & study
== "site_1"][1] & date < date[month == 4 & year == 2013 & study ==
"site_1"][1], "survey_2012",
if_else(date >= date[survey_start == "Y" & year == 2013 & study
== "site_1"][1] & date < date[month == 4 & year == 2014 & study ==
"site_1"][1], "survey_2013",
if_else(date >= date[survey_start == "Y" & year == 2014 & study
== "site_1"][1] & date < date[month == 4 & year == 2015 & study ==
"site_1"][1], "survey_2014",
if_else(date >= date[survey_start == "Y" & year == 2015 & study
== "site_1"][1] & date < date[month == 3 & year == 2016 & study ==
"site_1"][1], "survey_2015",
if_else(date >= date[survey_start == "Y" & year == 2016 & study
== "site_1"][1], "survey_2016","")))))))))))
trialData <- trialData %>% arrange(studyno, date) %>% group_by(studyno) %>%
mutate(survey_year = if_else(date >= date[survey_start == "Y" & year == 2016
& study == "site_1"][1] , "survey_2016",
if_else(date >= date[survey_start == "Y" & year == 2014 &
study == "site_1"][1] & date < date[month == 4 & year == 2015 & study ==
"site_1"][1], "survey_2014",
if_else(date >= date[survey_start == "Y" & year == 2013 &
study == "site_1"][1] & date < date[month == 4 & year == 2014 & study ==
"site_1"][1], "survey_2013",
if_else(date >= date[survey_start == "Y" & year == 2012 &
study == "site_1"][1] & date < date[month == 4 & year == 2013 & study ==
"site_1"][1], "survey_2012",
if_else(date >= date[survey_start == "Y" & year == 2011 &
study == "site_1"][1] & date < date[month == 4 & year == 2012 & study ==
"site_1"][1], "survey_2011",
if_else(date >= date[survey_start == "Y" & year == 2010 &
study == "site_1"][1] & date < date[month == 5 & year == 2011 & study ==
"site_1"][1], "survey_2010",
if_else(date >= date[survey_start == "Y" & year == 2009 &
study == "site_1"][1] & date < date[month == 5 & year == 2010 & study ==
"site_1"][1], "survey_2009",
if_else(date >= date[survey_start == "Y" & year == 2008 &
study == "site_1"][1] & date < date[month == 4 & year == 2009 & study ==
"site_1"][1], "survey_2008",
if_else(date >= date[survey_start == "Y" & year == 2007 &
study == "site_1"][1] & date < date[month == 5 & year == 2008 & study ==
"site_1"][1], "survey_2007",""))))))))))
________________________________________________________________
______
This e-mail contains information which is confidential. It is intended only for the
use of the named recipient. If you have received this e-mail in error, please let
us know by replying to the sender, and immediately delete it from your system.
Please note, that in these circumstances, the use, disclosure, distribution or
copying of this information is strictly prohibited. KEMRI-Wellcome Trust
Programme cannot accept any responsibility for the accuracy or completeness
of this message as it has been transmitted over a public network. Although the
Programme has taken reasonable precautions to ensure no viruses are present
in emails, it cannot accept responsibility for any loss or damage arising from
the use of the email or attachments. Any views expressed in this message are
those of the individual sender, except where the sender specifically states them
to be the views of KEMRI-Wellcome Trust Programme.
________________________________________________________________
______
[[alternative HTML version deleted]]
______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
*******************************************************************
This email and any attachments are confidential. Any use, copying or
disclosure other than by the intended recipient is unauthorised. If
you have received this message in error, please notify the sender
immediately via +44(0)20 8943 7000 or notify ***@lgcgroup.com
and delete this message and any copies from your computer and network.
LGC Limited. Registered in England 2991879.
Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK
______________________________________________
R-***@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contai
Kevin Wamae
2018-02-26 13:12:46 UTC
Permalink
Dear Ellison, thank you for the feedback, we replaced dplyr::if_else with dplyr::case_when and it seems to do the trick.

Still, we have to write several statements to match all the respective years but it's working.

Let me see how we can implement your suggestion.

Regards
------------------
Kevin Wamae
On 26/02/2018, 14:57, "S Ellison" <***@LGCGroup.com> wrote:

That many ifelse statements is obviously rather a pain.

Would you not have got what you want with

... paste("survey", year, sep="_")
?

If that is not what you're looking for (eg because 'year' is the observation year and not the study start year), perhaps something that picks the minimum year for a subject or other relevant group might work? For example
paste("survey", ave(year, studyno, FUN=min), sep="_")


S Ellison
-----Original Message-----
Wamae
Sent: 21 February 2018 20:34
Subject: [R] alternative for multiple if_else statements
Hi, I am having trouble trying to figure out why if_else is behaving the way it is,
it may be my code or the way the data is structured.
Below is a snapshot of a database am working on and it represents a
longitudinal survey of study participants in a trial with weekly follow up.
The variable "survey_start" represents the start of the study-defined one year
follow up (which we called "survey_year").
I am trying to populate all subsequent entries for each participant, per survey
year, with the entry "survey" followed by an underscore and the respective
year, eg. survey_2014.
There are missing entries such as the participant represented here, wasn't
available at the start of the 2015 survey. Also, some participants don’t have
complete one-year follow ups but I still need to include them.
I have written two codes, first one fails while the second works, the only
difference being I have reversed the order in which the entries are populated in
the second code (from 2007-2016 to 2016-2007) and removed the if_else
statement for 2015. Also noticed, that for the second code, which spans the
years 2007-2016 (less 2015), if a participants entries start from 2010-2016, the
code fails.
Kindly assist in figuring this out...or better yet, an alternative.
trialData <- structure(list(study = c("site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1", "site_1",
"site_1"), studyno = c("child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1", "child_1", "child_1", "child_1", "child_1", "child_1", "child_1",
"child_1"), date = structure(c(16078, 16085, 16092, 16098, 16104, 16115,
16121, 16129, 16135, 16140, 16146, 16156, 16162, 16168, 16177, 16185,
16191, 16195, 16203, 16210, 16217, 16225, 16234, 16237, 16246, 16253,
16262, 16269, 16278, 16283, 16288, 16297, 16304, 16311, 16319, 16326,
16332, 16337, 16346, 16353, 16360, 16366, 16370, 16381, 16384, 16395,
16399, 16407, 16415, 16422, 16444, 16452, 16454, 16467, 16474, 16477,
16484, 16490, 16501, 16508, 16514, 16520, 16529, 16533, 16539, 16550,
16556, 16564, 16566, 16578, 16582, 16593, 16599, 16604, 16613, 16620,
16623, 16635, 16636, 16654, 16660, 16666, 16673, 16681, 16688, 16693,
16702, 16706, 16714, 16721, 16728, 16734, 16745, 16749, 16757, 16764,
16769, 16778, 16785, 16792, 16805, 16812, 16819, 16830, 16832, 16839,
16846, 16856, 16862, 16867, 16877, 16884, 16890, 16898, 16904, 16912,
16917, 16923, 16936, 16938, 16953, 16960, 16966, 16973, 16980), class =
"Date"), year = c(2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L,
2015L, 2015L, 2015L, 2015L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L), month = c(1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L,
7L, 7L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L,
11L, 12L, 12L, 12L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L,
5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 10L,
10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 12L, 12L, 12L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L), survey_start =
c("", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "Y", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
"", "", "", "", "", "", "", "Y", "", "", "", "", "", "", "", "", "", "", "", "", "", "")), class =
"data.frame", row.names = c(NA, -125L), .Names = c("study", "studyno", "date",
"year", "month",
"survey_start"))
trialData <- trialData %>% arrange(studyno, date) %>% group_by(studyno) %>%
mutate(survey_year = if_else(date >= date[survey_start == "Y" & year == 2007
& study == "site_1"][1] & date < date[month == 5 & year == 2008 & study ==
"site_1"][1], "survey_2007",
if_else(date >= date[survey_start == "Y" & year == 2008 & study
== "site_1"][1] & date < date[month == 4 & year == 2009 & study ==
"site_1"][1], "survey_2008",
if_else(date >= date[survey_start == "Y" & year == 2009 & study
== "site_1"][1] & date < date[month == 5 & year == 2010 & study ==
"site_1"][1], "survey_2009",
if_else(date >= date[survey_start == "Y" & year == 2010 & study
== "site_1"][1] & date < date[month == 5 & year == 2011 & study ==
"site_1"][1], "survey_2010",
if_else(date >= date[survey_start == "Y" & year == 2011 & study
== "site_1"][1] & date < date[month == 4 & year == 2012 & study ==
"site_1"][1], "survey_2011",
if_else(date >= date[survey_start == "Y" & year == 2012 & study
== "site_1"][1] & date < date[month == 4 & year == 2013 & study ==
"site_1"][1], "survey_2012",
if_else(date >= date[survey_start == "Y" & year == 2013 & study
== "site_1"][1] & date < date[month == 4 & year == 2014 & study ==
"site_1"][1], "survey_2013",
if_else(date >= date[survey_start == "Y" & year == 2014 & study
== "site_1"][1] & date < date[month == 4 & year == 2015 & study ==
"site_1"][1], "survey_2014",
if_else(date >= date[survey_start == "Y" & year == 2015 & study
== "site_1"][1] & date < date[month == 3 & year == 2016 & study ==
"site_1"][1], "survey_2015",
if_else(date >= date[survey_start == "Y" & year == 2016 & study
== "site_1"][1], "survey_2016","")))))))))))
trialData <- trialData %>% arrange(studyno, date) %>% group_by(studyno) %>%
mutate(survey_year = if_else(date >= date[survey_start == "Y" & year == 2016
& study == "site_1"][1] , "survey_2016",
if_else(date >= date[survey_start == "Y" & year == 2014 &
study == "site_1"][1] & date < date[month == 4 & year == 2015 & study ==
"site_1"][1], "survey_2014",
if_else(date >= date[survey_start == "Y" & year == 2013 &
study == "site_1"][1] & date < date[month == 4 & year == 2014 & study ==
"site_1"][1], "survey_2013",
if_else(date >= date[survey_start == "Y" & year == 2012 &
study == "site_1"][1] & date < date[month == 4 & year == 2013 & study ==
"site_1"][1], "survey_2012",
if_else(date >= date[survey_start == "Y" & year == 2011 &
study == "site_1"][1] & date < date[month == 4 & year == 2012 & study ==
"site_1"][1], "survey_2011",
if_else(date >= date[survey_start == "Y" & year == 2010 &
study == "site_1"][1] & date < date[month == 5 & year == 2011 & study ==
"site_1"][1], "survey_2010",
if_else(date >= date[survey_start == "Y" & year == 2009 &
study == "site_1"][1] & date < date[month == 5 & year == 2010 & study ==
"site_1"][1], "survey_2009",
if_else(date >= date[survey_start == "Y" & year == 2008 &
study == "site_1"][1] & date < date[month == 4 & year == 2009 & study ==
"site_1"][1], "survey_2008",
if_else(date >= date[survey_start == "Y" & year == 2007 &
study == "site_1"][1] & date < date[month == 5 & year == 2008 & study ==
"site_1"][1], "survey_2007",""))))))))))
________________________________________________________________
______
This e-mail contains information which is confidential. It is intended only for the
use of the named recipient. If you have received this e-mail in error, please let
us know by replying to the sender, and immediately delete it from your system.
Please note, that in these circumstances, the use, disclosure, distribution or
copying of this information is strictly prohibited. KEMRI-Wellcome Trust
Programme cannot accept any responsibility for the accuracy or completeness
of this message as it has been transmitted over a public network. Although the
Programme has taken reasonable precautions to ensure no viruses are present
in emails, it cannot accept responsibility for any loss or damage arising from
the use of the email or attachments. Any views expressed in this message are
those of the individual sender, except where the sender specifically states them
to be the views of KEMRI-Wellcome Trust Programme.
________________________________________________________________
______
[[alternative HTML version deleted]]
______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
*******************************************************************
This email and any attachments are confidential. Any use, copying or
disclosure other than by the intended recipient is unauthorised. If
you have received this message in error, please notify the sender
immediately via +44(0)20 8943 7000 or notify ***@lgcgroup.com
and delete this message and any copies from your computer and network.
LGC Limited. Registered in England 2991879.
Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK



______________________________________________________________________

This e-mail contains information which is confidential. It is intended only for the use of the named recipient. If you have received this e-mail in error, please let us know by replying to the sender, and immediately delete it from your system. Please note, that in these circumstances, the use, disclosure, distribution or copying of this information is strictly prohibited. KEMRI-Wellcome Trust Programme cannot accept any responsibility for the accuracy or completeness of this message as it has been transmitted over a public network. Although the Programme has taken reasonable precautions to ensure no viruses are present in emails, it cannot accept responsibility for any loss or damage arising from the use of the email or attachments. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of KEMRI-Wellcome Trust Programme.
______________________________________________________________________
______________________________________________
R-***@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-

Continue reading on narkive:
Loading...