0

Say this is my data.

mydat=structure(list(ItemRelation = c(158200L, 158204L), DocumentNum = c(1715L, 1715L), CalendarYear = c(2018L, 2018L), X1 = c(0L, 0L), X2 = c(0L, 0L), X3 = c(0L, 0L), X4 = c(NA, NA), X5 = c(107L, 105L), X6 = c(NA, NA)), .Names = c("ItemRelation", "DocumentNum", "CalendarYear", "X1", "X2", "X3", "X4", "X5", "X6"), class = "data.frame", row.names = c(NA, -2L)) 

How can I create the condition that if X6=NA, then replace NA by value of X5?

In this example, the desired output would be:

 ItemRelation DocumentNum CalendarYear X1 X2 X3 X4 X5 X6 1 158200 1715 2018 0 0 0 NA 107 107 2 158204 1715 2018 0 0 0 NA 105 105 
7
  • 4
    with(mydat, ifelse(is.na(X6), X5, X6)) Commented Aug 30, 2018 at 13:02
  • 1
    try ifelse like ifelse(is.na(X6), X5, X6) Commented Aug 30, 2018 at 13:02
  • 1
    @RonakShah Looks like an answer to me. If you post as comment instead of answer, it cannot be upvoted or downvoted, or accepted, no one looking at the question queue will see that an answer has been posted (and possibly accepted), etc.... Commented Aug 30, 2018 at 13:11
  • @RonakShah, your duplicated post did not help for me. But theforestecologist 's solution great works Commented Aug 30, 2018 at 13:18
  • 1
    @duckmayr Thanks, the reason I did not post it as an answer because this question seemed like a duplicate and I was finding one. :) Commented Aug 30, 2018 at 13:22

1 Answer 1

1

You can use sapply in base R:

mydat[,c("X5","X6")] <- with(mydat, sapply(mydat[8:9],function(x) ifelse(is.na(X6),X5,X6))) 

Giving the desired solution:

 ItemRelation DocumentNum CalendarYear X1 X2 X3 X4 X5 X6 1 158200 1715 2018 0 0 0 NA 107 107 2 158204 1715 2018 0 0 0 NA 105 105 

Explanation:

ifelse examines whether the X6 value for a given row is NA, and if so, selects the value of X5 from that row. If X6 is not NA, then just X6 is used.

sapply allows you to quickly apply this ifelse function to every row of your data.frame.

with changes the environment so that you're "within" your mydat object so that you can refer to its parts without using $ or [].

Sign up to request clarification or add additional context in comments.

3 Comments

How can this be a correct answer if the contents of X5 is copied over all other columns? Even ItemRelation and DocumentNum have been overwritten.
Your edit makes it worse. Column X6 is now replaced by two columns X6.X5 and X6.X6. This is caused by the unnecessary call to sapply(). IMHO, the correct base R solution would be mydat$X6 <- with(mydat, ifelse(is.na(X6),X5,X6)) but this has already been proposed by Ronak Shah.
@Uwe, I noticed that I totally misread things the first time, which is why I initially incorporated the sapply. I stuck with it since that's the approach that was "accepted" by the OP, but then I realized I also made a typo in my edits. It's now fixed. I agree that Ronak's answer is better. If he does not add it as an answer, I can incorporate it into mine.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.