3

I've the following table

Owner Pet Housing_Type
A Cats;Dog;Rabbit 3
B Dog;Rabbit 2
C Cats 2
D Cats;Rabbit 3
E Cats;Fish 1

The code is as follows:

Data_Pets = structure(list(Owner = structure(1:5, .Label = c("A", "B", "C", "D", "E"), class = "factor"), Pets = structure(c(2L, 5L, 1L,4L, 3L), .Label = c("Cats ", "Cats;Dog;Rabbit", "Cats;Fish","Cats;Rabbit", "Dog;Rabbit"), class = "factor"), House_Type = c(3L,2L, 2L, 3L, 1L)), class = "data.frame", row.names = c(NA, -5L)) 

Can anyone advise me how I can create new columns based on the data in Pet column by creating a new column for each animal separated by ; to look like the following table?

Owner Cats Dog Rabbit Fish Housing_Type
A Y Y Y N 3
B N Y Y N 2
C N Y N N 2
D Y N Y N 3
E Y N N Y 1

Thanks!

2 Answers 2

2

One approach is to define a helper function that matches for a specific animal, then bind the columns to the original frame.

Note that some wrangling is done to get rid of whitespace to identify the unique animals to query.

f <- Vectorize(function(string, match) { ifelse(grepl(match, string), "Y", "N") }, c("match")) df %>% bind_cols( f(df$Pets, unique(unlist(strsplit(trimws(as.character(df$Pets)), ";")))) ) Owner Pets House_Type Cats Dog Rabbit Fish 1 A Cats;Dog;Rabbit 3 Y Y Y N 2 B Dog;Rabbit 2 N Y Y N 3 C Cats 2 Y N N N 4 D Cats;Rabbit 3 Y N Y N 5 E Cats;Fish 1 Y N N Y 

Or more generalized if you don't know for sure that the separator is ;, and whitespace is present, stringi is useful:

dplyr::bind_cols( df, f(df$Pets, unique(unlist(stringi::stri_extract_all_words(df$Pets)))) ) 
Sign up to request clarification or add additional context in comments.

Comments

1

You can use separate_rows and pivot_wider from tidyr library:

library(tidyr) library(dplyr) Data_Pets %>% separate_rows(Pets , sep = ";") %>% mutate(Pets = trimws(Pets)) %>% mutate(temp = row_number()) %>% pivot_wider(names_from = Pets, values_from = temp) %>% mutate(across(c(Cats:Fish), function(x) if_else(is.na(x), "N", "Y"))) %>% dplyr::relocate(House_Type, .after = Fish) 

which will generate:

# Owner Cats Dog Rabbit Fish House_Type # <fct> <chr> <chr> <chr> <chr> <int> # 1 A Y Y Y N 3 # 2 B N Y Y N 2 # 3 C Y N N N 2 # 4 D Y N Y N 3 # 5 E Y N N Y 1 

Data:

Data_Pets = structure(list(Owner = structure(1:5, .Label = c("A", "B", "C", "D", "E"), class = "factor"), Pets = structure(c(2L, 5L, 1L,4L, 3L), .Label = c("Cats ", "Cats;Dog;Rabbit", "Cats;Fish","Cats;Rabbit", "Dog;Rabbit"), class = "factor"), House_Type = c(3L,2L, 2L, 3L, 1L)), class = "data.frame", row.names = c(NA, -5L)) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.