2

Totally new to R and I trying to solve this using the dplyr package. I want to filter out and return countries that both have Import and Export values and view them separately. I tried a lot of methods such as select and filter but have been unable to do so.

Country Year Quantity Description Import/Export A 2001 10 Frozen Export B 2001 50 Fresh Import B 2004 20 Frozen Export C 2003 30 Frozen Import C 2005 40 Fresh Export C 2006 60 Frozen Import D 2007 290 Fresh Import 

Ideally, the end result should be this:

Country Year Quantity Description Import/Export B 2001 50 Fresh Import B 2004 20 Frozen Export C 2003 30 Frozen Import C 2005 40 Fresh Export C 2006 60 Frozen Import 
0

4 Answers 4

3

We can group_by() Country, then filter all groups with any "Import/Export" =='Import' and any ""Import/Export"" == 'Export'

library(dplyr) df %>% group_by(Country) %>% filter(any(`Import/Export`=='Import') & any(`Import/Export`=='Export')) %>% ungroup() # A tibble: 5 x 5 Country Year Quantity Description `Import/Export` <chr> <dbl> <dbl> <chr> <chr> 1 B 2001 50 Fresh Import 2 B 2004 20 Frozen Export 3 C 2003 30 Frozen Import 4 C 2005 40 Fresh Export 5 C 2006 60 Frozen Import 

data

structure(list(Country = c("A", "B", "B", "C", "C", "C", "D"), Year = c(2001, 2001, 2004, 2003, 2005, 2006, 2007), Quantity = c(10, 50, 20, 30, 40, 60, 290), Description = c("Frozen", "Fresh", "Frozen", "Frozen", "Fresh", "Frozen", "Fresh"), `Import/Export` = c("Export", "Import", "Export", "Import", "Export", "Import", "Import" )), row.names = c(NA, -7L), class = c("tbl_df", "tbl", "data.frame" )) 
Sign up to request clarification or add additional context in comments.

5 Comments

Amazing!! What would be the benefit of presenting the data as shown in the 2nd part?
what 2nd part??
structure(list(Country = c("A", "B", "B", "C", "C", "C", "D"), Year = c(2001, 2001, 2004, 2003, 2005, 2006, 2007), Quantity = c(10, 50, 20, 30, 40, 60, 290), Description = c("Frozen", "Fresh", "Frozen", "Frozen", "Fresh", "Frozen", "Fresh"), Import/Export = c("Export", "Import", "Export", "Import", "Export", "Import", "Import" )), row.names = c(NA, -7L), class = c("tbl_df", "tbl", "data.frame" ))
this is your data in a reproducible form. If you run that code, it gets us your dataframe. It makes it easier to share the data. You can get the same code with dput(data)
You should always share your data this way, it makes it much easier for others to test the answers and manipulate your data.
3

Using data from @GuedesBF answer here is another dplyr way to filter groups which has both 'Import' and 'Export'.

library(dplyr) df %>% group_by(Country) %>% filter(all(c('Import', 'Export') %in% `Import/Export`)) %>% ungroup() # Country Year Quantity Description `Import/Export` # <chr> <dbl> <dbl> <chr> <chr> #1 B 2001 50 Fresh Import #2 B 2004 20 Frozen Export #3 C 2003 30 Frozen Import #4 C 2005 40 Fresh Export #5 C 2006 60 Frozen Import 

Comments

1

Using data.table

library(data.table) setDT(df)[df[, .I[all(c('Import', 'Export') %in% `Import/Export`)], Country]$V1] Country Year Quantity Description Import/Export 1: B 2001 50 Fresh Import 2: B 2004 20 Frozen Export 3: C 2003 30 Frozen Import 4: C 2005 40 Fresh Export 5: C 2006 60 Frozen Import 

Comments

0
library(tidyverse) data <- tribble( ~Country, ~Year, ~Quantity, ~Description, ~`Import/Export`, "A", 2001, 10, "Frozen", "Export", "B", 2001, 50, "Fresh", "Import", "B", 2004, 20, "Frozen", "Export", "C", 2003, 30, "Frozen", "Import", "C", 2005, 40, "Fresh", "Export", "C", 2006, 60, "Frozen", "Import", "D", 2007, 290, "Fresh", "Import" ) data #> # A tibble: 7 x 5 #> Country Year Quantity Description `Import/Export` #> <chr> <dbl> <dbl> <chr> <chr> #> 1 A 2001 10 Frozen Export #> 2 B 2001 50 Fresh Import #> 3 B 2004 20 Frozen Export #> 4 C 2003 30 Frozen Import #> 5 C 2005 40 Fresh Export #> 6 C 2006 60 Frozen Import #> 7 D 2007 290 Fresh Import selected_countries <- data %>% mutate(is_there = TRUE) %>% distinct(Country, `Import/Export`, is_there) %>% pivot_wider(names_from = "Import/Export", values_from = is_there) %>% filter(!is.na(Export) & !is.na(Import)) %>% pull(Country) %>% unique() selected_countries #> [1] "B" "C" data %>% filter(Country %in% selected_countries) #> # A tibble: 5 x 5 #> Country Year Quantity Description `Import/Export` #> <chr> <dbl> <dbl> <chr> <chr> #> 1 B 2001 50 Fresh Import #> 2 B 2004 20 Frozen Export #> 3 C 2003 30 Frozen Import #> 4 C 2005 40 Fresh Export #> 5 C 2006 60 Frozen Import 

Created on 2021-09-11 by the reprex package (v2.0.1)

2 Comments

Love it! What is the purpose of adding a column?
It's to indicate that the property is "there". Pivot wider requires a name and a value column. After the pivot, we have all the info need in one row per country.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.