I want to find a way to recreate the same command code from R in Python using something similar as the dplyr package. In R I would do this:
library(dplyr) df <- data.frame(Countries=c('Brazil','Venezuela','Brazil, Colombia, Paraguay','Argentina','Peru','Andorra,Argentina,Chile,Uruguay'), Code=c(1,2,3,4,5,6)) df %>% filter(grepl('(Brazil|Argentina)',Countries)) Or even:
a=strsplit(as.character(df$Countries),',') a=lapply(a,FUN=function(t) gsub(" ","",t)) ele=unlist(lapply(a,FUN=function(t) any(t%in%c('Brazil','Argentina')))) (df[ele,]) The output that I want:
Countries Code 1 Brazil 1 2 Brazil, Colombia, Paraguay 3 3 Argentina 4 4 Argentina,Chile,Uruguay 6 In Python I've tried this:
import pandas as pd df = pd.DataFrame(dict(Countries=['Brazil','Venezuela','Brazil, Colombia, Paraguay','Argentina','Peru','Andorra,Argentina,Chile,Uruguay'], Code=[1,2,3,4,5,6])) list_=['Brazil','Argentina'] print(df.loc[df['Countries'].isin(list_)]) But the output looks like:
Countries Code 0 Brazil 1 3 Argentina 4