How to add jpg images information as a column in a data frame [closed]

Question

I have jpg images stored in a folder. For ex: 11_lion_king.jpg,22_avengers.jpg etc.

I have a data frame as below:

data_movie.head() movie_id genre 11 ['action','comedy] 22 ['animation',comedy] ..........

I want to add a new column movie_image into the data_movie data frame with the jpg information mapped correctly with movie_id column as shown below:

movie_id genre movie_image 11 ['action','comedy] 11_lion_king.jpg 22 ['animation',comedy] 22_avengers.jpg .........

Help will be appreciated.

n1k31t4 · Accepted Answer · 2020-04-05 18:39:25Z

I assume you a list of the filenames called movie_images

# Could get filenames with: # import os; movie_images = os.listdir("./folder/with/images/") movie_filenames = ["11_lion_king.jpg", "22_avengers.jpg"]

First create a mapping between the ID values and the filenames:

# Use the "_" to split the filename and take the first items, the ID mapping = {f.split("_")[0]: f for f in movie_filenames} # <-- a dictionary-comprehension

Now add a column of some empty values (whatever you like) that will hold the movie_image values:

data_movie["movie_image"] = pd.Series() # will be filled with NaN values until populated

Now iterate over this mapping, inserting the movie filenames for the correct movie IDs:

for movie_id, movie_image_filename in mapping.items(): df.loc[df.movie_id == movie_id, "movie_image"] = movie_image_filename

This should produce the output dataframe you described.

As a side note (in case you are ever tempted): never load the actual images into a pandas dataframe. It is best to load them as NumPy arrays or something similar. Pandas DataFrames are in essence just annotated NumPy arrays anyway.

Nishant · Accepted Answer · 2020-04-05 20:17:39Z

Slight addendum to the above solution:

##First create a mapping between the ID values and the filenames: # Use the "_" to split the filename and take the first items, the ID mapping = {f.split("_")[0]: f for f in movie_filenames} # <-- a dictionary-comprehension ##Now iterate over this mapping, inserting the movie filenames for the correct movie IDs: for movie_id, movie_image_filename in mapping.items(): data_movie.loc[data_movie.movie_id.astype(str) == movie_id, "movie_image"] = movie_filenames

Aliter way usingmap function:

mapping = {f.split("_")[0]: f for f in movie_filenames} data_movie["movie_image"] = data_movie['movie_id'].astype(str).map(mapping)

Stack Exchange Network

How to add jpg images information as a column in a data frame [closed]

2 Answers 2

Hot Network Questions

How to add jpg images information as a column in a data frame [closed]

2 Answers 2

Related

Hot Network Questions