2

How can I convert pandas object to h2o dataframe safely?

import h2o import pandas as pd df = pd.DataFrame({'col1': [1,1,2], 'col2': ['César Chávez Day', 'César Chávez Day', 'César Chávez Day']}) hf = h2o.H2OFrame(df) #gives error 

UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 4: ordinal not in range(128)

Environment: Python 3.5, h2o 3.10.4.2

1
  • you're steps are correct the error you are getting is unrelated to h2o, try look up the error message itself, or looking at this stackoverflow question stackoverflow.com/questions/9942594/… Commented Mar 29, 2017 at 18:55

1 Answer 1

7

I agree that this is not an H2O-specific issue. This works for me (same H2O and Python version):

import h2o import pandas as pd df = pd.DataFrame({'col1': [1,1,2], 'col2': ['César Chávez Day', 'César Chávez Day', 'César Chávez Day']}) hf = h2o.H2OFrame(df) ## -- End pasted text -- Parse progress: |█████████████████████████████████████████████████████████| 100% In [4]: hf Out[4]: col1 col2 ------ ---------------- 1 César Chávez Day 1 César Chávez Day 2 César Chávez Day [3 rows x 2 columns] In [5]: type('César Chávez Day') Out[5]: str 

My specs (you may need to change your default encoding):

In [6]: import sys In [7]: sys.getdefaultencoding() Out[7]: 'utf-8' 

This thread may help: How do I check if a string is unicode or ascii?

Sign up to request clarification or add additional context in comments.

4 Comments

I have already default encoding equal to 'utf-8'. What else can be there?
Just checking, yourtype('César Chávez Day') is "str"?
Maybe try this (might be redundant though): df = pd.DataFrame({'col1': [1,1,2], 'col2': [str('César Chávez Day'), str('César Chávez Day'), str('César Chávez Day')]})
I understand that is strange, but type('César Chávez Day') = str and your second advice also causes the same unicode erorr

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.