0

I'm quite new to Python and I'm encountering a problem.

I have a dataframe where one of the columns is the departure time of flights. These hours are given in the following format : 1100.0, 525.0, 1640.0, etc.

This is a pandas series which I want to transform into a datetime series such as : S = [11.00, 5.25, 16.40,...]

What I have tried already :

  • Transforming my objects into string :
S = [str(x) for x in S] 
  • Using datetime.strptime :
S = [datetime.strptime(x,'%H%M.%S') for x in S] 

But since they are not all the same format it doesn't work

  • Using parser from dateutil :
S = [parser.parse(x) for x in S] 

I got the error :

 'Unknown string format' 
  • Using the panda datetime :
S= pd.to_datetime(S) 

Doesn't give me the expected result

Thanks for your answers !

5
  • Why don't you split on the point and then check the string length? If it is 3, then do datetime (0,0,0,int(srting_before_point[0]),int(string_before_point[-2:]),after_point_string) and something similar if the length is 4. Commented Apr 3, 2019 at 16:47
  • can you show us an example that datetime.strptime(x,'%H%M.%S') does not work ? Commented Apr 3, 2019 at 16:49
  • @Nerdrigo Wouldn't it be far easier to just divide each element by 100? Commented Apr 3, 2019 at 16:50
  • So these are initially floating point numbers? Why not just do the simple math to separate out hours and minutes, and then pass those numbers to a datetime constructor. I don't see any reason to do string operations here. Commented Apr 3, 2019 at 16:50
  • Not a solution but a way to normalize the formats: s = [f'{x:07.2f}' for x in s]. That will make s = ['1100.00', '0525.00', '1640.00'] for s = [1100.0, 525.0, 1640.0]. Commented Apr 3, 2019 at 16:57

4 Answers 4

1

Since it's a columns within a dataframe (A series), keep it that way while transforming should work just fine.

S = [1100.0, 525.0, 1640.0] se = pd.Series(S) # Your column # se: 0 1100.0 1 525.0 2 1640.0 dtype: float64 setime = se.astype(int).astype(str).apply(lambda x: x[:-2] + ":" + x[-2:]) 

This transform the floats to correctly formatted strings:

0 11:00 1 5:25 2 16:40 dtype: object 

And then you can simply do:

df["your_new_col"] = pd.to_datetime(setime) 
Sign up to request clarification or add additional context in comments.

Comments

0

How about this?

(Added an if statement since some entries have 4 digits before decimal and some have 3. Added the use case of 125.0 to account for this)

from datetime import datetime

S = [1100.0, 525.0, 1640.0, 125.0]

for x in S: if str(x).find(".")==3: x="0"+str(x) print(datetime.strftime(datetime.strptime(str(x),"%H%M.%S"),"%H:%M:%S"))

Comments

0

You might give it a go as follows:

# Just initialising a state in line with your requirements st = ["1100.0", "525.0", "1640.0"] dfObj = pd.DataFrame(st) # Casting the string column to float dfObj_num = dfObj[0].astype(float) # Getting the hour representation out of the number df1 = dfObj_num.floordiv(100) # Getting the minutes df2 = dfObj_num.mod(100) # Moving the minutes on the right-hand side of the decimal point df3 = df2.mul(0.01) # Combining the two dataframes df4 = df1.add(df3) # At this point can cast to other types 

Result:

0 11.00 1 5.25 2 16.40 

You can run this example to verify the steps for yourself, also you can make it into a function. Make slight variations if needed in order to tweak it according to your precise requirements.

Might be useful to go through this article about Pandas Series. https://www.geeksforgeeks.org/python-pandas-series/

Comments

0

There must be a better way to do this, but this works for me.

df=pd.DataFrame([1100.0, 525.0, 1640.0], columns=['hour']) df['hour_dt']=((df['hour']/100).apply(str).str.split('.').str[0]+'.'+ df['hour'].apply((lambda x: '{:.2f}'.format(x/100).split('.')[1])).apply(str)) print(df) hour hour_dt 0 1100.0 11.00 1 525.0 5.25 2 1640.0 16.40 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.