1

I have a text file with a lot of values that are in scientific notation. However, instead of writing the scientific notations in terms of E (i.e. 2.0E-05), it is written in terms of D (i.e. 2.0D-05).

LATI LONGI AREA CO2 -0.5548999786D+01 0.3167600060D+02 0.1000000000D+07 0.1375607300D+08 -0.1823500061D+02 0.3668500137D+02 0.1000000000D+07 0.6878036500D+07 -0.6650000215D+00 0.2960499954D+02 0.7500000000D+06 0.5086381000D+07 -0.9671999931D+01 0.2264999962D+02 0.1000000000D+07 0.2657306000D+08 -0.1321700001D+02 0.4895299911D+02 0.6893938750D+06 0.8595105000D+07 -0.1152099991D+02 0.2493499947D+02 0.1000000000D+07 0.2615907200D+08 

How can I replace all the D's with E's?

Based on another stackoverflow answer, I wrote the following loop, but it's very slow and there is probably an easier way.

for ind in range(len(df_fires.LATI)): val = df_fires.LATI[ind] df_fires.LATI[ind] = float(val.replace('D','E')) val = df_fires.LONGI[ind] df_fires.LONGI[ind] = float(val.replace('D','E')) 

Example file: https://www.dropbox.com/s/5glujwqux6d0msh/test.txt?dl=0

2 Answers 2

2

Try sed to replace all D's with E's in the file. Do this before parsing the file with python.

sed -e 's:D:E:g' test.txt >> test_new.txt

If you want to keep this in python, try this solution https://stackoverflow.com/a/11332274/5196039

Sign up to request clarification or add additional context in comments.

Comments

1

You can use apply to apply your function to every element in the your column.

Not sure if it will be faster as I only have a small dataset but is definitely less code:

import pandas as pd columns = ['LATI', 'LONGI', 'AREA', 'CO2'] data = [['-0.5548999786D+01', '0.3167600060D+02', '0.1000000000D+07', '0.1375607300D+08'], ['-0.1823500061D+02', '0.3668500137D+02', '0.1000000000D+07', '0.6878036500D+07'], ['-0.6650000215D+00', '0.2960499954D+02', '0.7500000000D+06', '0.5086381000D+07'], ['-0.9671999931D+01', '0.2264999962D+02', '0.1000000000D+07', '0.2657306000D+08'], ['-0.1321700001D+02', '0.4895299911D+02', '0.6893938750D+06', '0.8595105000D+07'], ['-0.1152099991D+02', '0.2493499947D+02', '0.1000000000D+07', '0.2615907200D+08']] df = pd.DataFrame(columns=columns, data=data) for column_name in columns: df[column_name] = df[column_name].apply(lambda x: x.replace('D', 'E')) 

Output from df:

 LATI ... CO2 0 -0.5548999786E+01 ... 0.1375607300E+08 1 -0.1823500061E+02 ... 0.6878036500E+07 2 -0.6650000215E+00 ... 0.5086381000E+07 3 -0.9671999931E+01 ... 0.2657306000E+08 4 -0.1321700001E+02 ... 0.8595105000E+07 5 -0.1152099991E+02 ... 0.2615907200E+08 

4 Comments

I tried that, but I get AttributeError: ("'int' object has no attribute 'replace'", 'occurred at index DAY') at the updated_df line.
@shizishan Can you give us the data for the original dataframe in the question?
I added a link in the question. dropbox.com/s/5glujwqux6d0msh/test.txt?dl=0
I've updated my answer to only update the columns you have in the question. There may be columns which have int values in your dataset which is causing the issue you see as ints obviously don't have a replace method. You can use apply with the list of columns which you want to convert

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.