5

I need to read columns of complex numbers in the format:

# index; (real part, imaginary part); (real part, imaginary part) 1 (1.2, 0.16) (2.8, 1.1) 2 (2.85, 6.9) (5.8, 2.2) 

NumPy seems great for reading in columns of data with only a single delimiter, but the parenthesis seem to ruin any attempt at using numpy.loadtxt().

Is there a clever way to read in the file with Python, or is it best to just read the file, remove all of the parenthesis, then feed it to NumPy?

This will need to be done for thousands of files so I would like an automated way, but maybe NumPy is not capable of this.

3 Answers 3

5

Here's a more direct way than @Jeff's answer, telling loadtxt to load it in straight to a complex array, using a helper function parse_pair that maps (1.2,0.16) to 1.20+0.16j:

>>> import re >>> import numpy as np >>> pair = re.compile(r'\(([^,\)]+),([^,\)]+)\)') >>> def parse_pair(s): ... return complex(*map(float, pair.match(s).groups())) >>> s = '''1 (1.2,0.16) (2.8,1.1) 2 (2.85,6.9) (5.8,2.2)''' >>> from cStringIO import StringIO >>> f = StringIO(s) >>> np.loadtxt(f, delimiter=' ', dtype=np.complex, ... converters={1: parse_pair, 2: parse_pair}) array([[ 1.00+0.j , 1.20+0.16j, 2.80+1.1j ], [ 2.00+0.j , 2.85+6.9j , 5.80+2.2j ]]) 

Or in pandas:

>>> import pandas as pd >>> f.seek(0) >>> pd.read_csv(f, delimiter=' ', index_col=0, names=['a', 'b'], ... converters={1: parse_pair, 2: parse_pair}) a b 1 (1.2+0.16j) (2.8+1.1j) 2 (2.85+6.9j) (5.8+2.2j) 
Sign up to request clarification or add additional context in comments.

6 Comments

I had never heard about pandas. Seems pretty useful. Thanks!
Running this exact example on my machine returns an error: "ValueError: complex() arg is a malformed string". Any explanation?
@JoeyDumont I just tried it again and it still worked with python 2.7.6 / numpy 1.7.1. What python / numpy versions are you using?
@JoeyDumont Actually, I just tried with python 3.3 and got a similar error. Tracking it down a bit, it seems to be that numpy is reading its inputs as bytes, and for some reason e.g. complex(b'1') fails with a TypeError whereas float(b'1') works just fine. This should be a problem for ever reading anything with loadtxt(..., dtype=complex); you can work around it by adding a converter for everything else of e.g. lambda x: complex(x.decode() if isinstance(x, bytes) else x). parse_pair will also need to get its input decoded to match it against a regex. Annoying....
I use Python 2.7.6 / numpy 1.8.0. I'm not sure of how to change parse_pair to try your solution.
|
4

Since this issue is still not resolved in pandas, let me add another solution. You could modify your DataFrame with a one-liner after reading it in:

import pandas as pd df = pd.read_csv('data.csv') df = df.apply(lambda col: col.apply(lambda val: complex(val.strip('()')))) 

Comments

2

If your file only has 5 columns like you've shown, you could feed it to pandas with a regex for conversion, replacing the parentheses with commas on every line. After that, you could combine them as suggested in this SO answer to get complex numbers.

Pandas makes it easier, because you can pass a regex to its read_csv method, which lets you write clearer code and use a converter like this. The advantage over the numpy version is that you can pass a regex for the delimiter.

import pandas as pd from StringIO import StringIO f_str = "1 (2, 3) (5, 6)\n2 (3, 4) (4, 8)\n3 (0.2, 0.5) (0.6, 0.1)" f.seek(0) def complex_converter(txt): txt = txt.strip("()").replace(", ", "+").replace("+-", "-") + "j" return complex(txt) df = pd.read_csv(buf, delimiter=r" \(|\) \(", converters = {1: complex_converter, 2: complex_converter}, index_col=0) 

EDIT: Looks like @Dougal came up with this just before I posted this...really just depends on how you want to handle the complex number. I like being able to avoid the explicit use of the re module.

3 Comments

I had never heard about pandas. Seems pretty useful. Thanks!
Just a note if anyone else looks at this for help, your complex_converter does not work if the imaginary part is negative. The complex method seems to not work for "1+-3j"
@f4hy fixed it to replace "+-" with "-". Works for negative imaginary numbers now :)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.