14

I am trying to parse a list of email addresses to remove the username and '@' symbol only leaving the domain name.

Example: [email protected] Desired output: gmail.com

I have accomplished this with the following code:

for row in cr: emailaddy = row[0] (emailuser, domain) = row[0].split('@') print domain 

but my issue is when I encounter a improperly formatted email address. For example if the row contains "aaaaaaaaa" (instead of a valid email address) the program crashes with the error

(emailuser, domain) = row[0].split('@') ValueError: need more than 1 value to unpack. 

(as you would expect) Rather than check all the email addresses for their validity, I would rather just not update grab the domain and move on to the next record. How can I properly handle this error and just move on?

So for the list of:

[email protected] [email protected] youououou [email protected] 

I would like the output to be:

gmail.com hotmail.com yahoo.com 

Thanks!

3
  • docs.python.org/tutorial/errors.html Commented Feb 28, 2012 at 5:54
  • 1
    Just to note, about the only safe way to validate an email is to send an email to the address asking the account holder to verify that they've received it. Commented Feb 28, 2012 at 10:48
  • Note that several @'s are perfectly fine by the email spec. So it's much more sensible to do rfind("@") or something.. Commented Feb 28, 2012 at 16:36

8 Answers 8

28

You want something like this?

try: (emailuser, domain) = row[0].split('@') except ValueError: continue 
Sign up to request clarification or add additional context in comments.

Comments

7

You can just filter out the address which does not contain @.

>>> [mail.split('@')[1] for mail in mylist if '@' in mail] ['gmail.com', 'hotmail.com', 'yahoo.com'] >>> 

Comments

4

What about

splitaddr = row[0].split('@') if len(splitaddr) == 2: domain = splitaddr[1] else: domain = '' 

This even handles cases like aaa@bbb@ccc and makes it invalid ('').

Comments

2

Try this

In [28]: b = ['[email protected]', '[email protected]', 'youououou', '[email protected]'] In [29]: [x.split('@')[1] for x in b if '@' in x] Out[29]: ['gmail.com', 'hotmail.com', 'yahoo.com'] 

Comments

2

This does what you want:

import re l=["[email protected]","[email protected]", "youououou","[email protected]","amy@[email protected]"] for e in l: if '@' in e: l2=e.split('@') print l2[-1] else: print 

Output:

gmail.com hotmail.com yahoo.com youso.com 

It handles the case where an email might have more than one '@' and just takes the RH of that.

Comments

1
if '@' in row[0]: user, domain = row[0].split('@') print domain 

Comments

1

Maybe the best solution is to avoid exception handling all together. You can do this by using the builtin function partition(). It is similar to split() but does not raise ValueError when the seperator is not found.
Read more:
https://docs.python.org/3/library/stdtypes.html#str.partition

Comments

0

We can consider the string not having '@' symbol, as a simple username:

try: (emailuser, domain) = row[0].split('@') print "Email User" + emailuser print "Email Domain" + domain except ValueError: emailuser = row[0] print "Email User Only" + emailuser O/P: Email User : abc Email Domain : gmail.com Email User : xyz Email Domain : gmail.com Email User Only : usernameonly 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.