2
$\begingroup$

I'm building a system able to classify emails into different categories (positive, negative, out of office, etc...) and I'm looking for a dataset of already classified emails to avoid hand classification on a 70k database.

I know it exists the Enron email dataset but do you know if it exists a version of this dataset with classified emails ? Or any other already classified email dataset ?

$\endgroup$
1
  • 1
    $\begingroup$ I assume this is a question for the Open Data SE. $\endgroup$ Commented Nov 15, 2016 at 17:03

2 Answers 2

2
+50
$\begingroup$

You can download the corpus from this site. To the best of my knowledge this is the most complete email corpus available. A project to label a subset of this email corpus can be found on this UC Berkley site. I am not sure though whether these emails have the right training labels for you.

$\endgroup$
1
  • $\begingroup$ Even if categories aren't exactly what I'm looking for It's a really good working basis! I'll just wait one or two days before accepting your answer in case a better choice is given to me. $\endgroup$ Commented Nov 18, 2016 at 9:13
0
$\begingroup$

The Enron Corpus: A New Dataset for Email Classification Research paper describes the kind of data set you want.

The paper mentions the following link to download the data set:

https://www.cs.cmu.edu/~./enron/

Additionally, the paper also mentions various other papers which have used smaller data sets related to email classification which may not be of much use given this larger dataset.

$\endgroup$
3
  • $\begingroup$ No. This paper is about the number of mail per user, the number of message per thread and the number of folder per user but not about mail content classification. $\endgroup$ Commented Nov 15, 2016 at 15:58
  • $\begingroup$ @JérémyPouyet The paper does mention the task of classifying emails into folders. Please go through section 2 on page 3. $\endgroup$ Commented Nov 15, 2016 at 16:18
  • $\begingroup$ @JérémyPouyet I have added the link to the dataset from the paper. $\endgroup$ Commented Nov 15, 2016 at 16:35

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.