It seems to me that the process you're following at the moment is this, which fails with your out of memory error:
- Create several data files
- Concatenate them together
- Sort the result, discarding duplicate records (rows)
I think you should be able to perform the following process instead
- Create several data files
- Sort each one independently, discarding its duplicates (
sort -u) - Merge the resulting set of sorted data files, discarding duplicates (
sort -m -u)