Please consider the following data model:
data Artist = Artist Text data Song = Song Artist Text data Catalogue = Catalogue (Set Artist) (Set Song) You can see that the Artists are referred to from both the Songs and the Catalogue. The Catalogue contains a list of all artists referred to from Songs, so the same values of Artist get referred to from two places.
Suppose we were to generate the Catalogue value using multiple applications of the following function:
insertSong :: Song -> Catalogue -> Catalogue insertSong song@(Song artist title) (Catalogue artists songs) = Catalogue (Set.insert artist artists) (Set.insert song songs) It's evident that the Catalogue would get filled by references to the same values of Artist as the Songs refer to, thus saving the memory by not storing the copies of those values.
The problem is that when I try to recreate the catalogue from serialized data by separately deserializing a set of artists and a set of songs, the application occupies way more memory than when it generated the same value of Catalogue with insertSong. I suspect that it is caused by the lost relation between same Artists referred to from Songs and the Catalogue, which is why I get copies of values of Artist occupying extra memory.
The only solution I see is to first deserialize the set of artists and then to deserialize the set of songs while forcefully replacing the values of Artist with the ones from the first set.
So my questions are:
- Am I right in my suspicion?
- Will the solution I see work?
- Are there any better ways to solve this?