I increment unread_messages_count for all chat members who are not currently in that chat.
This optimises the app for inactive (non-)users, affecting the experience of actually present, online, active users. Looking at it from that angle should make it obvious you want the opposite.
Precomputed counts (tally tables) aren't inherently a bad idea, just not necessarily the way you do it, or for this particular use case. It's great to have those already prepared for when a user comes back, but it should stay out of the way of active users' traffic. Also, consider how much lag that really saves you - if it isn't perceivably faster than checking on demand, it's not worth it.
Is this a good approach for implementing unread message counters
You didn't really show or describe your implementation in enough detail. There's nothing wrong with it conceptually, it's done all the time. If "message triggers multiple updates" means actual triggers hold back every insert until it updates the counts in chat_members, that's bad.
is there a more efficient / standard pattern for this kind of feature?
avoid updating many rows on every new message while still being able to show an accurate unread counter
As already proposed by @deceze♦: just don't do it.
You only really need it when a user comes back. Save a bookmark where they left off and count whatever appeared after that.
create table chat_members ( id serial primary key, user_id int not null references users(id), chat_id int not null references chats(id), last_read_message_id int not null references messages(id) );
select chat_id, count(*) from users as u join chat_members as cm on cm.user_id=u.id join messages as m using(chat_id) where user_id = :userid and m.id > cm.last_read_message_id --also dictates our min(m.sent_at) group by chat_id order by min(m.sent_at)--fetch most recently visited threads first limit 10;--you can fetch the rest after loading the initial view
Problem with that is, if you keep maintaining last_read_message_id the same way you maintained unread_messages_count, it just shifts and spreads the problem, or even risks amplifying it. You'd trade one big update of N users per incoming message, for N separate updates, one for each user, each read, and those updates also get in the way of online, active readers. Before, offline participants slowed down those online, now everyone online would slow down everyone else online.
You need to decouple one from the other:
- Collect the
last_read_message_id when the user leaves a given chat (in a detectible way closes, minimises, switches away from it). If it's hard to detect they're leaving the currently active chat, it can be dictated by scroll and view time sensitivity, reported in intervals or batches, or even assumed based on received heartbeats - you can simply move it up to the timestamp of the last signal from them. Point is not to move their read head on each message individually - the specifics are up to how smart you want to be with your strategy. Which, by the way, could be adjustible to your current load. - Keep it entirely app-side - each time there's a push/publish/notify event, the app can increment the given chat's counter internally. If its memory/cache gets wiped, you can re-establish some estimates based on last login timestamp. Multiple clients of the same user need to sync up, so this
last_read_message_id "read head" is still useful. - Even if you move it app-side, you might still want to see chats going cold, without having to fetch anything from potentially unreachable endpoints. You can make them periodically report that, which again justifies keeping track of their read head position.
- You probably need to paginate over long chats anyways, so you could always start by simply reporting back your pagination head, limiting the number of
last_read_message_id updates to one per page fetched. All first-time chat visits probably want to start at most recent - in those cases and whenever user finishes catching up, you're back to the heartbeats/batches. But all future re-visits, should start from last read and let the user scroll, fetching X messages in each page at a time, giving you an easy way to bundle up and reduce multiple updates down to one.
It could also make sense to differentiate between long unread and recent - a chat blowing up with 900 messages overnight isn't the same as one that accumulated 900 over a year since you've lost interest in it. On the other hand, an unresponded message from yesterday might be more urgent than a new one from a minute ago. Measuring relative to last_read_message_id you can figure out
- how many messages behind you are, but also
- over how long,
- how many different people spoke out (or were those just two people arguing publicly),
- how much text that really is (as opposed to emojis, reactions) etc,
Compared to the unread_messages_count, it just has more uses.
like other chat apps do
You've probably seen examples of chat bubbles not showing any numbers, showing numbers that you never really cared to check and see if they were really correct: using estimates is just more efficient.
For only a few messages, the user probably wants to know if there's anything new at all, not how much exactly - no numbers, just an indicator. If they're far behind, they probably want to know that too, but still not exactly by how much - that's a "20+". Removing the need for exact counters lets you use built-in pg_stat histograms to get them for free, in constant time, or periodically precompute your own from pg_cron, or loosen up how often you configure your clients to report their pagination heads, heartbeats etc.