Should I let objects, whose copying is "costly", be naively copyable?

Question

I'm devising an API - or actually, a wrapper API for another, lower-level API - in a programming language with objects.

This API represents some entity E which is reference-counted (for example - a hardware resource); but - the reference counting is by the underlying API, or by the OS kernel etc.; so there is an "increase ref count" and "decrease ref count" lower-level functions for these entities. Let's assume we consider the refcount increases/decreases as "expensive" actions.

So, I'm designing a class for these entities. An object of class E "holds" one unit of ref count, i.e. we increase the ref count on its construction and decrease it on its destruction.

I am now mulling over the reference ownership, copyability/assignability of this class. Let's assume it's copyable if-and-only-if it's assignable and discusss those two features as just "copyability". So, should such E objects be copyable, and how?

... with options I can think about:

Make class E naively copyable, supporting e_1 = e_2 statements; the copy code will increase the refcount of the underlying entity (for entity_1) by 1, i.e. do the expensive thing.
Make class E movable but not (naively) copyable, i.e. e_2 = e_1 would typically fail to compile; to actually copy an E and increase the reference count, we would have a E::clone() method.
Make class E support both owning-reference and non-owning-reference instances, i.e. possibly not increase and decrease the refcount, in which case an instance depends on another, owning, reference to live while being used. In this case, the naive assignment would create a non-owning reference e2 (even if e_1 is an owning reference), without touching the refcount.
Separate the notion of ownership from the class, so that E is always a non-owning reference, but we have an owner<T> class template, or generic class, which adds the ownership semantics and from which one can get non-owning E instances.

What are the pros and cons you find for each of these options?

Notes:

I used C++ for my pseudocode, and I am writing this in C++, but if you can answer this more generally, please do.
If you need more contextual information to provide an answer, please ask for it, or answer "if X then A, otherwise B".

What exactly does E model? Does it model a reference to an entity, meaning that if I create a copy, then it is natural that changes made to one are reflected in the other. Or does it model the entity itself, and changes made to a copy should not be made to the original. That is very fundamental in deciding to what extent creating copies should/can be supported and how. — Bart van Ingen Schenau
– Bart van Ingen Schenau, Commented Jan 17 at 8:17
"Make class E movable but not (naively) copyable" is the default behaviour in Rust. In fact Rust doesn't even have "non-naive" copies, and even those are disabled by default. And I strongly suggest you do that. — freakish
– freakish, Commented Jan 17 at 8:41
Of course this sounds like a circular problem: how do I implement .clone() then? Rust allows implicit copies of things that explicitly implement Copy trait (a.k.a. interface). This is just a marker, you don't actually implement Copy, copies are always naive in Rust (byte by byte). This includes primitive types like integers, bools, floats, etc. But vast majority of types are not Copy. Especially types that allocate stuff, like strings or vectors. — freakish
– freakish, Commented Jan 17 at 9:17
@einpoklum yes, I understand. What I'm saying is that implicit copies are just problematic. Especially in C++ which allows arbitrary code to run under x = y statement. It's just bad. And explicit .clone() calls are fine. Makes you think twice before doing that. In Rust you actually have Arc type (which stands for atomic reference count) which matches well your case: it keeps a ref count, which is bumped on .clone() call, and decreased when Arc instance goes out of scope (Rust also has RAII like C++). — freakish
– freakish, Commented Jan 17 at 9:23
Please check your assumptions about the cost of reference counting. By measuring :-). For example reference counting on Apple’s ARM implementation is quite cheap. Cheap enough that you wouldn’t worry about it. Cheap enough to use copy-on-demand in quite simple classes. (Which means if you modify an object with reference count >= 2 you need to create a copy). — gnasher729
– gnasher729, Commented Jan 17 at 11:13

candied_orange · Accepted Answer · 2025-01-18 21:31:51Z

1

Reference counting is for counting references. References that point to the same address. If your clone makes a deep copy to a new address, where state can vary independently and consume it's own memory then that copy needs it's own reference count. In RAII terms, each deep copy is its own resource.

There is a thing called instance counting where deep copies would be counted. It's not the same thing.

edited Jan 18 at 21:31

answered Jan 17 at 12:54

candied_orange

120k27 gold badges233 silver badges369 bronze badges

This is true when the reference counting is just for memory management (e.g. ObjC/Swift objects, Rust RC, C++ std::shared_ptr, etc.). But reference counting could apply to other more expensive resources, like inodes on a file system, resources in a kernel, or entries in a DB. Could be out-of-process or even over the network, so that would need special consideration.

Alexander
– Alexander

2025-01-18 17:56:55 +00:00
Commented Jan 18 at 17:56
@Alexander how would that ever make "refcount increases/decreases as "expensive" actions"? This sounds more like instance counting. Not the same thing.

candied_orange
– candied_orange

2025-01-18 18:35:02 +00:00
Commented Jan 18 at 18:35
Nope, I do mean reference counting. The question of "does anybody still care about this, or can I free/delete/erase it?" generalizes beyond just memory and just a single process. Inodes on a file system are reference counted: when you open a file, the kernel +1s an inode's rc on-disk to ensure it doesn't get deleted from under you, while you still have it open. E.g. this is why you can keep editing a file, even after its last hardlink has been removed from the file system tree. Same idea applies to DBs. Incrementing a reference count on a DB row usually involves making a network request.

Alexander
– Alexander

2025-01-18 18:48:09 +00:00
Commented Jan 18 at 18:48
@Alexander none of that is a deep copy. I think two different kinds of counting are being conflated here.

candied_orange
– candied_orange

2025-01-18 21:10:23 +00:00
Commented Jan 18 at 21:10
Oh I see what happened here. I got confused between what you wrote in the comment above (about RC being less expensive on Apple's M series chips) and what you wrote within this answer. My original point was about RC being cheap, which can be true for in-memory RC, but other kinds (e.g. FS, DB) still be expensive.

Alexander
– Alexander

2025-01-18 22:44:46 +00:00
Commented Jan 18 at 22:44

Add a comment |

Stack Exchange Network

Should I let objects, whose copying is "costly", be naively copyable?

1 Answer 1

Hot Network Questions

Should I let objects, whose copying is "costly", be naively copyable?

1 Answer 1

Related

Hot Network Questions