33

I have a question very similar to

How do I allocate a std::string on the stack using glibc's string implementation?

but I think it's worth asking again.

I want an std::string with local storage that overflows into the free store. std::basic_string provides an allocator as a template parameter, so it seems like the thing to do is to write an allocator with local storage and use it to parameterize the basic_string, like so:

std::basic_string< char, std::char_traits<char>, inline_allocator<char, 10> > x("test"); 

I tried to write the inline_allocator class that would work the way you'd expect: it reserves 10 bytes for storage, and if the basic_string needs more than 10 bytes, then it calls ::operator new(). I couldn't get it to work. In the course of executing the above line of code, my GCC 4.5 standard string library calls the copy constructor for inline_allocator 4 times. It's not clear to me that there's a sensible way to write the copy constructor for inline_allocator.

In the other StackOverflow thread, Eric Melski provided this link to a class in Chromium:

http://src.chromium.org/svn/trunk/src/base/stack_container.h

which is interesting, but it's not a drop-in replacement for std::string, because it wraps the std::basic_string in a container so that you have to call an overloaded operator->() to get at the std::basic_string.

I can't find any other solutions to this problem. Could it be that there is no good solution? And if that's true, then are the std::basic_string and std::allocator concepts badly flawed? I mean, it seems like this should be a very basic and simple use case for std::basic_string and std::allocator. I suppose the std::allocator concept is designed primarily for pools, but I think it ought to cover this as well.

It seems like the rvalue-reference move semantics in C++0x might make it possible to write inline_allocator, if the string library is re-written so that basic_string uses the move constructor of its allocator instead of the copy constructor. Does anyone know what the prospect is for that outcome?

My application needs to construct a million tiny ASCII strings per second, so I ended up writing my own fixed-length string class based on Boost.Array, which works fine, but this is still bothering me.

4
  • 4
    As someone who's wrestled with custom allocators in the past, I'm very interested in hearing an authority speak on this topic. Commented Mar 30, 2011 at 21:48
  • +1 for "free store". Oh, and because it's a good question. Commented Mar 30, 2011 at 22:51
  • 1
    The link to Generic: A Policy-Based basic_string Implementation is terrific discussion of basic_string storage. I still wonder, though, if it is possible to write the inline_allocator class in my question. A brief glance at Alexandresu's article doesn't tell me the answer, I'll have to spend some time with it. Commented Mar 30, 2011 at 23:51
  • The link to Chromium's stack_container.h is dead. Here's an updated one: src.chromium.org/viewvc/chrome/trunk/src/base/containers/… Commented Jul 31, 2013 at 23:56

4 Answers 4

17

Andrei Alexandrescu, C++ programmer extraordinaire who wrote "Modern C++ Design" once wrote a great article about building different string implementations with customizable storage systems. His article (linked here) describes how you can do what you've described above as a special case of a much more general system that can handle all sorts of clever memory allocation requirements. This doesn't talk so much about std::string and focuses more on a completely customized string class, but you might want to look into it as there are some real gems in the implementation.

Sign up to request clarification or add additional context in comments.

1 Comment

That's a great article by Alexandrescu, thanks. This is a very good answer to the question of how to customize storage for basic_string.
10

C++2011 is really going to help you here :)

The fact is that the allocator concept in C++03 was crippled. One of the requirement was that an allocator of type A should be able to deallocate memory from any other allocator from type A... Unfortunately this requirement is also at odds with stateful allocators each hooked to its own pool.

Howard Hinnant (who manages the STL subgroup of the C++ commitee and is implementing a new STL from scratch for C++0x) has explored stack-based allocators on his website, which you could get inspiration from.

3 Comments

That's very interesting. I looked at Hinnant's stack_alloc class and it works splendidly for std::vector but alas, not for std::basic_string. The problem is again, I believe, in the way std::basic_string uses the stack_alloc copy constructor.
And I think it is permitted to use the copy constructor that way because of the requirement you described.
@James: yes, unfortunately since the standard all but imply that the allocators are stateless (or at the very least that all allocators of the same type more or less share the same state, somehow), copies are perfectly allowed.
6

This is generally unnecessary. It's called the "short string optimization", and most implementations of std::string already include it. It may be hard to find, but it's usually there anyway.

Just for example, here's the relevant piece of sso_string_base.h that's part of MinGW:

 enum { _S_local_capacity = 15 }; union { _CharT _M_local_data[_S_local_capacity + 1]; size_type _M_allocated_capacity; }; 

The _M_local_data member is the relevant one -- space for it to store (up to) 15 characters (plus a NUL terminator) without allocating any space on the heap.

If memory serves, the Dinkumware library included with VC++ allocates space for 20 characters, though it's been a while since I looked, so I can't swear to that (and tracking down much of anything in their headers tends to be a pain, so I prefer to avoid looking if I can).

In any case, I'd give good odds that you've been engaged in that all-too-popular pass-time known as premature optimization.

7 Comments

VC++ allocates 16 bytes for the SSO, which in my opinion is too small to be generally useful as std::wstring is more commonly used in Windows code than std::string and that only allows for 8 characters in a std::wstring (or 7 if .c_str() is used frequently).
Doesn't the header belong to MinGW library extensions (not part of std::string)?
@UncleBens: I didn't think so, but I could be mistaken. There are also alternatives (e.g., STLPort, libc++) that definitely do use SSO in their implementations of std::string.
@ildjarn: Looking again, you're quite right, and I agree that is a bit on the small size. I still think it would be easier to modify it and re-compile the standard library than try to hack it on afterwards though.
@James Brock: Surely -- and I never doubted it. As for controlling the threshold, in most cases it's a matter of finding the enum in the header, modifying it as needed, and (possibly) re-compiling the standard library to match (but since it's a template, there's a good chance you only need to modify the header).
|
2

I believe the code from Chromium just wraps things into a nice shell. But you can get the same effect without using the Chromium wrapper container.

Because the allocator object gets copied so often, it needs to hold a reference or pointer to the memory. So what you'd need to do is create the storage buffer, create the allocator object, then call the std::string constructor with the allocator.

It will be a lot wordier than using the wrapper class but should get the same effect.

You can see an example of the verbose method (still using the chromium stuff) in my question about stack vectors.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.