There has been a lot of pushback on questions seeking examples of certain things in different languages, but also a lot of people noting that they see value in surfacing different approaches taken in real-world usage that they haven't encountered. I'd like to try to assemble a consensus "good" question in that vein for this site, or establish that it's not achievable, and to do that I want to try to workshop it into shape.
I have a topic in mind that is
- about semantics, rather than syntax
- has a restricted scope of relevant languages, not just "what are some ..."
- I know does have genuine contrasting examples.
This is drawing off one of my long-standing list entries for possible academic studies. All this is to say that I believe the topic of the question is worthy of study, non-trivial, and not endless, and I think it then avoids those particular objections to "examples of syntax for X" questions in favour of the actual nature of this type of question. However, I haven't posted it because, frankly, I can't figure out how to frame it as a question to be a good fit for the site.
I'm going to put the direct version below, and invite answers proposing how it could be shaped into a form that is suitable for the site. We might establish that there isn't a suitable version, which will be good to figure out too, or that actually everyone is fine with this version. I want to separate this discussion from thinking about concrete examples on the main site where people have had a more direct involvement already, so it's a fresh question nobody's had any engagement with before.
How have modern languages dealt with (Unicode) strings?
Languages developed over the last fifteen years or so have been well within the "ubiquitous Unicode" era, and been able to design their string types accordingly.
I'm looking for examples of how different real-world languages with their primary release from 2008-2018 represent textual strings, how access within strings (e.g. indexing or iteration) behaves at the language level, the functioning of string equality, any performance or semantic tradeoffs made within that, and when appropriate how these choices have been received by programmers over the time since.
Relevant aspects of Unicode itself might include Unicode transformation formats, normalisation, codepoints, code units, and grapheme clusters; language syntax is relevant only as far as it's supporting the semantics.
For context on answers here: at the very least Rust, Swift, Go, and Raku all meet these criteria, all have deliberately addressed Unicode in their native string type, and all have made vastly different choices than each other, within a similar timeframe. There are certainly real answers to this question that have substance. What I want to establish here is a consensus acceptance of the way to elicit useful answers with a question like this, or to establish that it just can't be done.