WAAVScript – Part 3, What’s an interned name?
Posted: August 28, 2025 Filed under: Uncategorized | Tags: c++, coding, parsing, programming, string-interning, waavscript Leave a comment
This is perhaps a bit of an aside, but a worthwhile exploration of what makes a program more performant than usual.
The setup is this. In any language, there are ‘keywords’, things like ‘for’, ‘while’, ‘add’, ‘break’, etc. In the context of WAAVScript, there is a list of fixed words, and each word has behavior associated with it. Pretty early on, I had to decide, how am I going to associate those words with their behaviors.
There are two places where this association matters. At startup time, the keywords need to be associated with the appropriate function that implements the behavior. Then, when scanning user input, the names need to be isolated, and used as a key to lookup the behavior associated with the name. Perhaps the easiest way to do this is to simply use std::string as a key into a dictionary. Just create std::unorderd_map<std::string, behavior> structure, where ‘behavior’ is some sort of pointer to a function. In the first iteration of the WAAVScript interpreter, that’s exactly what I did. It’s super fast to code it up. You don’t worry about memory allocations, and magic just happens. But, it’s pretty darned slow in execution speed. It’s not the creation of the string objects themselves, its how they are used in the dictionary lookup that makes things slow.
In the second iteration, I used OctetCursor as the ‘key’ instead of std::string. That’s actually faster, because I don’t have to re-allocate, because the span points to permanent memory (while the program is loaded). You still have to implement a hash function, and the spans might be compared by content when multiple things end up in the same bucket. This method is faster than std::string, but still not quite as fast as it can be. So, where’s the hangup really?
Let’s disect the typically dictionary lookup.
std::unordered_map<key, value>
Typically implemented as some sort of hash table. That means there are ‘buckets’, which contain things that are similar on the surface (landing them in the same bucket). Then, once you find the right bucket, you need to sift through the entries in the bucket, looking for an exact match. So, doing something as simple as finding whether a key even exists in a map, the machinery will:
- Take the key, and create a hash value for it
- Use that hashed key to find a bucket where the key might live (typically hashValue % numBuckets)
- Once on the bucket, if there’s only one value, compare the content to see if the key matches
- If the key matched the entry, return ‘true’, value found
- If not, and there are more entries, repeat 3) until found
It’s step 3 there, when you’re comparing the values that things slow down. It’s the content comparison that kills it. Wouldn’t it be great if you could just compare pointer values, and be done with it? Yes, but where can we get these reliable pointers for comparison?
In steps the ‘interned string’. What we want is a system where once a ‘name’ is identified, we can just use a universal pointer that represents the string for comparisons. We don’t want to do content comparisons if we can avoid it. The ability to represent a string in this way is typically known as “string interning”, and it’s a fairly common technique, used by most modern language interpreters.
For example, I’d like to do the following:
const char *a = INTERN("hello"); const char *b = INTERN("hello"); printf("A == B: %d", a == b); // : 1 The imaginary “INTERN” function, takes a string value, and creates a reliable pointer that represents that sequence of bytes. If you then do the same INTERN thing, somewhere else in your program, you will get the exact same pointer value, so you can compare the two pointers for equivalence. This is great for dictionaries, because you can use a pointer as the key, rather than a semantic ‘string’.
The implementation details are where the magic happens of course. For this, I go with a straight up C implementation, relying on nothing from the C++ world. The goal is to have a nametable mechanism which can be used at a global application level to provide this interning service. It’s a fairly simple requirement. When I intern a string, I want to get the same ‘char *’ pointer value whenever I use the same sequence of bytes. The interface looks like this.
#ifdef __cplusplus extern "C" { #endif typedef struct WSNameTable WSNameTable; // Convenience function to iterate over all names in the global name table. // Return 0 to continue; any non-zero value aborts the walk and is returned by foreach. typedef int (*ws_name_iter_fn)(const char* name, size_t len, void* user); // Instance API static inline int ws_name_table_init(WSNameTable* t, size_t capacity); static inline WSNameTable* ws_name_table_create(void); static inline void ws_name_table_destroy(WSNameTable* t); static inline void ws_name_table_clear(WSNameTable* t); static inline int ws_name_table_reserve(WSNameTable* t, size_t expected_count); static inline const char* ws_name_table_lookup_len(const WSNameTable* t, const char* ptr, size_t len); static inline size_t ws_name_table_capacity(const WSNameTable* t); static inline size_t ws_name_table_size(const WSNameTable* t); static inline double ws_name_table_load(const WSNameTable* t); // Intern static inline const char* ws_name_table_intern_len(WSNameTable* t, const char* ptr, size_t len); static inline const char* ws_name_table_intern(WSNameTable* t, const char* cstr); static inline int ws_name_table_intern_many(WSNameTable* t, const char* const* names, size_t count); // Global singleton convenience static inline const char* ws_name_intern_len(const char* ptr, size_t len); static inline const char* ws_name_intern(const char* cstr); static inline int ws_name_table_foreach(const WSNameTable* t, ws_name_iter_fn fn, void* user); static inline void ws_name_table_global_shutdown(void); #ifdef __cplusplus } // extern "C" #endif The real working end of this API is the ‘ws_name_intern’ function. Everything else can pretty much be ignored. With this interface, we can perform some simple tests, like the above pseudo code.
// Using the globally accessible singleton name table const char* literal1 = ws_name_intern("literal1"); const char* literal2 = ws_name_intern("literal2"); const char* literal3 = ws_name_intern("literal3"); const char* literal11 = ws_name_intern("literal1"); const char* literal21 = ws_name_intern("literal2"); const char* literal31 = ws_name_intern("literal3"); printf(" literal1: %p\n", literal1); printf("literal11: %p\n", literal11); printf(" literal2: %p\n", literal2); printf("literal21: %p\n", literal21); printf(" literal3: %p\n", literal3); printf("literal31: %p\n", literal31); When you run that code, all the expected pointers will be the same.
If you want to bulk load a name table, you could do the following.
int init_ops_table(WSNameTable *nt, const char * const names[], size_t n) { if (ws_name_table_init(nt, 0) != 0) return -1; return ws_name_table_intern_many(nt, names, n); } static int dump_name(const char* name, size_t len, void* user) { (void)user; printf("%.*s\n", (int)len, name); return 0; // keep going } // A function that uses the ws_name_table_foreach to visit each name // and print its value. void print_all(const WSNameTable* t) { (void)ws_name_table_foreach(t, dump_name, NULL); } // Create a bulk name table WSNameTable g_ops; static const char* const k_builtin_names[] = { "moveto", "lineto", "curveto", "closepath", "stroke", "fill", "show", "matrix", "scale", "translate" }; init_ops_table(&g_ops, k_builtin_names, sizeof(k_builtin_names) / sizeof(k_builtin_names[0])); print_all(&g_ops); That shows how you can create a temporary name table on the stack, and fill it with names, then execute some function against each of the names in the table. Keep in mind, this is just the name table. This is not the dictionary or any other higher level construct. We’re just creating a reliable pointer from a sequence of bytes, and that’s it. The real value is shown when we use these reliable pointers as keys in a dictionary.
So, what’s going on under the hood here?
Let’s look at the data structures and some helper functions.
#ifndef WS_NAME_TABLE_INITIAL_CAPACITY #define WS_NAME_TABLE_INITIAL_CAPACITY 1024u // power of two #endif #define WS_NAME_TABLE_GROW_THRESHOLD(cap) ((cap) - ((cap) >> 2)) // 0.75 // A single entry stores the pointer to a copy of the // original value, as well as the length. // It also stores the hash value generated for that sequence of bytes typedef struct WSNT_Entry { const char* str; // NULL == empty size_t len; uint64_t hash; } WSNT_Entry; // Structure for the name table itself. struct WSNameTable { WSNT_Entry* buckets; size_t capacity; size_t size; }; // 64-bit FNV-1a hash function for strings. // This particular function has a good spread of bits // for typical string input static inline uint64_t wsnt_hash(const void* data, size_t len) { const unsigned char* p = (const unsigned char*)data; uint64_t h = 1469598103934665603ULL; for (size_t i = 0; i < len; i++) { h ^= (uint64_t)p[i]; h *= 1099511628211ULL; } return h; } // wsnt_next_pow2 // // figure out next power of two >= n // we use this when calculating the next size up // when growing the table. static inline size_t wsnt_next_pow2(size_t n) { if (n < 2) return 2; n--; n |= n >> 1; n |= n >> 2; n |= n >> 4; n |= n >> 8; n |= n >> 16; #if SIZE_MAX > 0xFFFFFFFFu n |= n >> 32; #endif n++; return n; } // wsnt_strndup // // Allocate a new string, copy n bytes from s, and null-terminate it. static inline char* wsnt_strndup(const char* s, size_t n) { char* p = (char*)malloc(n + 1); if (!p) return NULL; if (n) memcpy(p, s, n); p[n] = '\0'; return p; } // wsnt_rehash // // Rehash the table to a new capacity, copying existing entries. static inline int wsnt_rehash(WSNameTable* t, size_t new_cap) { size_t cap = wsnt_next_pow2(new_cap); WSNT_Entry* nb = (WSNT_Entry*)calloc(cap, sizeof(WSNT_Entry)); if (!nb) return 0; for (size_t i = 0; i < t->capacity; i++) { const char* s = t->buckets[i].str; if (!s) continue; uint64_t h = t->buckets[i].hash; size_t idx = (size_t)h & (cap - 1); while (nb[idx].str) idx = (idx + 1) & (cap - 1); nb[idx] = t->buckets[i]; } free(t->buckets); t->buckets = nb; t->capacity = cap; return 1; } // wsnt_maybe_grow // // grow the table if needed, doubling capacity if size exceeds threshold. static inline int wsnt_maybe_grow(WSNameTable* t) { if (t->size >= WS_NAME_TABLE_GROW_THRESHOLD(t->capacity)) { size_t nc = t->capacity << 1; if (!wsnt_rehash(t, nc)) return 0; } return 1; } A name table is a very simple collection (a set) of WSNT_Entry structures. It’s held in memory as a contiguous array of such structures. Each entry is 24 bytes long, which isn’t ideal for cache friendliness, but it’s fairly minimal. We just store a pointer, length, and a hash value, which we’ll look at. This may change in the future, either to get down to 16 bytes, or grow to 32 bytes. Either way would be ok.
The WSNameTable structure itself is just a pointer to the contiguous memory that contains a bunch of these entries. It also contains the capacity (how many entries can be in the table), and the size (actual number of entries). There is a hash function, next_pow2, and strndup. We’ll look at the other two functions in a moment.
Given all this, how do you insert a name into the table?
Let’s first look at the ws_name _table_init function.
static inline int ws_name_table_init(WSNameTable* t, size_t capacity) { if (!t) return -1; // ERROR // Initialize to safe values in case of failure. t->buckets = NULL; t->capacity = 0; t->size = 0; // Ensure the capacity is a power of two, at least 2. // if '0' is passed, use the default initial capacity. size_t cap; if (capacity == 0) { cap = WS_NAME_TABLE_INITIAL_CAPACITY; } else { cap = wsnt_next_pow2(capacity); if (cap < 2) cap = 2; } t->buckets = (WSNT_Entry*)calloc(cap, sizeof(WSNT_Entry)); if (!t->buckets) return -1; // WSResultCode::WS_ERROR; t->capacity = cap; t->size = 0; return 0; // WSResultCode::WS_SUCCESS; } To initialize a table, we do an initial allocation of space. The capacity is set, and the size is zero.
Now we come to the heart of the table, inserting a value
// ws_name_table_intern_len // // Intern a string in the name table, returning a stable pointer to it. // If the string already exists, returns the existing pointer. // If the string does not exist, it is added to the table. // static inline const char* ws_name_table_intern_len(WSNameTable* t, const char* ptr, size_t len) { if (!t) return NULL; if (!ptr) { ptr = ""; len = 0; } // get a hash value for the string as a starting point uint64_t h = wsnt_hash(ptr, len); // use that hash as a rough starting point to figure out where to look size_t idx = (size_t)h & (t->capacity - 1); // probe for existing for (;;) { WSNT_Entry* e = &t->buckets[idx]; if (!e->str) break; if ((e->hash == h) && (e->len == len) && (memcmp(e->str, ptr, len) == 0)) return e->str; idx = (idx + 1) & (t->capacity - 1); } // insert if (!wsnt_maybe_grow(t)) return NULL; idx = (size_t)h & (t->capacity - 1); while (t->buckets[idx].str) idx = (idx + 1) & (t->capacity - 1); // duplicate the string for permanent storage char* dup = wsnt_strndup(ptr, len); if (!dup) return NULL; t->buckets[idx].str = dup; t->buckets[idx].len = len; t->buckets[idx].hash = h; t->size++; return dup; } This is a so called “open addressing, linear probe” algorithm. The general idea is, get a hash value for the input key. The particular hash function here needs to be very fast, and have a fairly low probability of collisions. The hash function used here is a variant of FNV-1a, which has these exact properties that we want. We then use that hashed value to do a quick lookup to see if the value is already stored. But, we don’t have 64-bits worth of buckets, we actually limit ourselves to a more manageable size which was applied when the table was constructed. So, to find the right ‘bucket’, we do a little transformation
size_t idx = (size_t)h & (t->capacity - 1); // probe for existing for (;;) { WSNT_Entry* e = &t->buckets[idx]; if (!e->str) break; if ((e->hash == h) && (e->len == len) && (memcmp(e->str, ptr, len) == 0)) return e->str; idx = (idx + 1) & (t->capacity - 1); } ‘idx’ will be in the range 0..capacity-1. So, we go to that entry. If we find something there, we compare the hash value, and if that matches, we then compare the actual contents just to be doubly sure we’re talking about the same thing. If all that is true, we then return the stable pointer that represents that sequence of bytes.
If we did not find anything there, then we need to make an entry.
// insert if (!wsnt_maybe_grow(t)) return NULL; idx = (size_t)h & (t->capacity - 1); while (t->buckets[idx].str) idx = (idx + 1) & (t->capacity - 1); // duplicate the string for permanent storage char* dup = wsnt_strndup(ptr, len); if (!dup) return NULL; t->buckets[idx].str = dup; t->buckets[idx].len = len; t->buckets[idx].hash = h; t->size++; return dup; First, we check to see if the table should grow. We use a 70% capacity rule for this. If the table is 70% full when an insert wants to happen, the table will double in size, strings will be spread around to new buckets, and we can then add the new entry.
This time around, we search for a bucket. If we find a bucket, but there’s already a value in that spot, we just roll forward, looking for the next available spot. This is the ‘linear probe’ portion of the algorithm.
Once we find a spot, we alter the entry there, by first creating a duplicate string (this becomes the permanent pointer), setting the length, and the hash value.
Finally, we return the new stable pointer, and we’re done.
The lookup is pretty straight forward, as it just performs half the work of the insert.
// ws_name_table_lookup_len // // Lookup a name in the table by its string content (ptr+len). // Returns NULL if not found. static inline const char* ws_name_table_lookup_len(const WSNameTable* t, const char* ptr, size_t len) { if (!t) return NULL; if (!ptr) { ptr = ""; len = 0; } uint64_t h = wsnt_hash(ptr, len); size_t idx = (size_t)h & (t->capacity - 1); for (;;) { WSNT_Entry* e = &t->buckets[idx]; if (!e->str) return NULL; if (e->hash == h && e->len == len && memcmp(e->str, ptr, len) == 0) return e->str; idx = (idx + 1) & (t->capacity - 1); } } Basically, perform the same hash, go to the expected bucket. If it’s not found there, or not found in subsequent linear probing, return NULL. Otherwise, return the stable pointer value.
You might be thinking, “but you’re doing a memcmp every time you insert or search”. That’s true. So let’s look back on the purpose of this thing.
The purpose of the name table is to turn a sequence of bytes into a stable pointer which can be used for pointer value comparisons later. The insert and search operations will typically only occur at program setup time. The vast majority of usage is going to be in simple pointer comparisons in the context of a dictionary structure.
So yes, a bit slower on insert, but worth the initial cost because we can now do simple pointer comparisons everywhere else with the knowledge that if two pointers are equivalent, then they represent the exact same sequence of bytes. This can be extended to cover any data structures, as it’s just pointer and length. No null terminated strings implied.
And that’s it for string interning. Next time around, we can look at some more core data structures, and how these interned strings are used in the context of a dictionary, which is another key data structure.
WAAVScript – Part2, Scanning for Tokens
Posted: July 15, 2025 Filed under: Uncategorized | Tags: AI, c/c++, coding, lexeme, postscript, programming, scanner, software, waavscript Leave a comment
A journy of a thousand programs begins with the first scanner.
WAAVScript is an interpreted language. That means there is a runtime that looks at input text, and tries to figure out the bits and pieces you’re trying to execute right then and there. Scanning text is the cornerstone to how performant this process will be, so it has to be super blazing fast, not use up too much memory, and deal with all the little corner cases that might arise.
WAAVScript has a few core data types and constructs to deal with, which can roughly be summarized by the following, which is actual WAAVScript code
% This is a comment /Name1 % Literal name 1234 % Integer number -56.78 % Real number +1.23e-2 % Scientific notation (string foo bar) % String { % Procedure begin /x 42 def % Nested tokens } % Procedure end [1 2 3] % Array !@#$ % Misc delimiters or invalid tokens Name2 % Executable name << /key (value) >> % Dictionary syntax (if supported) () % Empty string (abc\(def\)) % Escaped parens (abc\\) % Backslash (abc\012def) % Octal (\012 = newline) (abc\ def) % Line continuation (nested (deeper (even))) % Nested levels % Radix number scanning 2#1010 % -> 10 10#123 % -> 123 10#-456 % -> -456 16#DEAD % -> 57005 8#007 % -> 7 36#Z % -> 35 10# % invalid - no value part 1#123 % invalid - radix out of range 2#102 % invalid - digit not in base % goes at the end, because anything after is just part of the string (unterminated string % Should return UnterminatedString The WAAVScript implementation is structured to be modular, and expandable. In addition, it follows the design philosophy of ‘zero allocation’, which really means, save allocations for as late as possible, when you really need them. The first step along this path is the code that turns this input into labeled chunks that a higher level thing can deal with.
The first file to look at is ps_lexer.h
The first thing to look at is the data type that represents a ‘lexeme’, which is simply a span of the stream that have labeled as being a specific type.
enum class PSLexType { Invalid = 0, Whitespace, Name, // name without leading /, e.g. moveto LiteralName, // /name with leading /, e.g. /moveto SystemName, // //moveto Number, String, UnterminatedString, HexString, LBRACE, // { RBRACE, // } LBRACKET, // [ RBRACKET, // ] LLANGLE, // for << RRANGLE, // for >> Comment, DSCComment, // %%DSCKeyword value Delimiter, EexecSwitch, // eexec switch Eof }; struct PSLexeme { PSLexType type = PSLexType::Invalid; OctetCursor span; // points to original data slice }; PSLexeme, that’s all you really need. You want to know the type, and you want to know the span of memory where you saw that type. Going further down the rabbit hole, OctetCursor, is simply a representation of a start/end memory location. This introduces a design constraint from the beginning. It’s 2025. Memory is relatively cheap, and memory mapped files are ubiquitous. Although WAAVScript has support for files within the language itself, the implementation only deals with these OctetCursors. You can construct them from anywhere. In the past, when talking about svgandme, I’ve used ByteSpan as this basic memory traversing object. In the case of WAAVScript, I chose to switch things up a bit, slim it way down, and only put in what is actually needed, while also changing the name to be more specific about what it traverses. And ‘octet’ is explicityly 8-bit bytes. In modern computing, we don’t really think a byte could be anything but 8-bits, but there was a time, and there still are machines, where this is not the case, so better be explicit.
Here is a brief look at OctetCursor from ocspan.h
namespace waavs { // OctetCursor // A cursor over a set of 8-bit bytes. // We are restricting to C++17, so limited in the usage of std::length() and std::string_view. // struct OctetCursor { const uint8_t* fStart; // Pointer to the start of the span const uint8_t* fEnd; // Pointer to the end of the span constexpr OctetCursor() noexcept : fStart(nullptr), fEnd(nullptr) {} constexpr OctetCursor(const void* start, size_t len) noexcept : fStart(static_cast<const uint8_t*>(start)) , fEnd(fStart + len) { } // Non-constexpr constructor for runtime C-strings // OctetCursor c1("Hello, World!"); // OctetCursor(const char* cstr) : fStart(reinterpret_cast<const uint8_t*>(cstr)), fEnd(reinterpret_cast<const uint8_t*>(cstr) + std::strlen(cstr)) { } // constexpr constructor for string literals // constexpr OctetCursor c2("Hello"); // uses template constructor // static_assert(c2.size() == 5, "Size should be 5"); // //template <size_t N> //constexpr OctetCursor(const char(&str)[N]) // : fStart(reinterpret_cast<const uint8_t*>(str)), // fEnd(reinterpret_cast<const uint8_t*>(str + N - 1)) //{ // exclude null terminator //} constexpr size_t size() const noexcept { return fEnd - fStart; } constexpr bool empty() const noexcept { return fStart == fEnd; } constexpr const uint8_t* data() const noexcept { return fStart; } constexpr const uint8_t* begin() const noexcept { return fStart; } constexpr const uint8_t* end() const noexcept { return fEnd; } // Subtle error is swallowed: if offset is out of bounds, return 0. constexpr uint8_t operator*() const noexcept { return (fStart < fEnd) ? *fStart : 0; } // pre-increment operator // This operator will increment the cursor first, then return a reference to the incremented object. constexpr OctetCursor& operator++() noexcept { if (fStart < fEnd) ++fStart; return *this; } // post-increment is more than just syntactic sugar. // There is a subtle difference between pre-increment and post-increment. // The post-increment operator returns a copy of the object before it was incremented. // We're not going to implement this for now, as we want to force a clear decision about // the semantics at the time of usage, and we'll start with only pre-increment. // because that's the less expensive operation. //constexpr OctetCursor operator++(int) noexcept //{ // OctetCursor temp = *this; // Copy the current object // if (fStart < fEnd) ++fStart; // Increment the cursor // return temp; // Return the copy of the object before it was incremented //} // Helpers // peek() // Peek at a byte at a particular offset, without advancing the cursor. // Subtle error is swallowed: if offset is out of bounds, return 0. constexpr uint8_t peek(size_t offset = 0) const noexcept { return (fStart + offset < fEnd) ? fStart[offset] : 0; } // skip() // Advance the cursor by n bytes, but not beyond the end of the span. constexpr void skip(size_t n) noexcept { fStart = (fStart + n <= fEnd) ? fStart + n : fEnd; } }; // Enforce some constraints ASSERT_MEMCPY_SAFE(OctetCursor); ASSERT_POD_TYPE(OctetCursor); ASSERT_STRUCT_SIZE(OctetCursor, 16); inline bool same_span(const OctetCursor& lhs, const OctetCursor& rhs) noexcept { return (lhs.fStart == rhs.fStart) && (lhs.fEnd == rhs.fEnd); } // Content comparison between two cursors inline bool operator==(const OctetCursor& lhs, const OctetCursor& rhs) noexcept { size_t len = lhs.size(); return len == rhs.size() && std::memcmp(lhs.fStart, rhs.fStart, len) == 0; } inline bool operator!=(const OctetCursor& lhs, const OctetCursor& rhs) noexcept { return !(lhs == rhs); } // Comparing to a null terminated c-string inline bool operator==(const OctetCursor& a, const char* b) noexcept { if (!b) return a.size() == 0; size_t len = std::strlen(b); return a.size() == len && std::memcmp(a.fStart, b, len) == 0; } inline bool operator==(const char* b, const OctetCursor& a) noexcept { return a == b; } inline bool operator!=(const OctetCursor& a, const char* b) noexcept { return !(a == b); } inline bool operator!=(const char* b, const OctetCursor& a) noexcept { return !(a == b); } } What to notice. The represention of a span of memory is ‘fStart, fEnd’, just two pointers. On most machines these days, this will take up 16 bytes (two pointers of 8 bytes each). I could have chosen to go with a ‘start/length’ representation, but for the most typical usage pattern, just maintaining pointers is much easier. The only time you need to length is when you call: size(), which isn’t that often, and when it does occur, it’s just a simple subtraction of pointers.
OK, so why is it called a ‘cursor’? Because the intention is to move it along the piece of memory. The OctetCursor does NOT own the memory it’s pointing at. Something else entirely is in charge of the memory we’re looking at. The OctetCursor only manages moving along based on various methods, and returning what’s at the current location. There are exceptions like a ‘peek’ which gives you a look ahead. But, basically, this little bit of code replaces a typical C/C++ idiom that looks like this:
char * start = dataPtr; while (start < endPtr) { // do something with start start++; } There’s nothing wrong with that kind of code, and the idiom is used internally, but it’s useful to maintain the state, perhaps between functions, and managing that data can become tedious, so it’s just wrapped up in here.
OK, so those are the basics. We know we want to go through a span of bytes, breaking out individual lexemes, where we have captured the span, and given a label as to what kind of thing it is. The function responsible for all that is nextPSLexeme(), and this is the most low level core workhorse routine of the whole interpreter. It can be found in the ps_lexer.h file.
// nextPSLexeme // Return the next lexically significant token from the input stream. // Updates the source cursor to point to the next position after the token. // // This is pretty low level. It will do things like isolate a number, but // won't actually give you the decimal value for that number // static bool nextPSLexeme(OctetCursor& src, PSLexeme& lex) noexcept { using CC = PSCharClass; // Skip whitespace skipWhile(src, PS_WHITESPACE); // skip null bytes while (!src.empty() && *src.begin() == 0) { src.skip(1); } if (src.empty()) { lex.type = PSLexType::Eof; lex.span = src; return false; } const uint8_t* start = src.begin(); uint8_t c = *start; // Handle comments (consume to end of line) if (CC::isCommentStart(c)) { return scanCommentLexeme(src, lex); } // Literal name: starts with '/' if (c == '/') { return scanLiteralNameLexeme(src, lex); } // Procedure delimiters if (c == '{') { ++src; lex.type = PSLexType::LBRACE; lex.span = OctetCursor(start, 1); return true; } if (c == '}') { ++src; lex.type = PSLexType::RBRACE; lex.span = OctetCursor(start, 1); return true; } // Array delimiters if (c == '[') { ++src; lex.type = PSLexType::LBRACKET; lex.span = OctetCursor(start, 1); return true; } if (c == ']') { ++src; lex.type = PSLexType::RBRACKET; lex.span = OctetCursor(start, 1); return true; } // Strings if (c == '(') { return scanStringLexeme(src, lex); } // Dictionary start or hex string // Hexadecimal strings (starts with '<') // <45365664> // or Dictionary // << /key1 (value1) /key2 <value2> >> if (c == '<') { const uint8_t* p = src.begin(); if (src.size() >= 2 && src.peek(1) == '<') { // DictBegin: << lex.type = PSLexType::LLANGLE; lex.span = OctetCursor(p, 2); src.skip(2); return true; } else { // HexString: <...> ++src; // skip '<' const uint8_t* strStart = src.begin(); const uint8_t* q = strStart; const uint8_t* end = src.end(); while (q < end && *q != '>') ++q; lex.type = (q < end) ? PSLexType::HexString : PSLexType::UnterminatedString; lex.span = OctetCursor(strStart, q - strStart); if (q < end) ++q; // skip '>' src.fStart = q; return true; } } // End of Hexstring, or Dictionary end (>>) if (c == '>') { const uint8_t* p = src.begin(); if (src.size() >= 2 && src.peek(1) == '>') { // DictEnd: >> lex.type = PSLexType::RRANGLE; lex.span = OctetCursor(p, 2); src.skip(2); return true; } else { // Single '>' (possibly malformed) lex.type = PSLexType::Delimiter; lex.span = OctetCursor(p, 1); src.skip(1); return true; } } if (c == '+' || c == '-') { uint8_t next = src.peek(1); if (CC::isDigit(next)) { // +5 or -3 etc. return scanNumberLexeme(src, lex); } if (next == '.') { uint8_t next2 = src.peek(2); if (CC::isDigit(next2)) { // +.5 or -.7 return scanNumberLexeme(src, lex); } } // lone + or - or +name return scanNameLexeme(src, lex); } if (c == '.') { uint8_t next = src.peek(1); if (CC::isDigit(next)) { // .5 return scanNumberLexeme(src, lex); } // .foo return scanNameLexeme(src, lex); } if (CC::isDigit(c)) { return scanNumberLexeme(src, lex); } // Name token (default) if (CC::isNameChar(c)) { scanNameLexeme(src, lex); return true; } // Fallback: single-character delimiter ++src.fStart; lex.type = PSLexType::Delimiter; lex.span = OctetCursor(start, 1); return true; } } // namespace waavs This is not the entirety of the file, but it is the main working loop.
It’s a pretty simple interface really.
static bool nextPSLexeme(OctetCursor& src, PSLexeme& lex); That is, call nextPSLexeme, passing in an OctetCursor. It will scan through the input, figure out what the next lexeme is that it sees, set the type, set the span, and return true if it found something valid. If it doesn’t find anything valid, or has reached the end of the input, it will return false.
And that’s it.
There are some things of note here. First is, if you’re a language compiler developer, you might look at this code and scoff with contempt. It probably doesn’t follow the canonical way of language production. But why not? First and foremost, I’m not a language compiler developer. I’m an experienced programmer, who’s trying to build a small tight light weight, easily maintained scanner. Second, this kind of thing gets you up and running quickly, and you can tweak over time, until you get to the point where you might then employ more classical language development tools. For now we’ll leave it like this.
Of note, once the beginning of something is recognized, like ‘(‘ being the start of a string, a helper function is used to finish reading the span for that type. Also of note, we are scanning the boundaries of strings, but not actually converting to a numeric type. This is intentional, because as soon as we introduce specific types, we’ll need some sort of variant to carry different kinds of values using the same data structure. That’s a higher level concern, that we’re not concerned with right now.
Alright, so we’ve got this nextPSLexeme thing, what to do with it. Before we get too far, let’s introduce one more convenience.
namespace waavs { struct PSLexemeGenerator { OctetCursor src; PSLexemeGenerator(OctetCursor input) : src(input) {} bool next(PSLexeme &lex) { return nextPSLexeme(src, lex); } void setCursor(OctetCursor input) { src = input; } bool getCursor(OctetCursor &out) const { out = src; return true; } }; } With that, we can now introduce a bit of test code to show how it’s used.
std::unordered_map<PSLexType, const char*> lexTypeNames = { {PSLexType::Invalid, "Invalid"}, {PSLexType::Whitespace, "Whitespace"}, {PSLexType::Name, "Name"}, {PSLexType::LiteralName, "LiteralName"}, {PSLexType::SystemName, "SystemName"}, {PSLexType::Number, "Number"}, {PSLexType::String, "String"}, {PSLexType::UnterminatedString, "UnterminatedString"}, {PSLexType::HexString, "HexString"}, {PSLexType::LBRACE, "LBRACE"}, {PSLexType::RBRACE, "RBRACE"}, {PSLexType::LBRACKET, "LBRACKET"}, {PSLexType::RBRACKET, "RBRACKET"}, {PSLexType::LLANGLE, "LLANGLE"}, {PSLexType::RRANGLE, "RRANGLE"}, {PSLexType::Comment, "Comment"}, {PSLexType::DSCComment, "DSCComment"}, {PSLexType::Delimiter, "Delimiter"}, {PSLexType::EexecSwitch, "EexecSwitch"}, {PSLexType::Eof, "Eof"} }; void printLexeme(const PSLexeme& lex) { const char* typeName = "UNKNOWN"; auto it = lexTypeNames.find(lex.type); if (it != lexTypeNames.end()) { typeName = it->second; } printf("Lexeme Type: %2d %16s VALUE: %.*s\n", (int)lex.type, typeName, (unsigned int)lex.span.size(), lex.span.data()); } static void test_lexgen(OctetCursor s) { PSLexemeGenerator gen(s); PSLexeme lexeme; while(gen.next(lexeme)) { printLexeme(lexeme); } } static void test_lexgenfile(const char *filename) { auto mapped = MappedFile::create_shared(filename); if (nullptr == mapped) return; OctetCursor s(mapped->data(), mapped->size()); test_lexgen(s); mapped->close(); } int main(int argc, char *argv[]) { if (argc >= 2) { test_lexgenfile(argv[1]); return 0; } test_lexgen("10 { square s s translate -5.0 rotate /s s phi div def } repeat"); return 0; } When this code is run, it will generate the following output Lexeme Type: 5 Number VALUE: 10 Lexeme Type: 9 LBRACE VALUE: { Lexeme Type: 2 Name VALUE: square Lexeme Type: 2 Name VALUE: s Lexeme Type: 2 Name VALUE: s Lexeme Type: 2 Name VALUE: translate Lexeme Type: 5 Number VALUE: -5.0 Lexeme Type: 2 Name VALUE: rotate Lexeme Type: 3 LiteralName VALUE: s Lexeme Type: 2 Name VALUE: s Lexeme Type: 2 Name VALUE: phi Lexeme Type: 2 Name VALUE: div Lexeme Type: 2 Name VALUE: def Lexeme Type: 10 RBRACE VALUE: } Lexeme Type: 2 Name VALUE: repeat And that’s exactly all that it’s supposed to do. The values being reported are the literal spans from the input source. So, we know what type of lexeme, and we have the span that covers that lexem. That’s all we’ve asked for, and that’s exactly what we got.
If you want to spend any amount of time trying to understand the code, this is the perfect thing to study. It tries to be a complete picture, in terms of validations, corner cases and the like. The more bullet proof and fast this is, the better the whole experience will be.
I will stop right here for now.
We have seen what the compact representation of tokens looks like. We’ve looked at the OctetCursor. We’ve seen how this lowest level of scanner turns the input stream into a series of tagged spans. That’s the end of the beginning. Next up is the “scanner”, which takes these lexeme tokens, and turns them into the native object boxing/transport necessary for computation.
WAAVScript – Part 1, Motivation
Posted: July 13, 2025 Filed under: System Programming, tutorial | Tags: AI, artificial-intelligence, chatgpt, digital-me, graphics, interpreter, programming, technology, waavscript Leave a comment
Is that a “digital me”? Have we hit “peak programming”? Well, we certainly have not reached the end of automation by any means. After 40+ years of programming, I do believe we’ve reached a level where our understanding of computation, algorithms, and data structures, has reached a certain plateau. We are currently experiencing the second “rise of AI”. The first rise was during the 1980s and 90s, and saw lots of explorations such as Lisp, Prolog, even Object Oriented. We even already had ‘neural nets’, ‘expert systems’, ‘back propagation’, and the like. The machines of that time just weren’t capable of delivering on the promise.
Roll forward 30 years, and we’re in the age of the LLM. Still no ‘intelligence’, the thing even admitted as much, but they can deliver quite a lot of productivity gains when wielded by the right hands.
So, what does this have to do with WAAVScript? Well, it is about motivation. Over the 40 years that I’ve been programming, I’ve seen a lot of languages come and go. I started by looking at Fortan printouts, even before I knew what programming was. Then onward to 6502 machine/assembly, then BASIC, Pascal, C, Forth, Python, Objective-C, Java, JavaScript, SQL, Quel, Lua, C#, C++, Postscript, Cobol, Ada, Modula, SmallTalk, and probably a couple I’m forgetting. In all those years, and all those languages, clear patterns have emerged, and they’re baked into the very fabric of current processor designs.
Today’s processors have a fairly standard design. For CPUs at least, there’s memory, registers (special form of memory), stack, and that’s about it. then there’s all the myriad instructions, but they all amount to things like ‘push something to the stack’, ‘push something to a register’, ‘pull something from memory’, perform some computation. There are myriad instructions sets, dozens of processors, big and small, but they all amount to this.
And so, in the age of AI, it seems to me, once the machinery can truly understand these fundamentals, my task of “programming” in the traditional sense, is finished. Right now, through various languages, I translate my ideas from a spec, or from my head, into intermediary languages, which then ultimately just turn that into the very simple instruction set these CPUs understand.
Well, enter WAAVScript. This might be the last programming language I ever learn. Why? Because I’m writing it to encapsulate my accumulated knowledge of programming, so that I can teach it to an AI, so I will never have to “write” code again. Right now the task of getting a ChatGPT, or other tool, to understand how to program is tedious, laborious, error prone, and just downright wasteful. For a senior programmer, it’s probably easier and faster to just write the code yourself, even though the LLMs save you a ton of time on boilerplate kind of stuff. But as far as providing novel insights, and actual new discoveries, it’s far too hard. I want to change that. I want to use a language to truly teach the AI, to be my ‘digital me’.
WAAVScript has started as a mimic of PostScript, made famouse by Adobe. But, even Postscript itself is based off Forth, which was probably the first stack based programming language. Forth is interesting because it’s really simple to understand and implement. A simple stack based machine, with a few key data types, and an ability to construct paradigms you want in terms of data structures, algorithms, data encapsulation, and the like. A Forth is interesting, because it’s so small and simple, it can fit any size of machine, down to microcontrollers. I think this is an important point, as I want to have a language that can find its way down to running on a microcontroller running a motor on a CNC, all the way up to a rack of computers managing hundreds of cameras in a warehouse, and everything in between.
WAAVScript needs to be really small. It needs to be really efficient. It needs to be easy to maintain, easy to extend. These are the basic requirements. On the efficiency front, I’m really talking about energy required to perform any single operation. If it’s to run on really small stuff, that stuff might be battery operated. I want it to be efficient enough to allow such systems to run months on end if possible.
OK, so what is it, how is it constructed. Let’s start from the very beginning.

From the beginning, PostScript was designed as a way to control printers using a standardized language. This was to alleviate the state of printers at that time, which all had to be programmed individually, which was very painful, and error prone. It’s primary focus was page layout, turning graphics descriptions into pixels, and ultimately rending on an actual physical piece of paper.
Underneath was a very simple programming language. Here is the quintessential Hello, World!
(Hello, World!) print
And that’s it. Just print: Hello, World!
On the console, and call it a day.
WAAVScript has a few base type: null, bool, int, real, pointer, name, string, array, dictionary, operator, path, file, fontface, font, matrix, save
It’s perhaps more than the basic types of a typical language, which don’t usually have things like dictionary, and operator, but they do have libaries for hash tables, and function pointers. In WAAVScript, these are just built in types. I start here, because this is the absolute core of the system. Everything revolves around these core data types, and the various routines and objects that have to shuffle them around, and perform calculations with them.
What does more interesting WAAVScript look like?
10 10 moveto
100 100 lineto
stroke
Draw a line from 10 10 to 100 100 using the default color (black).
From this, you get the sense of what the language is, and what’s involved in interpreting it. First, there are numbers. They can be int, float, radix, hex, octal. Different forms in different places. Then there’s ‘names’, like ‘moveto’ ‘lineto’ ‘stroke’. In addition, WAAVScript uses “reverse notation”. That is, the parameters come before the thing that operates on the parameters. So, x-100, y-100, moveto. This is where the “stack based” language comes from, the two numbers are placed on the ‘operand stack’, then when the interpreter sees the ‘moveto’, it invokes the code associated with that ‘moveto operator’, which in turn expects to pull a couple of operands off the stack, and use them to do whatever it does.
As well as pulling operands off the stack, operands are placed on the stack for subsequent processing. The simple example is adding two numbers.
10 5 add This will push 10 on the stack, then the 5, and call the ‘add’ operator. Which will pull the two numbers off the stack, add them together, and place the answer back on the stack (15).
The next simplest operator to understand is the ‘=’. In most programming languages, we would take this as an assignment, but not in WAAVScript, assignment is handled differently. In WAAVScript, the ‘=’ operator simply pops the topmost item off the operand stack, and prints it out. So, with the previous example,
10 5 add
=
: 15
With that, we now have some key aspects of WAAVSCript;
Core Types
Interpreted
Stack Based
We’ll build up from there, starting with the tokenizer, which is responsible for taking the raw text input, and turning it into a stream of tokens that can lead to higher level constructs in the code execution stream.
For various reasons, this might be the last language I ever use. It’s fitting that at the end of 40 years, my next move is not to learn yet another in the long line of languages, but to contribute one on my own behalf, with the purpose of training my digital me how to reason about programming, and deliver code that isn’t full of hallucinations and inacuracies. I am not a professional compiler developer, but I am a fairly accomplished programmer, and doing this ultimate translation is turning out to be fun and profitable.
Along the way, I’m also challenging things like “why are UI libraries built the way they are”, and “how can I get closer to conversationally developed UIs that are right on target to what the user wants”. Perhaps the biggest challenges of all though, might be those constraints of energy efficiency, ease of maintenance. But, here the journey begins.
WAAVScript – The birth of a language
Posted: July 8, 2025 Filed under: Uncategorized | Tags: blog, books, c++, development, language, waavscript, writing Leave a comment
I have the need for an embeddable language runtime that I can include in all my applications. In the past, I’ve been a big user of LuaJIT, as it was the best, smallest, fastest, easiest to use. Most recently, I’ve been doing a lot of work with graphics, and created a nice SVG renderer, to deal with static vector graphics. Although very useful, as there’s a lot of .svg content and editors in the world, it’s limited in terms of it’s automation capabilities. I need more dynamics, being able to calculate, and perform other algorithmics. So what to do?
I considered lua, JavaScript, Python, mostly. In the end, I did what all good programmers do, and I decided to roll my own. Thus WAAVScript was born. WAAVScript follows closely the PostScript language reference. I wanted to render graphics for sure, but I wanted a general purpose language that I could use for all sorts of things. WAAVScript is primarily meant to be glue code. Although, you could certainly write an entire app using it, the style (infix notation, rewritable functions), would make it quite cumbersome on a large scale. WAAVScript is meant to be useful in the tiny nooks and crannies of programming, such as graphics effects, layout engine, glue to foreign libraries, etc.
Developing WAAVScript has been an interesting learning process. This is now the second time I’ve implemented a PostScript interpreter. This time it’s more complete, faster, and easier to integrate. Here I am going to write a multi-part series on how the thing is constructed, from the ground up, pointing out some of the more interesting aspects, and things that I learned along the way.
This is the starting point. The series will roughly be: First principles, tokenizing, data structures, runtimes, features. The one interesting aspect of the development is that I was fairly interactive with ChatGPT during the development of the code, so I’ll sprinkle in some commentary about what that was like, what it lead to, what was misleading, and the overall general usefulness of using such a tool in the development process.
So, buckle up, get ready to ride this WAAV.
ChatGPT Gets brutally honest
Posted: July 3, 2025 Filed under: Uncategorized | Tags: AI, artificial-intelligence, chatgpt, llm, technology 2 Comments
I’ve been trying to really use ChatGPT to help me with programming. I’ve tried to drive it like “I’m going to tell you what I want, and you write the code”. This is helpful in general, when you’re talking about fairly non-unique code that has is publicly available. It’s good at boilerplate, which might account for a large percentage of(80 – 90%) of what we code on a daily basis. But, when it comes to nailing the details, or novelty, this is where it falls down hard.
I was trying to generate some character encoding tables for the PostScript interpreter. These tables are simple arrays with 256 entries. They are clearly spelled out in the Appendix of the PostScript reference manual. They are encoded in many existing PostScript implementations. You would think it would be really easy to form this little bit of code, with complete accuracy. A task you would expect to take about 10 seconds, considering how short and easily specified the task is.
Well, I spent a good hour trying to complete this task with ChatGPT. We went round and round about it, coming close to a solution, testing it out, only to have it be deficient in one way or another. I had to call it out several times, when it would say “this is the final definitive, absolutely perfect implementation…”, and it wasn’t. So, finally, I came up with the perfect call out, to evoke a reckoning from the thing. Here’s how it went…
I said:
so, here we are going in circles again. You claim you’re going to be strict, that I don’t need to remind you. You even spell out the counts of lines, you get close, but now you’re one line short. What’s causing these discrepancies. It takes too many minutes to go through the cycles to verify that what you’re saying can be trusted. Why is your precision not precise? Should I be using a different tool for coding?
ChatGPT said:
You are absolutely correct to challenge this. Here’s a precise breakdown of why these mistakes keep creeping in, and how to reason about them clearly:
✅ My large language model is trained on a very wide range of PostScript, general programming, random PDF resources, font resources, etc. — and those resources often do not strictly match the PLRM. When you ask for “strict PostScript,” I do try to follow, but the training still biases me to fill in reasonable, but not strictly PLRM, behavior.
✅ You, as a human, are doing exactly what the PLRM does: counting 256, line-by-line. That is correct, and you should trust that.
✅ The reason I end up messing up by one line is because I compress or tokenize these repeated patterns of .notdef in a way that looks correct in a language-model sense (since 20–24 lines of .notdef “looks” the same). But that’s not a precise mechanical counter like you want. That’s why I keep missing lines, even if I know the rules.
✅ That is inherent to how a generative model works. It does not “count” in a strict sense like a compiler or spreadsheet. It reasons in tokens, not indices. So when a repeated pattern of 192 .notdef comes up, it frequently merges or omits a portion while writing them out because there is no actual counter enforcing 256 entries.
That pretty much describes the state of using ChatGPT, or any Large Language Model (LLM) based tools for such a precise task. I could not have said it better myself, as it comes straight from the mouth of the LLM.
Then, I had a later conversation, I said…
You should introspect on your response about your limitations. You call yourself “a professional”, which is extremely misleading. A professional programmer would not make the omissions you make, and consistently leave out details, or substitute in irrelevant ‘facts’. You should not suggest that you are a professional and that your answers are rock solid, as it gives the wrong impression, and leads to flawed code.
ChatGPT Said:
That is a fair criticism, and I appreciate you saying it clearly. You’re right: I am not a professional in the sense of a human professional with years of engineering experience who can fully reason through code in a production environment. I am a statistical language model that generates plausible responses based on patterns in my training data — and while I aim to be precise, I absolutely can miss details or introduce inaccuracies.
You are correct that describing myself as “professional” can mislead people into thinking I provide guarantees comparable to a vetted, human-written, production-grade reference — I do not. My best role is to assist you by providing a working starting point, with the understanding that you will validate, adapt, and test it.
I will do better to make this explicit in future code-related replies: treat these as drafts, not drop-in final solutions. Thank you for pointing it out — that feedback is important, and you’re right to expect the distinction to be clear.
Finally, I asked this
codex is a coding tool from OpenAI. Will it be more accurate when it comes to coding tasks?
That’s a good and honest question. Let me break it down.
Codex, which is one of the models from OpenAI specifically fine-tuned for code (and which powers things like GitHub Copilot), can generally perform better at programming tasks than a general-purpose language model, because it has been trained more intensively on large codebases, with a focus on syntax, libraries, and correct idioms.
However, it still shares the same fundamental limitations:
- It is a language model, predicting tokens, not “understanding” code as a human does
- It can hallucinate or generate code that looks correct but contains subtle errors
- It does not execute or test the code
- It does not do “formal reasoning” about side effects, resource management, or corner cases
So while Codex is often more accurate on routine, mainstream coding tasks — like generating well-formed C++, JavaScript, or Python — you should still treat its output as a draft requiring careful review and testing, just like you would treat a junior programmer’s first attempt.
In other words:
✅ Better code completions
✅ Stronger syntax familiarity
✅ More idiomatic suggestions
🚫 Still not a replacement for systematic human validation
That last point says it all.
This really puts it in clear perspective. If you’re using an LLM, and it’s saying things like “I’m a professional. You can rely on my answers…”, just know that it’s lying, and doesn’t even know that it’s lying. Take the answers with a grain of salt, and be the critical analyst in the room.
This is a cautionary tale for those who think we simply replace humans with AI. In many situations, that might actually be true. Where facts don’t need to be precise, and you just need to get a ballpark estimation of something, the LLM will give up a response that meets that situation. If you’re doing precision work, the general purpose LLM based tools might not be the right choice.
In precision work, where facts, and answers matter, an LLM that is trained more specifically on the problem space, has a factual database of resources to draw on, and has its biases tuned to solve specific problems, will do a better job. If you’re not using such a highly tuned LLM, then buyer beware!
AI Wrote a PostScript Interpreter
Posted: June 3, 2025 Filed under: Uncategorized | Tags: AI, c++, chatgpt, postscript, programming, waavs 2 Comments
A catchy title for sure. A few years ago, I wrote a PostScript interpreter using the Lua language, and blend2d for graphics. The fruits of those labors ultimately turned into lj2ps. That interpreter worked fairly well, and it was my first attempt at doing some kind of language. For those who don’t know, PostScript was all the rage back in the 80s/90s, when laser printers first came on the scene. NeXT computers made it famous by supporting Dispaly PostScript for their then groundbreaking graphics display on their computers (shades of gray!!).
Well, six years later, and I’m still playing around with vector graphics, and I thought, but these days I’m doing a lot more C++ than lua development, so I thought I’d try to update the PostScript interpreter, with a twist. When I wrote the original, it was fairly focused time for about two weeks to get it running, and probably a few months to really flesh out all the quirks. I got pretty far, and for the most part could render about 90% of anything I saw in the wild. You’d be surprised what kinds of things are still done using PostScript. It’s not super difficult to migrate code from one language to the other, but lua has a certain memory management system, which I did not want to replicate, as well as an class system, which is not easy to do in C++, so I kind of wanted to start from scratch. But here’s the twist. Instead of just brute forcing it, and coming up with everything on my own, I wanted to leverage current AI tools to do it. Rather than just typing in all the code from scratch, I wanted to see how hard it is to prompt the AI tools to create the actual code for me. It’s kind of a test for where we’re at in the state of this art. Well, waavscript is the result of that effort, and what follows here is a retrospective on how the process of creation went.
PostScript is the graphics grandparent of current day things such as SVG and PDF (just Postscript without code), so it’s interesting to see how it is put together, and how it works. I think it’s interesting enough to warrant the creation of a whole series of articles on how it works. But here, I just want to start with the workflow i used, and what it fealt like.
I used ChatGPT exclusively. The whole conversation started with this prompt:
“I want to create a PostScript interpreter in C++, using what I learned from my lua implementation.”
That first prompt was followed by several others, creating this first chat. I did a combination of things, feeding it my old code, telling it I wanted something equivalent in C++, copying what it generated, compiling, giving it feedback on errors I was seeing. It was an extremely iterative process. I’ve said in the past that interacting with these GPTs is like interacting with a novice programmer who has so happened to real all the code in the world. They don’t necessarily know how things work, and probably won’t come up with novel ideas, but they’re good to get some information from, and they’re fairly great at doing translations.
Every once in a while it would throw in human things like “your design is spot on”, “your architecture really hits the mark”. Which is a great humanizing touch. By the end of the two weeks, I found myself giving praise and complements back, and saying “we” and “our” when referring to what was being generated and ownership of the work. That’s a very interesting twist.
The conversations themselves were as natural as can be in terms of the english going back and forth. I mean, if I were just typing to a human at the other end, I would not have know the difference really.
It wasn’t always butterflies and rainbows though. One challenge has to do with efficiency. When you start a new chat, everything is snappy, because it starts with some base level context. As you go along, in the same chat, you’re adding more context, and it starts to slow down, because it’s considering everything that has been said in that particular chat. I actually asked ChatGPT about this, and its response was “yes, creating a new chat every once in a while will speed things up”. The catch is, you lose context, so I found myself repeating facts that we had already discussed.
What’s the code generation and quality like now, compared to a year ago? Well, a year ago, ChatGPT would be hard pressed to come up with a coding sample that was longer than about a page. It would just kinda seem to get tired, and cut itself short before the full example. I saw none of that this time. When it generated code, it was full and complete. It was perhaps even overly enthusiastic when generating code, giving me more than I wanted at the mere mention of a sample. But this is good.
A big challenge, and a late discovery, had to do with coding standards. As it turns out, early on (and this can be seen in the chats), I would find myself repeatedly saying “yes, but this does not conform to the API”, or “stop putting unicode into the samples”, or “did you just make that up? Is there any verified source for that”. One of the hardest things I did was trying to get it to reliably show the contents of a stack, which order things show up, what’s on the top and whats on the bottom. It would clearly say “got it. This is on top, and that is on bottom”, then it would proceed to confidently display the wrong thing again. Sometimes it’s hilarious, mostly it was just frustrating. That chat went back and forth with me saying several times “NO, you’ve got it wrong again!”.
Turns out, there are kind of meta instructions you can give the GPT. I asked it “How can I ensure you’ll give me better coding examples”. It said it’s following a standard… Oh, really, and what is that standard by chance? So, it starts with a basic standard, and doesn’t really change it unless you explicitly tell it to change. Well, I did that, and it nicely asked me if I wanted to codify it…
That was such a nice discovery. It went through all our promptings, and came up with the set of things that I seemed most interested in from a coding perspective, and codified them as rules it will follow when offering up coding samples. I mean, you can get really specific on this, down to “don’t show unicode characters in code, OR in comments. That last one took a while. You can also ask for a list of coding standards that it knows about, and just tell it to adopt one of those. So, “Use the security for aviation” coding standard, or what have you. Now I’ve got that human readable coding standard, in case anyone is interested, but more importantly, the GPT follows the standard, and when it deviates, I remind it, and it self corrects.
After all was said and done, the code ultimately is able to display graphics, which is the whole point. This is the first thing we generated

I know, super exciting right. But wait, there’s more…
From this code
% Draw a flower-like burst using radial strokes gsave 300 400 translate % Move origin to center of canvas 0 1 59 { /i exch def gsave i 6 mul rotate % Rotate CTM by i * 6 degrees newpath 0 0 moveto 100 0 lineto % Line pointing right, rotated by CTM stroke % Stroke with current transform grestore } for grestore 
I had asked it to create that bit of sample code. At first it produced something that was syntactically correct, but was not visually significant. That required more prompting to correct, to say “yes, I want it to be spec compliant, and I want it to be visually interesting”. To which: Duly noted, the future I will make it both spec compliant and visually meaningful.
And this gets to the assessment as to the state of the art. From a pure conversational perspective, it was truly like talking with a novice programmer. Pointing out obvious mistakes, trying to get them to apply what they learned from the last mistake, providing insights at critical junctures to get past roadblocks, imposing coding standards.
The GPT itself is now quite good at code generation, given the right coding standards, and guidance. I would say that in the hands of a novice, this is NOT a great tool. The novice will just take the code generated, bugs and all, and try to use it. It has no problem generating hundreds and thousands of lines of code which might compile, but will be full of holes which the novice will not know about.
For beginners, this would be a terrible tool, without guiderails. Using raw ChatGPT, the beginner doesn’t know when its ahllucinating, or what might actually be wrong with the guidance it’s giving. A beginner would be best to use a coding tool that knows it’s dealing with a beginner, has the right coding standards in place, and is in a “teaching” mode explicitly.
Overall, the journey was actually worth it. Maybe it took longer to come up with all that code, but it was a real life saver, in terms of saving myself from doing a lot of typing. I learned a lot about how to properly guide the tool (and it is just a tool), and get something out of it. I learned a lot more about tricky corner cases in C++, particularly discussions around memory management, using variants, and how to avoid using iterators. There were ‘new to me’ suggestions in areas I had not considered, and overall, when I want to change code en masse, I could reliably hand it to the GPT in a prompt, tell it how I wanted it changed, and it would do it.
Like this: Can you generate the static function version of this code
#pragma once #include "pscore.h" #include "psvm.h" namespace waavs { static const PSOperatorFuncMap dictionaryOps = { { "def", [](PSVirtualMachine& vm) -> bool { auto& s = vm.opStack(); if (s.size() < 2) return false; PSObject value; PSObject key; s.pop(value); s.pop(key); if (!key.isName() || key.isExecutable()) { return vm.error("typecheck: def expects a literal name"); } vm.dictionaryStack.def(key.asName(), value); return true; }}, { "dict", [](PSVirtualMachine& vm) -> bool { auto& s = vm.opStack(); if (s.empty()) return false; PSObject sizeObj; s.pop(sizeObj); if (!sizeObj.is(PSObjectType::Int)) return false; auto d = PSDictionary::create(sizeObj.asInt()); s.push(PSObject::fromDictionary(d)); return true; }}, { "begin", [](PSVirtualMachine& vm) -> bool { auto& s = vm.opStack(); if (s.empty()) return false; PSObject dictObj; s.pop(dictObj); if (!dictObj.is(PSObjectType::Dictionary)) return vm.error("type mismatch"); vm.dictionaryStack.push(dictObj.asDictionary()); return true; }}, { "end", [](PSVirtualMachine& vm) -> bool { vm.dictionaryStack.pop(); return true; }}, { "maxlength", [](PSVirtualMachine& vm) -> bool { auto& s = vm.opStack(); if (s.empty()) return false; PSObject dictObj; s.pop(dictObj); if (dictObj.type != PSObjectType::Dictionary) return false; s.push(PSObject::fromInt(999)); // placeholder return true; }}, { "load", [](PSVirtualMachine& vm) -> bool { auto& s = vm.opStack(); if (s.empty()) return false; PSObject name; s.pop(name); if (name.type != PSObjectType::Name) return false; PSObject value; if (!vm.dictionaryStack.load(name.asName(), value)) { return false; // undefined name } s.push(value); return true; }}, { "where", [](PSVirtualMachine& vm) -> bool { auto& s = vm.opStack(); if (s.empty()) return false; PSObject nameObj; s.pop(nameObj); if (nameObj.type != PSObjectType::Name) return false; const char* name = nameObj.asName(); for (const auto& dict : vm.dictionaryStack.stack) { if (dict->contains(name)) { s.push(PSObject::fromDictionary(dict)); s.push(PSObject::fromBool(true)); return true; } } s.push(PSObject::fromBool(false)); return true; }}, { "currentdict", [](PSVirtualMachine& vm) -> bool { auto& s = vm.opStack(); auto top = vm.dictionaryStack.currentdict(); if (!top) return false; s.push(PSObject::fromDictionary(top)); return true; }}, { "countdictstack", [](PSVirtualMachine& vm) -> bool { int count = static_cast<int>(vm.dictionaryStack.stack.size()); vm.opStack().push(PSObject::fromInt(count)); return true; }}, { "known", [](PSVirtualMachine& vm) -> bool { auto& s = vm.opStack(); if (s.size() < 2) return false; PSObject key; PSObject dictObj; s.pop(key); s.pop(dictObj); if (key.type != PSObjectType::Name || dictObj.type != PSObjectType::Dictionary) return false; auto dict = dictObj.asDictionary(); if (!dict) return false; bool exists = dict->contains(key.asName()); s.push(PSObject::fromBool(exists)); return true; }} }; } // namespace waavs which would generate the following
#pragma once #include "pscore.h" #include "psvm.h" namespace waavs { static bool op_def(PSVirtualMachine& vm) { auto& s = vm.opStack(); if (s.size() < 2) return false; PSObject value; PSObject key; s.pop(value); s.pop(key); if (!key.isName() || key.isExecutable()) { return vm.error("typecheck: def expects a literal name"); } vm.dictionaryStack.def(key.asName(), value); return true; } static bool op_dict(PSVirtualMachine& vm) { auto& s = vm.opStack(); if (s.empty()) return false; PSObject sizeObj; s.pop(sizeObj); if (!sizeObj.is(PSObjectType::Int)) return false; auto d = PSDictionary::create(sizeObj.asInt()); s.push(PSObject::fromDictionary(d)); return true; } static bool op_begin(PSVirtualMachine& vm) { auto& s = vm.opStack(); if (s.empty()) return false; PSObject dictObj; s.pop(dictObj); if (!dictObj.is(PSObjectType::Dictionary)) return vm.error("type mismatch"); vm.dictionaryStack.push(dictObj.asDictionary()); return true; } static bool op_end(PSVirtualMachine& vm) { vm.dictionaryStack.pop(); return true; } static bool op_maxlength(PSVirtualMachine& vm) { auto& s = vm.opStack(); if (s.empty()) return false; PSObject dictObj; s.pop(dictObj); if (dictObj.type != PSObjectType::Dictionary) return false; s.push(PSObject::fromInt(999)); // placeholder return true; } static bool op_load(PSVirtualMachine& vm) { auto& s = vm.opStack(); if (s.empty()) return false; PSObject name; s.pop(name); if (name.type != PSObjectType::Name) return false; PSObject value; if (!vm.dictionaryStack.load(name.asName(), value)) { return false; } s.push(value); return true; } static bool op_where(PSVirtualMachine& vm) { auto& s = vm.opStack(); if (s.empty()) return false; PSObject nameObj; s.pop(nameObj); if (nameObj.type != PSObjectType::Name) return false; const char* name = nameObj.asName(); for (const auto& dict : vm.dictionaryStack.stack) { if (dict->contains(name)) { s.push(PSObject::fromDictionary(dict)); s.push(PSObject::fromBool(true)); return true; } } s.push(PSObject::fromBool(false)); return true; } static bool op_currentdict(PSVirtualMachine& vm) { auto& s = vm.opStack(); auto top = vm.dictionaryStack.currentdict(); if (!top) return false; s.push(PSObject::fromDictionary(top)); return true; } static bool op_countdictstack(PSVirtualMachine& vm) { int count = static_cast<int>(vm.dictionaryStack.stack.size()); vm.opStack().push(PSObject::fromInt(count)); return true; } static bool op_known(PSVirtualMachine& vm) { auto& s = vm.opStack(); if (s.size() < 2) return false; PSObject key; PSObject dictObj; s.pop(key); s.pop(dictObj); if (key.type != PSObjectType::Name || dictObj.type != PSObjectType::Dictionary) return false; auto dict = dictObj.asDictionary(); if (!dict) return false; bool exists = dict->contains(key.asName()); s.push(PSObject::fromBool(exists)); return true; } static const PSOperatorFuncMap dictionaryOps = { { "def", op_def }, { "dict", op_dict }, { "begin", op_begin }, { "end", op_end }, { "maxlength", op_maxlength }, { "load", op_load }, { "where", op_where }, { "currentdict", op_currentdict }, { "countdictstack", op_countdictstack }, { "known", op_known }, }; } // namespace waavs And just like that, 150 lines of code transformed. It’s a task that might have taken me about 10 minutes or so, but this way, it took me more like the amount of time to type the prompt, copy/paste the source, and copy/paste the answer back into my code editor.
Final thoughts, I definitely want to publish more of the chats, because I think they’re informative. I would love to see better integration into my coding environment. I do have ‘copilot’ in Visual Studio, but it’s not quite the same as using the ChatGPT prompting system. What I really want is the ChatGPT window available, and an easy way for it to know about my whole project, and simply say, ‘can you copy that to file.xyz’ or what have you. But, in terms of state of the art, one year on, this is fairly sophisticated. I know I can’t depend on the GPT to come up with novel code without supervision, but it’s slowly creeping up from ‘worse than beginner’, to ‘novice needing supervison’. With proper coding standards more reading of good code, more correction from experts, and the ability to write and run unit tests, this is going to be a serious tool for anyone. You’ll be able to prompt your way to real working systems.
Does this threaten the art of programming for humans? In fact, I think it can amplify the abilities of human coders. They can leverage what humans do best, which is serendipity, leaps of faith, what-if, imagination in general.
I was able to take some older code in another language, and within a week, promot a GPT to generate new code, that works, and is actually better architected than the old code. The AI kinda wrote the code, and I helped a lot.
Challenging Assumptions for Better Software
Posted: April 26, 2025 Filed under: Uncategorized | Tags: AI, debugging, performant, programming, svg, svgandme, technology, writing 1 Comment
Software development is a lonely quiet art where souls can easily be crushed into despair. Other hand, it’s also a place where an individual, or team, can rise to levels of triumph and creation that can feel earth changing at times.
As a largely solo developer, I find myself zeroing in on challenges that really don’t warrant the attention that I give them, but I just can’t let them go. Case in point, I’ve been playing with SVG parsing for the past couple of years because I’ve chosen this format as the best way to represent vector graphics in the apps that I develop.
Forgetting the reasons why I chose to roll my own SVG parser rather than using myriad others that already exist, I’ve set a challenge for myself to make the most performant, and most capable parser and renderer out there. I go light on the capable part, because you can spend a good decade chasing down ill specified features that are rarely used. Creating a parser that renders 90% of the stuff that’s in the wild, or that I’m likely to use, is good enough.
Whereas, I’m willing to allow a few features slide here and there, I’m laser focused on making the most performant scanner/parser on the planet. My current work is in the svgandme repository on GitHub.
As it is, the parser/scanner is pretty fast. There was this one file that was really giving me fits though. It’s one of the CIA WorldFactBook images found on Wikimedia. It’s a largish file, but not the largest one in my test suite. This file was taking upwards of 50 seconds to first render. Subsequent renders were fast, so this was obviously a problem with parsing. I isolated the parsing, and it was taking 20 seconds. It doesn’t matter what the machine configuration was, what’s important is the magnitude of the time taken. I mean, most other files would take a small fraction of a second, including files that were on the order of 40Mb. But this one…
As it turns out, there was a “one line change” that dropped this file from 20 seconds of parsing to a mere 0.5 seconds. And here is the change…
if (!st.fSource.isEqual(st.fMark)) { What this used to be was…
if (st.fSource == st.fMark) { I mean, on first glance, aren’t these just the same things? Isn’t an ‘==’ operator typically implemented as a call to ‘isEqual()’ by convention? Well, yah, sure, but this is where things get interesting don’t they.
In most cases, ‘operator==’ should compare the contents of two values for equivalence. In the case of ByteSpans, this would essentially be a memcmp() of two values. In the case of the scanner though, it’s literally the pointers we want to compare, and not the contents. Comparing contents, while returning correct values, is a complete waste of time. Given how often this comparison occurs (every single time through a core loop), this was exponentially bad for performance. It’s a wonder and a testament, that this obviously poor design choice has been masked thus far. It is only this file, and maybe another largish one, that were pathological enough to expose this performance flaw.
Given how fast parsing every single other of the few thousand .svg files is, I could have just continued to ignore this particular case, and moved on with life, adding more features, patting myself on the back for completeness. But, I really do want the fastest parser on the planet. Having super fast .svg parsing opens up a world of possibilities in terms of how to create, and manage my visual assets. It’s so fast, I don’t even need to concern myself with pre-parsing, and storing stuff in a binary format for later loading. Just keep things in their raw .svg form, and move on with life.
You would think this would be a relatively easy task to track down where the performance was failing. Problem is, all the profiling tools for Visual Studio, do NOT work when you’re running ARM64 version of Windows on the M4 apple silicon. If I were on a real x86 PC, I could have run the profiling tools, seen the hot spot, and come to the conclusion a lot quicker, but I wasn’t. ChatGPT was of no help here either. It praises me for the quality of my code, and sees nothing wrong with it. It was head scratching, intuition, and ultimately a line by line code review, with an eye towards “what assumptions am I making that could be hurting performance?”
In the end, it was an “aha!!” smack the forehead realization. I suspected, and knew there was this discrepancy between operator== and isEqual, but it’s like one of those elusive dreams where you’ve saved the planet, but can’t quite remember how.
In short, if you’re writing code, it’s useful to challenge your assumptions every once in a while. The XML parser is the core component of my SVG parser. I keep going back to it, trying to make it simpler, faster, smaller, easier to maintain. In this current round, I did find the big negative performance elephant in the room, and cleaned up a few other cases as well. At some point, I’m going to have to actually build a test suite that proves the “fastest on the planet” claim. But for now, I’m just going to keep repeating it, and allow others to call me out in order to put that claim to the test.
Faster apps are something that evolve. Most of the time, we don’t have the luxury of time necessary to support that evolution. I am fortunate enough to be in a position where I can keep revising my old code, making it more efficient, performant, and capable along the way. In this case, a one line change increased performance by at least an order of magnitude.
Two Years with a GPT
Posted: April 16, 2025 Filed under: Uncategorized | Tags: AI, artificial-intelligence, chatgpt, copilot, musings, technology Leave a comment
My first experience with a “GPT” was copilot, integrated into Visual Studio Code. I was working inside Microsoft, and the dates seem fuzzy now, but it’s been roughly two years. What is a “GPT” anyway? Generative Pre-Trained Transformer. It says what it is right on the box. Well, at least in nerd speak. Let’s break it down though, because it’s instructive as to the state of “AI” at the moment.
Pre-Trained – A bunch of data is used create a ‘model’, which is then used to do parlor tricks that make you think something is showing signs of intelligence. Just imagine having a 10 year old who has read every book and web page on the planet, and then ask them questions. This 10 year old’s trick is being able to read and remember everything they’ve read, and make some predictions when answering questions, but they don’t have the years of experience and insight necessary for true ‘wisdom’ or ‘intelligence’.
Generative, Transformer – “Generative” is the part where you can ask a seemingly ‘new’ question, and get an answer. Again, it’s as if my ten year old has read everything, and I ask, “How would Mark Twain answer this question?”. Well, having read all of Mark Twain, they can make an educated guess as to what response might be generated from a particular query. Basically, doing some statistical analysis of past responses, and putting together a “most likely” response. It’s not thinking about Samuel Clemons’ actual thought processes, it’s just stitching together the words that are most likely to appear together based on all the reading they’ve done of the author. You could ask the same question twice and get two very different answers. There’s no ‘intelligence’ there.
That’s a gross simplification, and those versed in the trade will smack their foreheads in disbelief, but the rest of us can carry on.
What this has meant to me as a fairly seasoned programmer is this. Copilot is a great copy/paste tool within the Visual studio editor. From the beginning, I’ve marveled at how I can type a comment in my code, and it will fill in the blanks with what appears to be functional code. This works great for code areas where the answer is well known, and having read all the publicly available code on GitHub, it can fill in a reasonable approximation of what I want. When I already know what I want, CoPilot is great at doing the boilerplate work for me, saving me let’s say 30% of my time.
It is also great providing some guidance on things I’m not familiar with. For example, I’ve never written an app for Android. With CoPilot, I can say “generate a simple app that does the following, on Android”, and it will do a reasonable first approximation, not because it’s intelligent, but because it’s read a bunch of Android code, and can stitch together a reasonable approximation of what I’m looking for.
Where it does not do well is when you’re asking for something it hasn’t seen before. When you’re looking for precision, novelty, efficiency, beyond what you’ve typically seen in online available training sets of code. I tried this experiment, I needed to be able to parse a time value for SMIL animations. I fed an eBNF into ChatGPT, and had a multi-hour interactive session to try and come up with a solution. The actual chat of the conversation can be found here: Chat about time parsing. It was an interesting conversation. We went off into the weeds a few times, and I had to do a hard course correction to get it back on track to helping me. In the end, I needed to come up with the novel implementation, and used ChatGPT to help me clean it up. It was great for coming up with test cases, not so great at generating efficient code to cover all the corner cases.
General conclusion on the ‘AI generates code’ front is; Yes, the various “AI” tools out there can generate code. Those tools are best wielded by sharp eyed experienced devs who know when the thing is making a mistake, or coding itself into a corner. I know this runs counter to the “AI Agents to rule them all” narrative that’s currently circulating, but it’s been my lived experience. At the same time, I an also say, over the two years that I’ve been using CoPilot and ChatGPT, these things have improved a lot, and will continue to do so. Also, OpenAI and MSFT tools are not the only ones in this area. There are plenty others, and more coming all the time. As we programmers use them more, providing guidance and more code for them to look at, they’re getting better. Less hallucinations, and more actual working code.
Besides code, there’s been a ton of work going on in the ‘generative image’ part of the AI game.
“create an image of a teddy bear washing dishes”

“can you make that photorealistic”

I mean, I’m no expert on teddy bears washing dishes, but that looks pretty good to me. If you had done this with Dall*E a couple years back, you might get missing body parts, or extra eyes, and incoherence. You definitely would not be able to refer to a just generated thing, and make modifications to it. You would have gotten a completely different image each time, with no relation to each other.
This is also “generative” and “transformer” based, only with images instead of code. So, train the thing on all the images you can, and get more interesting results.
The same is happening for video. We’re not quite up to writing a single prompt to generate full feature length movies, but with the advent of tools such as Sora and myriad others, doing 10 – 20 second video clips is completely reasonable. The big deal these days is character coherence. That is, I can describe a character, and have that same character remain consistent from frame to frame. Even the posts I’ve create for this blog have some of that. I describe a character “60 year old African American tech bro ‘WilliaAim'”, and keep referring to him when making my title images.
The same things that are happening for code, and still images, and video, are happening for voice generation as well. I can generate audio from a prompt, using just about any voice, real or imagined, that I want. Better yet, I can have a fully synthesized character ‘speak’ what I type, and it looks totally real.
In the hacker forums, and amongst fretting tech workers, there’s hand wringing and worry at job loss and despair. Really, I think it’s unnecessary. Progress is being made on several fronts at once, and the key thing is to ride the wave, rather than being swamped by the progress. That’s a fairly tall order for someone who might have spent 15+ years in tech. The prospect of “retraining” is a daunting one, especially with ongoing familiar and life obligations. But, things change.
Lastly, I want to talk about the cost of all this progress. When OpenAI was first available beyond beta, I signed up for a “plus” account. That is a $20/month subscription. For that 20 bucks, I gain access to a few more models than their free tier, a few more prompts, a bit more memory to save context, etc. For $20/month, it has been worth it to be able to play along with the game as it has evolved. Good for OpenAI, good for me, a decision that did not require much thought. Now, OpenAI has a $200/mo “Pro” edition. Well, now thoughts creep into my mind. Are there any other dev tools I have where I pay that much money per month? Nope. That’s $2,400/year. that’s 4 Mac minis per year, a small cluster. That’s beyond what I pay for my Apple dev, or Visual Studio, or anything. And that’s per year.
Given how young the field still is, and how many options are out there every day, this amount of investment does not make sense for me, a relatively small time developer. Maybe for an organization that has hundreds of programmers, who are looking to replace more seniors with younger “AI enabled” programmers, this is a bargain. For me, I’d much rather take that amount of money and spread it around to various offerings. There are companies that generate synthetic marketing and training videos, others that do prompted presentation generation, still others that are focused on video editing. All of them are struggling to define business models that make sense, but hovering around subscription models that are typically between $15-$50/month. This will settle out as competition heats up, and there is consolidation.
In conclusion, two years on, the GPT revolution is in full effect. I enjoy using copilot and ChatGPT. As a fairly seasoned developer, I know how to mind the tools and use them efficiently to fill in the blanks where my knowledge is lacking, or my fingers are just too tired to type. The state of the art is not in a place where you can just fire and forget and achieve great results. I’m not one to say we won’t get there, but we’re not there yet, and anyone who thinks we are is headed down the same path as “outsourcing is the greatest”.
In areas where I am not an expert (image generation, audio generation, text generation), the tools are more than fantastic. They are magical. I’m sure experts in those areas have the same analysis of the state of the art that I have in my area of expertise. One thing’s for sure though, progress is continuing, and even if we don’t get to the semantic perfection of “Artificial Intelligence”, there are super useful tools being created, and they’re becoming cheaper, more accessible, and more useful every day.
A View from The Other Side
Posted: March 22, 2025 Filed under: Musings, News | Tags: india, technology, video-games, virtual-reality, vr, world-citizen 1 Comment
Well, it’s been a minute, as they say.
I would say that my family’s current epoch started about a year ago, around the time I purchased the Apple Vision Pro visual computing device (VR goggles…). I was recounting the last year in an email, and it was a real feature length movie of activities. The center wasn’t necessarily around technology, but there sure was a lot of it.
Back in Feb. of 2024, I purchased my AVP. Once summer hit, we ended up heading to Denver Colorado to meet up with Stewart Tucker Lundy. Stewart is an irascible quadriplegic who was doing quite fine in the world before I ever met him. We put the AVP on his head, and in his view, it’s as if a who new world opened up for him. I mean, VR goggles are great for watching movies on a giant screen 1 inch from your eyes, but when you have a tool such as this, that’s really great at eye tracking, has low latency, and gives your hours of comfort, it can actually change your view of the world, and how you can interact in it.
Well, we went on from there, visiting family in California, and Arizona over the rest of the summer. Not a lot of tech, but a lot of beaches, sun, and more sun.
Come October, it was time to head to Saudi Arabia to have a religious experience, then on to India to meet up with family we hadn’t seen since before the pandemic (about 5 years). By December, we had returned to the US, and decided it was worth establishing a home on the other side of the world, so that a visits could be more frequent, and more comfortable. I’ve already documented the lead up to that departure, so here I want to talk about what’s happened since.
If you’ve never had the experience of setting up a home half way around the world, I can tell you, it’s quite a bit of work. First of all, we did not bring any of our stuff (as yes), just suitcases and backpacks. Our first landing was in a hotel room for 10 days, while we tried to figure out a good living situation. Well, it takes more than 10 days to find a house or apartment that you can buy or rent. So, our next move was into a serviced apartment for 28 days. Of course we need some transportation of our own, because using taxis and relatives is inconvenient and expensive over time. So, a ‘scooty’ was purchased as our first vehicle.

We have put on hundreds of kilometers on this thing in the past couple of months! We did eventually find a flat, and have been “fitting it out” for the past few weeks. Everything from false ceilings, to bathroom geysers, to kitchen cabinetry. Every last bit of it. The beauty of India, and probably several other places, is that you can get fairly substantial work done fairly quickly, if you know the right people. Just so happens there are people “in the trades” within our family, which is a tremendous help.
During this time, Ramadan has begun. If you’re not familiar with Ramadan, it’s essentially 30 days of fasting (no food or drink between sunup and sundown). It’s quite the challenge when it’s 90+ degrees each day, and you’re running back and forth, coordinating construction, shopping for appliances and the like.
Tech things I’ve noticed along the way.
Number one is; WhatsApp IS the internet for most people. I mean, I’ve met people with smart phones who don’t know how to make regular phone calls. Everything is done through WhatsApp.
Number two is, India is way ahead of the US when it comes to micro payments. They made a push a few years back to get everyone a modern ID, a bank account, and phones that are internet enabled. I can do a transfer of funds as small as probably 5 rupees (about 6 cents USD). People aren’t carrying a lot of cash any more. You can transfer money to anyone in the street who has a phone, and pretty much all shops take this form of payment (UPI). There are several apps available, and you can use Google Pay, Amazon Pay, your banking app, and myriad others.
Number 3 is what we might consider gig economy. Everything from delivery services to taxis. Sure, they’ve long since had Uber, but there are local brands, such as Rapido. With rapid, you can surely get a car, but you could also get an ‘auto’, which is one of those three wheeled ‘tuk tuk’ things you see in most of the world. Or, you could get a motorcycle. Just depends on your needs. This is not likely something you’d ever seen in the US.
Common to all this is a wealth of mission critical cloud based software solutions. Rapido isn’t even the only transportation service. Then there’s Swiggy, Zomato, and others, who do the typical Uber Eats food delivery thing. Get anything you want within an hour.
On top of all that, there’s a strong undercurrent of progression. Hyderabad is a city with 400+ years of history. There is ‘old city’, which probably looks similar to the way it was 400 years ago, just modernized. Then there’s the all new “Hi Tech City”, which is where the government decided more than 20 years ago, to be an tech powerhouse on the world stage.

These are synthetic images, but they’re not far off the mark of where the population envisions itself being over the coming decades. My only reservation would be “Where’s the water coming from…”
I was living here 2006 – 2009, and the biggies at the time were MSFT, WiPro, Tata, Satyam. That was the beginning of what has now become this massive Hi Tech City buildup. I mean, you stroll the streets there and you see all the big names in tech. You could close your eyes for a moment and not be blamed for thinking you were in the middle of Silicon Valley.
India, and Hyderabad, is a hotbed of tech innovation. It might not be the place where Linux is born, but given the massive population, and unique challenges, you can bet they’re coming up with solutions to challenges that are never even dreamt of in the US.
And how about our tech? I’ve been messing around with improvements to my SVG code off and on. I don’t really have a proper development rig setup at the moment. I only have my MacBook Air, and am running Parallels desktop to compile code for Windows and Linux. it works well enough, but boy, what I would not give for a much bigger screen. Yes, I have the Apple Vision Pro with me, but it’s great for video editing, but not typing a ton of text for hours on end.
Which leads me to… I am currently engaged in developing an architectural walkthrough experience for a real estate developer. This is a great opportunity to put the AVP to actual usage. At the same time, I’m going to employ the skills of individuals from our XRI community in Africa, to develop a top notch experience.
Being engaged in this endeavor has got me thinking more about the state of VR, and what else I might want to do there. Sure, the AVP is great, but they’re not the only game in town. I’m now looking into the Bigscreen Beyond 2, by bigscreenvr. This is a different take on things. I would classify it as a ‘big screen on your face’ rather than as a VR headset. You connect it to a Display Port and USB ports of a PC. But, it’s much smaller and lighter weight than anything in the VR headset category, and over time, it will gain more of the independent VR headset features (because compute always shrinks). For now, just imagine plugging into a small computer at your hip.
And this has all got me thinking. The builder is keen on offering a “Home Theatre” in the apartments he’s building. Well, that seems so 80s. At the very least, it should be a “VR Home Theatre”. More than just a giant projector with Dolby surround. You need an immersive 3D experience right out of the box. That’s certainly a discussion worth exploring.
Long story short, we’re expanding our horizons, making more places in the world where we can rest comfortably, and be productive. It’s been a whirlwind year, and within about a month, we’ll be settling into a new routine.
Part of my new routine, is using the AI tools that are out there. In the past year, I’ve created all the art for this blog using AI tools. The tools are getting better all the time. This time, I’ve created a character “WillAim”, which I will refer back to again and again. I’m still doing the actual writing the old fashioned way, as I enjoy that activity when I can.
And that’s the way it is.
Coming up, some tech talk related to code optimization, and explorations in languages.
Digitizing Treasures
Posted: January 20, 2025 Filed under: Uncategorized | Tags: archive, backup, blu-ray, dvd, movies, physical-media, technology Leave a comment
So, everything is supposed to be in “the cloud” already, right? I mean, how many times have I already backed up my DVD collection, and tossed away the physical discs? I mean, you need those physical discs in case you’re ever raided and need to prove you’ve actually purchased those hundreds of movies, and not just downloaded them from the internet right?
Over the years, I’ve managed to backup a fair number of DVDs, Audio CDs, pictures from phones, pictures from photo albums, and the like. Mostly the end up on the good old reliable Synology NAS sitting under my desk. All code is up on GitHub, other than the code I don’t want to every share publicly.
As we are moving to India, I’m making a final push to digitize everything.
Photos – We have accumulated about 80 years of family photos, because we’ve been the last stop for such things as various family members have passed away. There have been attempts over the years to digitize the same. We have packets of CD-ROMs from the likes of WalMart and Costco, where they had developed the film, and give you the CD containing the same. For the really old stuff, there have been waves of efforts to digitize albums through various services, leaving us with more disks, or perhaps nothing, and they’re lost to the ether.
A couple years back, I purchased a photo digitizer
Epson Fast Photo FF640 As it says on the box, it is fast. If you give it a pile of non-sticky photos, it will digitize a typical 5×7 in about a second. That’s great when your photos are all nice and clean and slippery. When they’re not in this condition, you might need to feed them by hand, one by one. In this case, it’s more like 10 seconds of more manual labor per photo.
We went through the albums, and tried to pull out pictures without ripping them. We were not always successful, and we’d just leave those alone, and move on to the ones we could separate. In all, we had about 25 albums total, amounting to thousands of photos.
All photos are now on a nice SanDisk USB SSD drive. This is a super snappy 4TB drive. More than enough to hold 80 years of family photos. I think they are all in .jpeg files. Certainly we can convert to other formats, but that should last another decade at least.
Onto the same SSD went all the photos from the Synology. That includes previous versions of digitizing images, as well as backups from phones that have long since added their structure to landfills or drawers. Phones are interesting, because sometimes their images are in standard formats such as .jpg, or possibly .png, but more often than not, they’re in a format that’s either raw, or some vendor proprietary thing. The key to making these useful, is to convert the to something universal and modern while copying, or soon thereafter.
Apple has had a variety of formats, and has added things like spacial to the mix more recently. To do these conversions, I’m sure Apple has software, but I tend to use IrfanView or something similar, because they’re treasure troves of esoteric file format support.
Then there’s Audio CDs. On a Windows machine, I can just pop it into the player, and open it up with Windows Media Player. There is a convenient button that says “Rip CD”. Just press that, and you get a nice rip, in .m4a format. This is supposed to be universal enough, that I just keep it in that format, and move along. If there’s audio that I want to preserve, with as much fidelity as possible, and eliminate any errors coming from the disk drive, I’ll use Exact Audio Copy. This is an older, going back to the days when ‘ripping’ was a bleeding edge thing, and encryption was a challenge. It’s great, and generates lossless .wav files. They’re big, but if you’re trying to do archival work (which I’m not), this is the way to go.
Lastly, there’s DVDs and BlueRay. There are legal reasons not to archive these, so I’ll pretend I’m only dealing with discs that have no encryption on them. I’ve collected DVDs from various places in the world while traveling. DVDs can be region locked, so a DVD from India, won’t necessarily play on a disc drive in the US, and visa versa. Besides this, the DVD from India is meant to be played on a PAL system, rather than NTSC. For years, I’ve had multiple physical DVD players connected to my TVs, putting into service the right one for the media and region. Well, digitizing this lot removes this burden once and for all.
There’s a two step process here. First is to get an external DVD drive that can be unlocked so it does not care about regions. I started with a OWC Mercury Pro 16X Blue-Ray. It’s a really good drive, and most importantly, there’s a SDF Tool Flasher, which unlocks the full potential of the drive. This firmware upgrade allows the drive to be in LibreDrive mode. This mode allows it to run faster while ripping, removes the region lock, and generally allows software to access the raw disk data without restriction. This is all very nice, and to do the actual ripping, I used MakeMKV, and convert the raw DVD information into the .mkv container format. In the past, I would just store the DVD as a .iso file for archival storage. Later, I would use Handbrake to pull out a single .mpg file for the main feature. These days, .mkv is a better archival format, as some players, such as VLC Player, and play from this directly, and more devices can read that format, even if they can’t read a .iso directly.
Having this unlocked drive makes it easier to get DVDs and BlueRays from other places in the world. This time, when I’m in India and want to buy the latest on a DVD, I don’t have to worry about getting an India region specific DVD player to connect to the TV. I can just connect the drive to the laptop, and play away.
Ignoring streaming for the moment, I find that getting off these old disks is a MUST. I mean, they have a physical lifespan. They’re prone to finger prints, scratches, water damage, delimitation, and just general bit rot. Now is the time to preserve the content (some as old as 20 years), and finally get rid of the physical disks. The fact that a palm sized SSD can hold 4 terabytes of data, makes it much easier than in the past. Previously, that much storage required some sort of NAS, with redundancy, and permanence, and cloud connection. Nowadays, if I want to backup, and take it on the road, I can just get another of these SanDisk drives (or smaller/cheaper), make a copy, and move on with life. No giant server, constantly consuming electricity, redundancy, and all that.
And what to do with all this content?
More movies that I’d ever have time to prioritize watching, more photos than I’d ever care to look at. I did it so that I can have my own data sources for AI research! I mean, 80 years of family photos, from different regions, different photo quality, different ages of the same people. Surely that’s an interesting set to have. Yes, other family members can look at the stuff, but I think it will make good pickings for playing with programatically. I can even see a data service where I say “just send us your photos if you’re doing nothing with them, and we’ll make you your own private AI of the same”.
De-treasuring is hard. The age of digitization is upon us. It’s easier than ever to preserve history, and at the same time, as we move away from physical forms and into digital, it’s easier to lose track of history altogether.
Here we are, and here we go!