Let me first address the assertion that "Reads will necessarily require a decompression step." Some common aggregate expressions can be evaluated without decompression.
Say we have 1,000 rows which include a dollar-value column. For the sake of illustration we'll say the first 600 rows all are $5 and the others are $7.
Uncompressed, to get the total dollar value requires loading 1,000 values and 999 computations. Similarly a count requires touching 1,000 values. These are very common analytical expressions.
With run length encoding the column can be stored as [(600,$5),(400,$7)]. Total value can be calculated as ((600 * 5)+(400 * 7)) and count as (600+400). Many fewer values are loaded and calculations performed. Repeat this across the very many compressed blocks an OLAP system may hold and performance improvement can be noticeable.
Usually the system will capture summary information per block, per column, as it is compressed. Minimum, maximum, total, row count & null count are typical. Using this metadata, stored in the DBMS catalog, some results can be found without even reading table data.
Second, the assertion that "Fast datastore reads in a database come from ... the ability to selectively read only the relevant data in a random-access manner" is not true. Snowflake holds data internally in a columnar format. It does not allow for the creation of secondary indexes at all. It is widely held to be a capable OLAP system.
Further there are a class of DBMS built around GPUs. They (generally) forgo any indexing, instead relying on the device's large bandwidth and parallel processing to treat all access as a table scan. These have not gained wide popularity, but they exist.
Thirdly the assertion "Indexing into a compressed data store is impossible" is false. SQL Server, for one, permits clustered columnstore indexes (cci). "Clustered" means they constitute the on-disk sequencing of base table rows. As columnstores they are compressed. Secondary BTree indexes can be defined on a table that is a cci.
Rows in a compressed block are still in some sequence. Each set of column values in the block are an array. Any compression on that array retains the original sequencing. A secondary index can, at its leaf level, hold a pointer to the block and offset within the block. Compression algorithm design makes it simple to return the ith value for each compressed column.
Compressed data is typically 20% the size of uncompressed. This means loading data takes one fifth as much time. The corollary being five times as much data can be retained in RAM ready to answer the next query, improving performance.
Having multiple column-values adjacent in memory facilitates movement en masse through the memory hierarchy and onto cores, where they are amenable to SIMD techniques. When a predicate eliminates rows this can be recorded in an related bitmap. Each predicate can be evaluated in parallel on separate cores and the bitmaps ANDed together at the end.