I have a method that removes common prefixes from pairs of strings, and have been trying to create a generator for it. It is trivial to generate pairs of random strings, but how can I enforce that many of the pairs have common prefixes? Simply filtering the pairs generates an insufficient number of examples, so I'm trying to create a custom generator to satisfy the requirements.
Here is what I have now; it works, but I'd like to generate better parameters:
(ns liblevenshtein.distance (:require [clojure.spec.alpha :as spec] [clojure.spec.gen.alpha :as gen])) (spec/def ::word (spec/and string? (complement nil?))) (spec/def ::v-w (spec/cat :v ::word, :w ::word)) (spec/def ::non-empty-v-w (spec/and ::v-w (fn [{:keys [v w]}] (and (not-empty v) (not-empty w))))) (defn- drop-common-prefix [v w] (loop [v v, a (.charAt v 0), s (.substring v 1), w w, b (.charAt w 0), t (.substring w 1)] (if (and (= a b) (not-empty s) (not-empty t)) (recur s (.charAt s 0) (.substring s 1) t (.charAt t 0) (.substring t 1)) [v a s, w b t]))) (spec/fdef drop-common-prefix :args ::non-empty-v-w :ret (spec/tuple string? char? string?, string? char? string?) :fn (fn [{{:keys [v w]} :args, [v' a s, w' b t] :ret}] (and (= v' (str a s)) (.endsWith v v') (= w' (str b t)) (.endsWith w w')))) Experimenting with generators, I've come up with the following. It generates pairs of strings satisfying my requirement, but I do not know how to split them into the arguments for my function:
user=> (def prefix-pair-gen (gen/fmap (fn [[u v w]] [(str u v) (str u w)]) (spec/gen (spec/coll-of string? :type vector? :count 3)))) #'user/prefix-pair-gen user=> (spec/def ::prefix-pair (spec/with-gen (spec/coll-of string? :type vector? :count 2) (fn [] prefix-pair-gen))) :user/prefix-pair user=> (gen/sample (spec/gen ::prefix-pair)) (["" ""] ["c" "cR"] ["lZ" "2F"] ["8a" "8a4"] ["n1D8CSq" "n1D8Gb1k"] ["X4PO" "X4Pu"] ["eAVM1" "eAVM1qg"] ["5e3DkZ6i" "5e3DkZv4Y"] ["3P7210" "3P7245cHM"] ["1c4D2j4UUK738" "1c4D2joFjd"])