How to make illegal values unrepresentable? [duplicate]

Question

A method of design in Functional Programming is making illegal states unrepresentable. I always see this being accomplished with the structure of types, but what about the value of types?

What if I have a string called Email and I only want it to hold a valid email address (checked against some Regex)? How can I do this in a functional way (without using OOP)?

Make an immutable type whose constructor throws if the data is invalid. — ildjarn
– ildjarn, Commented Feb 20, 2017 at 10:08
In Haskell, you can write down the type RegExp of regular expressions: switch on DataKinds, and now you can write down the type of data valid for a given regexp RegData :: RegExp -> *, and the singleton types for regexps REGEXP :: RegExp -> *. You'll probably also need a validating parser, regParse :: REGEXP r -> String -> Maybe (RegData r), so you can go from strings to validated data, given the singleton value for the regexp you want. If you can write a regexp recognizer, then regParse is an evidence-producing refinement of the same idea. — pigworker
– pigworker, Commented Feb 20, 2017 at 10:25
"a valid email address (checked against some Regex)" Not a very good example considering it's really hard to create a regex that can properly match the full set of valid email addresses. — JAB
– JAB, Commented Feb 20, 2017 at 16:35
@Kurren Don't suppose that people's names don't contain numbers, either :-). — Blackhole
– Blackhole, Commented Feb 20, 2017 at 19:57

chi · Accepted Answer · 2017-02-20 14:18:05Z

A common idiom is to use a smart constructor.

module Email (email, fromEmail, Email()) where -- export the type, but not the constructor newtype Email = Email String -- export this email :: String -> Maybe Email email s | validEmail s = Just (Email s) | otherwise = Nothing -- and this fromEmail :: Email -> String fromEmail (Email s) = s

This will verify emails at runtime, not compile time.

For compile time verification, one would need to exploit a GADT-heavy variant of String, or use Template Haskell (metaprogramming) to do the checks (provided the email value is a literal).

Dependent types can also ensure values are of the right form, for those languages that support them (e.g. Agda, Idris, Coq). F-star is a variant of F# that can verify preconditions/postconditions, and achieve some advanced static checks.

If possible, I would love to see the GADT way of handling it. :)
@Sibi I'm not really doing that! :-P It would probably involve something like a singleton data C (c::Char) where A :: C 'a' ; ... and then data S (s::String) where Nil :: S "" ; Cons :: C c -> S s -> S (c:s). Then you have data Email where Email :: S s -> Proof s Start -> Email where Proof s Start painfully encodes a DFA at the type level for checking that the string is really an email. For final states you have Pi :: Proof "" End and for transitions "q1-c->q2" middle Pj :: C c -> Proof s q2 -> Proof (c:s) q1 etc. Painful and not convenient at all, IMO :-P
Ah, okay! I will see if I can come up with that implementation tonight, just for fun. :-)

Fyodor Soikin · Accepted Answer · 2017-02-20 13:56:24Z

I'd posit, the same way you do all your runtime error handling?

If you did it "with classes and properties for encapsulation", you'd throw an exception (ie in the setter) that some code, somewhere higher up in the call chain, would have to watch out for. It's not your "classes and properties" that magically solve this, it's your discipline to throw and catch exceptions. In most any FP language you have a wide arsenal of representations for signaling erroneous values/inputs, from simple Maybe or more detailed Either (or whatever these are called in F# ;), to full-blown exceptions, to forcing immediate-halt-with-stderr-message. As suits the current app/lib context.

"making illegal states unrepresentable" in types is for pre-empting as many easy-to-make-in-the-heat-of-the-moment developer mistakes as the type system / compiler understands how-to: not user error.

There's of course academic explorations and research into how we can shift handling ever-more classes of bugs to the static (compilation) side, in Haskell there's LiquidHaskell a prominent example. But until you have a time-machine, you can't retroactively prevent compilation of your program if the input it reads once compiled is erroneous :D in other words, the only way to prevent wrong email addresses is to impose a GUI that cannot possibly let one through.

In F#, Maybe and Either are Option and Choice (or Result in F# 4.1).

duplode · Accepted Answer · 2018-04-19 04:40:53Z

I usually do the way @chi has done. As he stated, you can also you Template Haskell to do checks on the provided email at compile time. An example of doing that:

#!/usr/bin/env stack {- stack --resolver lts-8.2 exec ghci --package email-validate --package bytestring -} {-# LANGUAGE StandaloneDeriving #-} {-# LANGUAGE DeriveLift #-} {-# LANGUAGE TemplateHaskell #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE QuasiQuotes #-} import Language.Haskell.TH import Language.Haskell.TH.Quote import Language.Haskell.TH.Syntax import Data.ByteString.Char8 import Text.Email.Validate instance Lift ByteString where lift b = [|pack $(lift $ unpack b)|] instance Lift EmailAddress where lift email = lift (toByteString email) email :: QuasiQuoter email = QuasiQuoter { quoteExp = \str -> let (item :: EmailAddress) = case (validate (pack str)) of Left msg -> error msg Right email -> email in [|item|] }

Now, if you load this up in ghci:

> :set -XQuasiQuotes > [email|[email protected]|] "[email protected]" > [email|invalidemail|] <interactive>:6:1: error: • Exception when trying to run compile-time code: @: not enough input CallStack (from HasCallStack): error, called at EmailV.hs:36:28 in main:EmailV Code: quoteExp email "invalidemail" • In the quasi-quotation: [email|invalidemail|]

You can see how you get compile error on the invalid input.

And that can all be done in one file for experimentation? Very cool.
Yes. I use the exec ghci option specifically from stack in the script: docs.haskellstack.org/en/stable/GUIDE/#loading-scripts-in-ghci

Nikita Volkov · Accepted Answer · 2017-02-21 07:51:52Z

As it appears, both answers of @chi and @Sibi are what Refinement Types are about. I.e., the types which enclose other types, while restricting the range of supported values with a validator. The validation can be done both at run-time and compile-time depending on the use-case.

It just so happens that I've authored "refined", a library, which provides abstractions for both cases. Follow the link for an extensive introduction.

To apply this library in your scenario, in one module define the predicate:

import Refined import Data.ByteString (ByteString) data IsEmail instance Predicate IsEmail ByteString where validate _ value = if isEmail value then Nothing else Just "ByteString form an invalid Email" where isEmail = error "TODO: Define me" -- | An alias for convenince, so that there's less to type. type EmailBytes = Refined IsEmail ByteString

Then use it in any other module (this is required due to Template Haskell).

You can construct the values both at compile-time and run-time:

-- * Constructing ------------------------- {-| Validates your input at run-time. Abstracts over the Smart Constructor pattern. -} dynamicallyCheckedEmailLiteral :: Either String EmailBytes dynamicallyCheckedEmailLiteral = refine "[email protected]" {-| Validates your input at compile-time with zero overhead. Abstracts over the solution involving Lift and QuasiQuotes. -} staticallyCheckedEmailLiteral :: EmailBytes staticallyCheckedEmailLiteral = $$(refineTH "[email protected]") -- * Using ------------------------- aFunctionWhichImpliesThatTheInputRepresentsAValidEmail :: EmailBytes -> IO () aFunctionWhichImpliesThatTheInputRepresentsAValidEmail emailBytes = error "TODO: Define me" where {- Shows how you can extract the "refined" value at zero cost. It makes sense to do so in an enclosed setting. E.g., here you can see `bytes` defined as a local value, and we can be sure that the value is correct. -} bytes :: ByteString bytes = unrefine emailBytes

Also please beware that this is just a surface of what Refinement Types can cover. There's actually much more useful properties to them.

This is more of an ad than an answer; could you at least provide an example of how your library works here?
Thank you! It's been hard to find practical examples of refined or dependent types; used to solve common & mundane app development tasks. Your library is essentially a layer of abstraction and simplification of LH?

Community · Accepted Answer · 2017-05-23 11:33:24Z

This was answered for me recently.

Here's the post.

The context of your question is in regards to a valid email. The overall structure of your code will leverage Active Patterns:

module File1 = type EmailAddress = private | Valid of string | Invalid of string let createEmailAddress (address:System.String) = if address.Length > 0 then Valid address else Invalid address // Exposed patterns go here let (|Valid|Invalid|) (input : EmailAddress) : Choice<string, string> = match input with | Valid str -> Valid str | Invalid str -> Invalid str module File2 = open File1 let validEmail = Valid "" // Compiler error let isValid = createEmailAddress "" // works let result = // also works match isValid with | Valid x -> true | _ -> false

Collectives™ on Stack Overflow

How to make illegal values unrepresentable? [duplicate]

5 Answers 5

3 Comments

1 Comment

2 Comments

4 Comments

Comments

Linked

Hot Network Questions