4

Languages like Haskell allow you to create your own operators. The following answer explains which punctuation characters are allowed in operators: https://stackoverflow.com/a/10548541/783743

Languages like JavaScript on the other hand do not allow you to use punctuation character (beside $) in your variable names. [1]

I am writing a compiler which compiles a subset of Haskell to JavaScript and I don't know how to convert the operators into valid JavaScript identifiers.

Hence I decided to map each punctuation character to a basic latin lowercase alphabet (i.e. a-z). For example:

& = a | = l @ = q 

However instead of deciding the character mapping for myself, I first want to know whether anybody else has already done the same thing or whether there's a standard which decides how to map them.

I realize that this question could become primarily opinion based (which for some reason is strictly disallowed on StackOverflow). Hence I'm only looking for canonical answers which state definitively that "this is the way to do it" (perhaps with a link). If you want to opine then you can do so in the comments.

There are currently 19 characters which I wish to map to alphabets:

! # $ % & * + . / < = > ? @ \ ^ | - ~ 

Although $ is a valid character for identifiers in JavaScript it would be nice to map it to an alphabet too.


[1] Property name can have special characters, but that's an ugly hack.

5
  • 1
    Haskell -> JS? Commented Jun 18, 2014 at 6:59
  • The question is: do you wish your js code to be human readable or not? Commented Jun 18, 2014 at 8:20
  • @didierc In my opinion True.aa(True) is more human readable than True["&&"](True). The latter case is more descriptive but in my opinion it looks ugly. Commented Jun 18, 2014 at 8:43
  • What I mean is: if you care about readability, of course you'll try to stick to common idioms (usage of methods rather than array selectors), but if you don't, then it might make your life simpler to use whichever way allowing a direct mapping from haskell identifiers to js ones. Commented Jun 18, 2014 at 8:51
  • @didierc Yes, I do want the generated code to be readable. I would like people to be able to understand the generated code and integrate it with their JavaScript applications. Commented Jun 18, 2014 at 9:26

1 Answer 1

3

Ghc uses what they call z-encoding. For example, >>= is encoded as zgzgze. See https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/SymbolNames

Sign up to request clarification or add additional context in comments.

5 Comments

I appreciate the fact that you found out what GHC officially does. Hence +1. Nevertheless expanding punctuation characters to two character codes doubles the size of operators. When readability and understandability counts, that is unacceptable.
The reason for expanding to two characters is to be completely unambiguous. You wouldn't want a function gge to conflict with the >>= operator. If you know that names don't mix symbols and letters, then you can get away with only an operator marker at the start of the name, say op_gge.
True. I was thinking along the lines of simply converting && to aa. However if there's already a function named aa then I would compile it to $aa. Since $ is not a valid character in varsyms in Haskell and $ is allowed in identifiers in JavaScript this would resolve all ambiguities, while also keeping the length of the symbol to a minimum.
But if the $aa symbol is already taken, you'll have to find another way. c simply prepends any symbol with an underscore, but the same problem arises, though the standard used to discourage that usage for anything other than system/compiler code. You don't really have that luxury.
@didierc The $aa symbol can never be taken because Haskell doesn't allow the $ in varsyms. The compiled JavaScript code will be namespaced. Hence it wouldn't cause any naming conflicts there either.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.