37

Let's say, I've the below logic. How to write that in Functional Programming?

 public int doSomeCalc(int[] array) { int answer = 0; if(array!=null) { for(int e: array) { answer += e; if(answer == 10) break; if(answer == 150) answer += 100; } } return answer; } 

The examples in most blogs, articles... I see just explains the simple case of one straight forward math function say 'Sum'. But, I have a logic similar to the above written in Java and would like to migrate that to functional code in Clojure. If we can't do the above in FP, then the kind of promotions for FP doesn't state this explicitly.

I know that the above code is totally imperative. It was not written with the forethought of migrating it to FP in future.

5
  • 1
    Note that the combination of break and return answer can be replaced by a return inside the loop. In FP you could implement this early return using continuations, see e.g. en.wikipedia.org/wiki/Continuation Commented Jan 3, 2018 at 14:00
  • 2
    @Giorgio continuations would be an enormous overkill here. It's a loop anyway, to call its next iteration you do a tail call, so to break you just don't call it anymore and just return the answer. For nested loops, or other complicated control flow, that's where you might use continuations instead of heaving to restructure your code to use the above simple technique (which should always be possible, but may lead to overly complex code structure which would more or less explicate the continuation; and for more than one exit point you'd certainly need them). Commented Jan 3, 2018 at 14:28
  • 8
    In this case: takeWhile. Commented Jan 3, 2018 at 14:39
  • 2
    @WillNess: I just wanted to mention it because it can be used to leave a complex computation at any point. It is probably not the best solution for the OP's concrete example. Commented Jan 3, 2018 at 16:14
  • @Giorgio you're right, it is the most comprehensive one, in general. actually this question is very broad, IYKWIM (i.e. would be closed on SO in a heartbeat). Commented Jan 3, 2018 at 16:18

7 Answers 7

46

The closest equivalent to looping over an array in most functional languages is a fold function, i.e. a function that calls a user-specified function for each value of the array, passing an accumulated value along the chain. In many functional languages, fold is augmented by a variety of additional functions that provide extra features, including the option to stop early when some condition arises. In lazy languages (e.g. Haskell), stopping early can be achieved simply by not evaluating any further along the list, which will cause additional values to never be generated. Therefore, translating your example to Haskell, I would write it as:

doSomeCalc :: [Int] -> Int doSomeCalc values = foldr1 combine values where combine v1 v2 | v1 == 10 = v1 | v1 == 150 = v1 + 100 + v2 | otherwise = v1 + v2 

Breaking this down line by line in case you're not familiar with Haskell's syntax, this works like this:

doSomeCalc :: [Int] -> Int 

Defines the type of the function, accepting a list of ints and returning a single int.

doSomeCalc values = foldr1 combine values 

The main body of the function: given argument values, return foldr1 called with arguments combine (which we'll define below) and values. foldr1 is a variant of the fold primitive that starts with the accumulator set to the first value of the list (hence the 1 in the function name), then combines it using the user specified function from left to right (which is usually called a right fold, hence the r in the function name). So foldr1 f [1,2,3] is equivalent to f 1 (f 2 3) (or f(1,f(2,3)) in more conventional C-like syntax).

 where combine v1 v2 | v1 == 10 = v1 

Defining the combine local function: it receives two arguments, v1 and v2. When v1 is 10, it just returns v1. In this case, v2 is never evaluated, so the loop stops here.

 | v1 == 150 = v1 + 100 + v2 

Alternatively, when v1 is 150, adds an extra 100 to it, and adds v2.

 | otherwise = v1 + v2 

And, if neither of those conditions is true, just adds v1 to v2.

Now, this solution is somewhat specific to Haskell, because the fact that a right fold terminates if the combining function doesn't evaluate its second argument is caused by Haskell's lazy evaluation strategy. I don't know Clojure, but I believe it uses strict evaluation, so I would expect it to have a fold function in its standard library that includes specific support for early termination. This is often called foldWhile, foldUntil or similar.

A quick look at the Clojure library documentation suggests that it is a little different from most functional languages in naming, and that fold isn't what you're looking for (it's a more advanced mechanism aimed at enabling parallel computation) but reduce is the more direct equivalent. Early termination occurs if the reduced function is called within your combining function. I'm not 100% sure I understand the syntax, but I suspect what you're looking for is something like this:

(reduce (fn [v1 v2] (if (= v1 10) (reduced v1) (+ v1 v2 (if (= v1 150) 100 0)))) array) 

NB: both translations, Haskell and Clojure, are not quite right for this specific code; but they convey the general gist of it -- see discussion in the comments below for specific problems with these examples.

14
  • 11
    the names v1 v2 are confusing: v1 is a "value from array", but v2 is the accumulated result. and your translation is erroneous, I believe, the OP's loop exits when the accumulated (from left) value hits 10, not some element in the array. Same with the incrementing by 100. If to use folds here, use the left fold with early exit, some variation on foldlWhile here. Commented Jan 3, 2018 at 11:56
  • 2
    it is funny how the most wrong answer gets the most upvotes on SE.... it's OK to make mistakes, you're in the good company :), too. But the knowledge discovery mechanism on SO/SE is definitely broken. Commented Jan 3, 2018 at 12:41
  • 1
    The Clojure code is almost correct, but the condition of (= v1 150) uses the value before v2 (aka. e) is summed to it. Commented Jan 3, 2018 at 13:05
  • 1
    Breaking this down line by line in case you're not familiar with Haskell's syntax -- You are my hero. Haskell is a mystery to me. Commented Jan 3, 2018 at 13:16
  • 15
    @WillNess It’s upvoted because it’s the most immediately understandable translation and explanation. The fact that it’s wrong is a shame but relatively unimportant here because the slight errors don’t negate the fact that the answer is otherwise helpful. But of course it should be corrected. Commented Jan 3, 2018 at 14:32
33

You could easily convert it to recursion. And it has nice tail-optimized recursive call.

Pseudocode :

public int doSomeCalc(int[] array) { return doSomeCalcInner(array, 0); } public int doSomeCalcInner(int[] array, int answer) { if (array is empty) return answer; // not sure how to efficiently implement head/tails array split in clojure var head = array[0] // first element of array var tail = array[1..] // remainder of array answer += head; if (answer == 10) return answer; if (answer == 150) answer += 100; return doSomeCalcInner(tail, answer); } 
9
  • 14
    Yep. The functional equivalent to a loop is tail-recursion, and the functional equivalent to a conditional is still a conditional. Commented Jan 3, 2018 at 8:15
  • 4
    @JörgWMittag I'd rather say tail recursion is the functional equivalent to GOTO. (Not quite as bad, but still pretty awkward.) The equivalent to a loop, as Jules says, is a suitable fold. Commented Jan 3, 2018 at 10:28
  • 6
    @leftaroundabout I'd disagree actually. I'd say tail recursion is more constrained than a goto, given the need to jump to itself and only in tail position. It is fundamentally equivalent to a looping construct. I'd say recursion in general is equivalent to GOTO. In any case, when you compile tail recursion it mostly just boils down to a while (true) loop with the function body in where early return is just a break statement. A fold, whilst you are correct about it being a loop, is actually more constrained than a general looping construct; it is more like a for-each loop Commented Jan 3, 2018 at 11:02
  • 1
    @J_mie6 the reason I consider tail recursion more as a GOTO is that you need to do fiddly bookkeeping of what arguments in what state are passed to the recursive call, to ensure it actually behaves as intended. That is not necessary to the same degree in decently written imperative loops (where it's pretty clear what the stateful variables are and how they change in each iteration), nor in naïve recursion (where usually not much is done with the arguments, and instead the result is assembled in a quite intuitive way). ... Commented Jan 3, 2018 at 12:19
  • 1
    ... As for folds: you're right, a traditional fold (catamorphism) is a very specific kind of loop, but these recursion schemes can be generalised (ana- / apo- / hylomorphisms); collectively these are IMO the proper replacement for imperative loops. Commented Jan 3, 2018 at 12:19
14

I really like Jules' answer, but I wanted to additionally point out something people often miss about lazy functional programming, which is that everything doesn't have to be "inside the loop." For example:

baseSums = scanl (+) 0 offsets = scanl (\offset sum -> if sum == 150 then offset + 100 else offset) 0 zipWithOffsets xs = zipWith (+) xs (offsets xs) stopAt10 xs = if 10 `elem` xs then 10 else last xs result = stopAt10 . zipWithOffsets . baseSums result [1..] -- 10 result [11..1000000] -- 500000499945 

You can see that each part of your logic can be calculated in a separate function then composed together. This allows for smaller functions which are usually much easier to troubleshoot. For your toy example, perhaps this adds more complexity than it removes, but in real world code the split apart functions are often much simpler than the whole.

1
  • the logic is scattered all over here. this code won't be easy to maintain. stopAt10 is not a good consumer. your answer is better than the one you cite in that it correctly isolates the basic producer scanl (+) 0 of values. their consuming should incorporate the control logic directly though, is better implemented with just two spans and a last, explicitly. that would closely follow the original code structure and logic, too, and be easy to maintain. Commented Jan 3, 2018 at 17:27
6

Most list processing examples you will see use functions like map, filter, sum etc. which operate on the list as a whole. But in your case you have a conditional early exit - a rather uncommon pattern which is not supported by the usual list operations. So you need to drop down an abstraction level and use recursion - which is also closer to how the imperative example looks.

This is a rather direct (probably not idiomatic) translation into Clojure:

(defn doSomeCalc ([lst] (doSomeCalc lst 0)) ([lst sum] (if (empty? lst) sum (if (= sum 10) sum (let [sum (+ sum (first lst))] [sum (if (= sum 150) (+ sum 100) sum)] (recur (rest lst) sum))))))) 

Edit: Jules point out that reduce in Clojure do support early exit. Using this is more elegant:

(defn doSomeCalc [lst] (reduce (fn [sum val] (if (= sum 10) (reduced sum) (let [sum (+ sum val)] [sum (if (= sum 150) (+ sum 100) sum)] sum)) lst))) 

In any case, you can do anything in functional languages as you can in imperative languages, but you often have to change your mindset somewhat to find an elegant solution. In imperative coding you think of processing a list step by step, while in functional languages you look for an operation to apply to all elements in the list.

5
  • see the edit I just added to my answer: Clojure's reduce operation supports early exit. Commented Jan 3, 2018 at 8:55
  • @Jules: Cool - that is probably a more idiomatic solution. Commented Jan 3, 2018 at 8:57
  • Incorrect - or is takeWhile not a 'common operation'? Commented Jan 3, 2018 at 14:41
  • @jcast - while takeWhile is a common operation, it isn't especially useful in this case, because you need the results of your transformation before you can decide whether to stop. In a lazy language this doesn't matter: you can use scan and takeWhile on the results of scan (see Karl Bielefeldt's answer, which while it doesn't use takeWhile could easily be rewritten to do so), but for a strict language like clojure this would mean processing the entire list and then discarding results afterwards. Generator functions could solve this, however, and I believe clojure supports them. Commented Jan 4, 2018 at 10:12
  • @Jules take-while in Clojure produces a lazy sequence (according to the docs). Another way to tackle this would be with transducers (maybe the best one). Commented Jan 4, 2018 at 10:13
5

As pointed out by other answers, Clojure has reduced for stopping reductions early:

(defn some-calc [coll] (reduce (fn [answer e] (let [answer (+ answer e)] (case answer 10 (reduced answer) 150 (+ answer 100) answer))) 0 coll)) 

This is the best solution for your particular situation. You can also get a lot of mileage from combining reduced with transduce, which lets you use transducers from map, filter etc. However it is far from a complete answer to your general question.

Escape continuations are a generalized version of break and return statements. They are directly implemented in some Schemes (call-with-escape-continuation), Common Lisp (block + return, catch + throw) and even C (setjmp + longjmp). More general delimited or undelimited continuations as found in standard Scheme or as continuation monads in Haskell and Scala can also be used as escape continuations.

For example, in Racket you could use let/ec like this:

(define (some-calc ls) (let/ec break ; let break be an escape continuation (foldl (lambda (answer e) (let ([answer (+ answer e)]) (case answer [(10) (break answer)] ; return answer immediately [(150) (+ answer 100)] [else answer]))) 0 ls))) 

Many other languages also have escape continuation -like constructs in the form of exception handling. In Haskell you could also use one of the various error monads with foldM. Because they are primarily error handling constructs using exceptions or error monads for early returns is usually culturally unacceptable and possibly quite slow.

You can also drop down from higher-order functions to tail calls.

When using loops, you enter the next iteration automatically when you reach the end of the loop body. You can enter the next iteration early with continue or exit the loop with break (or return). When using tail calls (or Clojure's loop construct which mimics tail recursion), you must always make an explicit call to enter the next iteration. To stop looping you just don't make the recursive call but give the value directly:

(defn some-calc [coll] (loop [answer 0, [e es :as coll] coll] (if (empty? coll) answer (let [answer (+ answer e)] (case answer 10 answer 150 (recur (+ answer 100) es) (recur answer es)))))) 
2
  • 1
    Re using error monads in Haskell, I don't believe there is any real performance penalty here. They tend to get thought of along the lines of exception handling, but they don't work the same way and there isn't any stack walking required, so really shouldn't be a problem if used this way. Also, even if there's a cultural reason not to use something like MonadError, the basically-equivalent Either has no such bias towards only error handling, so can easily be used as a substitute. Commented Jan 4, 2018 at 10:26
  • @Jules I think returning Left does not prevent the fold from visiting the entire list (or other sequence). Not intimately familiar with Haskell standard library internals though. Commented Jan 4, 2018 at 10:30
2

The intricate part is the loop. Let us start with that. A loop is usually converted to functional style by expressing the iteration with a single function. An iteration is a transformation of the loop variable.

Here is a functional implementation of a general loop :

loop : v -> (v -> v) -> (v -> Bool) -> v loop init iter cond_to_cont = if cond_to_cont init then loop (iter init) iter cond else init 

It takes (an initial value of the loop variable, the function that expresses a single iteration [on the loop variable]) (a condition to continue the loop).

Your example uses a loop on an array, which also breaks. This capability in your imperative language is baked into the language itself. In functional programming such capability is usually implemented at the library level. Here is a possible implementation

module Array (foldlc) where foldlc : v -> (v -> e -> v) -> (v -> Bool) -> Array e -> v foldlc init iter cond_to_cont arr = loop (init, 0) (λ (val, next_pos) -> (iter val (at next_pos arr), next_pos + 1)) (λ (val, next_pos) -> and (cond_to_cont val) (next_pos < size arr)) 

In it :

I use a ((val, next_pos)) pair which contains the loop variable visible outside and the position in the array, which this function hides.

The iteration function is slightly more complex than in the general loop, this version makes it possible to use the current element of the array. [It is in curried form.]

Such functions are usually named "fold".

I put an "l" in the name to indicate that the accumulation of the elements of the array is done in a left-associative manner; to mimic the habit of imperative programming languages to iterate an array from low to high index.

I put a "c" in the name to indicate that this version of fold takes a condition that controls if and when the loop to be stopped early.

Of course such utility functions are likely to be readily available in the base library shipped with the functional programming language used. I wrote them here for demonstration.

Now that we have all the tools that are in the language in the imperative case, we can turn to implement the specific functionality of your example.

The variable in your loop is a pair ('answer', a boolean that encodes whether to continue).

iter : (Int, Bool) -> Int -> (Int, Bool) iter (answer, cont) collection_element = let new_answer = answer + collection_element in case new_answer of 10 -> (new_answer, false) 150 -> (new_answer + 100, true) _ -> (new_answer, true) 

Note that i used a new "variable" 'new_answer'. This is because in functional programming i can not change the value of an already initialized "variable". I do not worry about performance, the compiler may get to reuse the memory of 'answer' for 'new_answer' via life-time analysis, if it thinks that is more efficient.

Incorporating this into our loop function developed earlier :

doSomeCalc :: Array Int -> Int doSomeCalc arr = fst (Array.foldlc (0, true) iter snd arr) 

"Array" here is the module name which exports function foldlc is.

"fist", "second" stand for functions that returns the first, second component of its pair parameter

fst : (x, y) -> x snd : (x, y) -> y 

In this case "point-free" style increases the readability of the implementation of doSomeCalc:

doSomeCalc = Array.foldlc (0, true) iter snd >>> fst 

(>>>) is function composition : (>>>) : (a -> b) -> (b -> c) -> (a -> c)

It is the same as above, just the "arr" parameter is left out from both sides of the defining equation.

One last thing : checking for case (array == null). In better designed programming languages, but even in badly designed languages with some basic discipline one rather uses an optional type to express non-existence. This does not have much to do with functional programming, which the question is ultimately about, thus i do not deal with it.

1

First, rewrite the loop slightly, such that each iteration of the loop either early-exits, or mutates answer exactly once:

 public int doSomeCalc(int[] array) { int answer = 0; if(array!=null) { for(int e: array) { if(answer + e == 10) return answer + e; else if(answer + e == 150) answer = answer + e + 100; else answer = answer + e; } } return answer; } 

It should be clear that the behavior of this version is exactly the same as before, but now, it's much more straightforward to convert to recursive style. Here's a direct Haskell translation:

doSomeCalc :: [Int] -> Int doSomeCalc = recurse 0 where recurse :: Int -> [Int] -> Int recurse answer [] = answer recurse answer (e:array) | answer + e == 10 = answer + e | answer + e == 150 = recurse (answer + e + 100) array | otherwise = recurse (answer + e) array 

Now it's purely functional, but we can improve it from both an efficiency and readability standpoint by using a fold instead of explicit recursion:

import Control.Monad (foldM) doSomeCalc :: [Int] -> Int doSomeCalc = either id id . foldM go 0 where go :: Int -> Int -> Either Int Int go answer e | answer + e == 10 = Left (answer + e) | answer + e == 150 = Right (answer + e + 100) | otherwise = Right (answer + e) 

In this context, Left early-exits with its value, and Right continues the recursion with its value.


This could now be simplified a bit more, like this:

import Control.Monad (foldM) doSomeCalc :: [Int] -> Int doSomeCalc = either id id . foldM go 0 where go :: Int -> Int -> Either Int Int go answer e | answer' == 10 = Left 10 | answer' == 150 = Right 250 | otherwise = Right answer' where answer' = answer + e 

This is better as final Haskell code, but it's now a bit less clear how it maps back to the original Java.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.