Recursive combination of elements using a function.
Posted on 22 September 2016
FOLD: Higher Order Function
Recursive combination of elements using a function.
Posted on September 22, 2016
In functional programming, fold (also termed reduce, accumulate, aggregate, compress, or inject) refers to a family of higher-order functions that recursively combine elements of a data structure using a function.
To In this article we’re going to be focusing on loops that make use of an accumulation variable, for example:
Sum of values
Product of values
Concatenation (sum of strings)
Scenario 1: Sum of Integer Values
We have an array of integers and need to return their combined sum.
Ok, quite easy. We created a pure function with a values parameter to accept an array of integers to be summed; but what if instead of summing the values, we need to find the product of the values.
Ok, again a fairly easy task… but did you notice how much of the code is similar between the sum and product functions? Essentially we only changed the name, and these two lines:
Ok, but how can we do this in a more “functional way”?
I say “more functional way”, because if you look at both of the above functions; they meet the requirements of a Pure Function, i.e. no side effects.
Well…, so let’s frame this question slightly differently: how can functional techniques assist us to reduce the amount of code duplication. So for this section we’re going to use a Higher-order Function;
Fold (reduce, accumulate, aggregate).
This takes a function, a collection of items and returns a value that is created by combining the items.
#include <iostream>
#include <vector>
#include <numeric>
#include <functional>
typedefstd::multiplies<int>mult;intmain(){// For the sake of legibility, I'll declare an initialised vector.
std::vector<int>v{1,2,3,4,5,6,7,8,9};std::cout<<std::accumulate(v.begin(),v.end(),0)<<std::endl;std::cout<<std::accumulate(v.begin(),v.end(),1,mult())<<std::endl;return0;}
I agree it’s less code, … but how exactly does that work, confused?
What’s is most likekly confusing the picture a bit is that we’ve chosen to use some Syntactic Sugar; shortened syntax within a programming language that is designed to make things easier to read or to express. Before we try to understand what’s going on; let’s first remove the sugar and reveal the expanded version:
#include <iostream>
#include <vector>
#include <numeric>
#include <functional>
intmain(){// For the sake of legibility, I'll declare an initialised vector.
std::vector<int>v{1,2,3,4,5,6,7,8,9};std::cout<<std::accumulate(v.begin(),v.end(),0,[](inttotal,intvalue){returntotal+value;})<<std::endl;std::cout<<std::accumulate(v.begin(),v.end(),1,[](inttotal,intvalue){returntotal*value;})<<std::endl;return0;}
Ok, so let me help you with understanding of this; we basically have a function called
aggregate
accumulate
array_reduce
fold
reduce
that operates on a list of values (the array) and also takes two additional parameter values:
The Initial or starting value; which in this case is 0 for sum, and 1 for product.
A Closure (or a block of code to process the values); this closure receives two values: total & value; to which we then apply an operator, for example: + for sum, and * for product.
Process: operation-by-operation:
The process starts with an initial value of 0 for the accumulator. The closure is then iteratively passed the current accumulator and the next value in the sequence, the closure then returns the sum of the accumulator plus the value; the result is then stored in the accumulator; overriding the previous value.
Underlying Operational Code
Too further alleviate any confusion, let’s have a look at a function for this:
C#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
publicstaticTSourceAggregate<TSource>(thisIEnumerable<TSource>source,Func<TSource,TSource,TSource>func){Check.SourceAndFunc(source,func);// custom foreach so that we can efficiently throw an exception
// if zero elements and treat the first element differently
using(varenumerator=source.GetEnumerator()){if(!enumerator.MoveNext())throwEmptySequence();TSourcefolded=enumerator.Current;while(enumerator.MoveNext())folded=func(folded,enumerator.Current);returnfolded;}}
// Source from openjdk.java.netintreduce(intidentity,java.util.function.IntBinaryOperatorop){intresult=identity;for(intelement:thisstream)result=accumulator.applyAsInt(result,element)returnresult;}
sub reduce($$$){my($fn,$initial,$values)=@_;$accumulator=$initial;foreach$value(@{$values}){$accumulator=&$fn($value,$accumulator);}return$accumulator;}
Does this seem at all familiar? Go back and look at the two functions (sum & product) that we defined at the beginning of this article. Yes, it’s essentially the same process; the only key differences here are:
- use of a enumerator to step through the sequence as opposed to the `foreach`
- a closure called `func` that receives two parameters:
- `folded`; the accumulated value
- `enumerator.Current`; which is the last value retrieved from the sequence.
This basically says, we receive two values total, value, and then proceed to add them together total + value. This is a rudimentary example of a block of code (closure); far more complex logic and/or process can be performed in a closure. The only restrictions is that your block of code at the end, conforms to type of the closure:
i.e. We are provided two values; we add our processing code and finally we received a TSource, which is just a generic placeholder for the return value (the accumulated result). To further understand this function definition you should refer to Microsoft’s .Net definition: https://msdn.microsoft.com/en-us/library/bb548651(v=vs.110).aspx.
Conclusion
Whew, ok that’s enough for this article… I really hope you’re were able to follow along; please feel free to ask for more examples if you need them, even for more clarity on anything that you’re struggling with.
Performance Note:
It's also important to appreciate that the underlying code for Higher-order Functions are built by the compiler teams of each language, meaning they would have taken additional steps to ensure that this code is highly optimized.
Bonus
As a bonus we’re going to include code on how to solve the first Euler challenge using:
aggregate
accumulate
array_reduce
fold
reduce
The problem is as follows:
Multiples of 3 and 5: If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23. Find the sum of all the multiples of 3 or 5 below 1000.
The solution is intentionally hidden, to give you an opportunity to try this yourself. Click on the Show / Hide button to see our solution.