Recursion and closures
This session covers two programming techniques which are very useful but I have found people often haven't come across or used in anger.
Recursion is when a function calls itself. This can really lend itself to neatly writing certain algorithms, and in those cases is often a lot simpler than an iterative algorithm.
Closures are functions which capture their environment. Often this means that there is a more general function which has some of its options bound to specific values, yielding a more specific function which is then called just with the unbound arguments. These work really neatly in languages such as R where variables can be functions.
Another similar topic I might add here would be generators, which are functions which
return but retain their internal state. In python this is easily done with the yield
keyword. However these aren't implemented in many languages yet so I will
omit them here, but you may want to look them up.
This session will have a brief overview and example, followed by a more challenging example for you to try. Solutions are at the end, but try not to look at them until you've given it a go.
I am going to give examples in rust, but you can use python or R just as easily for this module.
Recursion
The golden rule of recursion: make sure there is an exit condition for the function
For some reason the example often used for recursion is calculating the Fibonacci function \( f(x) = f(x - 1) + f(x - 2) \):
#![allow(unused)] fn main() { pub fn fibonacci_recursive(x: i32) -> i32 { // Remember the '_' branch is 'else' match x { 0 => 0, 1 => 1, _ => fibonacci_recursive(x - 1) + fibonacci_recursive(x - 2) } } let fib: Vec<i32> = (0..10).map(|x| fibonacci_recursive(x)).collect(); println!("{fib:?}"); }
But this clearly wasteful as we evaluate the same values repeatedly. This is a good
place to note the cached
library which
stores previous runs of the function with the same input parameters, and makes the
above code a lot more efficient:
use cached::proc_macro::cached;
#[cached]
pub fn fibonacci_recursive(x: i32) -> i32 {
match x {
0 => 0,
1 => 1,
_ => fibonacci_recursive(x - 1) + fibonacci_recursive(x - 2)
}
}
let fib: Vec<i32> = (0..10).map(|x| fibonacci_recursive(x)).collect();
println!("{fib:?}");
Note there is a similar decorator available in python.
This function is probably more obviously done with an iterative function anyway:
#![allow(unused)] fn main() { pub fn fibonacci_iterative(x: i32) -> i32 { if x == 0 || x == 1 { return x } else { let (mut fib, mut fib_minusone, mut fib_minustwo) = (0, 0, 1); for _ in 1..=x { fib = fib_minusone + fib_minustwo; fib_minustwo = fib_minusone; fib_minusone = fib; } fib } } let fib: Vec<i32> = (0..10).map(|x| fibonacci_iterative(x)).collect(); println!("{fib:?}"); }
For something more complex to which recursion is well suited, let's try something we all fondly remember, the flood fill from MS paint.
Let's reuse our PixelArray
image code from the data structures module:
#![allow(unused)] fn main() { use core::fmt; extern crate bmp; use bmp::Pixel; struct PixelArray { pixels: Vec<f64>, height: usize, width: usize } impl PixelArray { fn get_pixel(&self, x: usize, y: usize) -> f64 { let index = self.get_index(x, y); self.pixels[index] } fn set_pixel(&mut self, x: usize, y: usize, value: f64) { let index = self.get_index(x, y); self.pixels[index] = value; } fn get_index(&self, x: usize, y: usize) -> usize { if !self.valid_index(x, y) { panic!("Invalid index!"); } x + y * self.width } fn valid_index(&self, x: usize, y: usize) -> bool { x < self.width && y < self.height } fn from_bitmap(file: &str) -> Self { let img = bmp::open(file).unwrap_or_else(|e| { panic!("Failed to open: {}", e); }); let mut image_array = Self::empty_array(img.get_height() as usize, img.get_width() as usize); for y in 0..image_array.height { for x in 0..image_array.width { let px = img.get_pixel(x as u32, y as u32); let px_avg = px.b as f64; image_array.set_pixel(x, y, px_avg); } } image_array } fn to_bitmap(&self, file: &str) { let mut img = bmp::Image::new(self.width as u32, self.height as u32); for y in 0..self.height { for x in 0..self.width { let px_val = (self.get_pixel(x, y)) as u8; let px = Pixel::new(px_val, px_val, px_val); img.set_pixel(x as u32, y as u32, px); } } img.save(file).unwrap_or_else(|e| { panic!("Failed to save: {}", e) }); } } }
Consider this pre-prepared tiger:
Right click and save the bitmap to get your input image.
What we want to do is fill in all of the white pixels in the bounded region as black. One
way of doing this is to turn the pixel we click on, the start_pixel
, black. Then we can check
its neighbours, filling them in as black if they are currently white, and not doing anything
if they are already black. Then, for each of the originally white pixels, we would check its neighbours
and do the same thing. As we don't know the start and end of the bounded regions it's hard to do this in
a loop. However, the function of checking white/black and recolouring, then doing the same on the neighbours
is quite easy. We just need to call the function again within itself. This is recursion.
So, have a look at flood_fill()
function in the code below which shows how to put this
together to make a paint bucket function:
use std::collections::HashSet; extern crate bmp; use bmp::Pixel; impl PixelArray { pub fn flood_fill(&mut self, start: (usize, usize), filled: &mut HashSet<(usize, usize)>) { filled.insert(start); // Keep track of nodes already processed // Change this pixel from white to black if self.get_pixel(start.0, start.1) == 255.0 { self.set_pixel(start.0, start.1, 0.0); // Fill from left pixel outward let pixel_left = (start.0.wrapping_sub(1), start.1); if self.valid_index(pixel_left.0, pixel_left.1) && !filled.contains(&pixel_left) { self.flood_fill(pixel_left, filled); } // Fill from right pixel outward let pixel_right = (start.0.wrapping_add(1), start.1); if self.valid_index(pixel_right.0, pixel_right.1) && !filled.contains(&pixel_right) { self.flood_fill(pixel_right, filled); } // Fill from top pixel outward let pixel_top = (start.0, start.1.wrapping_add(1)); if self.valid_index(pixel_top.0, pixel_top.1) && !filled.contains(&pixel_top) { self.flood_fill(pixel_top, filled); } // Fill from bottom pixel outward let pixel_bottom = (start.0, start.1.wrapping_sub(1)); if self.valid_index(pixel_bottom.0, pixel_bottom.1) && !filled.contains(&pixel_bottom) { self.flood_fill(pixel_bottom, filled); } } } } fn main() { // Load the image let mut image = PixelArray::from_bitmap("tiger_lines.bmp"); // let start_pixel = (239, 179); // does a stripe let start_pixel = (221, 173); // does most of the tiger let mut filled = HashSet::new(); image.flood_fill(start_pixel, &mut filled); // Save the filled image image.to_bitmap("tiger_fill.bmp"); }
You should get something like this:
Try varying the start position.
Merge sort
Here is a more challenging problem to try, implementing a merge sort using a recursive algorithm. A merge sort works well when lists are nearly sorted. The idea is that we will take an unsorted list, split it into smaller lists, sort those smaller lists, then merge these sorted lists back together in order.
First, we need a merge function that will combine two sorted lists. We can do this by keeping two pointers (or iterators, in rust) which move down the lists. We take the currently smallest of the two, and move the pointer forward one in its list. This only goes through the lists once so is linear time.
#![allow(unused)] fn main() { pub fn merge(list1: &[i32], list2: &[i32]) -> Vec<i32> { let out_len = list1.len() + list2.len(); let mut list_out = Vec::with_capacity(out_len); let mut list1_iter = list1.iter(); let mut list2_iter = list2.iter(); let mut list1_item = list1_iter.next().unwrap(); let mut list2_item = list2_iter.next().unwrap(); while list_out.len() < out_len { if list1_item < list2_item { list_out.push(*list1_item); list1_item = list1_iter.next().unwrap_or(&i32::MAX); } else { list_out.push(*list2_item); list2_item = list2_iter.next().unwrap_or(&i32::MAX); } } list_out } }
First, use assert!()
to test this works with some simple input.
Then, try and write a recursive function to take two lists and sort them.
A handy function in rust is let (left, right) = list.split_at(index);
. If you keep halving
the left and right lists, sorting them, then merging them in order (using the function above)
you'll get a sorted list. Also note that if you call merge()
on two lists both of length 1, they
will be sorted (as a list of length 1 is already sorted by definition).
Some of the rust borrowing rules can be a bit of a headache here so if you are finding them annoying switch to python or R, the code will be very similar.
The golden rule of recursion: make sure there is an exit condition for the function
A solution can be found at the end.
Closures
Closures capture some of the environment when they are defined. They are really useful when you need to match a function's arguments. Typically this will be a function where you can't change the arguments, usually from an external library.
Here are some examples of doing this in python, rust and R.
python
Use the partial
function from the functools
library:
from functools import partial
def linear_regression_predict(x, slope, intercept):
return intercept + x * slope
model1 = partial(linear_regression_predict, slope=-2, intercept=3)
predict_values = [-1, 0, 1, 5]
model1_predictions = [model1(x) for x in predict_values]
# [5, 3, 1, -7]
Here we have a more general function which predicts the y-axis value for a linear regression, given an input x value, slope and intercept. If we have fitted the slope and intercept we can 'bind' or 'capture' these to make a more specific predict function for that model. This is then more convenient to use elsewhere.
rust
The same example in rust:
pub fn linear_regression_predict(x: f64, slope: f64, intercept: f64) -> f64 { intercept + x * slope } pub fn main() { let slope = -2.0; let intercept = 3.0; let model1 = |x: f64| -> f64 { linear_regression_predict(x, slope, intercept) }; let predict_values = vec![-1.0, 0.0, 1.0, 5.0]; let model1_predictions: Vec<f64> = predict_values.iter().map(|x| model1(*x)).collect(); println!("{model1_predictions:?}"); // [5, 3, 1, -7] }
Here you use ||
to define the input variables for the new function. Adding the move
keyword
before this means the function will additionally take ownership of the other captured variables.
Note also the syntax for .map()
also uses a closure. Closures can be used anywhere you want
to define a unnamed function, which is useful for very short functions. These are sometimes
called lambdas. We could use this to replace the linear_regression_predict()
function entirely,
making the code more concise:
pub fn main() { let slope = -2.0; let intercept = 3.0; let predict_values = vec![-1.0, 0.0, 1.0, 5.0]; let model1_predictions: Vec<f64> = predict_values.iter().map(|x| intercept + x * slope).collect(); println!("{model1_predictions:?}"); // [5, 3, 1, -7] }
R
R has really great support for closures, as variables can be set to functions with no extra work:
linear_regression_predict <- function(x, slope, intercept) {
intercept + x * slope
}
slope <- -2.0
intercept <- 3.0
model1 <- function(x) {linear_regression_predict(x, slope, intercept)}
predict_values <- c(-1.0, 0.0, 1.0, 5.0)
model1_predictions <- sapply(predict_values, model1)
model1_predictions
}
We can use sapply()
in this way as the function operates on the value in the list.
Otherwise we would have to change the arguments to take a vector and give slope
and intercept
with every call.
Alternatively we could use apply()
which has a ...
argument allowing other parameters to be passed,
but with any other nesting of functions this gets complicated.
Optimiser using a closure
Let's stick with R and use a closure to make running an optimiser possible. A common example is Rosenbrock's function: \[f(x,y) = (a-x)^2 + b(y-x^2)^2 \]
As an R function:
rosenbrock <- function(a, b, x, y) {
(a-x)^2 + b*(y-x^2)^2
}
The optim()
function can be used to maximise or minimise a function, but the first
argument par
needs to be a vector of parameters, which doesn't match the four defined
by that function.
Write a closure around rosenbrock()
which captures a = 1
, b = 100
and uses
x = input[1]
, y = input[2]
so it can be passed to optim()
. By default optim()
finds the minimum -- what is it, and at which co-ordinates?
If you are feeling brave, also define the derivative (for any \( a \) and \( b \)), and try some different optimisers.
Solutions are at the end.
Solutions
Merge sort
Merge sort:
pub fn merge(list1: &[i32], list2: &[i32]) -> Vec<i32> { let out_len = list1.len() + list2.len(); let mut list_out = Vec::with_capacity(out_len); let mut list1_iter = list1.iter(); let mut list2_iter = list2.iter(); let mut list1_item = list1_iter.next().unwrap(); let mut list2_item = list2_iter.next().unwrap(); while list_out.len() < out_len { if list1_item < list2_item { list_out.push(*list1_item); list1_item = list1_iter.next().unwrap_or(&i32::MAX); } else { list_out.push(*list2_item); list2_item = list2_iter.next().unwrap_or(&i32::MAX); } } list_out } pub fn merge_sort(list: Vec<i32>) -> Vec<i32> { let (left, right) = list.split_at(list.len() / 2); let mut left = left.to_vec(); let mut right = right.to_vec(); if left.len() > 1 { left = merge_sort(left); } if right.len() > 1 { right = merge_sort(right); } merge(&left, &right) } fn main() { let mut list = vec![1,5,4,2,3,3,22,1,1]; list = merge_sort(list); println!("{list:?}"); }
Optimiser
start_value <- c(-1.2, 1)
rosenbrock <- function(a, b, x, y) {
(a-x)^2 + b*(y-x^2)^2
}
rosen <- function(x) {rosenbrock(1, 100, x[1], x[2])} # or we could write a <- 1; b <- 100 above
optim(start_value, rosen)
# With the gradient
gradient <- function(a, b, x, y) {
ddx <- -2*a + 4*b*x^3 - 4*b*x*y + 2*x
ddy <- 2*b*(y-x^2)
c(ddx, ddy)
}
rosen_g <- function(x) {gradient(1, 100, x[1], x[2])}
optim(start_value, rosen, rosen_g, method = "BFGS", control = list(trace = 1))