Curve fitting with genetic algorithm

Before reading this blog it would be great if you could read and get familiarized with Guessing a Number with Genetic Algorithm.

In this blog we would like to fit a line with genetic algorithm for the equation $y = 7x + 5$. So let’s write the equation in Julia

y(x) = 7x + 5

Output:

y (generic function with 1 method)

Now let’s assign x to range 1 to 5 in steps of 0.1

x = 0 : 0.1 : 5

Output:

0.0:0.1:5.0

Now let’s plot the line

using Plots

scatter_plot = scatter(x, y.(x), xlims = (-0.1, 5.5), ylims = (0, 50))

svg

There is a property in Julia which I want you to remember, let’s say we have a tuple (5, 7) and we assign it to two variables m and c as shown below

m, c = (5, 7)
println("m = ", m)
println("c = ", c)

Output:

m = 5
c = 7

Then the first variable will take the first value of the tuple and second will take the second value. We will be using this technique in this blog.

For genetic algorithm, mutation or change is very important. Let’s write a mutation function for a two valued tuple as shown:

function mutate(value, mutations = 10)
    [value + rand() - 0.5 for i in 1:mutations]
end

function mutate(mc::Tuple, number_of_mutations = 10)
    m, c = mc
    ms = mutate(m, number_of_mutations)
    cs = mutate(c, number_of_mutations)

    [(ms[i], cs[i]) for i in 1:number_of_mutations]
end

Output:

mutate (generic function with 4 methods)

and we test that function:

mutate((1, 2))

10-element Vector{Tuple{Float64, Float64}}:
 (1.1205168123693385, 2.41514037266611)
 (0.5079882676843603, 1.8217999723149112)
 (1.1581739571074665, 2.459460436379917)
 (0.5796738025677814, 1.9702636743774655)
 (1.0321606616260794, 1.7632954510809746)
 (1.3545156402167784, 1.8782505949943635)
 (1.3407759822406362, 2.3597713445300306)
 (0.8524954649600578, 2.4966166282430047)
 (0.991673117157789, 2.4749204653332812)
 (0.9933905544923662, 1.5623361787716572)

Now let me explain how it works. First we have a mutation function that takes single value and mutates it as shown below, to learn it you may go to this blog.

function mutate(value, mutations = 10)
    [value + rand() - 0.5 for i in 1:mutations]
end

Now we kind of want to extend this mutate function to tuple, so we write a declaration for it as shown

function mutate(value, mutations = 10)
    [value + rand() - 0.5 for i in 1:mutations]
end

function mutate(mc::Tuple, number_of_mutations = 10)
end

Note that unlike the mutate(value, mutations = 10) this mutate(mc::Tuple, number_of_mutations = 10) clearly says it needs a tuple. Now imagine I pass a tuple to mutate like mutate((1, 2)), clearly Julia will pass it to mutate(mc::Tuple, number_of_mutations = 10) and not to mutate(value, mutations = 10). So now let’s define the tuple mutate.

Let’s assume the tuple will have only two values, so the first value should be assigned to a variable m and second one to c as shown

function mutate(mc::Tuple, number_of_mutations = 10)
    m, c = mc
end

Now all we need to do is to mutate m and c by number_of_mutations times and combine it. To mutate m and c we use the following code

ms = mutate(m, number_of_mutations)
cs = mutate(c, number_of_mutations)

In the above code mutate(m, number_of_mutations) and mutate(c, number_of_mutations) calls mutate(value, mutations = 10). We store the values of mutation in variable ms, I pronounce it as emmmss and other in seas cs, let’s plugin this code into our function as shown

function mutate(mc::Tuple, number_of_mutations = 10)
    m, c = mc
    ms = mutate(m, number_of_mutations)
    cs = mutate(c, number_of_mutations)
end

Now we combine these mutations to form a Array of Tuple’s and return it as shown

function mutate(mc::Tuple, number_of_mutations = 10)
    m, c = mc
    ms = mutate(m, number_of_mutations)
    cs = mutate(c, number_of_mutations)

    [(ms[i], cs[i]) for i in 1:number_of_mutations]
end

Now let’s write an error function, let’s say that we have m and c that our algorithm recommends, for a given x, the prediction will be (m * x + c) , and in the training phase the value of y may vary from our prediction, so, we need to minus y from our prediction to get error as shown:

∆(m, c, x, y) = (m * x + c) - y

Output:

∆ (generic function with 1 method)

Let’s say our m is 5, and c is 4, let’s see what’s out error when x is 10 as shown below

∆(5, 4, 10, y(10))

Output:

-21

Now some one says he has 10 values of $x$ and corresponding $y$ with him, he wants us to fit a line to predict the unknown stuff, we decide a value of $m$ and $c$, we need to know the total error for all values of our prediction, so let’s write a total_error function as shown:

function total_error(m, c, x, y)
    ΣΔ = 0

    for i in 1:length(x)
        ΣΔ += abs(∆(m, c, x[i], y[i]))
    end

    ΣΔ
end

output:

total_error (generic function with 1 method)

In the total_error, we have a variable ΣΔ to store the total error, we assign it to zero as shown:

function total_error(m, c, x, y)
    ΣΔ = 0
end

now for each element if x

function total_error(m, c, x, y)
    ΣΔ = 0

    for i in 1:length(x)
    end
end

we add absolute value of error to total error variable ΣΔ

function total_error(m, c, x, y)
    ΣΔ = 0

    for i in 1:length(x)
        ΣΔ += abs(∆(m, c, x[i], y[i]))
    end
end

we return the total error variable ΣΔ

function total_error(m, c, x, y)
    ΣΔ = 0

    for i in 1:length(x)
        ΣΔ += abs(∆(m, c, x[i], y[i]))
    end

    ΣΔ
end

Now let’s check the total_error function

total_error(5, 4, [1, 2, 3, 4, 5], [1, 2, 3, 4, 5])

Output:

Seems to work!

Let’s now write our top_survivors function, this one takes Array of $m$’s and $c$’s Tuple’s as mcs (I spell it as emm seee’s), an array of training inputs namely x_train and y_train, and selects the top_percent of mcs that that deviates the least with y_train for the given x_train.

function top_survivors(mcs, x_train, y_train, top_percent = 10)
    errors_and_values = []

    for mc in mcs
        m, c = mc
        error = total_error(m, c, x_train, y_train)
        push!(errors_and_values, (error, mc))
    end


    sorted_errors_and_values = sort(errors_and_values)
    end_number = Int(length(mcs) * top_percent / 100)
    sorted_errors_and_values[1:end_number]
end

Output:

top_survivors (generic function with 2 methods)

I would encourage you to read my previous Genetic Algorithm blog to understand top_survivors function.

Now let’s setup the needed inputs for top_survivors, let’s have initial mcs as (0, 0):

x_train = [1, 2, 3, 4, 5]
y_train = y.(x_train)
mcs = mutate((0, 0))

Output:

10-element Vector{Tuple{Float64, Float64}}:
 (0.16087862232389538, 0.018092305432591882)
 (0.08312155751992667, -0.022630590705241538)
 (-0.32885966826028445, -0.4936484604469795)
 (-0.19041016854936288, 0.4707757045419352)
 (0.011998626752193209, -0.276041600093212)
 (-0.18351337891221542, -0.31274629114220875)
 (0.014823442262999142, 0.03761731346306618)
 (-0.19617757028312166, -0.11227454061420317)
 (0.07142332361410042, -0.2765731664220268)
 (0.3720719485797561, 0.44332686459739223)

Let’s test it out:

top_survivors(mcs, x_train, y_train)

Output:

1-element Vector{Any}:
 (122.2022864483167, (0.3720719485797561, 0.44332686459739223))

Looks like our m of 0.3720719485797561 and c of 0.44332686459739223 are the fittest with an error of 122.2022864483167. Now all we need to do it to tell it to select top survivors across many generations so that we get a near perfect fit as shown:

generations = 40

top_survivor = Nothing
mc = (0, 0)

for i in 1:generations
    mcs = mutate(mc)
    top_survivor = top_survivors(mcs, x_train, y_train)[1]
    _error, mc = top_survivor
end

hm, hc = mc
p = scatter(x_train, y_train)
h(x) = hm * x + hc
plot!(p, x, h.(x))

Output:

svg

The code below is almost the same as code above, but note the @gif which collects the plot images into a gif file so that we can see it as animation:

generations = 40

top_survivor = Nothing
mc = (0, 0)

@gif for i in 1:generations
    mcs = mutate(mc)
    top_survivor = top_survivors(mcs, x_train, y_train)[1]
    _error, mc = top_survivor

    hm, hc = mc
    p = scatter(x_train, y_train)
    h(x) = hm * x + hc
    plot!(p, x, h.(x), ylims = (0, 50))
end

Output:

┌ Info: Saved animation to
│   fn = /Users/mindaslab/data_science_with_julia/code/tmp.gif
└ @ Plots /Users/mindaslab/.julia/packages/Plots/g581z/src/animation.jl:104

See how in 40 generations our algorithms nearly fits a line to y(x) = 7x + 5, you may change 7 and 5 in y(x) to any thing and our algorithm in successive generations will try to fit line to a bunch of points better and better in successive generations.

When I tried, this was the final value I got for $m$ and $$c$

mc

Output:

(6.838106002778459, 5.334256635688782)

In your computer it could vary, but the line will be a near perfect fit for the bunch of points.

You can get the code for this blog here https://gitlab.com/data-science-with-julia/code/-/blob/master/genetic_alogorithms_line_fitting.ipynb.