Brute-force all the things?
These days it’s hard to tell whether the computer saves us more time than it wastes. However recently I had an experience programming in Ruby that demonstrated to me that the computer can be our modern time-saving friend, especially when wielding a language like Ruby, delicately designed to just “get out of your way” and let you program. The story involves number crunching, eyebrow scrunching, and in the end, an unabashed brute-force beauty.
Backstory: A need (knee’d?) is born
So I like to run. A lot. I find running to be one of the number one things that clears my head and helps me craft code. Currently I’m training for the Staten Island Half this coming October. A number of training regimens prescribe running at different paces on different days in order to train the multitude of systems that come together synergistically to propel you down the road for any length of time.
Many of these pace tables are based on your current 5 kilometer race time. You find your 5K time and the run type you’re assigned for the day, and it tells you what your pace-per-mile/km should be. For example, while your prescribed pace for a 5K run might be 7.5 minutes per mile, the pace table for a long training run would have you running 10 minute miles.
Being a software developer at heart, I found manually looking up the paces by opening a book and flipping pages to be tedious and time-consuming. So I did what anyone would do: I went about writing an app in Ruby to do it for me. And now I found myself running barefoot careening toward hurdle number #1.
Hurdle #1 – Getting the data
How would I get the data from the pace tables into my app? I could store it all in a database, but how would I get it in there? I didn’t really feel like typing it manually, or attempting some sort of clunky OCR process. The whole reason I was doing this in the first place was to save time! So I decided that I would reverse engineer the formulas used to calculate the pace data – Which led me to do a cartwheeling 360° faceplant in front of hurdle #2.
Hurdle #2 – Math?!
So what were these equations? I don’t have a strong math background, and the background I do have is rusty with disuse. If it was your neighborhood street mural it would be marred by years of rain, road salt and graffiti. I did some research on the internets in search of athletic formulas but didn’t get as far as I’d have hoped – While some of the pace tables were easy to calculate because they were based on formulas created back in the 70s for estimating runner performance at varying distances, the majority of the pace equations remained as elusive as Meb Keflezighi from the back of the pack.
I tried a few naive approaches at calculating a few paces, but my unit tests quickly showed that they were very naive indeed. Nothing like a few tests to make you feel like a pair of old unwashed gym shorts. That’s when I decided that sometimes TDD really stands for Terribly Depressing Development. :)
So I plotted some of the data on a graph in Excel, and lo and behold! The data did not progress linearly. In each case, there was a slight curve. That’s what was throwing me off.
Up to this point, I had the program calculate the slope of the line between the first and last pace in the table, and then use the provided 5K time to find the correct point on the “line”. This approach was getting me close, but still wasn’t accurate enough because of that mysterious curve! Which led me to curvy hurdle #3.
class LongRun < RunType
def self.fastest_pace_km
HumanTime.new('4:00')
end
def self.slowest_pace_km
HumanTime.new('10:32')
end
def self.calc_pace(five_k_time)
x2 = ((five_k_time.total_minutes * 2) - 27) - 1
HumanTime.from_minutes((slope * x2 + fastest_pace_km.total_minutes))
end
private
def self.slope
(slowest_pace_km.total_minutes - fastest_pace_km.total_minutes) / (57 - 1)
end
end
Hurdle #3 – A curve
So how would I calculate this curve in the data? As stated earlier, “Math skills? What math skills?” I began looking into calculating points on curves, but it was taking too long, and it was my laziness which spawned this project in the first place. It was time to put the computer to work.
First of all, rather than calculating an actual curve, I just fudged it with a triangle. The high point is in the middle of the data line, and it tapers off to each endpoint. It was a lot easier to program (for me anyway), and still accurate enough for calculating a pace table used for distance running. You’ll notice in the code snippet that I still named this high point in the data the “radius” because at some point I would like to calculate this more accurately. But for now it’s working. (View the code in its full context , which is part of my Runby-Pace project on GitHub.)
def calc(five_k_time)
five_k_time = RunbyPace::PaceTime.new(five_k_time)
x2 = ((five_k_time.total_minutes * 2) - (MIDPOINT_X - 1)) - 1
RunbyPace::PaceTime.from_minutes(slope * x2 + @fastest_pace_km.total_minutes + curve_minutes(x2))
end
private
def curve_minutes(x_axis)
return 0 if @midpoint_radius_divisor == 0
midpoint_reduction = x_axis
midpoint = MIDPOINT_X
if midpoint_reduction > midpoint
midpoint_reduction = midpoint - (midpoint_reduction - midpoint)
midpoint_reduction = 0 if midpoint_reduction < 0
end
# TODO: Use an actual curve instead of a triangle to calculate the number of minutes to add.
midpoint_reduction / @midpoint_radius_divisor / 60
end
Hurdle #4 – Repetitive calculations (Brute-force anyone?)
But remember that each run type has it’s own curve height. How would I calculate that? My answer: I wouldn’t. That’s what computers are for. I just set the initial curve radius to be the same as half of the length of the data line - a circle. This creates a huge and wildly inaccurate curve, so I needed a divisor to cut it down to size. I could sit around with a calculator for a while calculating the divisor for each type of run, or I could just write this method to brute-force the divisor (Full version ):
def self.find_divisor(golden_paces, allowable_deviation = '00:01')
_, first_pace = golden_paces.first
last_pace = golden_paces[:'42:00']
viable_divisors = []
(1.0..5.0).step(0.025) do |candidate_divisor|
viable_divisor = nil
golden_paces.each do |five_k, golden_pace|
five_k_time = RunbyPace::PaceTime.new(five_k.to_s)
pace_data = RunbyPace::PaceData.new(first_pace, last_pace, candidate_divisor)
calculated_pace = pace_data.calc(five_k_time)
if !calculated_pace.almost_equals?(golden_pace, allowable_deviation)
viable_divisor = nil
break
end
viable_divisor = candidate_divisor
end
if viable_divisor != nil
viable_divisors << viable_divisor
end
end
if viable_divisors.length > 0
# puts viable_divisors
midpoint = (viable_divisors.length - 1) / 2
return viable_divisors[midpoint]
end
end
The code runs in less than a second. Now that is computers at their best. Tell them what to do. Make sure you have some tests that prove it does what you think you’re telling it to do, and then let it rip! And it does it, over and over and over again. Perfectly. Every time. Man and machine. Now if you’ll excuse me, I’m going for a run.