Category Archives: AWS

An update! And interesting algorithms…

Hello! Has it really been 8 months since I last posted? My intent to keep the blog updated certainly didn’t fan out over the summer! Though now the cooler weather is here, I have more time on my hands so I intend on changing my habbits!

Firstly, I apologise for the new look! I’m still getting the hand of WordPress themes. I’m a fan of the minimalist look, though this “Twenty fourteen” certainly looks a little too spartan! I’ve never been great at the “Making it pretty” part of web design, so stick with me an hopefully things will improve!

The thing I am better at is back end coding. I particularly enjoy problem solving, so when the team behind the cycling club came to me with a suggestion that one of the site functions could do with some improvements in speed,  I jumped at the chance!

The function in question relates to validating GPS trails – the majority of the rides organised are attended my masses of riders who start at a given point, visit certain “Checkpoints” to get a receipt, signature, stamp or some other proof of visit, and then (though not always) return to the start. These are the types of ride I attend, though another way of gaining points is to ride a “DIY Permanent” whereby you track your progress using a GPS device and then submit the trail, which is checked against known records of the trail.

Using libraries of functions, this part of the task is fairly straightforward and is carried out by visiting each point of the expected ride, finding the closes point from the riders trail and comparing that to within a certain level tolerance. The problem comes when there are several of these GPS trails to be checked and the riders have recorded a huge number of points along the way! One example I used was a 200km trail that had a suggested GPS trail of just over 1000 points.

The user had submitted 67,000 points along their journey! Although running locally, the software managed the impressive task within 40 seconds, I’m told that wasn’t the biggest that has been seen!

So how could the times be improved? The task is currently pretty much sequential, so only one processing core is being used so one option would be to break up the task into smaller chunks to allow for some parallel processing of the trail. Though given the cost of servers with multiple accessible cores over our current offering, I didn’t give this much consideration.

So how to simplify the task? Researching “Line simplification” may lead you to Ramer, Douglas & Peucker’s algorithm. Specifically wikipedia’s entry:

So how does it work? (For people not wanting to look just yet…) You take the start and finish points, draw a line between, and then find the point that is furthest from this line… If this point is outside of the tolerance required, mark this point as a mid point and create two new lines meeting here. Repeat this process for the new lines created, et voila! You eventually get lines representing the original route, with slightly decreased resolution but potentially a huge compression rate!

And the results of carrying out the filtering? The trail was reduced from 67,000 points to an impressed 1,200! Running the same path validation algorithm over this new track resulted in an execution speed of just 10 seconds! a 400% improvement!

Unfortunately it wasn’t all good news. As you can imagine, comparing all these points is still a computing intensive activity and took a full 20 seconds to carry out. Still, 30 seconds total compared to 37 was a 24% improvement.

So, the simplest option that was considered, of simply “missing out points” from the control trail currently looks the best option in terms of implementation ease and improvement (Using 1 in 4 points on the control trail and comparing against the full path of 67,00 user points took 17 seconds, incidentally) though we are trying one last ditch attempt to get our riders to tone down the data sampling!


And just in case anyone is interested in the Java implementation of the algorithm (you’ll need your own implementation of lineToPoint()):


private List <Coordinate> trim(List <Coordinate> coords, int epsilon) {
 double dmax = 0;
 int index = 0;
 int end = coords.size() - 1;

// Find furthest point from line start to end
 List<Coordinate> theLine = new ArrayList();
for (int i = 1; i < end - 1; i++) {
 double dist = lineToPoint(theLine, coords.get(i));
 if (dist > dmax) {
 dmax = dist;
 index = i;

 List <Coordinate> resultList = new ArrayList();
 // If max distance is greater than epsilon, recursively simplify
 if ( dmax > epsilon ) {
 List <Coordinate> recResults1 = trim(coords.subList(0, index), epsilon);
 List <Coordinate> recResults2 = trim(coords.subList(index, end), epsilon);
 // Build the result list
 } else {
 return resultList;

AWS, a new comer’s views

As an IT professional it can be a daunting task trying to keep up with new technologies. A lot of job specifications I’ve seen have asked for Amazon or other cloud experience, so I decided it was time I had a look!

Previously I’ve administered several virtual servers hosted by hosting companies, all of which can be classed as cloud computing, given that the machine doesn’t actually exist and all of the processing is done “in the cloud”, so what are my initial thoughts having delved into AWS for the first time?

Mainly how easy it is to get started! The interface certainly looks daunting, though I managed to get a virtual server up and running on the EC2 (Elastic cloud computing) service in less than 10 minutes! And what’s better, they provide a modest machine known as a t2.micro, absolutely free of charge for the first year!

Delving a little deeper, following Ryan Kroonenberg’s “AWS Certified Developer 2016” course on Udacity, ( I was surprised by how easy it all was! I even manged to get a load balancer set up, linked to an EC2 server, and managed to Link to the Tomcat server seamlessly from Netbeans! And it was still free! (So long as you pick the right machine…) Admittedly the choice of course was biased towards getting certified rather than actually learning AWS, but it provides an excellent starting point!

Having developed an idea for a future Android app, the environment looks like the perfect place to host the back end processing given it’s scalable nature and high availability. Plus it will earn me some brownie points for the CV! Watch this space for details of the App in the future!