Tag Archives: ubuntu

An update! And interesting algorithms…

Hello! Has it really been 8 months since I last posted? My intent to keep the blog updated certainly didn’t fan out over the summer! Though now the cooler weather is here, I have more time on my hands so I intend on changing my habbits!

Firstly, I apologise for the new look! I’m still getting the hand of WordPress themes. I’m a fan of the minimalist look, though this “Twenty fourteen” certainly looks a little too spartan! I’ve never been great at the “Making it pretty” part of web design, so stick with me an hopefully things will improve!

The thing I am better at is back end coding. I particularly enjoy problem solving, so when the team behind the cycling club came to me with a suggestion that one of the site functions could do with some improvements in speed,  I jumped at the chance!

The function in question relates to validating GPS trails – the majority of the rides organised are attended my masses of riders who start at a given point, visit certain “Checkpoints” to get a receipt, signature, stamp or some other proof of visit, and then (though not always) return to the start. These are the types of ride I attend, though another way of gaining points is to ride a “DIY Permanent” whereby you track your progress using a GPS device and then submit the trail, which is checked against known records of the trail.

Using libraries of functions, this part of the task is fairly straightforward and is carried out by visiting each point of the expected ride, finding the closes point from the riders trail and comparing that to within a certain level tolerance. The problem comes when there are several of these GPS trails to be checked and the riders have recorded a huge number of points along the way! One example I used was a 200km trail that had a suggested GPS trail of just over 1000 points.

The user had submitted 67,000 points along their journey! Although running locally, the software managed the impressive task within 40 seconds, I’m told that wasn’t the biggest that has been seen!

So how could the times be improved? The task is currently pretty much sequential, so only one processing core is being used so one option would be to break up the task into smaller chunks to allow for some parallel processing of the trail. Though given the cost of servers with multiple accessible cores over our current offering, I didn’t give this much consideration.

So how to simplify the task? Researching “Line simplification” may lead you to Ramer, Douglas & Peucker’s algorithm. Specifically wikipedia’s entry: https://en.wikipedia.org/wiki/Ramer-Douglas-Peucker_algorithm.

So how does it work? (For people not wanting to look just yet…) You take the start and finish points, draw a line between, and then find the point that is furthest from this line… If this point is outside of the tolerance required, mark this point as a mid point and create two new lines meeting here. Repeat this process for the new lines created, et voila! You eventually get lines representing the original route, with slightly decreased resolution but potentially a huge compression rate!

And the results of carrying out the filtering? The trail was reduced from 67,000 points to an impressed 1,200! Running the same path validation algorithm over this new track resulted in an execution speed of just 10 seconds! a 400% improvement!

Unfortunately it wasn’t all good news. As you can imagine, comparing all these points is still a computing intensive activity and took a full 20 seconds to carry out. Still, 30 seconds total compared to 37 was a 24% improvement.

So, the simplest option that was considered, of simply “missing out points” from the control trail currently looks the best option in terms of implementation ease and improvement (Using 1 in 4 points on the control trail and comparing against the full path of 67,00 user points took 17 seconds, incidentally) though we are trying one last ditch attempt to get our riders to tone down the data sampling!

 

And just in case anyone is interested in the Java implementation of the algorithm (you’ll need your own implementation of lineToPoint()):

 

private List <Coordinate> trim(List <Coordinate> coords, int epsilon) {
 double dmax = 0;
 int index = 0;
 int end = coords.size() - 1;

// Find furthest point from line start to end
 List<Coordinate> theLine = new ArrayList();
 theLine.add(coords.get(0));
 theLine.add(coords.get(end));
for (int i = 1; i < end - 1; i++) {
 double dist = lineToPoint(theLine, coords.get(i));
 if (dist > dmax) {
 dmax = dist;
 index = i;
 }
 }

 List <Coordinate> resultList = new ArrayList();
 // If max distance is greater than epsilon, recursively simplify
 if ( dmax > epsilon ) {
 List <Coordinate> recResults1 = trim(coords.subList(0, index), epsilon);
 List <Coordinate> recResults2 = trim(coords.subList(index, end), epsilon);
 
 // Build the result list
 resultList.addAll(recResults1);
 resultList.remove(resultList.size()-1);
 resultList.addAll(recResults2);
 } else {
 resultList.add(coords.get(0));
 resultList.add(coords.get(coords.size()-1));
 }
 return resultList;
 }

What a week!

What a week it has been! We returned to Kent from visiting family in Derbyshire, unaware of the events in store…

It all started Monday morning with an ominous email reading “The server has disappeared!”. Never something you want to see, never mind on a Monday! The usual ping test confirmed the host was down, though the hosting companies routers confirmed that the host was unreachable, so at least the DNS was not to blame.

Checking the providers news feed revealed they’d had a server outage the previous day, confirmation something had gone awry on their end. After finding someone with the password for the dashboard, they raised a support ticket and got the server running again! Unfortunately that was only the start of the problems!

As the server had unexpectedly gone off line, the file system had a few errors and so had been mounted read only until we could run FSCK over it to correct any problems. Being a hosted server, this had to be requested from the ever helpful support team. That done, we managed approximately half an hour before the server switched the file system back to read only mode! After several attempts to stabilise things, the decision was made to request a new server image and perform an emergency migration… Not something to be taken lightly, given the server was still running Ubuntu 10.04!

With the new server freshly running 14.04 and updated, the long slow process of installing the LAMP stack commenced, followed by the reinstatement of the databases and web site pages. By Tuesday evening, everything appeared to be running. Or so it seemed! It turns out the website was relying on features disabled in the currently installed version of PHP! After trying a few code kludges to work around the matter, time was pressing on and so a downgrade looked the best option. But how to carry it out? A good old compile from source! Which always takes forever in a server environment, given they aren’t geared for development and so package after package had to be added before it was finally compiled and installed and back up and running!

So, by Wednesday evening, the majority of the functionality had been reinstated…  A problem reading created Excel spreadsheets was traced to bugs in a no longer supported PHP module (http://pear.php.net/bugs/bug.php?id=19284&edit=3) and usability issues relating to cookie usage across domains were rectified (rather than respond to several addresses, the site now forwards requests to one domain) and so we’re up and ready for the weekend!

So what did I learn from the week?

  • Always keep up to date backups of the /etc/ directory! Luckily I had a copy I’d made last year available, but it would have been much easier if it had been kept up to date!
  • PHP is evil…
  • And PHP code from last decade isn’t going to be happy on a fresh server.
  • Ask for passwords for the control panel of the server! I hadn’t asked as I didn’t forsee a need for it…

So here’s to the weekend, and the 10 mile run on Sunday! (http://canterbury10.co.uk/)

With thanks to:

The support staff at Hosting UK (https://hostinguk.net/)  who did all they could to ensure we were back up and running as quickly as possible.

My colleagues at AUKWEB (http://www.aukweb.net/) all volunteers who set up the old server, coded the site and assisted where they could in the migration.

Here’s to another decade of uptime!