Big Data in Sailing – International Moth data logger project
Background & objectives
Big Data is a wide ranging term describing tools and techniques used to analyse complex and large volume data sources in ways that were previously not possible using traditional computing. New applications continue to emerge with highly diverse examples of innovation ranging from the LAPD reducing crime rates through building predictive models to car insurers calculating premiums based on driving styles.
Professional sport has taken great interest in terms of performance analysis and backroom management with moneyball style analytics. Given I’m a keen sailor in the International Moth class and we specialise in helping organisations adopt the latest Information Management techniques we thought we would try to apply some our expertise to this domain. In order to do this however there are a number of challenges to overcome, including a lack of commercially available sensors, so we’ve decided to create a Data Logger project which we will be covering using our Blog. The aim is to build a data logger using a variety of sensors using the Arduino Open source electronics platform. This can then be used to for repeatable analysis using the latest Big Data tools to gain performance insights. In this first post I’ll talk about the motivations for the project, challenges and where we are with prototyping our sensor package and analysis toolset.
The principle objectives of the project are to analyse data in order to:
- Achieve a repeatable way of assessing whether kit and settings changes impact performance.
- To analyse technique and compare approaches across helms.
- To collate data to identify patterns, insights and trends that we may not currently be aware of.
Using Big Data in Sailing presents its own set of challenges distinct from other sports. A large part of what makes it an exciting sport is the sheer number of variables that impact performance and how difficult these variables are to model, understand and predict. Pre-racing there are a wide range of variables that can impact performance such as kit selection, boat design, rig setup and the like. Some of these variables are easier to measure than others. Rig tension for example can be measured using a gauge, measuring the shape of a sail when rigged on the other hand can be somewhat more difficult. During racing a whole host of new variables such as weather conditions, tide and wind to Sailor characteristics (ability, weight, fitness etc.) also emerge.
Winners of the Americas Cup, Team Oracle, proved that leveraging Data can help overcome the impossible by achieving one of the most impressive sporting comebacks of all time. They were able to do this by investing heavily not only in hardware and software but also sensors and trained staff. Team Oracle used around 300 sensors generating 3000 datapoints, ten times per second. Combining this with historical data, video and other datasets its unsurprising they needed Oracle Exadata kit. Fortunately for our project however the Moth is considerably smaller, single handed and has less controls and settings to change therefore whilst still significant data volumes should be smaller.
We always recommend prototyping projects before committing to large and complex builds. To clarify some of my thinking last season I installed a waterproof housing on my moth to use my Android smartphone as a temporary data logger. This proved to be a relatively easy way of collecting initial accelerometer, gyroscope and GPS data. It helped to progress thinking about which analytics were worth generating, how to streamline calculations and work through using the data in practice to make tuning decisions. I set the device to capture gps coordinates, gyroscope and accelerometer (x,y,z) data 2 times per second. From this I was able to calculate real-time speed, average speeds and bearing.
The analytics below are screenshots from Tableau, a Data Visualisation tool, for the first day of last year’s Parkstone Grand Prix where fortunately the wind was reasonably consistent in terms of direction and speed. Using estimated windspeed and direction I calculated estimate polars (speed by sailing direction) and absolute Velocity Made Good (VMG) and coded some rules to determine lap numbers and times, port-starboard tacks and when a tack or a gybe had been completed. One thing that became apparent during the prototyping was that the sensors do from time to time throw up errors that can be difficult to spot can be corrected or filtered out. An example of this was when the number of connected satellites for the GPS fell to a number that generated inaccurate fixes for location data. To get around these issues I used our Data Mining & Profiling partner’s tool X88 Pandora to build a rule that compared average speed against leg, bearing and number of connected satellites. This should prove useful in the future as the project progresses.
So what did I learn from the data collected that day? One of the first insights was that my top upwind and downwind speeds do not appear to be evenly spread across port and starboard tack. I was considerably faster downwind on port tack and upwind on starboard tack. From the mapping data these differences didn’t appear to be due to significantly overstanding lay-lines and thus gaining higher speeds at the expense of VMG. Unfortunately given I was using estimate wind direction and speed numbers this could be due to coincidental gusts whilst sailing on those tacks. I don’t think this was the case either however as the wind was reasonably consistent and particularly upwind the difference is a hugely noticeable couple of knots. This got me thinking about other key settings that could be noticeably different from tack to tack such as the wand on the bow of the moth which is mounted to one side. This changes in relative length from tack to tack, especially upwind where the boat is sailed heeled over on top of the helm. At the time my wand controls were poorly setup making it difficult to adjust length from tack to tack so I decided to take this hypothesis forward by fitting a longer wand and logging data with an equally long wand on both tacks – sure enough this reduced the difference in average speeds from tack to tack.
Another interesting analytic is comparing speed to boat trim – i.e. the angle of the hull and foils relative to the water. This can be adjusted by moving your body weight back and forward or by twisting the tiller to adjust the rudder angle in a moth. There’s a lot of debate in the moth fleet about what is the optimum setting for downwind sailing. Some suggest keeping the bow down so that the boat can be driven hard in the waves, others suggest keeping the bow up and reasonably level so that foil lift is generated more from the cross section of the mainfoil rather than through more flap down – which adds to drag and reduces speed. The scattergram shows that there is a cluster of faster speeds when the bow is slightly dropped, with faster speeds particularly on port tack. There is the odd high speed where the bow is slightly raised, however given this was during racing I wasn’t deliberately experimenting with different trim settings. I think this analytic will be useful in the future to try and shed some more light on this area and work out what the optimum level of trim is. I suspect the way forward will be to start doing tuning runs once I have accurate, real-time wind direction and strength figures and can produce similar analysis for a constant VMG.
Tacking and gybing tends to be an area where significant gains can be made by helms that are able to keep speeds up and stay on the foils whilst changing direction. Techniques to do this slightly vary, particularly across moth hull designs and what works for one person might not work for another. As expert Nathan Outeridge shows below a lot of things happen during a tack. The average and actual 10 second gybe and tack breakdowns were useful for working out which were my best tacks and gybes and then looking into why. My worst tacks generally seem to be where I dont have good speed going into the tack and don’t have a slight amount of windward heel – I’m either flat or heeled to leeward. For gybes one thing I’m looking forward to analysing is whether slow, arched gybes or fast gybes are better for average speed and VMG. Either way looking at these analytics first makes measuring improvement and finding the relevant video footage to troubleshoot a lot easier. Going forward I suspect I’ll try and automate some of this analysis by calculating how flat the curves are.
Scoping and next steps
The use of a smartphone was great for helping to form ideas around next steps and helped to find a number of insightful things buried in the data. Taking the project forward however really we want to use a more advanced suite of sensors that can be fitted to multiple boats. To do this I’ve been brushing up on my electronics knowledge by learning about the Arduino platform and prototyping different sensor configurations. I’ve also got a wind speed and direction finder on order to get more accurate wind data. The eventual aim is to also incoporporate other real-time sensor data such as wand movements and angle, mainsheet pulls, sail-shape and angle, height above the water, foil angles, steering movement and other measurements. This will take time however given I dont have a great deal of electronics expertise.
In the next blog I’ll spend some time going through the Data Architecture and IT components we are using as well as some more detail on progress with the sensors. Stayed tuned.