About Us
Our Approach
Services
Projects
News
Contact Us
 

Big Data in Sailing – International Moth data logger project

Data Logger used to collect Moth sailing data

Big Data in Sailing – International Moth data logger project

  |   Blog

Background & objectives

Big Data is a wide ranging term describing tools and techniques used to analyse complex and large volume data sources in ways that were previously not possible using traditional computing. New applications continue to emerge with highly diverse examples of innovation ranging from the LAPD reducing crime rates through building predictive models to car insurers calculating premiums based on driving styles.

Professional sport has taken great interest in terms of performance analysis and backroom management with moneyball style analytics. Given I’m a keen sailor in the International Moth class and we specialise in helping organisations adopt the latest Information Management techniques we thought we would try to apply some our expertise to this domain. In order to do this however there are a number of challenges to overcome, including a lack of commercially available sensors, so we’ve decided to create a Data Logger project which we will be covering using our Blog. The aim is to build a data logger using a variety of sensors using the Arduino Open source electronics platform. This can then be used to for repeatable analysis using the latest Big Data tools to gain performance insights. In this first post I’ll talk about the motivations for the project, challenges and where we are with prototyping our sensor package and analysis toolset.

The principle objectives of the project are to analyse data in order to:

  • Achieve a repeatable way of assessing whether kit and settings changes impact performance.
  • To analyse technique and compare approaches across helms.
  • To collate data to identify patterns, insights and trends that we may not currently be aware of.

 

 

Challenges

Using Big Data in Sailing presents its own set of challenges distinct from other sports. A large part of what makes it an exciting sport is the sheer number of variables that impact performance and how difficult these variables are to model, understand and predict. Pre-racing there are a wide range of variables that can impact performance such as kit selection, boat design, rig setup and the like. Some of these variables are easier to measure than others. Rig tension for example can be measured using a gauge, measuring the shape of a sail when rigged on the other hand can be somewhat more difficult. During racing a whole host of new variables such as weather conditions, tide and wind to Sailor characteristics (ability, weight, fitness etc.) also emerge.

Winners of the Americas Cup, Team Oracle, proved that leveraging Data can help overcome the impossible by achieving one of the most impressive sporting comebacks of all time. They were able to do this by investing heavily not only in hardware and software but also sensors and trained staff. Team Oracle used around 300 sensors generating 3000 datapoints, ten times per second. Combining this with historical data, video and other datasets its unsurprising they needed Oracle Exadata kit. Fortunately for our project however the Moth is considerably smaller, single handed and has less controls and settings to change therefore whilst still significant data volumes should be smaller.

Prototyping

We always recommend prototyping projects before committing to large and complex builds. To clarify some of my thinking last season I installed a waterproof housing on my moth to use my Android smartphone as a temporary data logger. This proved to be a relatively easy way of collecting initial accelerometer, gyroscope and GPS data. It helped to progress thinking about which analytics were worth generating, how to streamline calculations and work through using the data in practice to make tuning decisions. I set the device to capture gps coordinates, gyroscope and accelerometer (x,y,z) data 2 times per second. From this I was able to calculate real-time speed, average speeds and bearing.

The analytics below are screenshots from Tableau, a Data Visualisation tool, for the first day of last year’s Parkstone Grand Prix where fortunately the wind was reasonably consistent in terms of direction and speed. Using estimated windspeed and direction I calculated estimate polars (speed by sailing direction) and absolute Velocity Made Good (VMG) and coded some rules to determine lap numbers and times, port-starboard tacks and when a tack or a gybe had been completed. One thing that became apparent during the prototyping was that the sensors do from time to time throw up errors that can be difficult to spot can be corrected or filtered out. An example of this was when the number of connected satellites for the GPS fell to a number that generated inaccurate fixes for location data. To get around these issues I used our Data Mining & Profiling partner’s tool X88 Pandora to build a rule that compared average speed against leg, bearing and number of connected satellites. This should prove useful in the future as the project progresses.

So what did I learn from the data collected that day? One of the first insights was that my top upwind and downwind speeds do not appear to be evenly spread across port and starboard tack. I was considerably faster downwind on port tack and upwind on starboard tack. From the mapping data these differences didn’t appear to be due to significantly overstanding lay-lines and thus gaining higher speeds at the expense of VMG. Unfortunately given I was using estimate wind direction and speed numbers this could be due to coincidental gusts whilst sailing on those tacks. I don’t think this was the case either however as the wind was reasonably consistent and particularly upwind the difference is a hugely noticeable couple of knots.  This got me thinking about other key settings that could be noticeably different from tack to tack such as the wand on the bow of the moth which is mounted to one side.  This changes in relative length from tack to tack, especially upwind where the boat is sailed heeled over on top of the helm. At the time my wand controls were poorly setup making it difficult to adjust length from tack to tack so I decided to take this hypothesis forward by fitting a longer wand and logging data with an equally long wand on both tacks – sure enough this reduced the difference in average speeds from tack to tack.

Another interesting analytic is comparing speed to boat trim – i.e. the angle of the hull and foils relative to the water. This can be adjusted by moving your body weight back and forward or by twisting the tiller to adjust the rudder angle in a moth. There’s a lot of debate in the moth fleet about what is the optimum setting for downwind sailing.  Some suggest keeping the bow down so that the boat can be driven hard in the waves, others suggest keeping the bow up and reasonably level so that foil lift is generated more from the cross section of the mainfoil rather than through more flap down – which adds to drag and reduces speed. The scattergram shows that there is a cluster of faster speeds when the bow is slightly dropped, with faster speeds particularly on port tack. There is the odd high speed where the bow is slightly raised, however given this was during racing I wasn’t deliberately experimenting with different trim settings. I think this analytic will be useful in the future to try and shed some more light on this area and work out what the optimum level of trim is. I suspect the way forward will be to start doing tuning runs once I have accurate, real-time wind direction and strength figures and can produce similar analysis for a constant VMG.

Tacking and gybing tends to be an area where significant gains can be made by helms that are able to keep speeds up and stay on the foils whilst changing direction. Techniques to do this slightly vary, particularly across moth hull designs and what works for one person might not work for another. As expert Nathan Outeridge shows below a lot of things happen during a tack. The average and actual 10 second gybe and tack breakdowns were useful for working out which were my best tacks and gybes and then looking into why. My worst tacks generally seem to be where I dont have good speed going into the tack and don’t have a slight amount of windward heel – I’m either flat or heeled to leeward. For gybes one thing I’m looking forward to analysing is whether slow, arched gybes or fast gybes are better for average speed and VMG. Either way looking at these analytics first makes measuring improvement and finding the relevant video footage to troubleshoot a lot easier. Going forward I suspect I’ll try and automate some of this analysis by calculating how flat the curves are.

Scoping and next steps

The use of a smartphone was great for helping to form ideas around next steps and helped to find a number of insightful things buried in the data.  Taking the project forward however really we want to use a more advanced suite of sensors that can be fitted to multiple boats.  To do this I’ve been brushing up on my electronics knowledge by learning about the Arduino platform and prototyping different sensor configurations. I’ve also got a wind speed and direction finder on order to get more accurate wind data. The eventual aim is to also incoporporate other real-time sensor data such as wand movements and angle, mainsheet pulls, sail-shape and angle, height above the water, foil angles, steering movement and other measurements. This will take time however given I dont have a great deal of electronics expertise.

In the next blog I’ll spend some time going through the Data Architecture and IT components we are using as well as some more detail on progress with the sensors. Stayed tuned.

Read More

New company website launch

  |   Blog

We are very pleased to announce the launch of our new company website.  In keeping with our business philosophy we have tried to keep the site as Lean as possible whilst maximising the use of interactive and educational features. We really do believe there has never been a more exciting time to be working with Data and Information and we hope this is apparent in the site.

We have always been Information Management practitioners and as such have tried to illustrate this.  One of the features we created is a video feature on the homepage which showcases some of the tools we use for Data Modelling, Data Architecture and Data Profiling projects with clients. The tools featured are X88 Pandora, Gephi and Sparx Enterprise Architect.  Pandora is a leading Data Quality Management and Profiling tool whilst Gephi and Sparx we often use for traditional Data Modelling and graph-based analyses of Data Modelling and Data Architecture requirements.

2015 looks set to be another very exciting year for our profession and we will be hosting a number of online webinars and events in central London as the year progresses. To stay informed of these and other news please do subscribe to our Newsletter.

 

Read More

Trade Data Profiling – Essential Tips

  |   White Papers

Trade Data has never been more important

Financial Trade Reporting and analysis requirements, particularly for Investment Banks, continue to grow in terms of volumes, depth and complexity. Whilst many Financial institutions retain their trading data in various formats, many have struggled to provision this data to third parties in well understood, cleansed and standardised formats. Often trade repositories developed to become a single version of the truth for trade data have simply moved upstream data issues into a single location downstream.

Many non-investment banking firms such as Insurers and Asset Managers are now recognising the value of implementing trade repositories and Regulators have similarly suggested that existing regulations such as EMIR and Dodd-Frank will be further extended.

 

How do I take control my Trade Data?

Prior to building IT components, purchasing tools and services we always recommend understanding the structure, quality and content of your Data. Given large daily volumes and often complex hierarchical data structures (such as for exotic derivatives) this is often however easier said than done.

Data Profiling Trade Data including Security, Book, Counterparty and Transaction data is something we specialise in. This involves using the latest Data Science, Data Mining and automated discovery techniques to acquire a deep understanding of your data.   This enables you to discover issues such as Data Quality problems that may need to be resolved in order to ensure regulatory compliance and gain further insights into trading patterns and behaviour. To help we have put together a handy white paper sharing some of our best practice tips for getting the most from your Data Profiling of Financial transactions.

 

Download our free White Paper on Trade Data Profiling here.

trade_data_profiling

 

Read More