I recently finished my studies in Edinburgh, United Kingdom at the School of Mathematics, University of Edinburgh where I did an MSc programme in Statistics with Data Science. Previously I also completed a BSc in Computer Science at the same university, with the School of Informatics, focusing mainly on machine learning and its applications.
I am now working at Nibble as a data scientist, mainly looking at ways of using machine learning and statistics to improve a fun negotiation chatbot that helps customers to achieve discounts on various e-commerce websites with products ranging from karaoke machines and coffee capsules to high-end luxury watches.
Data Scientist• September 2021 - Ongoing
Core data scientist of a conversational AI product enabling e-commerce retailers to deliver personalized discounts to customers via an engaging negotiation agent.
Junior Developer (Intern)• May 2021 - September 2021
Software engineer focusing on adding new features to support the growth of the start-up, and providing foundations for advancing the use of machine learning within the company.
MSc Statistics with Data Science - Distinction• 2020 - 2021
Despite the merits of renewable energy, the development of offshore wind farms directly poses the risk of disrupting natural habitats and foraging areas of seabirds. In order to mitigate these impacts, known foraging areas can be designated as marine special protection areas.
To identify more foraging areas, the Joint Nature Conservation Committee has provided data consisting of the GPS location of an observation boat and the instantaneous & continuous behaviour of common (Sterna hirundo), Arctic (Sterna paradisaea), roseate (Sterna dougallii) and sandwich (Sterna sandvicensis) terns on foraging trips from Coquet Island near Amble, Northumberland, United Kingdom. This was collected in 2010/2011 during incubation and chick-rearing breeding stages.
We investigate whether hidden Markov models (HMMs) can be effectively used in an unsupervised manner to identify foraging locations based on a sequence of step lengths and angles derived from GPS data. With the use of the recorded behavioural data, we validate our models and evaluate their effectiveness through the use of sensitivity and specificity metrics. Our focus is solely on the rearing stage, performing year-by-year analyses between all species.
Our results show that overall HMMs are an effective approach, although sensitive to initial parameters. Despite fitting a mixture model to observations to obtain initial parameters, we were unable to fit a suitable model for 2010 sandwich tern data. Visual inspection of observation histograms suggests that there are only two clear behavioural states in terns; supposedly foraging and non-foraging.
Small sample sizes in 2010 renders year-by-year comparison challenging, but based on more reliable 2011 analysis, HMMs tended to predict the non-foraging class more accurately with high specificity (≥ 0.84 for all species) but were less effective at identifying foraging locations correctly, i.e. lower sensitivity (between 0.56–0.74 for all species). However, considering the model was unsupervised, these are still impressive results and it is worth investigating how to improve the models.
Our year-by-year findings show that 2011 predictions were generally more accurate, but likely due to having more samples. There were few behavioural differences between the years, except the roseate terns all following the same route in 2011 and having much more predictable foraging behaviour than all terns in 2010 and 2011 (foraging almost exclusively in the shallower water near the coast of Amble, perhaps due to changes in weather conditions or prey availability). As there is a general shortage of year-to-year analyses in literature, we are unable to compare this result with others.
Large fluctuations in taxi usage are often influenced by real-world events and phenomena. We can apply anomaly detection to taxi passenger counts in an attempt to uncover causes such as holidays and sporting events, and even more extreme causes such as storms or terrorist attacks. As an example use-case, taxi and ride-hailing services such as Uber regularly use anomaly detection to increase pricing to match demand during periods of unusually high passenger counts, also known as surge pricing.
In this paper, we aim to investigate the effectiveness of modern machine learning algorithms such as recurrent neural networks on the task of anomaly detection in taxi passenger time series data.
Along with neural networks, Bayesian models have also been of particular interest for anomaly detection as they attempt to directly model uncertainty. Uncertainty modelling is a core aspect of anomaly detection as it allows us to quantify how strongly we believe an observation to be anomalous.
We explore a neural network architecture proposed by Zhu & Laptev at Uber, which aims to model prediction uncertainty through the use of approximate Bayesian inference in recurrent neural networks. We implement and compare a non-Bayesian version of the network which identifies anomalies using a threshold, and a Bayesian version of the network which uses uncertainty modelling to construct a prediction interval for detecting anomalies.
In particular, we apply these models to the TLC Trip Record Dataset, consisting of passenger counts of taxi trips in New York City from July 1st 2014 to January 31st 2015.
Our results show that the proposed neural network architecture is effective at forecasting and anomaly detection when implemented in both a non-Bayesian and Bayesian manner. Within the dataset, both models were able to identify some anomalies that had known causes such as New Year’s Eve and the New York Marathon.
The Bayesian approach yielded a RMSE of 0.081 on held-out test data, compared to 0.121 from the non-Bayesian approach. This suggests that the Bayesian approach generalizes better in forecasting and detecting anomalous counts in future unseen data.
Both methods required a very large threshold/interval in order to correctly identify anomalies. The non-Bayesian approach was based on the 99.9th percentile of the errors between the forecasts generated by the model and the true values, and the Bayesian approach required a prediction interval of 99.95% in order to yield acceptable results.
BSc (Hons) Computer Science - 1st class• 2016 - 2020
Head gestures are a simple, yet expressive form of human-to-human interaction, used as a medium for conveying ideas and emotions. Despite their simplicity as a form of communication, the accurate modelling and recognition of human head gestures has posed many challenges and provided many opportunities for machine learning research. The frequent use of motion-tracking devices, video, virtual-reality headsets and motion capture systems in the modern age of technology further motivates the need for effective head gesture recognition systems.
In this dissertation, we focused primarily on the task of isolated head gesture recognition on rotation signals obtained from motion capture data. For this task, we performed in-depth research, application and evaluation of various widely-used sequence classification algorithms, including k-Nearest Neighbors with Dynamic Time Warping, Hidden Markov Models, Feed-Forward Neural Networks, and Recurrent Neural Networks with Long Short-Term Memory (LSTM). Comparisons between classifiers were done on the basis of recognition performance, which was measured with F1 score, and efficiency, which was measured in terms of peak memory consumption and fitting/prediction times.
The most effective method of modelling gestures was a bidirectional multi-layer LSTM, which yielded an accuracy of 53.75±1% and an F1 score of 52.30±1%. This result is a vast improvement of +15.1% F1 score over previous works on the same dataset.
GCE/A-Levels (Mathematics - A*, Physics - A, Computing - B) • 2007 - 2016
In addition to being able to effectively handle, prepare and gather insight from data, I have a solid understanding of many machine learning algorithms, including but not limited to:
I have also had practice using these methods in specific application areas, such as using HMMs to perform part-of-speech tagging or automatic speech recognition, and using a variant of k-Means clustering to identify similarities in customer behaviour.
I am also able to confidently conduct statistical analyses involving hypothesis tests, maximum likelihood estimation, confidence intervals, imputation methods for missing data, and much more – all with the help of the R programming language. I am currently learning more about Bayesian statistics, and in particular, computational methods such as Markov Chain Monte Carlo.
Programming Languages and Libraries:
I mostly use Ruby for my general scripting needs, but also use it to design web-related things such as APIs, frameworks and websites. For tasks involving data science and machine learning, I am very comfortable working with both Python and R.
During my time at university, I have developed many coursework-related and personal projects ranging from machine learning packages to websites, web frameworks and APIs. Most of these projects were written in Ruby and Python, but I have also done a few in some other languages such as R. You can find all of my projects on my GitHub profile.
Here is a brief list of some of the larger projects that I have worked on, or are currently working on:
A machine learning interface for isolated sequence classification algorithms in Python.
Created as part of my Honours project research involving the use of sequential classifiers for automatic detection and recognition of head gestures in motion capture data.
Micro-framework, application generator and CLI wrapped around the Sinatra DSL.
Designed to simplify the process of getting new Sinatra applications up and running by providing commands for quick scaffolding and MVC file generation.