About Me

I have a strong background in software engineering, developed through my personal interests such as creating software packages, but also from my time studying computer science at the School of Informatics, University of Edinburgh.

At the same time, I have been building a solid background in statistics and machine learning, initially being motivated by a number of machine learning courses I took, which led on to enrolling on a statistics-focused graduate programme at the School of Mathematics, University of Edinburgh.

Following on from this, I now work at Nibble as a data scientist looking at ways of using machine learning and statistics to improve a conversational negotiation agent that helps customers to achieve personalized discounts on various e-commerce websites.

Contact Details

Edwin Onuonga
United Kingdom
[email protected]


Nibble Technology

Data Scientist September 2021 - Ongoing

Core data scientist focusing on understanding user negotiation styles and strengthening the use of data for driving company-wide decisions for a conversational AI product enabling e-commerce retailers to deliver personalized discounts to customers via an engaging negotiation agent.

Nibble Technology

Junior Developer (Intern) May 2021 - September 2021

Software engineer focusing on adding new features to support the growth of the start-up, as well as improving the core negotiation algorithm and providing foundations for advancing the use of machine learning within the company.


School of Mathematics, University of Edinburgh

MSc Statistics with Data Science - Distinction 2020 - 2021

Research projects:

  • Hidden Markov Models as Tools to Identify Seabird Foraging Areas
    Supervised by: Dr. Gail Robertson and Prof. Finn Lindgren
    Industry partner: Joint Nature Conservation Committee
    Click here to view the abstract

    Despite the merits of renewable energy, the development of offshore wind farms directly poses the risk of disrupting natural habitats and foraging areas of seabirds. In order to mitigate these impacts, known foraging areas can be designated as marine special protection areas.

    To identify more foraging areas, the Joint Nature Conservation Committee has provided data consisting of the GPS location of an observation boat and the instantaneous & continuous behaviour of common (Sterna hirundo), Arctic (Sterna paradisaea), roseate (Sterna dougallii) and sandwich (Sterna sandvicensis) terns on foraging trips from Coquet Island near Amble, Northumberland, United Kingdom. This was collected in 2010/2011 during incubation and chick-rearing breeding stages.

    We investigate whether hidden Markov models (HMMs) can be effectively used in an unsupervised manner to identify foraging locations based on a sequence of step lengths and angles derived from GPS data. With the use of the recorded behavioural data, we validate our models and evaluate their effectiveness through the use of sensitivity and specificity metrics. Our focus is solely on the rearing stage, performing year-by-year analyses between all species.

    Our results show that overall HMMs are an effective approach, although sensitive to initial parameters. Despite fitting a mixture model to observations to obtain initial parameters, we were unable to fit a suitable model for 2010 sandwich tern data. Visual inspection of observation histograms suggests that there are only two clear behavioural states in terns; supposedly foraging and non-foraging.

    Small sample sizes in 2010 renders year-by-year comparison challenging, but based on more reliable 2011 analysis, HMMs tended to predict the non-foraging class more accurately with high specificity (≥ 0.84 for all species) but were less effective at identifying foraging locations correctly, i.e. lower sensitivity (between 0.56–0.74 for all species). However, considering the model was unsupervised, these are still impressive results and it is worth investigating how to improve the models.

    Our year-by-year findings show that 2011 predictions were generally more accurate, but likely due to having more samples. There were few behavioural differences between the years, except the roseate terns all following the same route in 2011 and having much more predictable foraging behaviour than all terns in 2010 and 2011 (foraging almost exclusively in the shallower water near the coast of Amble, perhaps due to changes in weather conditions or prey availability). As there is a general shortage of year-to-year analyses in literature, we are unable to compare this result with others.

  • Anomaly Detection with Bayesian Neural Networks for Time Series Data
    Supervised by: Dr. Michael Allerhand and Alastair Hamilton
    Industry partner: Lloyds Banking Group
    Click here to view the abstract

    Large fluctuations in taxi usage are often influenced by real-world events and phenomena. We can apply anomaly detection to taxi passenger counts in an attempt to uncover causes such as holidays and sporting events, and even more extreme causes such as storms or terrorist attacks. As an example use-case, taxi and ride-hailing services such as Uber regularly use anomaly detection to increase pricing to match demand during periods of unusually high passenger counts, also known as surge pricing.

    In this paper, we aim to investigate the effectiveness of modern machine learning algorithms such as recurrent neural networks on the task of anomaly detection in taxi passenger time series data.

    Along with neural networks, Bayesian models have also been of particular interest for anomaly detection as they attempt to directly model uncertainty. Uncertainty modelling is a core aspect of anomaly detection as it allows us to quantify how strongly we believe an observation to be anomalous.

    We explore a neural network architecture proposed by Zhu & Laptev at Uber, which aims to model prediction uncertainty through the use of approximate Bayesian inference in recurrent neural networks. We implement and compare a non-Bayesian version of the network which identifies anomalies using a threshold, and a Bayesian version of the network which uses uncertainty modelling to construct a prediction interval for detecting anomalies.

    In particular, we apply these models to the TLC Trip Record Dataset, consisting of passenger counts of taxi trips in New York City from July 1st 2014 to January 31st 2015.

    Our results show that the proposed neural network architecture is effective at forecasting and anomaly detection when implemented in both a non-Bayesian and Bayesian manner. Within the dataset, both models were able to identify some anomalies that had known causes such as New Year’s Eve and the New York Marathon.

    The Bayesian approach yielded a RMSE of 0.081 on held-out test data, compared to 0.121 from the non-Bayesian approach. This suggests that the Bayesian approach generalizes better in forecasting and detecting anomalous counts in future unseen data.

    Both methods required a very large threshold/interval in order to correctly identify anomalies. The non-Bayesian approach was based on the 99.9th percentile of the errors between the forecasts generated by the model and the true values, and the Bayesian approach required a prediction interval of 99.95% in order to yield acceptable results.


  • Stochastic Modelling
  • Fundamentals of Operational Research
  • Credit Scoring
  • Biomedical Data Science
  • Generalized Regression Models
  • Bayesian Theory
  • Bayesian Data Analysis
  • Statistical Methodology
  • Applied Statistics
  • Incomplete Data Analysis

School of Informatics, University of Edinburgh

BSc (Hons) Computer Science - 1st class 2016 - 2020

Research project:

  • Automatic detection and classification of human head gestures
    Supervised by: Dr. Hiroshi Shimodaira
    Click here to view the abstract

    Head gestures are a simple, yet expressive form of human-to-human interaction, used as a medium for conveying ideas and emotions. Despite their simplicity as a form of communication, the accurate modelling and recognition of human head gestures has posed many challenges and provided many opportunities for machine learning research. The frequent use of motion-tracking devices, video, virtual-reality headsets and motion capture systems in the modern age of technology further motivates the need for effective head gesture recognition systems.

    In this dissertation, we focused primarily on the task of isolated head gesture recognition on rotation signals obtained from motion capture data. For this task, we performed in-depth research, application and evaluation of various widely-used sequence classification algorithms, including k-Nearest Neighbors with Dynamic Time Warping, Hidden Markov Models, Feed-Forward Neural Networks, and Recurrent Neural Networks with Long Short-Term Memory (LSTM). Comparisons between classifiers were done on the basis of recognition performance, which was measured with F1 score, and efficiency, which was measured in terms of peak memory consumption and fitting/prediction times.

    The most effective method of modelling gestures was a bidirectional multi-layer LSTM, which yielded an accuracy of 53.75±1% and an F1 score of 52.30±1%. This result is a vast improvement of +15.1% F1 score over previous works on the same dataset.


  • Algorithms, Data Structures and Learning
  • Introductory Applied Machine Learning
  • Machine Learning Practical
  • Machine Learning and Pattern Recognition
  • Extreme Computing
  • Database Systems
  • Processing Formal and Natural Languages
  • Automatic Speech Recognition
  • Natural Language Understanding, Generation and Machine Translation
  • Foundations of Natural Language Processing

The English College, Dubai

GCE/A-Levels (Mathematics - A*, Physics - A, Computing - B) 2007 - 2016


  • Heriot-Watt University Programming Challenge
  • IT Department Assistance


In addition to being able to effectively handle, prepare and gather insight from data, I have a solid understanding of many machine learning algorithms, including but not limited to:

  • Generalized Linear Models
  • Logistic Regression
  • k-Nearest Neighbor Classifiers
  • Neural Networks (incl. RNN & CNN)
  • Hidden Markov Models
  • Mixture Models
  • Decision Trees & Random Forests
  • k-Means Clustering

I have also had practice using these methods in specific application areas, such as using HMMs to perform part-of-speech tagging or automatic speech recognition, and using a variant of k-Means clustering to identify similarities in customer behaviour.

Programming Languages and Libraries:

Python Ruby R PostgreSQL HTML JavaScript CSS   SASS
VS Code RStudio Git   GitHub macOS Bash Conda LaTeX

I mostly use Ruby for my general scripting needs, but also use it to design web-related things such as APIs, frameworks and websites. For tasks involving data science and machine learning, I am very comfortable working with both Python and R.

I have also written and published a number of libraries in both Python and Ruby on the public repositories PyPI and RubyGems.


eonu's GitHub chart

During my time at university, I have developed many coursework-related and personal projects ranging from machine learning packages to websites, web frameworks and APIs. Most of these projects were written in Ruby and Python, but I have also done a few in some other languages such as R. You can find all of my projects on my GitHub profile.

Here is a brief list of some of the larger projects that I have worked on, or are currently working on:



A machine learning interface for isolated sequence classification algorithms in Python.

Created as part of my Honours project research involving the use of sequential classifiers for automatic detection and recognition of head gestures in motion capture data.



Micro-framework, application generator and CLI wrapped around the Sinatra DSL.

Designed to simplify the process of getting new Sinatra applications up and running by providing commands for quick scaffolding and MVC file generation.