Machine Learning – An automotive analogy



Progress in emerging technologies, such as machine learning, is creating alternatives to labour intensive risk modelling activities. Banks will require vision, investment and enduring strategic actions to truly leverage the full range of potential benefits
The roadmap defined for autonomous electric cars by tech giants and cars manufacturers include: changes to usage and storage of fuel; investment in talent, tools and infrastructure; evolution of next generation maps and levels of automation; and the overcoming of regulatory challenges. Recent developments have sparked debates on the impact of the economy, infrastructure, and regulations.
Similar roadmaps should be defined and dialogs pursued on the increasing use of machine learning within financial institutions.
A significant use case is risk modelling, where benefits could include:
- Automation of labour intensive and prone-to-error processes such as data cleansing
- Development of models capable of generating greater insights, accuracy and pattern identification using vast amount of data
- Reduced timelines required for model development, validations and re-calibrations
- Competitive advantage for early adopters
Prepare for the road-trip:
Fuel:
Evolution from oil to electricity in the automotive industry required technological progress in both batteries and electrical engines. Similarly, machine learning ‘fuel’ is data captured on ‘batteries’ powered by progress in data storage and cloud computing.
Banks have a tremendous opportunity to dramatically improve risk modelling by using machine learning to make sense of large, unstructured and semi-structured datasets, and to monitor the outputs of primary models to evaluate how well they are performing.
To take advantage of this, firms should determine the different datasets that are required for their specific needs (for model development, machine learning training, validation). Leverage increasing data availability, from internal and external sources and define a roadmap that improves data quality whilst minimising the dependency on data from third parties (where possible). Different dimensions across the data requirements should be considered, such as volume, variety, velocity and veracity. Remember the world’s most valuable resource is no longer oil, but data.
Talent, tools and infrastructure:
Highly skilled resources in this area are scarce and in demand. Banks, fin-techs and non-financial institutions are increasingly searching and competing for data scientists and machine learning professionals. I believe that banks, and risk departments in general, need to recruit the right mix of individuals with a banking and academic background, relevant experience with emerging technologies and modelling tools. They can partner with leading universities, tech companies and consultancies to reap the benefits of the latest machine learning research and development, techniques and training.
Where the automotive industry has been able to merge antiquated technologies with innovations (e.g., the hybrid engine), so too must banking. Machine learning must co-exist and integrate with legacy processes and systems. Risk management teams should combine well-established technologies (e.g. scorecards) with emerging technologies (e.g. machine learning) to build better predictive risk models.
Maps:
Drivers’ experiences have been enhanced from restricted, paper maps to interactive and connected GPS enabled maps.
However, in banking, the use of machine learning and complex algorithms could result in a lack of transparency due to the ‘black box’ characteristic, leaving the ‘machine operators’ (bank employees), consumers and regulators in the dark. For example, if a bank is challenged about the outcome of the use of machine learning to assign credit scores and make credit decisions, it may find it more difficult to provide consumers, auditors, and supervisors with an explanation of a credit score and resulting credit decision.
Dedicated analysis should be used to understand and document the risk model’s explicability/interpretability, and a wide variety of frameworks and techniques should be experimented with – such as, Prediction Decomposition; LIME (Local Interpretable Model-agnostic Explanation) and BETA (Black-box Explanation through Transparent Approximations) – to assist the bank employees to interpret and defend the results and minimise consumers and regulators concerns.
Tools should be tested and trained with unbiased data and feedback mechanisms to ensure applications do what they are intended to do and explanations should be examined to determine whether the model is trustworthy.
Automation:
Tesla, Google, Uber and Ford are just a handful of firms developing technology pushing towards increasing levels of autonomous cars (from no automation – level 0 – to full automation – level 5).
Likewise, there are various categories of machine learning according to the level of human intervention required in labelling the data to train the algorithm to derive decisions, such as:
- Supervised learning: algorithm is fed a set of ‘training’ data that contains labels on some portion of the observations. The algorithm will ‘learn’ a general rule of classification that it will use to predict the labels for the data set
- Reinforcement learning: algorithm is fed an unlabelled set of data, chooses an action for each data point, and receives feedback (perhaps from a human) that helps the algorithm learn
- Unsupervised learning: algorithm is fed an unlabelled set of data and is asked to detect patterns in the data by identifying clusters of observations that depend on similar underlying characteristics
Machine learning will augment your team’s capabilities rather than replace them: humans must be looped in, as we can consider context and use general knowledge to put machine learning driven outputs into perspective. Define the appropriate level of human intervention accepted within your various use cases and implement ‘request to intervene’’ controls that notify the machine learning operators that they should promptly assess the outcomes and take corrective actions.
Oversight:
Governments and the population will not feel safe using fully autonomous cars without assurances in place (e.g. validated testing results, regulations and laws). Equally, widespread use of machine learning within financial institutions will require banks to demonstrate that the right governance and validations are taking place.
The Basel Committee on Banking Supervision notes that a sound development process should be consistent with the firm’s internal policies, procedures and risk appetite. To support new model choices (including the use of machine learning), firms should be able to demonstrate developmental evidence of theoretical construction; behavioural characteristics and key assumptions; types and use of input data; numerical analysis routines and specified mathematical calculations; and code writing language and protocols (to replicate the model).
With machine learning used increasingly in risks model development, firms must assess how they manage and implement policies and processes to evaluate the exposure to model risk (risk of loss resulting from using insufficiently accurate models to make decisions).
Governance is, therefore, key. Understand the way your team develops, documents, uses, monitors, sets up and maintains model inventories, and how they validate and control models. Examine the use of emerging technologies, such as network studies, that can optimise the analysis of model inventories to assess whether increased interconnectivity between models also led to increased model risk.
Conclusions
The car industry has taken major steps on the journey toward autonomous vehicles, which will provide significant benefits to consumers, manufacturers and retailers. However, the challenges are not limited to understanding and implementing the technology, they are steeped in the challenges of changing people’s mindsets, overcoming the fear of major change and demonstrating safety and efficacy.
Banks are going need to tackle similar challenges – albeit somewhat more company-internal versions – in order to be able to reap the benefits of further incorporating machine learning into their risk management approach.