Data science (also referred to as data analytics) has revolutionised many areas. From the choices we make about the movies we watch, to what products we buy and which foods we eat - these are all increasingly influenced by marketing initiatives informed by predictive analytics.  Our favourite sports teams have transformed their decision making processes (from recruitment to training methods to timing of substitutions) by incorporating feedback from data into their practices, whilst politics has become obsessed with polling data and predictive analysis

Data is set to be an increasingly big part of the workplace as with drastic changes as these methods are increasingly deployed. This article by Laurence Mills of Lewis Silkin LLP considers the issues.

What is “Big Data” and data science?

New technologies have improved our ability to track and collate vast amounts of information.  We now have masses of data on all sorts of behaviours (from our sleep patterns to running speeds to how long it takes us to write an email)- so much so that we require computing power to process all of this information.  This sheer volume of data is generically referred to as “Big Data”. 

Data science is the practice of purposefully collecting and investigating this data to identify patterns and behaviours - rather than relying on non-numerical or subjective impressions of events.  Compared to this non-numerical and intuitive mode of investigation, data science is a more scientific and evidence based form of analysis that can highlight inaccurate assumptions, spot relationships that hadn’t been seen before and enable us to more accurately predict future behaviours and scenarios. 

Data science in the workplace

Data science in the workplace is being used broadly at three levels:

(1)    Tracking and analysis

To engage in HR analytics, an organisation first needs some data.  This is collected (in a workplace scenario) by purposefully tracking employee activity, performance and productivity.  With these data collated, organisations can more precisely analyse the effectiveness of practices.  Questions such as “Do pay rises increase output?”; “Does the new wellbeing measure improve staff retention?”; “Does productivity actually decrease after a Bank Holiday weekend?” can be answered with reference to data rather than relying on subjective impressions.  As an example, by studying their employees’ progress, Ernst & Young’s 2015 study found that university degree classification bore little correlation to future performance in professional qualifications.  As a result, EY dropped degree classification as a requirement from its graduate recruitment criteria.

(2)    Intelligence and predictive analytics

Provided a sufficient body of data has been collated it may be possible to accurately predict future behaviours and needs of employees.  This enables management to more accurately anticipate a number of key points in the employment relationship- ranging from when employee performance is likely to be optimal to when employees are most likely to be looking to leave the organisation (Google has famously led the way here).

(3)    Automated decision making

Once an organisation is able to collect enough of the right sort of data it is possible to codify decision making (from whom to recruit to bonuses to promotions) into an automated form or algorithm.  These algorithms, when fed with the required information, can produce an overall assessment score or outcome (“Person X is most suitable for your organisation”; “Employee Y has satisfied the criteria to be awarded a bonus”; “Employee Z is ready to be promoted to the next level of management”).  This is being used with increasing effectiveness and intelligence in recruitment (such as untapt), but it has potential to pervade the whole life cycle of the employment relationship.

These practices present opportunities for organisations to improve their decision making processes and collective efficiency and will be a significant tool in the future of organisational management.

Risks and challenges

Notwithstanding this, data science in the workplace is not risk free.  These practices present three main challenges:

(1)    Unlawful bias

The case for deploying data science practices in workplace decision making is often made by reference to eliminating unlawful bias.  The human brain, riddled as it is with its unconscious (or conscious) prejudices appears less fair than the strict application of neutral criteria.  Correctly implemented, algorithmic decision making should be free from these human biases (both unlawful biases and perhaps more unexpected ones). The reality is, however, slightly more complex. Cathy O’Neil’s “Weapons of Math Destruction” highlights the US case of Kyle Behm who claimed that the personality tests used for entry level jobs at several large companies was a breach of the Americans with Disabilities Act 1990.  This is an important reminder that outsourcing decision making abilities to an automated process neither renders that decision protected characteristic-neutral, nor inoculates the organisation from possible legal liability.

Importantly, discrimination law pervades all aspects of an employer’s (even automated data driven) processes.  Quite clearly, algorithms which screen individuals because of their protected characteristic (being age, disability, gender reassignment, marriage and civil partnership, pregnancy and maternity, race, religion or belief, sex and sexual orientation) are directly discriminatory.  This can be easily avoided by not writing into the algorithm the instruction to take these characteristics into account.

However, there is a greater risk of an organisation’s automated decision making processes being found to be indirectly discriminatory.  An unwitting employer may program a recruitment algorithm to apply a seemingly neutral criterion (for example, requiring the prospective employee to be available to work between 8.00 am and 7.00 pm Monday through Friday).  The algorithm would then filter out those applicants who could not commit to working those hours.  However, this algorithm would disproportionately filter out women (who are more likely to have childcare responsibilities that prevent them from committing to working those hours).  This puts that group to a particular disadvantage when compared with the effect of the application of that criterion to people who do not have that protected characteristic (men who may not be the primary child carer).  If this occurs, it would be unlawful unless the organisation is able to show that the application of that criterion is a proportionate means of achieving a legitimate aim.

It is also possible for an automated recruitment/promotion decision to inadvertently slide into an indirectly discriminatory method.  An algorithm may develop itself through machine learning techniques and can (if left to develop without proper oversight) move closer to an unlawful decision making process.  As a hypothetical example, data may be automatically fed back to the (badly written) algorithm that employees who attended 1 of 3 particular universities are frequently outperforming employees who didn’t attend one of those universities.  The algorithm would then refine itself (with a view to providing evermore optimal results for the organisation) by promoting employees who have attended one of these institutions (putting those who did not attend one of those universities at a disadvantage).  However, unbeknownst to the algorithm, these universities were only open to men.  Without human intervention, the algorithm has inadvertently rendered itself indirectly discriminatory and exposed the business to liability.

(2)    Data protection

The tracking and analysis phase of workplace data science must be handled carefully.  The capacity of an organisation to monitor and collate intricate and detailed data on its employees’ behaviour is well developed, but the legal grounds on which to do so are limited in the UK by the Data Protection Act 1998.  There are, very broadly, two applicable bases in this sort of scenario on which an organisation is able to collect data about its employees: that the employee has given their consent or that the processing is necessary for the purposes of the employer’s legitimate business interests.  There are also a myriad of obligations that the employer is under when storing and processing this data. 

Further, the General Data Protection Regulation (which will come into effect from May 2018) will provide individuals with some grounds on which to not be subjected to automated processing or request an explanation of that automated processing.  The actual legal effectiveness of this has been subject to much comment but is an important consideration when planning workplace data science procedures.

(3)    The evidential requirements for decision making

Organisations which monitor and track their employees’ activities and outputs will be required to build those data into their HR decision making.  This will mean that decisions ranging from whether to promote or give a pay rise to the more serious decisions of dismissals will have to be made with reference to (and supported by) the data that is collected as part of that organisation’s HR analytics practices.  And if challenged, the assessment of the fairness of these decisions will involve significant consideration of these data.  

For example, an organisation which has tracked and monitored its employees’ outputs and efficiency levels will be able to set a near numerical definition of “good performance” comprising of quantifiable outputs and statistical measures.  The effectiveness of a performance improvement plan or fairness of eventual dismissal for capability will then be assessed by analysing that employee’s outputs and efficiency levels data with reference to the quantifiable definition of “good performance”.  This will be in contrast to the largely instinctive and non-numerical methods of the annual performance review and ad hoc subjective feedback which comprises most assessments of employee performances now.

In some cases this may ironclad decisions from question by an aggrieved ex-employee (because the outcome is supported by the data), in other cases it may leave some organisations more vulnerable (because the data do not support that outcome).  Organisations should take account of the data collected in relation to their decision-making and ensure that uncomfortable or inconvenient aspects are not ignored. 

This will all have a direct impact on the day to day role of HR professionals and employment lawyers.  As more and more decision-making is based on collected data, organisations should prepare for more arduous disclosure exercises if litigation were to loom (and lawyers should prepare to reskill their own analytical skills to fully understand algorithms and data tracking methods).  There are also significant employee relations issues.  What differentiates HR analytics from other forms of data science is that it does not measure numbers of widgets produced, or fuel efficiency, or clicks on an ad.  The “data” are about real people - employees, team members and colleagues.  Simply because the technology enables (and the law may allow) an organisation to track certain behaviours, employees who are not accustomed to such interference may well be resistant. 

Conclusion

Workplace data science provides organisations with opportunities to make future decision-making better, faster, supported by evidence and, potentially, far more transparent (some algorithms lay bare their methodology).  Questions around: Who made the decision?  What was the decision making process?  On what basis was that decision made?  are often at the core of a dispute.  For those whose job it is to to anticipate, avoid or resolve workplace problems, data science practices may well start dominating their email inboxes in the near future.

By Laurence Mills, Lewis Silkin LLP

1 Comment