On the future of data science.

Joseph Cooper
Gousto Engineering & Data
7 min readApr 14, 2021

--

Photo by Javier Allegue Barros on Unsplash

So I’ve been working at Gousto coming up to a year now and working here has been amazing. The company has grown an unbelievable amount since I joined in April 2020, going from 100 people in tech to around 150, and this is in less than a year. This growth has meant that we have been faced with all sorts of scaling challenges. For the data science team it has been particularly interesting, we have seen our numbers grow from just 5 people to 15 people. This rapid growth is necessary because Gousto places huge importance on data science and using machine learning techniques to improve efficiency and also create value. This is somewhat rare among companies, who often have a lot of data but not a great awareness of how it can be used. This got me wondering how data science will evolve in the coming years and whether data science is still the sexiest job of the 21st century or if the field was overhyped.

What is a data scientist?

Well most people would say something along the lines of: A data scientist solves complex problems using data and machine learning techniques. The field of data science is itself a multidisciplinary field that combines computer science and statistical methods with business expertise. From these relatively loose descriptions the field is open to many people as long as they have some experience and skill in the most common data science workspaces such as: statistical analysis, data visualisation, application of machine learning methods, understanding and breaking down abstract problems for the business. There’s really no hard rule as to which of these skills you need more or less and this list is not extensive at all, it all depends on the needs of the business.

If you are reading this article then you are also probably well aware that there is a huge amount of hype around data science, it’s often quoted as being the sexiest job of the 21 century. The field has exploded in recent years with linkedin suggesting a growth of over 660% since 2012 and many experts believe that even more growth is yet to come. So it’s a great time to be in the industry and incredibly exciting to be part of this growth. But the real question I want to ask today is what does the future look like for a newcomer and where will they be in 10 years time?

Perfect Future

To start answering this question, we should first imagine what the perfect future is for data science, this will give us an idea of where we want to be. We would ideally want to see steady growth over the next 10 years, with the field slowly expanding, new positions opening up, new techniques unlocked and more importance placed on the field by businesses. A CapGemini survey in 2015 showed that 60% of businesses surveyed agreed that data science was necessary to stay competitive and since then we have seen separate surveys show an increase in the number of productionised data science products from 13% in 2017 to 20% in 2019 and we would like to see these numbers continue to increase. On top of this, for people fresh to the field, we would like job opportunities to be similar in prospect to what they are now.

The ‘perfect future’ we have outlined is not unrealistic at all, data science has exploded over recent years. The real trigger for this explosion is the quality and volume of data that companies can now collect. With almost every detail about the customer being recorded, as well as all their behaviours, companies now have colossal datasets at their disposal. Data scientists have then become incredibly important as a way of analysing this data, producing machine learning models based on it and overall providing value from these datasets. So it is realistic to assume that the demand for data science will continue to grow as innovation in analysis and machine learning grows.

It also stands to reason that as the field grows, more positions will open up, as more data scientists are required for analysis pieces. This will become more important as new techniques become more established and proven as revenue streams. For example recommendation systems are now commonplace in most companies and are proven drivers of revenue growth, so it is likely that a recommendations data scientist will always be present in companies in the future.

This all sounds very positive, but there are a couple of things that still need to be considered, especially as a newcomer to data science. To things in particular I think are important factors in the future of data science:

  1. The impact of AI on data science
  2. Specialist data science

Will AI change the future?

We can’t really talk about data science without talking about AI. Some people would argue that they are one and the same, certainly they are similar enough that it is tricky to come up with a clear distinction between the two. This is mostly because machine learning is a form of AI and so you could argue that a data scientist also works on AI. But I would say they are different based on focus: a data scientist focuses more on analysis and output, driving value through their analysis; whereas working with AI is building self sustaining systems that produce an output with little to no interaction from the maker, the focus being more on an evolving system rather than analysis.

The value of AI is still to be fully realised and so it could be the deciding factor in the future of data science. If more true use cases can be found for advanced AI then the field will continue to grow, we will see new departments open up and more opportunities for newcomers.

Data Science at Gousto

Gousto has some really great use-cases for data science and has positioned itself so that data science and AI is an integral part of operations. Our main data science products have demonstrated to me how data science can be used to produce value, importantly this is done in very specialised circumstances. We have some algorithms that are centred around customer experience and providing a variety of recipes to our customers; then we have other algorithms that are more operationally centred and are used to help distribute boxes to our customers across the country. These are obviously very different problems with completely different objective functions, not only do they have different objective functions but also the accuracy of the training data is completely different between the problem spaces. This has necessitated that Gousto invest in having very specialised roles in it’s data science team. There is still plenty of room for improvement and also many more areas that need exploration, and Gousto will continue to innovate and improve upon the data science algorithms we have.

Specialism in Data Science

More and more, within some companies, we are seeing the granularisation of a data scientist’s role, more expertise is required to complete a certain work stream or to stay competitive and so more specialised knowledge is required. This is really a very natural continuation of the field but it does also mean that the role of ‘data scientist’ is likely to become less and less common, soon it will be more common to have specialisms earlier, so maybe you will become a recommendations data scientist or an NLP expert. We already commonly see two areas of a data scientist’s workload split: data engineering and machine learning engineering. These two roles were once firmly within a data scientist’s remit, but now that a huge amount of investment has gone into the infrastructure to support machine learning models it means that an engineer is required to understand this infrastructure in more detail. Similarly, a large part of a data scientists job remit would have been to clean and organise data. When I was doing my training courses some years ago an often quoted saying was ‘90% of a data scientist’s job is cleaning data’, for most people this is simply not a true statement anymore, as this job is mostly taken on by data engineers.

Already Gousto is hiring with specialisms in mind, we recently had roles available for a forecasting data scientist and a recommendations data scientist, because we are now well aware that the specialist is going to be important going forward. Of course people can always learn, they can always re-specialise, but bringing that ready-made expertise into the team is important for us.

There are a couple of negatives to breaking down the workload of a data scientist, at least in terms of job prospects. It means that certain roles will become more competitive and more dependent on experience, it also means that there will be many more junior roles but not necessarily more senior roles, so there could be a significant amount more competition for say a head of data science position. The point being that we shouldn’t expect the job roles that exist today to be the same in 10 years time, especially for someone who wants to climb the corporate ladder quickly I think it is worth considering whether data science is the best way of doing that, as senior roles will inevitably become more competitive over the years.

Conclusion

It seems as though data science is still in a great place, a lot of hype surrounded the field in the early 00s and for the most part the hype was lived up to. Now we are moving into a much more stable phase where certain algorithms and techniques are commonplace and often expected in commercial settings. The next phase will most likely be the further granularisation of roles in the field.

This granularisation of work is a great thing in many ways, it means that there is a lot more opportunity for any new data scientists entering the field to make a name for themselves. I personally think it is also really exciting, it means that there is more opportunity to focus on a particular subject and become an expert in it, it means there is more potential to create clever algorithms, and it means we can push the boundaries even more of what data science is capable of.

--

--