Who is Data Science Engineer, and is there any difference between them and Data Scientists? Your answer to this question is an indicator of success or failure of your Data Science project.
Software development is a risky business, but the risks rocket up when you give in to the persuasion of your client “to add some AI to the system”. Your next move is to hire Data Scientists and delegate this task to them.
“What are you doing?” – Scrum masters are wondering. “Building models” – they say. After a few months, you are starting to realize you cannot control this process. Why?
Because Data Science Process IS NOT Software Development Process! Any attempts to ignore this fact quickly bring the project to epic fail.
Data Science Process has its own set of inputs, outputs, roles, deliverables and process flow. Look at some them: TDSP, CRISP-DM, KDD, SEMMA. I hope your Project Managers are aware of them. But even if they are – epic fail is still your main option.
Because there is a gap between these two processes: Data Science and Software Development.
If your developers know everything about programming, they hardly know something about Statistics, Machine Learning, Data Science (excepting popular articles). Just as Data Scientists know nothing about building Line-of-Business Applications, and truly speaking are not too strong in programming. How are you going to cope with it?
Project Manager staring at a Data Science project.
Data Science Engineer is the answer. You need a person
- with strong programming skills
- with basic Statistical skills
- with Data Science Process understanding and ability to participate on each stage of the process
- knowing specialized DS languages and tools like R, Tensorflow and so on.
Data Scientists and developers live in different worlds. Data Science Engineer lives in both. It is a magic adhesive tape without which your Data Science project will fall apart.