Code review for statisticians, data scientists & modellers

submited by
Style Pass
2024-09-19 11:30:07

I work as a data scientist alongside people who typically have academic backgrounds in statistics, epidemiology, or something of a similar feel. Some of us have job titles relating to data science, other to mathematical and statistical modelling. None of us are software developers by trade.

However, we’re writing software all day. We’re writing mathematical/statistical models, creating data-driven products and doing data engineering. We quality assure all of this work via code review.

The honest answer to Who is this for? is that this is for me. It’s for a previous version of myself who knew the mechanics of reviewing code, but could have exercised a bit more much more nuance.

That being said, I think anyone working in some kind of modelling or analytical role, whose main output is some sort of mathematical or statistical analysis, but writes code to implement their models could find this useful.

It doesn’t really matter what programming language you use, or whether you’re storing your code on GitHub, GitLab, BitBucket or whatever else. However this will be biased towards R and GitHub because that’s what I use. The concepts are important here, not the precise toolkit.

Leave a Comment