So, you want to make your code more readable. Or, someone has told you, “Please make your code more readable!” This post is for you. There are a few categories to consider when making code more readable:
- Name your variables something understandable. You’re not going to impact code performance by giving things longer, more understandable names. So instead of
- Name your functions something understandable. Please don’t do something like
fcn1! Names like
count_SNPsare much clearer and help people intuit what your code is doing.
- Pick a standard and stick with it. Much digital ink has been spilled over capitalization and word separation, but in the end it only matters that you’re consistent. Decide if you’re going with
age.first.cancer.dxand stick with it!
- Functions should do one thing, without side effects. In other words, if you need to do four things to a data set, like interpolate missing data, center and scale data by z statistic, create a linear model, and output a table, separate those into four separate functions. Make sure that your function doesn’t reach outside of itself to change other things in the environment. It should only manipulate things that are passed into it, and then pass out a result.
- Even if you can do something on one line and it seems cool and efficient and a testament to your coding chops, is it really understandable to the next person working on the code? Try to limit the scope of a single line of code. If you’re programming in a language that allows piping from line to line, that’s a good way to spread out the logic so that it’s more readable.
- In some languages (like Python), whitespace matters. In most, they don’t, at least not to the interpreter or compiler. However, whitespace helps humans! Decide ahead of time how you’re going to indent code that wraps over several lines or is nested. Two spaces? Four? Consider lining things up so that they make sense, and keep the character width to under 80 characters (or even less, to give you room for commenting).
- While developing, write yourself notes in comments. “I did it this way because it’s an unbalanced design, so standard anova isn’t the right choice.” “I am breaking this up into several chunks because otherwise I run out of memory.” This will help scaffold your comments as well as help you when you walk away for a few days and then come back with imperfect memory of what you did and why.
- As your code product gets more mature, standardize your comments. Decide how you’re going to comment, whether you’re going to use first or third person, if you’re going to comment just before a line of code or alongside the code, etc. Assume you’re going to win the lottery, leave your job, and leave your code to someone with little understanding of what you’ve done.
- Comment at the top of your code at least the following: what the code does, what the code expects (e.g. a certain kind of data in a certain format), the date it was completed, your name, and any dependencies (is this written to run only in Windows? Do you need special libraries installed).
- Consider a “literate statistical programming” approach, where you can write whole paragraphs, with formatting, to describe and explain your code not only in technical terms but in actual reasoning. Why does it matter that you center and scale the data? That you apply a logistic model?
- Ask someone to look over your code. Have them run it on their own computer with no assistance. Can they do it? Can they understand what you’ve done, and why?
- Ask to see code from other people whose work you appreciate, or who work in your field. What bugs you? What do you find helpful?
- Ah, version control. Use it, and use it often, with good explanations of your changes. This will help someone understand the story of your code and how it developed!