Programming Style Guidelines

submited by
Style Pass
2021-05-22 14:00:07

In the descriptions that follow, the terms "component", "block", "subroutine", and "module" are used. A block refers to a group of related statements within a subroutine. A subroutine refers to a procedure, function, method, or exception handler. A module refers to a collection of related subroutines or a class or other data type abstraction. A component refers to a block, subroutine, or module. A poor programming language should never be used as an excuse for writing poor code Ones ability to follow these (or any other) guidelines is affected by the programming language used to write the program. However, lack of language support should not interfere with attempting to conform to the spirit of the guidelines. Conversely, use of the most supportive language does not guarantee the production of high quality code. "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code." (Flon's Law) Optimize the programmer, not the program Do not worry about efficiencies of program size or execution speed. Small efficiencies are almost never needed; major efficiencies are seldom possible. Saving a second of execution time or 1K bytes of memory in a computer with 1GB of memory is hardly worth the extra time or cost in program development and maintenance. Explore major efficiencies only in those rare cases where there is an obvious need. "More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason - including blind stupidity." (W.A. Wulf) A program is like a story, how you say it is as important as what you say Write for the benefit of a reader who is totally unfamiliar with your program. That reader could be you. According to Eagleson's law, "any code of your own that you haven't looked at for six or more months might as well have been written by someone else." "Let us change our traditional attitude to the construction of programs. Instead of imagining that our main task is to instruct a computer what to to, let us concentrate rather on explaining to human beings what we want a computer to do." (Donald Knuth) "The process of preparing programs for a digital computer is especially attractive, not only because it can be economically and scientifically rewarding, but also because it can be an aesthetic experience much like composing poetry or music. (Donald Knuth) Comments & indentation are an important part of the story Use comments to summarize or clarify. Use indentation to convey the program's structure. "If the code and the comments disagree, then both are probably wrong." (attributed to Norm Schryer) Comment the purpose of a module and subroutine A module's or subroutine's declarations and comment should provide the user with all information needed to use the module or subroutine. Good code is self-commenting Do not over comment. Comments should add only information not apparent from reading the code. Common situations where inline comments are added are to summarize the purpose of a group of statements or clarify the purpose of a single complex statement. Comments should be accurate, concise, and assume literacy in the program's language Make comments standout Use visual clues (like surrounding a multiline comment with asterisks) to make comments standout. Depending on the situation, the programmer may need to focus just on the code or just on the comments. Such visual clues make it easier to ignore or find the comments. Use blank lines to separate blocks of code Use consistent indentation Not only should both sections of an IF-THEN-ELSE be indented the same amount, the indentation amount for all control structures should be the same. Use different indentation for continued statements Statements that must be continued onto other lines should be indented but indented a different amount than program blocks. Use indentation on comments as well as code Aligning comment information is as important as aligning code. Estimates of how long it will take to write a program are usually too small; plan accordingly Hofstadter's Law offers sound advice on how long it takes to write a program. "It always takes longer than you expect, even when you take into account Hofstadter's Law." "The first 90% of the code accounts for the first 90% of the development time. The remaining 10% of the code accounts for the other 90% of the development time." (Tom Cargill) If you do not understand the problem, you cannot design a solution Always manually solve a problem given a variety of inputs to gain a better understanding of both the problem and its solution. "The most important single aspect of software development is to be clear about what you are trying to build." (Bjarne Stroustrup) In prose or in programming, clarity begins with a well-designed structure Any non-trivial program is too complex to be understood as a whole. How a problem is decomposed into modules, subroutines, and blocks is the most important task in creating clear a program. "The open secrets of good design practice include the importance of knowing what to keep whole, what to combine, what to separate, and what to throw away." (Kevlin Henny) "Ugly programs are like ugly suspension bridges: they're much more liable to collapse than pretty ones, because the way humans (especially engineer-humans) perceive beauty is intimately related to our ability to process and understand complexity." (Eric S. Raymond) "There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies. " (C.A.R. Hoare) Re-use Check standard and personal libraries for useful modules. No need to write what has already been written. Think generic Often, a generic component can be written as easily as a specific one. Generic code provides a more adaptable tool. For example, a subroutine that finds the N'th in a list may be no more complicated than a subroutine to find the 3'rd item, but the more generic subroutine is far more useful. High cohesion, low coupling Design highly cohesive components while minimizing interdependencies between components. (Cohesion is a measure of the extent that components in a module relate to each other.) Make coupling visible Any coupling that does exist between components should be visible to a reader of the program. Not being able to determine whether a method is a mutator or an assessor hides the degree of coupling between the calling code and the method. Replace repetitive code with calls to a function Encapsulate Details that are irrelevant outside a component should be encapsulated within the component. Use local variables, constants, data types, and subroutines whenever possible. Given a choice between clarity and length, pick clarity All other things being equal, shorter will usually be clearer. But clarity is the objective, not brevity. Brevity at the expense of clarity is no bargain. If mid-block declarations are needed, the block is probably too long If by placing a variable declaration at the beginning of a block, the declaration seems to far removed from where the variable is used, the block is probably too long. Use recursion for recursive data structures "Use data arrays to avoid repetitive control sequences."1 For example, if a program needs to convert a month value from an integer to a string, defining a 12-element array containing all the month names is preferable to creating a 12-way branch. If the code is not clear, it probably could be. Make it so Unclear code is usually the result of either a poor design or trying to be clever. Rethink the program design to resolve the former. Avoid the latter. "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." (Brian W. Kernighan) "Good code is its own best documentation. As you're about to add a comment, ask yourself, `How can I improve the code so that this comment isn't needed?'" (Steve McConnell, Code Complete) Use meaningful identifier names That includes label, variable, constant, type, literal, subroutine, and module names. Use different naming conventions for different types of identifiers Some naming conventions include saddleback case, titlecase, underscore between words, all lowercase, all uppercase, and initial letter uppercase. Initialize variables just before their use but do not initialize just to initialize Do not initialize a variable in its declaration, initialize it just before its first use. If a variable is assigned a value in its first use, that is its initialization, do not otherwise initialize it meaninglessly. Use constants Statements like `totalMonths := totalYears * MONTHS_PER_YEAR' that use constants are more readable that those using literal values. In addition the literal 12 may also serve other purposes (eg, the conversion factor from inches to feet). Searches for a constant yield more accurate results than searches for a literal. Declare all variables and constants at start of a block This creates a variable and constant "glossary" for the reader of the block. (See the guidleine about block size and mid-block declarations under well design structure above.) Eliminate unused local variables or constants Unused local variables or constants mask a program's intent with irrelevant "noise". (However, some modules may have "public" variables or constants not used in a particular program but provided for generic usefulness. For example, a module of mathematical functions may define the constant PI to make the module more generically useful even if PI is not used in a particular program.) Use each variable for only one purpose For example, a variable should not be used at one point to store a count of apples, then later used to store a count of pears. Avoid temporary variables, but when used, two-letter names are ok Occasionally, a situation arises where a variable is used only over the span of a few lines and for which no meaningful name exists (other than `temp'). On those occasions, variable names like 'ii' or 'jj' are easy to type, easy to search for, and appropriately convey the temporary purpose of the variable. At most, one statement per line Multiple declarations statements, declarations of multiple variables, multiple assignment statements, and multiple procedure calls should not be placed on a single line. (A procedure is a subroutine that does not return a value, ie., is not used within an expression.) This applies to declarations and executable statements Parenthesize expressions Even if you have memorized the precedence rules for the language in which the program is written, a reader of your program (which may be you in a year) may not have. Resolve all non-trivial ambiguity by parenthesizing expressions with more than a single operator. (For the amount of effort required to follow this guideline, it probably provides the most benefit of any the the guidelines listed here.) Avoid complex expressions Complex expressions often can be made clearer by converting them to a simpler equivalent expression (such as by apply DeMorgan's law). Separating a complex expression into parts and using variables to store sub-totals can also be used to simplify complex expressions. Avoid unnecessary loops and branches All other things being equal, fewer loops and branches means less complexity. If your code has more than five levels of indentation, you probably need to simplify Either your control structures are needlessly complex or some of the complexity should be transfered into a subroutine. Avoid multiple exits from loops and subroutines Avoid `break', `continue', `return' and `exit' statements (except for `return' statements at the end of a subroutine). "Make sure special cases are truly special" 1 At first, it may seem easier to write separate code to handle any special cases of a problem. But writing a single block of code that handles all cases typically creates a simpler program. For example, when summing the elements of a list, separate code for handling an empty list should not be needed. A generic loop whose body adds one additional element to the sum requires no special case if written in a way that has the body executed zero times when the list is empty. It is not important that a program works with some possible inputs, it is important that it works with any possible inputs Consider an airplane controlled by a program. Which is more dangerous? The plane that can do nothing because the program does not work at all. Or the plane with a program that works 95% of the time. "That's the thing about people who think they hate computers. What they really hate is lousy programmers." (Larry Niven and Jerry Pournelle, Oath of Fealty). Floating point values are approximate Be careful of the effects of integral floating point values. For example, the value (6.3/2.1) when converted to or printed as an integer may yield 2 since the value may be approximated as 2.999999. Also be careful when testing for the equality or inequality of floating point values. For example, a comparison of (6.3/2.1) and 3.0 may evaluate to unequal. "Watch out for off-by-one errors"1 The initial array element may have index zero or one (depending on the language), the size of an array is one greater than the index of the last element, comparing adjacent elements in an array requires only N-1 comparisons for an array of N elements, etc. Test for valid input Assume input may be anything. Check that the input exists (that EOF has not been reached), is in proper format, and specifies values within acceptable limits. Never trust the user will input what your program is requesting. "Any fool can use a computer. Many do." (Ted Nelson) Program defensively Assume your program will have bugs. Program to catch them. Defensive program does not eliminate errors. It decreases the likelihood errors will go undetected, and when detected, it decreases time needed to debug them. Test for valid subroutine argument values Test that the arguments passed into a subroutine's parameters contain acceptable values. For example, a floating point square root function must check for negative input. Use assertions to catch program bugs Consider the most likely places for your code to contain errors and add assertions to detect those errors. Use asserted default branch In situations where a program expects one of N possibilities and has a branch to check each one, add another branch that raises an assertion if ever reached. For example, when expecting a zero or positive value and branching to handle those cases, write a 3-way branch with first handling zero, the second handling positive, and the third (or default) raising an exception. Listen to compiler warnings The objective after receiving a warning is not to eliminate the warning, it is to remove the inconsistency in the code that caused the warning. If you lie to the compiler, it will get its revenge.(Henry Spencer) The objective of debugging is to find bugs, not to prove correctness "It's hard enough to find an error in your code when you're looking for it; it's even harder when you've assumed your code is error-free." (Steve McConnell Code Complete) Fix the bug, not just the symptom Eliminating the symptom of a bug does not mean you have eliminated the bug. When correcting for bugs, your objective should be to identify why the bug occur, then correct the code that resulted in the improper behavior. If a change to the code eliminates the symptom of the bug but you did not first identify why the program ran improperly to begin with, you may have just masked an error that will reappear under different circumstances. All errors are significant History is filled with seemingly minor errors that have had huge consequences. Assume any error is significant, even errors in output such as a missing blank character, a misspelled label, incorrect precision of a floating point value, or a missing newline character. Provide useful error messages Give some thought to the content of error messages. Error messages have two audiences: the user and the programmer. The error message should convey to the user information that allows the user to appropriately respond to the error. If the error is potentially the result of a system error or program deficiency, the error message should convey information the programmer needs to correct the problem. Serving these two audiences is not a trivial task. To conclude, I will borrow from the Epilogue of The Elements of Programming Style. "The essence of what we are trying to convey is summed up in the elusive word ``style''. It is not a list of rules so much as an approach and an attitude."1 Quotes were obtained from a web page by Bob Archer. 1. The Elements of Programming Style, Kernighan and Plauger, 1978. Guidelines Summary

Ones ability to follow these (or any other) guidelines is affected by the programming language used to write the program. However, lack of language support should not interfere with attempting to conform to the spirit of the guidelines. Conversely, use of the most supportive language does not guarantee the production of high quality code. "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code." (Flon's Law) Optimize the programmer, not the program Do not worry about efficiencies of program size or execution speed. Small efficiencies are almost never needed; major efficiencies are seldom possible. Saving a second of execution time or 1K bytes of memory in a computer with 1GB of memory is hardly worth the extra time or cost in program development and maintenance. Explore major efficiencies only in those rare cases where there is an obvious need. "More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason - including blind stupidity." (W.A. Wulf) A program is like a story, how you say it is as important as what you say Write for the benefit of a reader who is totally unfamiliar with your program. That reader could be you. According to Eagleson's law, "any code of your own that you haven't looked at for six or more months might as well have been written by someone else." "Let us change our traditional attitude to the construction of programs. Instead of imagining that our main task is to instruct a computer what to to, let us concentrate rather on explaining to human beings what we want a computer to do." (Donald Knuth) "The process of preparing programs for a digital computer is especially attractive, not only because it can be economically and scientifically rewarding, but also because it can be an aesthetic experience much like composing poetry or music. (Donald Knuth) Comments & indentation are an important part of the story Use comments to summarize or clarify. Use indentation to convey the program's structure. "If the code and the comments disagree, then both are probably wrong." (attributed to Norm Schryer) Comment the purpose of a module and subroutine A module's or subroutine's declarations and comment should provide the user with all information needed to use the module or subroutine. Good code is self-commenting Do not over comment. Comments should add only information not apparent from reading the code. Common situations where inline comments are added are to summarize the purpose of a group of statements or clarify the purpose of a single complex statement. Comments should be accurate, concise, and assume literacy in the program's language Make comments standout Use visual clues (like surrounding a multiline comment with asterisks) to make comments standout. Depending on the situation, the programmer may need to focus just on the code or just on the comments. Such visual clues make it easier to ignore or find the comments. Use blank lines to separate blocks of code Use consistent indentation Not only should both sections of an IF-THEN-ELSE be indented the same amount, the indentation amount for all control structures should be the same. Use different indentation for continued statements Statements that must be continued onto other lines should be indented but indented a different amount than program blocks. Use indentation on comments as well as code Aligning comment information is as important as aligning code. Estimates of how long it will take to write a program are usually too small; plan accordingly Hofstadter's Law offers sound advice on how long it takes to write a program. "It always takes longer than you expect, even when you take into account Hofstadter's Law." "The first 90% of the code accounts for the first 90% of the development time. The remaining 10% of the code accounts for the other 90% of the development time." (Tom Cargill) If you do not understand the problem, you cannot design a solution Always manually solve a problem given a variety of inputs to gain a better understanding of both the problem and its solution. "The most important single aspect of software development is to be clear about what you are trying to build." (Bjarne Stroustrup) In prose or in programming, clarity begins with a well-designed structure Any non-trivial program is too complex to be understood as a whole. How a problem is decomposed into modules, subroutines, and blocks is the most important task in creating clear a program. "The open secrets of good design practice include the importance of knowing what to keep whole, what to combine, what to separate, and what to throw away." (Kevlin Henny) "Ugly programs are like ugly suspension bridges: they're much more liable to collapse than pretty ones, because the way humans (especially engineer-humans) perceive beauty is intimately related to our ability to process and understand complexity." (Eric S. Raymond) "There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies. " (C.A.R. Hoare) Re-use Check standard and personal libraries for useful modules. No need to write what has already been written. Think generic Often, a generic component can be written as easily as a specific one. Generic code provides a more adaptable tool. For example, a subroutine that finds the N'th in a list may be no more complicated than a subroutine to find the 3'rd item, but the more generic subroutine is far more useful. High cohesion, low coupling Design highly cohesive components while minimizing interdependencies between components. (Cohesion is a measure of the extent that components in a module relate to each other.) Make coupling visible Any coupling that does exist between components should be visible to a reader of the program. Not being able to determine whether a method is a mutator or an assessor hides the degree of coupling between the calling code and the method. Replace repetitive code with calls to a function Encapsulate Details that are irrelevant outside a component should be encapsulated within the component. Use local variables, constants, data types, and subroutines whenever possible. Given a choice between clarity and length, pick clarity All other things being equal, shorter will usually be clearer. But clarity is the objective, not brevity. Brevity at the expense of clarity is no bargain. If mid-block declarations are needed, the block is probably too long If by placing a variable declaration at the beginning of a block, the declaration seems to far removed from where the variable is used, the block is probably too long. Use recursion for recursive data structures "Use data arrays to avoid repetitive control sequences."1 For example, if a program needs to convert a month value from an integer to a string, defining a 12-element array containing all the month names is preferable to creating a 12-way branch. If the code is not clear, it probably could be. Make it so Unclear code is usually the result of either a poor design or trying to be clever. Rethink the program design to resolve the former. Avoid the latter. "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." (Brian W. Kernighan) "Good code is its own best documentation. As you're about to add a comment, ask yourself, `How can I improve the code so that this comment isn't needed?'" (Steve McConnell, Code Complete) Use meaningful identifier names That includes label, variable, constant, type, literal, subroutine, and module names. Use different naming conventions for different types of identifiers Some naming conventions include saddleback case, titlecase, underscore between words, all lowercase, all uppercase, and initial letter uppercase. Initialize variables just before their use but do not initialize just to initialize Do not initialize a variable in its declaration, initialize it just before its first use. If a variable is assigned a value in its first use, that is its initialization, do not otherwise initialize it meaninglessly. Use constants Statements like `totalMonths := totalYears * MONTHS_PER_YEAR' that use constants are more readable that those using literal values. In addition the literal 12 may also serve other purposes (eg, the conversion factor from inches to feet). Searches for a constant yield more accurate results than searches for a literal. Declare all variables and constants at start of a block This creates a variable and constant "glossary" for the reader of the block. (See the guidleine about block size and mid-block declarations under well design structure above.) Eliminate unused local variables or constants Unused local variables or constants mask a program's intent with irrelevant "noise". (However, some modules may have "public" variables or constants not used in a particular program but provided for generic usefulness. For example, a module of mathematical functions may define the constant PI to make the module more generically useful even if PI is not used in a particular program.) Use each variable for only one purpose For example, a variable should not be used at one point to store a count of apples, then later used to store a count of pears. Avoid temporary variables, but when used, two-letter names are ok Occasionally, a situation arises where a variable is used only over the span of a few lines and for which no meaningful name exists (other than `temp'). On those occasions, variable names like 'ii' or 'jj' are easy to type, easy to search for, and appropriately convey the temporary purpose of the variable. At most, one statement per line Multiple declarations statements, declarations of multiple variables, multiple assignment statements, and multiple procedure calls should not be placed on a single line. (A procedure is a subroutine that does not return a value, ie., is not used within an expression.) This applies to declarations and executable statements Parenthesize expressions Even if you have memorized the precedence rules for the language in which the program is written, a reader of your program (which may be you in a year) may not have. Resolve all non-trivial ambiguity by parenthesizing expressions with more than a single operator. (For the amount of effort required to follow this guideline, it probably provides the most benefit of any the the guidelines listed here.) Avoid complex expressions Complex expressions often can be made clearer by converting them to a simpler equivalent expression (such as by apply DeMorgan's law). Separating a complex expression into parts and using variables to store sub-totals can also be used to simplify complex expressions. Avoid unnecessary loops and branches All other things being equal, fewer loops and branches means less complexity. If your code has more than five levels of indentation, you probably need to simplify Either your control structures are needlessly complex or some of the complexity should be transfered into a subroutine. Avoid multiple exits from loops and subroutines Avoid `break', `continue', `return' and `exit' statements (except for `return' statements at the end of a subroutine). "Make sure special cases are truly special" 1 At first, it may seem easier to write separate code to handle any special cases of a problem. But writing a single block of code that handles all cases typically creates a simpler program. For example, when summing the elements of a list, separate code for handling an empty list should not be needed. A generic loop whose body adds one additional element to the sum requires no special case if written in a way that has the body executed zero times when the list is empty. It is not important that a program works with some possible inputs, it is important that it works with any possible inputs Consider an airplane controlled by a program. Which is more dangerous? The plane that can do nothing because the program does not work at all. Or the plane with a program that works 95% of the time. "That's the thing about people who think they hate computers. What they really hate is lousy programmers." (Larry Niven and Jerry Pournelle, Oath of Fealty). Floating point values are approximate Be careful of the effects of integral floating point values. For example, the value (6.3/2.1) when converted to or printed as an integer may yield 2 since the value may be approximated as 2.999999. Also be careful when testing for the equality or inequality of floating point values. For example, a comparison of (6.3/2.1) and 3.0 may evaluate to unequal. "Watch out for off-by-one errors"1 The initial array element may have index zero or one (depending on the language), the size of an array is one greater than the index of the last element, comparing adjacent elements in an array requires only N-1 comparisons for an array of N elements, etc. Test for valid input Assume input may be anything. Check that the input exists (that EOF has not been reached), is in proper format, and specifies values within acceptable limits. Never trust the user will input what your program is requesting. "Any fool can use a computer. Many do." (Ted Nelson) Program defensively Assume your program will have bugs. Program to catch them. Defensive program does not eliminate errors. It decreases the likelihood errors will go undetected, and when detected, it decreases time needed to debug them. Test for valid subroutine argument values Test that the arguments passed into a subroutine's parameters contain acceptable values. For example, a floating point square root function must check for negative input. Use assertions to catch program bugs Consider the most likely places for your code to contain errors and add assertions to detect those errors. Use asserted default branch In situations where a program expects one of N possibilities and has a branch to check each one, add another branch that raises an assertion if ever reached. For example, when expecting a zero or positive value and branching to handle those cases, write a 3-way branch with first handling zero, the second handling positive, and the third (or default) raising an exception. Listen to compiler warnings The objective after receiving a warning is not to eliminate the warning, it is to remove the inconsistency in the code that caused the warning. If you lie to the compiler, it will get its revenge.(Henry Spencer) The objective of debugging is to find bugs, not to prove correctness "It's hard enough to find an error in your code when you're looking for it; it's even harder when you've assumed your code is error-free." (Steve McConnell Code Complete) Fix the bug, not just the symptom Eliminating the symptom of a bug does not mean you have eliminated the bug. When correcting for bugs, your objective should be to identify why the bug occur, then correct the code that resulted in the improper behavior. If a change to the code eliminates the symptom of the bug but you did not first identify why the program ran improperly to begin with, you may have just masked an error that will reappear under different circumstances. All errors are significant History is filled with seemingly minor errors that have had huge consequences. Assume any error is significant, even errors in output such as a missing blank character, a misspelled label, incorrect precision of a floating point value, or a missing newline character. Provide useful error messages Give some thought to the content of error messages. Error messages have two audiences: the user and the programmer. The error message should convey to the user information that allows the user to appropriately respond to the error. If the error is potentially the result of a system error or program deficiency, the error message should convey information the programmer needs to correct the problem. Serving these two audiences is not a trivial task. To conclude, I will borrow from the Epilogue of The Elements of Programming Style. "The essence of what we are trying to convey is summed up in the elusive word ``style''. It is not a list of rules so much as an approach and an attitude."1 Quotes were obtained from a web page by Bob Archer. 1. The Elements of Programming Style, Kernighan and Plauger, 1978. Guidelines Summary

Leave a Comment