|Issues Involved When Using Variables|
A variable is a named location in memory. When using variables in a program, there are a number of factors that the programmer must take into consideration:
Scope is the extent to which a variable is known in a program. A global variable, which is discussed in the final section of this web page, is a variable with unlimited scope. A global variable is known throughout an entire program. As it will be discussed later, the use of global variables is discouraged.
In general, a programmer's goal is to limit the scope of a variable as much as possible. Variables with limited scope make a program easier to read. If you can confine the scope of a variable to a single function, you do not have to worry about the value of that variable in other parts of the program. But, when a program utilizes global variables, or variables with very wide scope, then understanding any part of the program which uses these variables requires an understanding of all of the other parts of the program which use these variables.
In addition, wide scope variables make code harder to debug and correct. When the scope of a variable is localized, we know that we only need to look through a limited number of lines of code where the variable is in-scope to determine the possible source of error. The wider the scope, the more places where the error might be hiding in our code.
Variables should be declared as close as possible to their point of use. For example, if you declare a variable hundreds of lines of code before its actual use, you place a greater strain on your reader to understand the intent of your code.
Always initialize variables; you should never leave the initial value of a variable to chance. Remember, that when you declare a variable, you are asking the compiler to set aside sufficient memory to store your variable. However, you have no way of knowing how the underlying bit pattern in that part of memory might be interpreted in terms of its value. Initialize your variables to avoid any chance of problems.
One particular source of errors for beginners is the use of C-style strings. C-style strings operate on the principle that somewhere in the array of characters there is a null character, \0, marking the end of the string. Objects like cout in C++ are looking for the null character to determine when to stop printing the string. If a null character is not found, cout will continue reading beyond the end of the memory actually reserved for the string. As a consequence, a segmentation fault may result. Segmentation faults are run-time errors. Like most run-time errors, segmentation faults are difficult to isolate in a program. So to save time and effort you should always initialize C-type strings.
Click here for an example of good scope
Click here for an example of bad scope
Return To Top Of Page
Binding time is the time when a variable assumes its initial value. Some variables assume their values at the time the code is written, such as literals. Some variables take on their value at compile time such as #defined constants. Yet, some variables don't take on their value until the user inputs a value. These variables bind at run time. Yet again, some variables do not bind until they are created dynamically. These types of variables are usually associated with the new and delete key words, and pointers, such as in C++.
In general, it is the programmer's goal to have a variable bind as late as possible. Why? If we can get a variable to bind to its value as late as possible, our program becomes more flexible. Suppose, for example, that a program binds all its values at compile time. Since the values of the program's variables are set before it is run, our program can only work with the data as it is given. Yet, a program that cannot vary its input is not particularly useful to a user who may wish to input a number of variables for the program to process.
Click here for examples of literal values
Click here for examples of early binding
Click here for examples of late binding
Return To Top Of Page
There is a relationship between the data structure and the control structure. For example,
Sequential data is groups of data that exist in a particular order. This sequential data must be handled with sequential statements. For example, if you have a Name, Student Id, Class and Grade then they could be output with a series of output statements.
Click here for an example of sequential data
Selective Data is multiple pieces of data that exist at the same time and only one piece of this data is to be used at a time. This selective data must be handled with if...then...else or case type structures. For example, if you have the data from before, Name, Student Id, Class, and Grade then you must use a selection statement to check whether of not the student has a high enough grade to pass a test.
Click here for an example of selective data
Iterative Date is a collection of many data items of a particular type, normally stored as an array of records or classes. Iterative data is normally handled with loop type structures. For example, if we have the data from the previous examples in a structure, and then made an array of these structures, we could use a loop structure to iterate through outputting the date from each structure one by one.
Click here for an example of iterative data
Return To Top Of Page
A variable should be utilized for one and only one purpose. If you are using a variable for more than one purpose, you are doing yourself a disservice. First, you are making your program much harder to understand. The reader of your program may not necessarily understand that the variable's purpose has changed or at what point in the code that the purpose of the variable changed. As well, the fact that you have an unneeded variable available for another purpose suggests a scope problem. Quite possibly, the variable could have gone out-of-scope before you used it again for a different purpose.
Instead, declare a separate variable for each and every purpose in your program. While it is true that memory is not an unlimited resource, declaring additional variables makes your program easier to read, easier to test, easier to debug, and easier to maintain.
It is also important to avoid using variables that have values which have "hidden" meanings. For example, a program may be written such that the value of -1 in a variable indicates an error condition. This practice is an example of very poor programming. It is not self-evident that -1 means anything else than negative one. The special meaning of this number can only be deduced (if at all) after a very careful examination of the program.
In the example given above, the use of a boolean variable with a descriptive name is far more appropriate. If the variable's value is a logical true, it may indicate that the error has not occurred; that is, the process succeeded. A logical value of false may indicate that the error has occurred; that is, the process failed.
Finally, don't declare a variable unless you actually intend to use it. Many compilers will flag down such errors as "unreferenced" variables. Some compilers will not. It's the programmer's responsibility to make sure all variables are used. Programs with unreferenced variables are harder to read because the unreferenced variable always leaves the reader wondering how the variable is ultimately used in the program. Consequently, unreferenced variables make your program harder to read and possibly harder to debug, test, and maintain.
Return To Top Of Page
As mentioned earlier in this web page, the use of global variables is discouraged because of the potential problems inherent with their use. The problems associated with global variables are discussed below:
When two or more functions access the same global variable, serious problems may arise. A particular function may be expecting a certain value to be held in the global variable, but the global variable's value has been changed accidentally in an intervening process.
Aliasing occurs when one variable is referenced by more than one name. Consequently, a program which both takes the alias as a parameter and manipulates the value of the global variable directly might be unwittingly changing the value of the same variable in memory through two different identifiers with unexpected results.
See an example of the aliasing problem: In C
And the Fix: In C In C++
Many recent operating systems allow code to be entered by more than one thread of control. This permits more than instance of a program to run simultaneously. Yet, if each thread has access to the same global variable, one thread may change a global variable to a value which is unanticipated by other threads with unexpected results.
Ideally a function, routine, or module should portable. This means that code should be capable of being taken out of one piece of software and "snapped into" another piece of software. However, a routine that uses global data is dependent on the values of those global variables. Either the routine will need to be remodeled to eliminate this dependence, or the new program will have to be amended to include the necessary global variables.
The purpose of modularization is that a complex program can be separated into a series of separate, smaller, and more manageable problems. Using global data blurs this separation and forces the programmer to mentally keep track of how a global variable is accessed and used throughout the whole code where the global variable is accessed.
Notwithstanding these problems, there are some acceptable uses for global variables:
It is quite acceptable to have a global status variable provided that no other function uses that value in its processing. Instead, the global variable is merely there for the convenience of the user. The user can interrogate the value of the global variable to determine the current status of a process, for example.
As well, global data can be used as a substitute for literal constants, and named constants where these constants are not supported by the language being used.
When a variable appears in the parameter list of nearly every routine, it may be easier to make the variable global and remove it from the parameter lists.
Data passed into a routine that is only needed as a parameter for another function called by the original routine is called tramp data. It may be easier to make the variable global and remove it from the parameter lists of both functions.
In short, create each new variable as a local variable and only make it global if absolutely necessary. Try making it a module variable before using global variables. To ensure that a variable is used as a global variable, prefix the variable name with the letter g. As well, carefully document your use of global variables.
In most cases, however, global data can be avoided. In particular, the data members of an abstract data type (ADT) can be made private or protected, such as in C++. In such a case, the modular variables are known only by the member functions of that module. This technique is popularly known as "data hiding." Outside of the module, the variables are unknown and only accessible through routines defined within the module. These accessor and mutator routines control how data is accessed from the module and validated before it is sent into the module. In short, the module "fire-walls" our data from the users and enforces the appropriate use of modular data.
Return To Top Of Page
References used while building this page: McConnell, Steve. Code Complete, A Practical Handbook of Software Construction. Microsoft Press (1993).
These pages were edited for style and content by
Allan Caine, Laura Drever, Gorana Jankovic, Megan King, and Marie Lewis.
The original authors of these pages were Graeme Humphries, Chantal Laplante, Chris Mills,
and Melvin Lenz. Last Modified: Wednesday, June 7th 23:45:30
Copyright 2000 Department of Computer Science, University of Regina.
The original authors of these pages were Graeme Humphries, Chantal Laplante, Chris Mills, and Melvin Lenz.
Last Modified: Wednesday, June 7th 23:45:30
[CS Dept Home Page]