CS330 Compiling and Debugging


Highlights of this lab:


Focus on Unix Commands

Although you can do a lot with a graphical user interface (GUI), it is often necessary to execute commands on the command-line. The basic format is:

prompt$ command-name options arguments

For example:

hercules[1]% g++ -c main.cpp 
a037094[7]% ls -l

A summary of a few commands is provided in tabular format here:

Command Example Comments
ssh ssh smithj@hercules.cs.uregina.ca Secure Shell (ssh) enables you to log in, execute commands, and run applications on a remote system.
SSH encrypts any communication between the remote user and a system on your network.
scp

scp input.txt smithj@hercules.cs.uregina.ca:CS330lab1/

scp -r CS330 smithj@hercules.cs.uregina.ca:classes/

With Secure Copy (scp) you can copy files between the remote host and a network host. scp actually uses ssh to transfer data and employs the same authentication and encryption.

This first example copies the file input.txt from a users current directory to the user smithj's CS330lab1 directory, located on the hercules host.
The second example uses the -r option (which allows whole directories to be copied); it, thus, copies the entire CS330 directory to the classes directory of the user smithj

touch touch myfile Update the access and modification time of "myfile". If "myfile" does not exist, create it.
mkdir mkdir reports Creates a directory named "reports"
rmdir rmdir letters Erases a directory named "letters"
rm

rm myfile

rm -r mydir

Erases a file named "myfile"

Erases the directory "mydir" and any contents/subdirectories in it

ls

ls -F


ls -R

ls -a

ls -l

ls -i

ls -t

Lists working directory with trailing characters for file types. Most common are / for directory and * for executable.


Lists working directory as well as all subdirectories

Lists all files including "hidden files"

Lists files with permissions, owner, group, time stamp

List the files inode number--a unique number used by the system to identify a specific file.

List files by time last modified.

cd

cd reports

cd

cd ..

Changes to the "reports" directory, making it the working directory.

Changes back to the home directory

Moves you up one directory level

pwd pwd Print Working Directory - prints full path to current directory.
cp

cp lab1 mylab

cp lab1 mydirectory

cp lab1 mydirectory/mylab

cp -r mydirectory dirname

Copies file "lab1" to "mylab" file

Copies "lab1" in your working directory to "mydirectory"

Copies "lab1" to "mydirectory" and renames it "mylab".

Copies "mydirectory" and all its contents into "dirname".

mv

mv lab1 lab2

mv lab1 labdirectory

mv lab1 labdirectory/newfile

mv labdirectory newdirectory

Renames "lab1" to "lab2"

Moves "lab1" to the "labdirectory"

Moves "lab1" to the "labdirectory" and renames it "newfile"

Renames a whole directory to a new directory name

g++
or
CC
(on Hercules)

g++ -o prog_run main.cpp

g++ -c main.cpp

g++ -o prog_run main.o part1.o part2.o

compiles and links "main.cpp", calls the executable "prog_run"

compiles "main.cpp" (creates the object file)

links the object files (when you have different files), calls the executable "prog_run"

For another summary of Unix commands, click here.

For a Unix tutorial, click here.


Focus on Permissions

Each file and directory in Unix contains a set of permissions that determine who can access it and how. There are three levels of access to set:

  1. You can restrict access to yourself alone (user)
  2. You can allow users in a predesignated group to have access (group)
  3. You can permit anyone on your system to have access (world)

How do you view permissions?

The ls command with the -l option allows you to view a file's permissions (among other information).

 $ ls -l mydata
 -rw-r--r-- 1 chris weather 207 Feb 20 11:55 mydata
 
The breakdown of this information is as follows:
File Type Permissions Number of Links Owner Name Group Name Size of File in Bytes Date and Time Last Modified File Name
- rw-r--r-- 1 chris weather 207 Feb 20 11:55 mydata

Right now, the owner of mydata has read and write permissions, and the group, and the world have read permissions. How do I know? The permissions are organized in groups of three:

In addition,

What would the following permissions represent?

  1. -rwxr--r--
  2. drwxr-xr-x
  3. -rwxrw-r--

How does Unix determine who has permissions to access files?

Again it comes down to the /etc/passwd file. In this file, you have a unique numeric id, and a principle group id (also numeric). When you create a file, your unique numeric id and principle group id are assigned to that file. If there is a match of these numbers, then you will have specific permissions (according to whether you are user/group/world).

You have a principle group id, but you may also belong to other groups that are not your principle group. To know what groups you belong to, try the following command:

$ groups

This command gets its information from the /etc/group file as well using your principle group id.

How do I set permissions?

To set permissions, you use chmod. There are two main usages of chmod:

Symbolic Permission Mode:

The general format for using the symbolic permission mode is the following:
chmod 'access class' operator 'access type' filename
For example, this would add executable access for the user:
$ chmod u+x testfile
The following summarizes the values of "access class", "operator", and "access type" in the above syntax:
  1. Access Class
  2. Operator
  3. Permissions

Given a base permission of -rw------ for a file called "myfile", what would the resulting permission be after the following chmod calls?

  1. chmod u+x myfile
  2. chmod a+x myfile
  3. chmod g+r myfile

Absolute Permission Mode

Another way to change permissions is by using a numeric (octal) code. Typically, you will use three octal numbers: one for the user, one for the group and one for other (world).

The syntax for using chmod in absolute permission mode is:

chmod 'octal permissions' filename
For example:
$ chmod 744 myfile
Each of the three octal digits represent the read, write, and execute permissions for the user, group, and world respectively.

The following table summarizes the octal digits and how the permissions are affected:
Octal Binary Permissions
0 000 ---
1 001 --x
2 010 -w-
3 011 -wx
4 100 r--
5 101 r-x
6 110 rw-
7 111 rwx

What would the permissions look like on "myfile" after the following chmod calls?

  1. chmod 755 myfile
  2. chmod 644 myfile
  3. chmod 711 myfile

For more on chmod click here


Review of Compiling Multiple File Projects

Before we get into a discussion of make, it would be good to review how to compile projects on Linux. Given three files: main.cpp, greet.cpp, and greet.h, we must undergo two steps to create the executable file "demo".

  1. First create the object files using the -c option (for each .cpp file)
  2. Then link the object files into an executable file
Our commands will look something like this on Linux:
A044872[102]% g++ -c greet.cpp
A044872[103]% g++ -c main.cpp
A044872[104]% g++ -o demo greet.o main.o
A044872[105]% demo
Hello, World

Note that after issuing the g++ -c greet.cpp command, the object file greet.o will be produced.


Introduction to "make" and Makefile(s)

Note: some content on make is copied from the 170 lab

What is make?

The idea behind make is that it simplifies the compilation of projects with multiple files. Consider the above example (with two .cpp files). Compared to a single file project, you must issue two additional commands. Now, think about what happens when your project consists of several more files. You will waste a lot of time typing.

This is where make comes in handy. It automates compilation. It checks which files have been modified and based on dependencies (or rules) determines which object files will need to be recompiled. In addition, it saves time by compiling only files that have changed since the last build.

make is a UNIX command that looks for a file called Makefile or makefile. (Makefile is typically preferred because it appears at the beginning of the directory listing). Within the Makefile, there are variables and things called dependencies. A simple make file for the project that we discussed above might look like this:

The red text highlights the explanations.

# leads comments in a line 
# Build all: default target
all : demo

# Separate compilation to build object files
main.o : main.cpp greet.h 
	g++ -c -ggdb main.cpp

greet.o : greet.cpp greet.h 
	g++ -c -ggdb greet.cpp

# Linking
#demo is a target which depends upon main.o and greet.o 
#"g++ main.o greet.o -o demo" is the command to produce the executable file
#You need to use a TAB before g++ 
demo : main.o greet.o
	g++ main.o greet.o -o demo

# Testing
check : all
	./demo

# Clean up all build targets so that one may get a clean build
clean :
	rm -f *.o demo

Information on how to use the make command:

	main.cpp is the main program
	greet.cpp and greet.h contain the helper function
	Makefile contains the build script
	
		"make all" or simply "make" to build everything
		"make clean" to erase all the files built by make
		"make clean all" to get a clean build
		"make check" to run the "demo"

Adding Variables to Makefile(s)

Makefile(s) may also contain variables. For instance, if you will be adding additional object files or changing the compiler, it will be easier to use variables and make modifications only in one place in the file. Variables are set using the equal sign as in

CXX=g++
Note that by long standing convention, the variable CC is used to define the C compiler to be used. The variable CXX was added to this convention to define the C++ compiler.

To use the variable, it is prefixed by a $ and surrounded by parenthesis as in: $(CXX). A Makefile with variables might look something like the following:

# this is a comment
# specify the object files ...
OBJ= global.o access.o mem.o rungoal.o sup.o unify.o wexhdr.o


# specify the compiler
CC   = cc   # this is the cross platform standard C compiler
CXX  = g++  # this is the GNU C++ compiler
#CXX = CC   # Solaris C++ compiler on hercules


# specify the compiler options
CFLAGS = -g

# specify compiler preprocessor options
CPPFLAGS = -I/usr/local/include

# specify linker options
LDFLAGS  = -L/usr/local/lib

# specify the name of the ultimate executable file
EXEC = runwex


# create the executable
$(EXEC): $(OBJ)
	$(CXX) $(LDFLAGS) -o $(EXEC) $(OBJ)
	@echo 'runwex has been created'



access.o: access.cpp global.cpp globdefs.h allwexhdr.h
	$(CXX) $(CPPFLAGS) $(CFLAGS) -c access.cpp
global.o: global.cpp global.h globdefs.h
	$(CXX) $(CPPFLAGS) $(CFLAGS) -c global.cpp
mem.o: mem.cpp
	$(CXX) $(CPPFLAGS) $(CFLAGS) -c mem.cpp
rungoal.o:rungoal.cpp rungoal.h global.h sup.h
	$(CXX) $(CPPFLAGS) $(CFLAGS) -c rungoal.cpp
sup.o: sup.cpp sup.h
	$(CXX) $(CPPFLAGS) $(CFLAGS) -c sup.cpp
unify.o: unify.cpp globdefs.h
	$(CXX) $(CPPFLAGS) $(CFLAGS) -c unify.cpp
wexhdr.o: wexhdr.cpp
	$(CXX) $(CPPFLAGS) $(CFLAGS) -c wexhdr.cpp


clean: 
	 -/bin/rm -f $(EXEC) $(OBJ)

Some comments on this Makefile:

You may have noticed the "@echo" command in the makefile. You can use this to direct, for example, informational or status messages to the console.

Some comments about make:

Hercules has two different make programs. It has the older Solaris provided make and the newer GNU version. The GNU make is accessed on Hercules using the gmake command. Some of the comments on this page are specific to the GNU make. The Linux systems only use the GNU version, which can be accessed using either the make command or the gmake command.

Related Programs

*Note: prof and lint are not available on the Linux machines but are available on hercules. Both, prof and lint only work for C programs. Neither understands C++ source code.

Command Line Options & Usage for make

Like most UNIX commands, make has a number of command line options. A few of them are as follows:

You can try them out if you like!


Using GNU debugger - gdb

Note: most of the following content on gdb is copied from Guili Liu's 170 lab

What is gdb?

Let's say that your code compiled without any errors (using make, of course), but now you run the program and it either:

  1. core dumps
  2. doesn't produce the results that you expected

What can you do? Well, one option is to put print statements to watch where the code has gone. A more sophisticated method, however, is to use a debugger. This is where gdb enters the picture.

gdb is GNU debugger. You can use the debugger to control the running of the error-prone program and examine variables when/where problems arise. The most popular debugger for UNIX systems is gdb. It has tons of features, but you only need to use a few to get started.

Basic features of a debugger

gdb detects run-time errors, as opposed to compile-time errors. The former is a logic error; the latter is a syntax error.

When you execute a program that does not behave as you expected, you need some way to step through the logic other than just looking at your code. Things that you might want to know may be the following:

Using GDB Debugger

Before you can use gdb on a program, you must prepare your program for debugging with gdb. You must compile it with -g or -ggdb option. The general syntax is
	g++ -g filename.cpp -o executablefile
Under this -g or -ggdb option, g++ creates additional information about the program and deposits it in a symbol table. The debugger must have this symbol table to do its work. You will start gdb by typing in the following command:
	gdb executablefile
where executablefile is the executable version of the program. If you did not use -o option when you compile the program, the executablefile will be a.out.

gdb will give you a prompt that looks like this:(gdb). From this prompt you can use gdb commands such as run or list and so on. To exit the gdb program, just type quit at the (gdb) prompt and then press the enter key.

GDB Commands Summary

The following is a list of the most useful commands inside the gdb.

help
gdb provides online documentation. Just typing help, you will obtain a list of topics.

file
file executable specifies which program you want to debug.

run
run starts the program running under gdb. The program is the one that you have previously selected with the file command, or on the UNIX command line when you started gdb. You can give command line arguments to your program on the gdb command line. You can do this the same way you would on the UNIX command line, except that you are saying run instead of the program name. For example,

run 2048 24 4 

You can even do input/output redirection: run > outfile.txt.

list
list linenumber prints out some lines from the source code around linenumber. If you give it the argument function it will print out lines from the beginning of that function.

Just list without any arguments will print out the lines just after the lines that you printed out with the previous list command.

break
break sets a breakpoint in your program.

A breakpoint is a spot in your program where you would like to temporarily stop execution in order to check the values of variables, or to try to find out where the program is crashing, etc.

break function sets the breakpoint at the beginning of function. If your code is in multiple files, you might need to specify filename:function.

Break linenumber or break filename:linenumber sets the breakpoint to the given line number in the source file. Execution will stop before that line has been executed.

delete
delete deletes all breakpoints that you have set.
Delete number deletes breakpoint numbered number. You can find out what number each breakpoint using the command: info breakpoints. (The command info can also be used to find out a lot of other stuff. Try help info for more information.)

clear
clear function deletes the breakpoint set at that function. Similarly for linenumber, filename:function, and filename:linenumber.

step
step executes the current source line, and then stops execution again before the next source line.

next
next continues until the next source line in the current function (actually, the current innermost stack frame, to be precise). This is similar to step, except that if the line about to be executed is a function call, then that function call will be completely executed before execution stops again, whereas with step execution will stop at the first line of the function that is called.

until
until is like next, except that if you are at the end of a loop, until will continue execution until the loop is exited, whereas next will just take you back up to the beginning of the loop. This is convenient if you want to see what happens after the loop, but don't want to step through every iteration.

print
print expression prints out the value of the expression, which could be just a variable name. For example, to print out the first 25 values in an array called list, you would call
print list[0]@25

quit
quit is used to exit the gdb debugger.

Example of using gdb

Here is program called sample.cpp.
#include <iostream>
using namespace std;

int DivideInt(int, int);

int main()
{
	int x = 5, y= 2;

	cout << " x / y = " << DivideInt(x, y) << endl;

	x = 3;
	y = 0; 

	cout << " x / y = " << DivideInt(x, y) << endl;

	return 0;
}


int DivideInt(int a, int b)
{

	return a / b;

}
To use gdb on this program, you must use -g or -ggdb option when you compile the program.
	g++ -g sample.cpp -o sample
This is what a compile and run would look like:
A044876[7]% g++ -g sample.cpp -o sample
A044876[8]% sample
 x / y = 2
Floating exception (core dumped)

A core dump occurs when a program crashes. Core files are usually large (and take up valuable disk quota). It is is a good idea to remove them.

Now let's use gdb to find out the bug.

A044876[9]% gdb sample
GNU gdb Red Hat Linux (5.3post-0.20021129.18rh)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...
(gdb) list
1       #include 
2       using namespace std;
3
4       int DivideInt(int, int);
5
6       int main()
7       {
8               int x = 5, y= 2;
9
10              cout << " x / y = " << DivideInt(x, y) << endl;
(gdb) list
11
12              x = 3;
13              y = 0;
14
15              cout << " x / y = " << DivideInt(x, y) << endl;
16
17              return 0;
18      }
19
20
(gdb) list
21      int DivideInt(int a, int b)
22      {
23              return a / b;
24      }
25
(gdb) run
Starting program: /home/hercules/t/temp1/gdb/sample
 x / y = 2
 
Program received signal SIGFPE, Arithmetic exception.
0x08048702 in DivideInt(int, int) (a=3, b=0) at sample.cpp:23
23              return a / b;
(gdb) print a
$1 = 3
(gdb) print b
$2 = 0
(gdb) quit
The program is running.  Exit anyway? (y or n) y
A044876[10]%

The above is only an example of using gdb. We encourage you to try it when you debug your class assignments. The more you practice, the more you will learn and the more comfortable you will feel with gdb


Tracing System Calls

Another useful debugging and diagnostic tool is one that traces system calls. All Unixes provide such a tool, sometimes several varieties. However, the name for the tool is Unix variant specific.

Linux provides several such tools, the easiest to use is strace. The similar tool on Hercules is truss. Each of these tools sends diagnostic information to stderr every time your program makes a system call. Typically the output shows the actual value of the parameters passed to the call and the return value from the call. If the call encountered an error, the errno error code is displayed in symbolic form. eg. ENOMEM or EPERM.

One of the big advantages to this type of debugging tool is that you don't have to compile (or recompile) your program with diagnostic options set. In addition to being a useful diagnostic tool, these programs will also teach you a lot about the various system calls. Remember, that even if you don't explicitly make system calls your program, via the various C++ objects, will be making a lot of them.

Since these programs send the diagnostic output to stderr, which will be interspersed with your program's output, you may want to use a command line option to direct that output to a file:

strace -o outfile.out executable

Typically these programs are used with the final command line items being the name of your program and any options that it requires. That will invoke your program under the control of the tracing tool. However, it is also possible to attach the trace to a running process. This means that if something appears to be going wrong after your program has been running for a while, you can attach a trace to it to get a better idea of what is happening internally.

Do check the man page!


How to Access Linux Machines from Home

To get a list of all the Linux machines that you can log onto, use the command:

cs_clients CL115

Then, log onto one using:

ssh a0##### 
Where the #'s are actual numbers

Reasons to use the Linux machines at home:


How to Check for Memory Leaks

You can use a program installed on Linux called valgrind to check for memory leaks. It can tell you if you are:

  1. Trying to access memory that you shouldn't be
  2. Using values before they are initialized
  3. Not correctly freeing memory
  4. Writing code with memory leaks

Rather than re-inventing the wheel, you can get more information about valgrind from these places:

A sample use of valgrind (with options turned on to provide details about memory leaks) for your executable is:

valgrind --leak-check=yes executable


References