Assignment three questions and answers

Here are some questions and answers about assignment three. Suggestions for additions to this list are welcome (via e-mail).

These questions and answers are added in order; new questions appear at the end. This means that they are not organized into topics. I hope that the list isn't long enough to make this too big a problem, and I think it's more important for people to be able to read only the new stuff, by scrolling down to the end.

Some issues are the same as for assignments one and two, or minor variations on those issues. Questions and answers from previous assignments which still apply have been placed into a separate file, with some editing and updating for assignment three. Any further such questions will be added at the bottom here, then placed into the merged file only for assignment four.


There are also specific notes in separate files on:

Also see an extremely literal translation of that java gas station simulation into C++.


Q: Where do I put the class declarations, in a .cpp or .h file?

A: In the .h file. In assignment two, we put the struct declarations in the .c file because otherwise any module would be able to access the private members. This is because C doesn't have classes. In C++, you can define the class completely in the .h file without inviting the violation of any data-hiding rules.


Q: So what should the typedef be?

A: With classes in C++, you don't need a typedef. After the 'class' declaration is seen (by #including the .h file), the class name is a type name.


Q: Do we capitalize the class name?

A: Normally we write a class name in all-lower-case, just like a struct tag. There is a tradition in java of capitalizing class names, but this is not a tradition of C++.


Q: How do you compile C++ programs on CDF?

A: Just name them something ending with ".cpp" (a few other extensions are recognized too, but the grading software might only recognize ".cpp"). The 'gcc' command looks at the "extension" (the part of the file name after the last dot) of each file name to decide what to do with it. If it's ".c", it passes it through the C compiler. If it's ".cpp", it passes it through the C++ compiler. If it's ".o", it just passes it on to the linker with the rest of the ".o" files resulting from the above. And so on for various "extensions".

Using "g++" instead of "gcc" links your program with some additional C++ libraries. You probably need this. We will use g++ to compile your submitted files. (We will also include -lm in the compilation line.)


Q: Which random numbers come from which random number streams?

A: We don't need to use separate random number streams for this program. If you have an adequate random number generator, then for this assignment, there's really little advantage in using separate streams. Just call rand() for all random number generation. There is a separate web page about the various randomization tasks involved in this program.


Q: What should go in our Sim class? What should go in our main class? Can one class go inside another?

A: Classes do not nest in C++. You do not need a "sim" class (probably with a lower-case name, not capitalized), but you might find this a useful way to group variables into a module (that is, a class with only 'static' members is really only a module, not a class -- objects of that class type make no sense, but that's not what you're doing it for). You might want to think in terms of a "statistics" class into which you put your various record-keeping functions, perhaps a utility function or two related to record-keeping, and functions which initialize and dump (printf) the pile of relevant variables. (The simulation time would not belong in here, of course; that could go in a different module or could be a global variable.)

The type qualifier "static" works here just as in java, except that if you declare a class variable to be "static", you must then define the variable as well, in the .cpp file for that class, without the word static where it is defined. See the bottles-of-beer example from tutorial for an example of a class 'static' variable, including initialization (the "top" variable in class "bottle" in bottle3.cpp and bottle3.h).


Q: How do I seed the random number generator?

A: Call srand(). This void function takes one argument which is the seed value. You still call rand() to get one random number.

One source of variation is the time of day. If you #include <time.h>, you can call time(NULL) to get the current time in seconds since 00:00 1 Jan 1970 UTC, which makes an adequate seed.

Another source of variation is your process ID number. Each running program in unix has a separate process ID number. If you #include <unistd.h>, you can call getpid() (no arguments) to get your process ID number.

So altogether you might want to call srand(time(NULL) ^ getpid()).
Or simply srand(time(NULL)) (which is then not unix-specific).

You only do this once, often near the beginning of main() (but it can be anywhere, so long as it's only done once). If you do it again, you reset the random number stream to where it was before and you'll get the same "random" numbers again.


Q: Do I have to use the suggestions in http://www.dgp.toronto.edu/~ajr/270/a3/modules.html?

A: No.


Q: What is that "#ifndef _FILE_H" and "#define _FILE_H" stuff I see in C++ .h files all the time?

A: It means that if you #include this file five times, you won't get an error. The first time it defines _FILE_H; and the entire file is enclosed in conditional compilation which says that you should disregard the entire file if _FILE_H is defined. That is, the first time, _FILE_H will not be defined, and the file will "count"; subsequent times, _FILE_H will be defined from the first time and the file will be ignored.

I personally recommend against this practice, both in this course and in "real life". Often, although not always, it is a symptom of a program out of control. I suggest that each .cpp file #include the .h files it needs, once, and that .h files not include each other.

(On the other hand, if you do do this re-inclusion protection thing, don't begin your name with a '_'; the ANSI C standard reserves names beginning with an underscore and a capital letter, or two underscores. There is no need to use a weird or secret name for such a #define.)


Q: I'm getting various strange "undefined symbol" errors when compiling my C++ program. I'm indeed compiling all of my .cpp files together, and anyway the undefined symbols it's complaining about are functions I am not calling (not explicitly, anyway).

A: You are probably compiling a C++ program with "gcc" and you need to be specifying "g++". If your filenames end with ".cpp", there is not much difference between using "gcc" and "g++", except that g++ links in some extra C++-related libraries. It doesn't seem to take a very complex C program to require a little item or two out of these C++-specific libraries.

The grading software will be using g++ to compile your program. (Specifically, "g++ -Wall -ansi -pedantic *.cpp -lm".)


Q: In a given run, all my "random" numbers are the same.

A: Perhaps you are calling srand() too much. Call it just once. (Not just from one place, but just once.)


Q: What's a "vtable" and why am I getting an error that it is undefined?

A: This is, unfortunately, a somewhat involved issue. Unless you are actually getting this error message you probably won't want to spend the time reading the following long description.

Suppose you have a class with a virtual function:

class base {
    int x;

public:
    base(int y);
    virtual void jumpUpAndDown();
};
Derived classes all override jumpUpAndDown(), and you never intended the base class to be able to jump up and down itself. So you don't even define base::jumpUpAndDown anywhere, because it's never called.

Well, you can't be that sure. When you have virtual functions, the C++ compiler makes a table of all the function pointers for each class. Like global variables, this "vtable" has to be defined in one .o file (the result of compiling a .cpp file) and referenced without re-defining it (like an 'extern' declaration) in applicable other .cpp files.

How should it arrange this? This same class type definition appears (via #include) in many .cpp files.

The way it works is that it has an algorithm, which can be anything really so long as it is a deterministic algorithm which always yields the same result, which decides next to which class method it will emit the vtable. So, when it compiles the .cpp file which defines that particular class method, it emits the vtable too, as a separate global object, before or after the code for that function in your machine-language program (in the .o file, to be specific). Since the algorithm is deterministic, all the separately-compiled .cpp files agree on which one of them should contain the definition of the vtable when compiled.

The problem occurs when you don't define that class member which this algorithm has arbitrarily selected to be the hand-holder of the vtable. In this case, the C++ compiler has apparently selected base::jumpUpAndDown. And you never defined it in any .cpp file, so you also never implicitly defined the vtable.

 
The preferred method of avoiding this is to make sure that you always specify correctly whether or not you are going to define a function. If you declare it in the class, you define it. There is, however, an exception for virtual functions in a class which is never going to be instantiated and all derived classes which are going to be instantiated will override this function.

If this is the case, you want to designate it as a "pure virtual" function -- it's a placeholder, there is no actual function by that name in that class. You should write:

    virtual void jumpUpAndDown() = 0;
The "=0" says that base::jumpUpAndDown doesn't actually exist, and thus "base" is an abstract class and cannot be instantiated. Any class derived from base which doesn't override jumpUpAndDown is also an abstract class because it inherits the "pure virtual" function.. But probably most or all of them will indeed override jumpUpAndDown, unless you have a quite complex class hierarchy, so they won't be abstract classes, just "base" will be.

This way, a non-existent function will not be chosen by the algorithm for deciding next to which user-defined function the automatically-defined vtable should be emitted, so the "undefined vtable" error will not occur.


Q: If I have a data transmission of size greater than 1518, how can I figure out when the subsequent packets are going to be sent so as to break it up into multiple packets? Should I really be creating a new event for each packet, e.g. 10 "new"s for data of size 15180?

A: You should not create new objects for this. That is the hard way to do it. We recommend having the one object resubmit itself to the event queue and be invoked for each packet.

See modules.html for a description of the algorithm for sending a multi-packet message.


We strongly recommend against having a separate simulation object for each of the two parties in a pair. E.g. a pair of two peers is one object, as is a client-server pair.

The class for a host type should keep track of which side is transmitting (if applicable), etc; it should have a member function which says to transmit one packet.

Between this and the above Q&A entry, this gives you the ability to determine statistical data which requires connecting a failed transmission to its subsequent successful transmission, etc.


Q: Is a "conflict" the same as a "collision"?

A: I did not intend those terms to be synonymous. I'm not aware of any standard term for the former.

If a host goes to transmit on the network and finds another host already transmitting, it is not a collision; it's only a collision in the case where both hosts transmit simultaneously (because there was less than two µs between when they both checked and thus they both saw that the network was clear).

In the case of a collision, the current transmission is ruined, and both hosts must retry the transmission after their delays.

In the case of this lesser form of conflict, the host which sees that the network is in use will have to retry (and should use the same delay formula as in the case of a collision), but the host which was already transmitting is not interfered with. So the cases are distinct.


Q: How should I keep track of retry information in the case of collisions or conflicts?

A: This information should be kept in the simulation object itself. For example, a "client-server" pair will keep a value saying how many more bytes are to be transmitted in the current server request, rather than trying to transmit multiple packets in a loop, and will also keep information saying whether it's the client or the server's turn to transmit. More in modules.html.


Q: What goes in the event queue?

A: Hosts (in the case of spewers) and host-pairs (in the case of peers and client-server pairs). When an object comes to the top of the event queue, if its transmission is a collision or conflict then it gets resubmitted into the queue to retry the same packet; if its transmission is successful then it gets resubmitted into the queue to send its next packet.


Q: When you say that my report file should be called "report", don't you mean "report.txt"?

A: No. Just like in assignment one. But pay more attention to this in the current situation, because since you may name your .cpp and .h files what you wish, there are no filename restrictions being imposed by the "submit" system. Please be sure the file is named correctly. Not "report.txt", not "Report", not "myreport", etc.


Q: Must I come up with a better strategy for strategy #5? I can't seem to.

A: You must try; you don't have to succeed so long as you can still write about it in your report. Look at the results; think of some variation; try it; report about it (including an explanation of why you thought it might do better and also whether or not it did, in fact, do better).

Note the maximum report size -- this part of the assignment is not intended to be a big exercise.


Q: Why do you keep talking about "trying" to send and "attempting" to send?

A: The delay formulas only specify the time of the beginning of the next attempt to send a packet. When that time arrives, the network might not be silent, in which case we apply the delay formula to send it later; or, the network might be silent at that time, but not 2 µs later, hence we have a collision and need to delay and re-send. We identify the time at which it starts trying; furthermore, actual transmission starts at the earliest at 2 µs after the start-trying time.


Q: Does my program have to check for the right number of command-line arguments (argv), values greater than zero, etc?

A: In general in C, you should verify that argc is as expected before looking at the argv array, but we will not test or grade this aspect of your assignment. We will also run it only with valid argv values. However, do note that 0 is a valid value for any of the first three parameters, or all of them. (For example, if run as ./a.out 0 0 0 1, the simulation will be boring, but it shouldn't behave incorrectly.)


[main course page]