CSC 209 exam and solutions

(see the exam as given, without solutions, at exam.pdf)

University of Toronto
Faculty of Arts and Science
August 2022 Examinations
CSC 209H1Y
Duration: 3 hours
Aids allowed: Any books and papers.
No electronic aids allowed: No calculators, cell phones, computers, ...
To pass the course you must achieve at least 35% on this exam.

Do not open this booklet until you are instructed to.
When instructed to begin, please check that you have all 12 pages (including this one).

Please fill in your name and student number above, now.

Turn off all mobile (cell) phones, "smart watches", and other electronic devices, and place them in your bag under your desk. If you possess an electronic device during the exam, you may be charged with an academic offence.

When you are finished with the exam, raise your hand for someone to collect it. Do not access unauthorized aids before your exam is collected.

Answer all questions. Answer questions in the space provided, or write "see page ..." and identify the answer on that other page clearly. You can omit comments unless you need to explain something unclear about the functioning of your code, and in the C programming questions you can omit the #includes.

1. [12 marks]
Write shell commands to output the following. Put some thought into making your answer simple; answers will be graded on quality, not just on whether they "work".

Please be careful to write single quotes and backquotes correctly to avoid misinterpretation.

a) the product of 123 times 234

b) the larger of the files "bloof" in the current directory and "bloof" in the root directory. Output the file path name, e.g. simply "bloof" if the file in the current directory is bigger. (In the event of a tie, outputting either one file name is fine.)

c) whether the value of the variable f, plus 9, is equal to the value of the variable g (output either "yes" or "no")

d) the number of '<' characters in the standard input. (Thus, with certain simplifying assumptions, if the input is HTML, then this counts the number of HTML tags.)

Sample solutions:

	expr 123 '*' 234

Note: The grading will be largely about the quoting of the asterisk, although there will also be marks deductions for making this unnecessarily complicated in any of a variety of possible ways.

	if test `wc -c <bloof` -ge `wc -c </bloof`
	then
	    echo bloof
	else
	    echo /bloof
	fi

Note: How to specify a path name for a file in the root directory is a significant portion of the grading of this question. Some people seem to be confusing the unrelated concepts "root directory" and "home directory".

	if test `expr $f + 9` -eq $g
	then
	    echo yes
	else
	    echo no
	fi

	tr -dc '<' | wc -c

Note: There are a great many possible ways to do this one, and yours didn't have to be as terse as mine here, although it should be about as clear.

2. [7 marks]
There are files named a1, a2, a3, and so on through a20.

Write a loop in sh to rename a1 to a2, a2 to a3, a3 to a4, and so on, through renaming a20 to a21.

Sample solution:

	for i in `seq 20 -1 1`
	do
	    echo mv a$i a`expr $i + 1`
	done

You didn't have to remember exactly how to make seq count backwards (e.g. if you wrote "seq 20 1 -1" meaning the above, that's fine for the exam, although that doesn't actually work). But you did have to notice (to be able to get full marks) that doing "mv a1 a2" followed by "mv a2 a3" isn't going to work out well.

Remembering the name of the command mv was not important in grading. E.g. if you wrote "rename", whatever, no problem.

3. [5 marks]
Consider the following terminal session, where '$' is the shell prompt:

	$ cat s
	echo Blah, blah, blah[footnote].
	echo Blah blah, and blah blah.
	echo
	echo [footnote]Additionally, blah blah.
	$ sh s
	Blah, blah, blah[footnote].
	Blah blah, and blah blah.

	[footnote]Additionally, blah blah.
	$

[code to download]

In this case, the shell script worked as expected; but in fact, it has a serious bug, relating to special characters.
The output will not always be as above.

a) What is the bug?

b) What would trigger the bug? (That is, suppose you wanted to show a sceptical person that the bug really exists — what would you do to demonstrate its existence?)

Sample solution:

The bug is that the use of the glob notation's '[' makes this code inappropriately sensitive to the contents of the current directory.

The bug could be triggered by creating a file named fAdditionally,
or a file named blaho.
(the trailing comma or period, respectively, is essential)
(Try it out! If you are sceptical that this bug exists, here I am demonstrating its existence.)

4. [9 marks]
Recall the difference between "tr x y" and "sed s/x/y/" — this tr command replaces all 'x' characters but this sed command replaces only the first 'x' on each line.

Write a complete C program (except that you can omit the #includes) which takes no command-line arguments (you don't have to declare the argc and argv arguments to main()) and does the same transformation as "sed s/x/y/" — that is, stdin is copied to stdout, with the first 'x' on each line (if any) changed to 'y', but otherwise unmodified.

Sample solution:

	#include <stdio.h>

	int main()
	{
	    int havent_seen_yet = 1;
	    int c;
	    while ((c = getchar()) != EOF) {
		if (c == '\n')
		    havent_seen_yet = 1;
		if (c == 'x' && havent_seen_yet) {
		    putchar('y');
		    havent_seen_yet = 0;
		} else {
		    putchar(c);
		}
	    }
	    return(0);
	}

Your solution did not have to be this compact, but the equally-obvious solution using fgets() is going to have a bug for long lines, in that when the line is split, an 'x' subsequent to the split is going to be considered to be the first on the line even if there was one prior to the split.
How substantially such a bug will be penalized in grading will depend on how long the line has to be to encounter the bug. If your program malfunctions if a line is longer than just a few dozen characters, this bug will manifest all the time and that's pretty bad. Whereas if the program malfunctions only if a line is longer than several hundred characters, in some theoretical sense that's equally bad, but practically, it's not going to malfunction all that often, so in one sense it isn't as bad (although it's still bad, and unnecessarily so).

5. [10 marks]
Write a function in C (not a complete program) which takes an array of integers (passed as a pointer to the zeroth element plus a separate size parameter) which are file descriptors, and an array of char (again a base pointer plus a separate size parameter) which is an area to read() into.

So the arguments are much like those to read(), except we have an array of file descriptors instead of a single file descriptor.

Your function calls select() to find one of the file descriptors which is first ready for reading, and reads from that file descriptor. Other than the select() call, your function behaves similarly to read(): it reads into the char array, and returns the number of bytes read. Also like read(), in the event of error, your function does not produce an error message, but simply returns −1 (which is what read() will return to you).

Reminder: extern int read(int fd, char *buf, int bufsize);

	int read_any(int *fdlist, int fdlistsize, char *buf, int bufsize)
	{

Sample solution:

	#include <unistd.h>
	#include <sys/select.h>

	int read_any(int *fdlist, int fdlistsize, char *buf, int bufsize)
	{
	    int i, maxfd = 0;
	    fd_set fds;
	    FD_ZERO(&fds);
	    for (i = 0; i < fdlistsize; i++) {
		FD_SET(fdlist[i], &fds);
		if (fdlist[i] > maxfd)
		    maxfd = fdlist[i];
	    }
	    if (select(maxfd + 1, &fds, NULL, NULL, NULL) < 0)
		return(-1);
	    for (i = 0; i < fdlistsize; i++) {
		if (FD_ISSET(fdlist[i], &fds))
		    return(read(fdlist[i], buf, bufsize));
	    }
	    return(-1);  /* shouldn't happen, but is certainly an error */
	}

6. [8 marks]
Given these declarations:

	int n;
	int *p;
	int x[4];

and given values as follows:

n is 5
p is a pointer to x[1]
x[0] is 0; x[1] is 1; x[2] is 2; x[3] is 3

specify the type and value of each of the following expressions. (For pointer values you can write things like "pointer to x[1]".) If the expression is invalid, write "invalid".

	type	value
`p+2`
`p+n`
`*p`
`*(p+2)`
`&p`
`p[1]`
`&p[1]`
`'a'+n`

Solution:

	type	value
`p+2`	pointer to int	pointer to x[3]
`p+n`	invalid (exceeds array bounds)
`*p`	int	1
`*(p+2)`	int	3
`&p`	pointer to pointer to int	pointer to p
`p[1]`	int	2
`&p[1]`	pointer to int	pointer to x[2]
`'a'+n`	int	'f' (or 102)

7. [20 marks]
There is an existing function called "sthg" (short for "do something") which takes one integer argument and returns an integer. It does some sort of computation which we want to perform in parallel for various arguments.

Write a function in C (not a complete program) which takes an array of integers (as the usual two parameters: base address of the array and size of the array). It forks once per array element and, in parallel, calls sthg(i) for each array element. The results of all of the computations are added together, and this is the return value of your function.

To convey the results of the computations back to the original process, you will need to use pipes. Each child process does the computation and then writes the result to the pipe.

	int parallel_something(int *a, int size)
	{

Weakness in question: I should have said what to do in the case of error from fork or pipe. I apologize for the omission. I also should have provided for there to be something reasonable to do in the case of error! Because any int return value is a possible valid value so can't be used for an error signal.

If I were setting this exam question again, I'd say that sthg()'s return value is guaranteed to be non-negative, and parallel_something() should return −1 for error — a bit weak, but good enough for an exam question. That's how I wrote my solution below. If you made other assumptions, that's fine. (In "real life", I'd probably add an additional parameter to convey error information, separate to the return value conveying the sum in the case of success.)

Sample solution:

	#include <stdlib.h>
	#include <unistd.h>
	#include <fcntl.h>
	#include <sys/types.h>
	#include <sys/wait.h>

	int parallel_something(int *a, int size)
	{
	    int pipefd[size * 2];
	    int i, sum, x;
	    extern int sthg(int x);

	    for (i = 0; i < size; i++) {
		if (pipe(pipefd + i * 2))
		    return(-1);
		switch (fork()) {
		case -1:
		    return(-1);
		case 0:
		    close(pipefd[i*2]);
		    x = sthg(a[i]);
		    if (write(pipefd[i*2+1], &x, sizeof x) != sizeof x)
			perror("write");
		    exit(0);
		default:
		    close(pipefd[i*2+1]);
		}
	    }
	    for (sum = i = 0; i < size; i++) {
		if (read(pipefd[i*2], &x, sizeof x) != sizeof x)
		    return(-1);
		close(pipefd[i*2]);
		sum += x;
	    }
	    for (i = 0; i < size; i++)
		wait(NULL);
	    return(sum);
	}

8. [20 marks]
The file /dev/urandom is a special device file which is a source of random bytes, as we used early in the course in the "how much is that doggie in the window?" example, and again in lab 12. (You don't need to remember those examples to do this question.)

The file /usr/share/dict/words is a list of English words for use in spell-checking, with one word per line. It is more than 65536 lines long. Each line is fewer than 500 characters wide.

Write a complete C program (except that you can omit the #includes) which reads two bytes from /dev/urandom to get a 16-bit number from 0 to 65535 (2¹⁶−1), and then reads /usr/share/dict/words to get a random line from the first 65536 lines of the file (from line 1 to 65536). Output that line.

(Note: You still need to check for all errors. For example, if /usr/share/dict/words can't be opened, you would need to output a correct error message. If it is fewer lines long than the random number you generate, you can simply exit without output, but your program mustn't misbehave or violate the C language rules. If a line is longer than 500 characters, you can end up splitting it in two, or losing some of it, or anything like that, but you can't exceed array bounds.)

Sample solution:

	#include <stdio.h>
	#include <unistd.h>
	#include <fcntl.h>

	int main()
	{
	    int fd;
	    unsigned char nrand[2];
	    int n;
	    FILE *fp;
	    char line[500];

	    if ((fd = open("/dev/urandom", O_RDONLY)) < 0) {
		perror("/dev/urandom");
		return(1);
	    }
	    if (read(fd, nrand, 2) != 2) {
		perror("/dev/urandom");
		return(1);
	    }
	    n = nrand[0] << 8 | nrand[1];
	    if ((fp = fopen("/usr/share/dict/words", "r")) == NULL) {
		perror("/usr/share/dict/words");
		return(1);
	    }
	    while (fgets(line, sizeof line, fp)) {
		if (n-- == 0) {
		    printf("%s", line);
		    return(0);
		}
	    }
	    return(1);
	}

Bit-wise operators were not the only way to combine the two bytes into one 16-bit value. "nrand[0] * 256 + nrand[1]" will also be accepted for full marks, and in fact in a modern optimizing compiler might compile that to the same machine code (although it also might not, because the result could be different for negative numbers, but the use of unsigned char rules this out, but that's getting pretty sophisticated).

9. [9 marks]

The following program is meant to be a server which listens on port 2000, and for each client which connects, it reads one character (byte) from the client, sends it back to that client, and drops the connection.

This program doesn't work.

(I suggest reading through the code first, rather than skipping ahead to the questions.)

	 1  #include <stdio.h>
	 2  #include <unistd.h>
	 3  #include <string.h>
	 4  #include <sys/types.h>
	 5  #include <sys/socket.h>
	 6  #include <netinet/in.h>
	 7  
	 8  int main()
	 9  {
	10      int fd;
	11      socklen_t len;
	12      struct sockaddr_in r, q;
	13      int c;
	14  
	15      if ((fd = socket(AF_INET, SOCK_STREAM, 0)) < 0) {
	16          perror("socket");
	17          return(1);
	18      }
	19  
	20      memset(&r, '\0', sizeof r);
	21      r.sin_family = AF_INET;
	22      r.sin_addr.s_addr = INADDR_ANY;
	23      r.sin_port = 2000;
	24  
	25      if (bind(fd, (struct sockaddr *)&r, sizeof r) < 0) {
	26          perror("bind");
	27          return(1);
	28      }
	29      if (listen(fd, 5)) {
	30          perror("listen");
	31          return(1);
	32      }
	33  
	34      while (1) {
	35          len = sizeof q;
	36          if ((fd = accept(fd, (struct sockaddr *)&q, &len)) >= 0) {
	37              switch (read(fd, &c, 1)) {
	38              case -1:
	39                  perror("read");
	40                  return(1);
	41              case 1:
	42                  write(fd, &c, sizeof c);
	43              }
	44          }
	45      }
	46  }

[the code as a separate file, if you want to try it]

The following bugs are observed. For each bug, state the fix, by writing the actual replacement code and saying what should be replaced or where the new code should be inserted, using the line numbers on the original code for reference (e.g. "insert this after line 12").

a) It listens on the wrong port number. For some reason it is listening on port 53255 (determined by running a tool which shows what ports have programs listening on them), rather than on port 2000.

b) It indeed echoes back the first character sent, but then also sends some seemingly-random garbage.

c) It doesn't drop the connection after doing that. (The introductory description said that this program is supposed to drop the connection after sending the character back to the client.)

d) It only works for the first client. Subsequent connections are not apparently answered. In starting to debug this, you notice that you are not handling errors from accept() (line 36 does check accept's return value, but does not do anything in the negative case). You add an 'else' after line 44 to call perror() and exit. For the second client, this outputs the error "accept: Invalid argument". The man page says that this means that "socket is not listening for connections".

Sample solutions:

a) This is because it didn't call htons() on the port number. 2000 as a 16-bit binary number is 0000011111010000, and if you reverse those two halves to get 1101000000000111, that's 53255. So on a little-endian machine, it will listen on port number 53255 instead of 2000. (Try it!)

(The fix is therefore, obviously, simply to say htons(2000) instead of 2000.)

b) This is because line 42 writes "sizeof c" bytes instead of just one, where c is an int. The clearest fix is to change the type of c (line 13) to char, because that's what we're really meaning here. Changing the third argument to the write() call on line 42 to 1 will also work, although for a slightly odd reason: The read() on line 37 will just read into whatever byte of the integer c is first, leaving the other bytes of c undisturbed; and then, the write on line 42 will write only that same first byte. This is not a recommended practice, because it's confusing.

c) There is no code to drop the connection. After line 42, add close(fd).

d) The variable fd is used for two different purposes. As is, we're losing the listening file descriptor altogether, so then when we loop around, fd is the old client, not the listening file descriptor, so it's not going to work in the accept().

To fix this, the result of the accept() call needs to be stored in a new int variable, and this variable used for the rest of lines of the file, leaving fd's value intact for the next iteration of the loop. That is to say, add "int clientfd;" somewhere, and change line 36 to say "if ((clientfd = accept(fd, ...", and change fd in lines 37 and 42 to clientfd.

End of exam. Total marks: 100. Total pages: 12.