Beyond "hello world!"

The traditional first program is to write a program to display "hello, world!". There are plenty of tutorials on the web to do this, so we won't bother (but feel free to search Google to find examples).

Instead, we'll write a slightly more complicated program that will read text from the standard input console, collect frequencies of letters, and then display those frequencies in four columns. For example, here's what our program (which we'll dub countch) does on a version of Alice in Wonderland:

% ./countch < alice.txt
\n      44      \r      44      Space   2155    !       28
"  77      '       30      (       11      )       11
,       175     -       29      .       45      :       22
;       16      ?       15      A       40      B       2
C       2       D       14      E       7       F       3
G       1       H       5       I       39      K       1
L       7       M       5       N       6       O       6
P       2       R       8       S       9       T       10
W       10      Z       1       a       640     b       139
c       167     d       376     e       1075    f       183
g       201     h       588     i       522     j       6
k       96      l       410     m       142     n       551
o       690     p       127     q       6       r       439
s       506     t       859     u       249     v       68
w       244     x       6       y       163     z       3

Note that the way we wrote the program, upper case letters are distinct from their lower case equivalents and we print out some special characters (like the 'newline' character and the Space character) to make them easier to read.

Getting with the program

Here's the program listing, which I named countch.cpp:
#include <map>
#include <iostream>
using namespace std;

int main() 
  {
  map<char, int> freqs; 
  char ch;
  
  while (cin.get(ch))
    freqs[ch]++;
 
  int i;
  map<char,int>::iterator it;

  for (i=1, it = freqs.begin(); it != freqs.end(); ++it,++i) 
    {
    switch (it->first) 
      {
      case '\r': cout << "\\r"; break;
      case '\t': cout << "\\t"; break;
      case '\n': cout << "\\n"; break;
      case ' ' : cout  << "Space"; break;
      default: cout << it->first;
      }
    cout << "\t" <<  it->second << ((i%4) ? "\t" : "\n");
    }
  // cout << freqs;
  }

The first thing to note that this program has no classes in it, just a function, main. C++ programs don't require the use of classes (as does Java). However, as with Java, the main function is special: it's the function which gets called when the program is run.

The second thing to note is that the program isn't very long. C++ has an odd reputation for being both overly verbose and overly concise. But you can write reasonably clearly written programs in C++, and this program is an example. Most of the code handles the special cases of hard-to-read characters and putting the results in columns.

Heads up

Let's examine this code bit by bit. The first section contains header inclusions:

#include <map>
#include <iostream>
using namespace std;
What is the Standard Template Library? It is a libary with wide variety of useful algorithms and data structures. See the section in the appendix about the STL.

Most C++ programs contain one or more #include statements. The #include statements tell the program where to find the declaration of data types and methods for other C++ modules. This program has #include <map> statement because we want to use a module from the Standard Template Library that allows us to define a map: a dictionary of keys to values that maintains the keys in sorted order. We'll use this to map characters to their frequencies.

We have #include <iostream> because this module will allow us to read and write to input and output streams, including "standard in" and "standard out." In this module, the standard input stream is called cin, and the standard output stream is called cout (for 'console in' and 'console out,' I believe).

What is a namespace? Namespaces are a data structure for keeping names of variables, methods, classes, etc., organized.

The statement using namespace std; allows us to use names from the included files without prefixing them with their namespace name. Without this statement, we would have to refer to cout as std::cout, for example. Different programmers make different choices about whether to use the standard namespace in this way.

Remember the main

The next section of our program contains the code for our one function main. As in Java and and C, a function is declared by first describing its return type (in this case int), then its name (in this case main), its list of arguments (in this case, (), because we don't care about arguments that are passed in), and then a block of code that defines the function. Again, as in Java and C, blocks of code are defined with matching braces.

In other words, if we were writing the "hello, world!" program, it would be just:

#include <iostream>
int main() 
  {
  std::cout << "hello world!\n";
  }

Ours is slightly more complicated, but not really by much. It has two parts. First, the frequency dictionary, freqs is declared, and the input stream is read to find the frequencies. Then, we iterate over the frequency dictionary writing out the frequencies in four columns. Again, here is the code:

int main() 
  {
  map<char, int> freqs; 
  char ch;
  
  while (cin.get(ch))
    freqs[ch]++;
 
  int i;
  map<char,int>::iterator it;

  for (i=1, it = freqs.begin(); it != freqs.end(); ++it,++i) 
    {
    switch (it->first) 
      {
      case '\r': cout << "\\r"; break;
      case '\t': cout << "\\t"; break;
      case '\n': cout << "\\n"; break;
      case ' ' : cout  << "Space"; break;
      default: cout << it->first;
      }
    cout << "\t" <<  it->second << ((i%4) ? "\t" : "\n");
    }
  }
Common datatypes in C++ include bool, char, int, float, long, and double. Character literals are enclosed in single quotes (e.g., 'a'), string literals in double quotes (e.g. "test").

map<char, int> freqs; declares that freqs is a map whose keys are of type char and whose values are of type int. We also declare a char variable, ch, to use as the key to freqs. C++ allows us to get a particular value in this map with freqs[ch]. As is often the case with C++, the value is initialized to zero; hence, freqs['a'] is automatically zero.

Cin City

The while loop repeatedly calls the get method on cin, the standard input stream. The get method takes one parameter, a character variable. If there is a character on the standard input stream, the variable gets that character as its value. If there is not, it sets the value of the character to 0. In C++, any non-zero value is considered true, so we can write the very concise while loop:

  while (cin.get(ch))
    freqs[ch]++;
In general, x+=2 is the same as x=x+2; x++ is the same as x+=1; for most purposes, ++x is the same as x+=1, too, except the return value of this expression is the old value of x, not the new value. It's considered by some (not me) to be bad C++ style to use i++ instead of ++i in for loops and the like.

Note that C++ conditions inside while loops (as well as if conditions) must be enclosed in parentheses.

For: he's a jolly good fellow

Because we have a map containing the character/frequency pairs, we need to iterate over all the keys to print out the frequencies. C++ provides iterators for all of the collections in the Standard Template Libary. We declare such an iterator in our code with: map<char,int>::iterator it;

Notice that it's just the type definition we used for freqs--that is, map<char,int>--with "::iterator" appended to it. We also declare int i, which we'll use to keep track of the column.

Typically, an iterator is initialized by calling the begin() method on an object, and is completed when it is equal to the result of calling the end() method on an object. The ++ incrementor is used to move the iterator to the next "place" in the object. In our case, the object is the freqs map. Thus we see in the for loop:

 for (i=1, it = freqs.begin(); it != freqs.end(); ++it,++i) 
   { 
     ...code elided... 
   }

Again, we are counting the columns with i, so that is initialized and incremented as well. The general syntax for for loops is:

 for (initialization; test; increment) 
   { 
     code 
   }

Pay attention to the punctuation: the initialization, test and increment are separated by semi-colons, and surrounded by parentheses. Individual initialization or increment statements (as in our example) are separated by commas. It's all boring syntax, but you'll get used to it eventually.

What's a pointer? See the discussion on Memory Management.

Essentially, we want our program to display to the standard output the key and value from each entry in the freqs map. Reading the Standard Template Library documentation, we learn that each entry is a pair object, and a pair object has two publically accessible fields (called "data members" in C++): first (which, for map objects, holds the key) and second (which holds the value). Iterators return pointers to objects, so to access the first and second fields from the current iterator pointer, we use it->first and it->second. If we didn't care about columns or printing out special characters, the code would look like this:

  for (it = freqs.begin(); it != freqs.end(); it++) 
    {
    cout << it->first << "\t" <<  it->second << "\n");
    }

In fact, this is a very common programming idiom when using iterators.

Cout and about

Small exercise: compile and run this program: cat.cpp.

We'll ignore the details of the << and >> operators here, but notice that we send (the printed representation of) data to cout with <<, and we can "chain" data to send to cout with multiple uses of <<. Data can be sent from cin to variables using the >> operator; if we wanted to send data from cin to a string variable until we reached the end of file, we could use this while loop:

  string str;
  while (cin >> str)
    {
    do something with str ...
    }

Switch hitting

Some special characters: \n is newline; \r is carriage return; \t is tab.

When we send a newline character to cout, naturally a new line is displayed on the console. If we want each character/frequency pair on a separate line, we need to display some of the special characters in some other form. Of course, spaces are not visible, so I've chosen to print "Space" instead of a space character. The switch construct in C++ allows control choices based on the value of some variable. The basic form is:

  switch (variable) 
    {
    case first_value: do some things; break;
    case second_value: do some things; break;
    ...
    case nth_value: do some things; break;
    default: do some things;
    }

Of special note is the use of break at the end each case. Without this, control will pass to the next case statement instead of "breaking out" of the switch block of code. The break statement can also be used to break out of other code blocks--from within a for or while loop, as common examples.

Smooth operators

Most operators in C++ are either unary or binary--that is, they take one arguments (like ++ in ++i) or two (like = in x=7). C++ (like Java and Javascript) has one ternary operation:
     test ? if_true : if_false

The test is executed; if it results in a true (or non-zero) value, the code in the if_true block is executed; otherwise the code in the if_false block is executed. This is the same as if test if_true else if_false, but is a little more concise. Some people dislike it for this reason, but using it when wanting to embed simple "if/then/else" logic seems perfectly reasonable, as we've done in our code:

    cout << "\t" <<  it->second << ((i%4) ? "\t" : "\n");

Whenever (i%4) is not zero, we want to output a tab character ( "\t"), otherwise we want to output a newline character ("\n"). We initialize i to 1, and using "the ternary operator" allows us to consisely output values in four columns.

Runs, comments, and errors

Finally, our code has one other line, and it's the only line I included just for pedagogical reasons:

  // cout << freqs;
Everything after // is commented out. Everthing between /* and */ is a comment.

The // characters introduce a comment: everything after this is ignored by the compiler. It's possible to imagine that the uncommented line would display all of the key/value pairs in freqs to the console (this is what happens in Javascript, for example). But this isn't what happens in C++. Instead, I got the following error messages (using the g++ compiler):

countch.cpp: In function `int main()':
countch.cpp:27: no match for `std::ostream& << std::map<char, int, 
   std::less<char>, std::allocator<std::pair<const char, int> > >&' operator
/usr/include/gcc/darwin/3.1/g++-v3/bits/ostream.tcc:55: candidates are: 
   std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, 
   _Traits>::operator<<(std::basic_ostream<_CharT, 
   _Traits>&(*)(std::basic_ostream<_CharT, _Traits>&)) [with _CharT = char, 
   _Traits = std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/bits/ostream.tcc:77:                 
   std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, 
   _Traits>::operator<<(std::basic_ios<_CharT, 
   _Traits>&(*)(std::basic_ios<_CharT, _Traits>&)) [with _CharT = char, _Traits 
   = std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/bits/ostream.tcc:99:                 
   std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, 
   _Traits>::operator<<(std::ios_base&(*)(std::ios_base&)) [with _CharT = char, 
   _Traits = std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/bits/ostream.tcc:177:                 
   std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, 
   _Traits>::operator<<(long int) [with _CharT = char, _Traits = 
   std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/bits/ostream.tcc:214:                 
   std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, 
   _Traits>::operator<<(long unsigned int) [with _CharT = char, _Traits = 
   std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/bits/ostream.tcc:152:                 
   std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, 
   _Traits>::operator<<(bool) [with _CharT = char, _Traits = 
   std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/ostream:104:                 
   std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, 
   _Traits>::operator<<(short int) [with _CharT = char, _Traits = 
   std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/ostream:115:                 
   std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, 
   _Traits>::operator<<(short unsigned int) [with _CharT = char, _Traits = 
   std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/ostream:119:                 
   std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, 
   _Traits>::operator<<(int) [with _CharT = char, _Traits = 
   std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/ostream:130:                 
   std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, 
   _Traits>::operator<<(unsigned int) [with _CharT = char, _Traits = 
   std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/bits/ostream.tcc:240:                 
   std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, 
   _Traits>::operator<<(long long int) [with _CharT = char, _Traits = 
   std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/bits/ostream.tcc:278:                 
   std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, 
   _Traits>::operator<<(long long unsigned int) [with _CharT = char, _Traits = 
   std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/bits/ostream.tcc:304:                 
   std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, 
   _Traits>::operator<<(double) [with _CharT = char, _Traits = 
   std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/ostream:145:                 
   std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, 
   _Traits>::operator<<(float) [with _CharT = char, _Traits = 
   std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/bits/ostream.tcc:329:                 
   std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, 
   _Traits>::operator<<(long double) [with _CharT = char, _Traits = 
   std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/bits/ostream.tcc:354:                 
   std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, 
   _Traits>::operator<<(const void*) [with _CharT = char, _Traits = 
   std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/bits/ostream.tcc:120:                 
   std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, 
   _Traits>::operator<<(std::basic_streambuf<_CharT, _Traits>*) [with _CharT = 
   char, _Traits = std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/ostream:211:                 
   std::basic_ostream<_CharT, _Traits>& 
   std::operator<<(std::basic_ostream<_CharT, _Traits>&, char) [with _CharT = 
   char, _Traits = std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/bits/ostream.tcc:505:                 
   std::basic_ostream<char, _Traits>& std::operator<<(std::basic_ostream<char, 
   _Traits>&, char) [with _Traits = std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/ostream:222:                 
   std::basic_ostream<char, _Traits>& std::operator<<(std::basic_ostream<char, 
   _Traits>&, signed char) [with _Traits = std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/ostream:227:                 
   std::basic_ostream<char, _Traits>& std::operator<<(std::basic_ostream<char, 
   _Traits>&, unsigned char) [with _Traits = std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/bits/ostream.tcc:573:                 
   std::basic_ostream<_CharT, _Traits>& 
   std::operator<<(std::basic_ostream<_CharT, _Traits>&, const char*) [with 
   _CharT = char, _Traits = std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/bits/ostream.tcc:620:                 
   std::basic_ostream<char, _Traits>& std::operator<<(std::basic_ostream<char, 
   _Traits>&, const char*) [with _Traits = std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/ostream:246:                 
   std::basic_ostream<char, _Traits>& std::operator<<(std::basic_ostream<char, 
   _Traits>&, const signed char*) [with _Traits = std::char_traits<char>]
/usr/include/gcc/darwin/3.1/g++-v3/ostream:251:                 
   std::basic_ostream<char, _Traits>& std::operator<<(std::basic_ostream<char, 
   _Traits>&, const unsigned char*) [with _Traits = std::char_traits<char>]

Oh my!

It is not unusual to get so much information back from the C++ compiler, and it takes some familiarity with how C++ works to understand what is going on. But the key piece of information--that the error occurred in line 27--told me where to look, and it's easy to guess that one just can't print a map to the console.

Briefly, though, look at the start of the error message again:

countch.cpp: In function `int main()':
countch.cpp:27: no match for `std::ostream& << std::map<char, int, 
   std::less<char>, std::allocator<std::pair<const char, int> > >&' operator

We can see something that looks like our map declaration, and the pair that a map contains. This is wrapped inside no match for `std::ostream& << ... >&' operator, and this is understandable. It's telling us that a map of this type does not have the << operator defined for "std::ostream," which, as you might guess, is the class for output streams (in the "std" namespace).

I include this example because it's not uncommon, in learning C++, to see reams and reams of syntax error messages like this. It doesn't necessarily mean that there are lots and lots of errors in your code though--as we've seen, one syntax error here resulted in lines and lines of messages.