C++ iostreams: Unexpected but legal multithreaded behaviour
In previous articles, I’ve waxed rhapsodic about how great C++ is. I also noted there however that every language, C++ included, has its dark sides. Some languages have an unavoidable pervasive dark side, like being slow or hard to multithread, for C++ that dark side is mostly its complexity. In this post I want to zoom in on a specific ‘gotcha’ that recently took me several hours to resolve. I wrote this piece so anyone running into the same issue might find out about it if they search the web.
You may end up at this page if your C++ programs suffer from duplicate
output in multi-threaded programs, or unexpectedly corrupted output. This
specifically after you’ve enabled the optimization
std::ios_base::sync_with_stdio(false)
.
It turns out that sync_with_stdio
determines a lot more than if C
stdio is synced with C++ iostreams.
I want to thank Stackoverflow user bames53 for the insights found in his comment here. Thanks are also due to Stefan Bühler who first pointed out my bug was likely entirely legal behaviour.
iostreams
When any new language is released, it needs a ‘Hello, world’ program. And in fact, whole languages may be graded on how easy or hard it is to output text to screen or a file.
C (or POSIX) offer two ways of doing i/o - either the lowest level form of
straight up system calls like write(2)
, or the slightly more advanced
stdio which offers
buffering, formatting and parsing.
As was already noted in ‘The C++ Programming Language’, in theory, C++
should be able to do better than printf("Hello %s world", "new")
. This
printf
needs to parse its formatting string at runtime and in fact every
time it is invoked. And, without special compiler help, you won’t
know if the formatting %s is correctly matched up to a pointer to a string.
Enter the C++ iostreams which were one of the first things everyone
encountered in C++. In theory, cout << "Hello " << "new" <<" world"
could
be faster than the printf
above, since it could figure out what it needed
to do at compile time, and thence do it faster at runtime.
In practice this was not the case, and despite valiant efforts, the C++
iostreams remained slower and more cumbersome to use than the existing C
stdio. printf
use is still rife & not frowned upon.
Lately however, and I’m not sure when this happened, a lot of work has been spent at least in G++ to speed up iostreams. It is now entirely feasible to use iostreams for bulk text processing.
The nitty gritty
To make coexistence between C stdio and C++ iostreams possible, by default,
writing things to cout
will happen in such a way that it ends up in the
same buffer as when writing to stdout using stdio. So this will do the right
thing:
printf("Hello, ");
cout << "new world" << endl;
This synchronization of course comes at a performance penalty, or at least,
we tend to assume so. Most C++ programmers will decide that if they can get
away with it, disabling any form of synchronization is a good idea. Much
advice online therefore suggests doing
std::ios_base::sync_with_stdio(false)
, frequently not noting that this
needs to happen before any i/o has occurred.
The name sync_with_stdio
certainly suggests this is about interoperability
with C stdio
. It turns out this is not the case. Disabling this
synchronization fundamentally alters how cin
, cout
, cerr
, clog
and
variants function.
Multi-threading
One reason for using C++ is that it supports multi-threading (or more
broadly, multi-processing) very well. The original C++ standard had no words
on it because back in the day, officially there were no threads. Later
versions of C++ (starting with C++ 2011) dusted off the iostreams
specification and added words on thread
safety.
This starts off with the following:
Concurrent access to a stream object (30.8, 30.9), stream buffer object (30.6), or C Library stream (30.12) by multiple threads may result in a data race (6.8.2) unless otherwise specified (30.4). [ Note: Data races result in undefined behavior (6.8.2). — end note ] – [iostreams.threadsafety]
This is a blanket statement that bad things may happen if we do stuff to iostreams from several threads at the same time, unless there is a specific statement that says doing so is safe.
Luckily, there is the following paragraph too:
Concurrent access to a synchronized (27.5.3.4) standard iostream object’s formatted and unformatted input (27.7.2.1) and output (27.7.3.1) functions or a standard C stream by multiple threads shall not result in a data race (1.10). [Note: Users must still synchronize concurrent use of these objects and streams by multiple threads if they wish to avoid interleaved characters. — end note] – [iostream.objects.overview]
No disasters will happen on concurrent use of iostreams, although if you
print out two log lines to cerr
at the same time, you may find them
interleaved in your output. This certainly is not pretty & hard to parse,
but at least it is not illegal.
Note however that this paragraph talks only about ‘synchronized’ streams.
Once we call the much recommended sync_with_stdio(false)
, our streams are
no longer synchronized, not only not with stdio, but not at all. This
means every write operation on cin
or cout
etc must now be protected by
a mutex.
This itself is likely reason enough to never call sync_with_stdio(false)
in
any multi-threaded program using cout
to print things.
Ha, but I never do output from two threads at once
We now end up at my mysterious bug, which can be reproduced with the following tiny program:
#include <iostream>
#include <thread>
#include <string>
#include <unistd.h>
using namespace std;
void theThread()
{
for(int counter = 0 ;; ++counter) {
usleep(250000);
cout << "Hi "<< counter << endl;
}
}
int main()
{
std::ios_base::sync_with_stdio(false);
string line;
thread t(theThread);
while(getline(cin, line))
;
}
If this is invoked as yes | ./repro
, we may get the following output:
HHi 0
Hi Hi 1
Hi Hi 2
Where we would be expecting to see Hi 1
, Hi 2
, Hi 3
etc. We only ever
operate on cout
from theThread()
, and never from main. It feels like
this should be safe, but it still fails. What is going on?
tied iostreams
In its wisdom, the C++ standards committee decided that some iostreams should be tied together. This guarantees that the following works:
std::ios_base::sync_with_stdio(false);
cout << "Enter your name\n";
cin >> name;
The initial “Enter your name
” string is buffered and not emitted
to the terminal. Without tying, the user will be asked for input before the
program has printed what it wants.
Because of the tie, any read operation on cin
will trigger a flush
on
cout
. Most helpful.
However, this flush
is a write operation! So in our sample program above,
we do in fact have two threads operating on cout
at the same time. Every
time we read a line from yes
, cout
gets flushed. It therefore it is
entirely legal (if unexpected) for our compiler & standard library to emit
odd output.
The solution to this problem is simple, insert the following:
cin.tie(nullptr);
This breaks the tie between cin
and any other iostream.
Note: in practice, printing a
\n
to the terminal usually flushes the stream, because of the synchronization with stdio. This why in the example above, we first had to disable that synchronization to exhibit the problem of ‘output only appearing after being asked for input’.
Is that the whole solution?
Reading up on various bugs filed on iostreams operating in unsynchronized mode, it appears that wise users will stick to synchronized streams for their multithreaded programs. It is far too easy to stumble when you forego the promise from [iostream.objects.overview].
Note however that even synchronized streams may deliver interleaved output, either on a character by character basis or by emitting whole chunks of lines mixed together. There is no guarantee at all that:
cout << "Hello user #" << userno << ", welcome to " << host << endl
.. will actually be emitted without interleaving with other concurrent invocations of this line. You may easily see “Hello user #Hello user #123987,,” as valid output.
Using clog
will also not help - there is no way for it to emit the various
components of a log line atomically. You will need a lock.
Summarising
Be very careful when using std::ios_base::sync_with_stdio(false)
, and if
you do, also issue cin.tie(nullptr)
. Make sure sync_with_stdio
is called
before doing any i/o.
In general, be very weary of doing output operations on a single iostream from multiple threads - it may not do what you want.
Some further reading:
- The libstdc++ bug I filed about this, where it will likely be concluded this is (unfortunately) not a bug, but intended behaviour
- The {fmt} library is a simpler alternative to rapidly output text. Typically faster than printf.