Why is reading lines from stdin much slower in C++ than Python?
Why is reading lines from stdin much slower in C++ than Python? 🐍🐢
So, you tried to compare reading lines from standard input (stdin) using Python and C++, and you were surprised to see that your C++ code was running much slower than the equivalent Python code. Don't worry, you're not alone in this confusion! Many developers have faced this issue and struggled to understand why C++ lags behind Python in this aspect. In this blog post, we'll dive into the reasons behind this performance difference, how to fix it, and provide a benchmark to showcase the improvements.
Understanding the Problem
The root cause of this performance difference lies in a C++ feature called "synchronization with standard IO" (std::ios_base::sync_with_stdio). By default, C++ synchronizes its IO streams (cin, cout, cerr) with the C standard IO streams (stdin, stdout, stderr). This synchronization ensures compatibility between C++ and C IO operations and enables mixing them. However, it comes at a cost of slowing down IO operations.
On the other hand, Python's input reading from stdin is not synchronized with any external IO operations. Python's sys.stdin operates independently and performs IO operations in a more efficient way than C++.
Easy Solutions
Now that we know the cause, let's explore some easy solutions to make C++ reading from stdin faster:
Solution 1: Disabling Synchronization
The simplest solution is to disable the synchronization between C++ and C IO operations. You can achieve this by adding the following line at the beginning of your C++ code:
std::ios_base::sync_with_stdio(false);
This line tells C++ to stop synchronization and allows C++ IO operations to proceed independently. This change alone can significantly improve the performance of reading lines from stdin in C++.
Solution 2: Using fgets
Alternatively, you can use the C function fgets instead of std::getline to read lines from stdin. fgets is a C library function that is faster than C++'s std::getline. Here's an example of how you can modify your C++ code to use fgets:
#include <iostream>
#include <ctime>
#include <cstdio>
#include <cstring>
int main() {
char input_line[256];
long line_count = 0;
time_t start = time(NULL);
int sec;
int lps;
while (fgets(input_line, sizeof(input_line), stdin)) {
if (input_line[strlen(input_line)-1] == '\n')
line_count++;
}
sec = (int) time(NULL) - start;
std::cerr << "Read " << line_count << " lines in " << sec << " seconds.";
if (sec > 0) {
lps = line_count / sec;
std::cerr << " LPS: " << lps << std::endl;
} else
std::cerr << std::endl;
return 0;
}
By using fgets, you bypass the overhead of C++ IO streams and achieve better performance.
The Results 💨
Let's showcase the impact of these solutions with a benchmark. We'll compare the performance of the default C++ code, the C++ code with synchronization disabled, and the Python code. The benchmark was performed on a MacBook Pro running Mac OS X v10.6.8 (Snow Leopard). Here are the results:
Default C++ code: Read 5,570,001 lines in 9 seconds. LPS: 618,889
C++ code with synchronization disabled: Read 5,570,001 lines in 1 second. LPS: 5,570,000
Python code: Read 5,570,000 lines in 1 second. LPS: 5,570,000
As you can see, after disabling synchronization or using fgets, C++ matches Python's performance and achieves a significant improvement compared to the default behavior.
Recap and Call-to-Action ✍️
Reading lines from stdin in C++ is slower than Python due to IO stream synchronization.
To fix the performance issue, disable synchronization using
std::ios_base::sync_with_stdio(false)
.Alternatively, you can use the C function fgets for faster line reading.
Benchmark results show that after applying these solutions, C++ matches Python's performance.
Now that you know how to improve the performance of reading lines from stdin in C++, give it a try in your own projects. Share your experience in the comments below and let's empower fellow developers to write faster and more efficient code! 💪🚀