Read My Buffer

EDITED BY: PROF. GARTH SANTOR

Back in the day, villages were built around a water source such as a river or a pond, and people would go straight to that water source for their necessities. Later on, however, with expansion and population increase, villages needed to be built farther away from any water. The solution to that was using buckets and pails to transport the water from its source to the homes.

Similarly, programs have two methods of retrieving information. Either the source is local to the program and can be accessed directly, or it resides in a remote location and will need to be transported using virtual buckets. And that’s exactly what buffers are. A buffer is a memory region in which data is temporarily stored, until it is handled. Whether that source is an input stream or a socket connection, a buffer is required to temporarily store the incoming data until it is extracted by the program.

On a lower level, a microcontroller’s buffer is a register attached to the I/O (input/output) pin. Depending on the microcontroller family, its register size can vary from one character (8 bits/1 byte) to several (can go up to 64 bits/8 bytes). If this data is not extracted by the time new data comes in, the old data will be lost, or the communication interface will completely shut off because of a buffer overflow situation.

Modern computers offer a much larger buffer, and is situated somewhere in RAM for our programs to access. Programming languages also offer a multitude of functions that allow easy access and manipulation of buffers, using what’s called a stream. Going back to the water retrieval example, streams can be considered the straws or tubes that transport water from the bucket to your cup.

The simplest and most well-known buffer interaction programmers have is with the keyboard, which is displayed through the console. Learning any programming language requires console I/O knowledge, in order to have some interaction with the program, and to get feedback on the program’s execution. We’ll apply the same water source analogy here. The keyboard is the source, which is linked to a space in RAM that acts as its buffer. Our program then reads from that buffer using an input stream. Similarly, an output stream is hooked to a buffer that gets printed to the console by the operating system.

In the above demonstration, user input is stored in the keyboard buffer. scanf_s waits and keeps an eye on that buffer until it finds a 0x0D (ASCII value for a newline character), using a stream, and extracts the buffer components when a newline character is found. The stream that scanf_s accesses is actually a pointer that points by default to the keyboard buffer. The evidence of this is the ability to use that same code but with redirected input using the commandline redirection operator, "<". Follow the animation bellow:

The same concept can be applied to output: the default output stream is tied to the console buffer, and can be redirected from the command line with a “>”.

This is beyond the scope of this article, but input can also be redirected by employing pipes which are one of the ways to achieve inter-process communication.

With those animations in mind, we can see the difference in execution between a scanf_s and a getchar. scanf_s retrieves data based on a format specifier, and typically finishes reading after encountering a newline character or and EOF character. getchar on the other hand (and from the name) retrieves one character from the buffer at a time. That character will be popped off from the buffer and will only exist within your program. The remaing characters will still be stored in the buffer, and can either be read through one of the stdio functions, or by flushing the buffer. In order to put a character back to the buffer, putc can be called.