Friday, November 4, 2011

A Low Level File-copy Program in C language

Earlier we had written a program to copy the contents of one file to another. In that program we had read the file character by character using fgetc( ). Each character that was read was written into the target file using fputc( ). Instead of performing the I/O on a character by character basis we can read a chunk of bytes from the source file and then write this chunk into the target file. While doing so the chunk would be read into the buffer and would be written to the file from the buffer. While doing so we would manage the buffer ourselves, rather than relying on the library functions to do so. This is what is low-level about this program. Here is a program which shows how this can be done.
/* File-copy program which copies text, .com and .exe files */

#include "fcntl.h"
#include "types.h" /* if present in sys directory use
"c:tc\\include\\sys\\types.h" */
#include "stat.h" /* if present in sys directory use
"c:\\tc\\include\\sys\\stat.h" */
main ( int argc, char *argv[ ] )
{
char buffer[ 512 ], source [ 128 ], target [ 128 ] ;
int inhandle, outhandle, bytes ;
printf ( "\nEnter source file name" ) ;
gets ( source ) ;
inhandle = open ( source, O_RDONLY | O_BINARY ) ;
if ( inhandle == -1 )
{
puts ( "Cannot open file" ) ;
exit( ) ;
}
printf ( "\nEnter target file name" ) ;
gets ( target ) ;
outhandle = open ( target, O_CREAT | O_BINARY | O_WRONLY, S_IWRITE ) ;
if ( inhandle == -1 )
{
puts ( "Cannot open file" ) ;
close ( inhandle ) ;
exit( ) ;
}
while ( 1 )
{
bytes = read ( inhandle, buffer, 512 ) ;
if ( bytes > 0 )
write ( outhandle, buffer, bytes ) ;
else
break ;
}
close ( inhandle ) ;
close ( outhandle ) ;
}

Declaring the Buffer

The first difference that you will notice in this program is that we declare a character buffer,
char buffer[512] ;
This is the buffer in which the data read from the disk will be placed. The size of this buffer is important for efficient operation. Depending on the operating system, buffers of certain sizes are handled more efficiently than others.

Opening a File

We have opened two files in our program, one is the source file from which we read the information, and the other is the target file into which we write the information read from the source file.
As in high level disk I/O, the file must be opened before we can access it. This is done using the statement,
inhandle = open ( source, O_RDONLY | O_BINARY ) ;
We open the file for the same reason as we did earlier—to establish communication with operating system about the file. As usual, we have to supply to open( ), the filename and the mode in which we want to open the file. The possible file opening modes are given below:

O_APPEND  - Opens a file for appending 
O_CREAT     - Creates a new file for writing (has no effect
                        if file already exists)
O_RDONLY - Creates a new file for reading only O_RDWR
                      - Creates a file for both reading and writing
O_WRONLY - Creates a file for writing only 
O_BINARY  - Creates a file in binary mode
O_TEXT       - Creates a file in text mode

These ‘O-flags’ are defined in the file “fcntl.h”. So this file must be included in the program while usng low level disk I/O. Note that the file “stdio.h” is not necessary for low level disk I/O. When two or more O-flags are used together, they are combined using the bitwise OR operator ( | ). Chapter 14 discusses bitwise operators in detail.
The other statement used in our program to open the file is,
outhandle = open ( target, O_CREAT | O_BINARY | O_WRONLY, S_IWRITE ) ;
Note that since the target file is not existing when it is being opened we have used the O_CREAT flag, and since we want to write to the file and not read from it, therefore we have used O_WRONLY. And finally, since we want to open the file in binary mode we have used O_BINARY.
Whenever O_CREAT flag is used, another argument must be added to open( ) function to indicate the read/write status of the file to be created. This argument is called ‘permission argument’. Permission arguments could be any of the following:

S_IWRITE - Writing to the file permitted
S_IREAD - Reading from the file permitted
To use these permissions, both the files “types.h” and “stat.h” must be #included in the program alongwith “fcntl.h”.

File Handles

Instead of returning a FILE pointer as fopen( ) did, in low level disk I/O, open( ) returns an integer value called ‘file handle’. This is a number assigned to a particular file, which is used thereafter to refer to the file. If open( ) returns a value of -1, it means that the file couldn’t be successfully opened.

Interaction between Buffer and File

The following statement reads the file or as much of it as will fit into the buffer:
bytes = read ( inhandle, buffer, 512 ) ;
The read( ) function takes three arguments. The first argument is the file handle, the second is the address of the buffer and the third is the maximum number of bytes we want to read.
The read( ) function returns the number of bytes actually read. This is an important number, since it may very well be less than the buffer size (512 bytes), and we will need to know just how full the buffer is before we can do anything with its contents. In our program we have assigned this number to the variable bytes.
For copying the file, we must use both the read( ) and the write( ) functions in a while loop. The read( ) function returns the number of bytes actually read. This is assigned to the variable bytes. This value will be equal to the buffer size (512 bytes) until the end of file, when the buffer will only be partially full. The variable bytes therefore is used to tell write( ), as to how many bytes to write from the buffer to the target file.

Note that when large buffers are used they must be made global variables otherwise stack overflow occurs.

No comments:

Post a Comment