Linux file Searching, Replacing, Copying

         File searching, replacing, copying...

A. Third Edition
This is comp444 lab1 and it is concentrated on file operations such as searching, replacing, copying...
I want to keep this simple version for possible future modification. I have to backup my work because it is 
very dangerous in debugging such that there are two occasions that I almost destroyed my source code by running
the program in my source code directory.
B.The problem

COMP 444

System Software Design

Programming Assignment One

Due Date: midnight on Thursday, February 3

In this assignment, you¡¯ll get a chance to utilize a number of the calls that we¡¯ve looked at so far.

In addition, you¡¯ll have an opportunity to practice your C programming¡­since a number of you

are no doubt a little rusty at this point.

Objective: You work for a company that is named after the company founder. Unfortunately, the

company founder is named Stanley G. Moneyloser. Consultants have been contacted and after

great expense have determined that the company¡¯s lagging sales may be associated with the fact

that buying products from the Moneyloser Corporation may not be very appealing to potential

customers. Consequently, Stanley has decided to change the name to the Moneymaker

Corporation.

Unfortunately, there are thousands of files on the company¡¯s UNIX server that have the name

Moneyloser in them. Rather than trying to track all of them down and change them manually,

Stanley has asked you to write some kind of application that can do this automatically (Yes, I

realize that, in practice, this would probably be done with Perl). Based upon some serious

thought, the following requirements have been specified:

1. The application will take one command line argument: the name of a directory in which to

start the search.

2. The search will be recursive. As you find new directories inside the current directory, you will

have to visit them as well.

3. Symbolic links may exist within any directory. You should be able to handle the symbolic links

properly.

4. In terms of processing, your job will be to open all regular files and read their contents.

Remember, Unix treats all flies the same way ¨C text or binary.

5. You will be looking for the following sequence of 10 characters: ¡°Moneyloser¡±. You don¡¯t want

to just look for ¡°Money¡± since this may occur in places that have nothing to do with the

company name. Note: there are many ways to search for patterns. Some techniques ¨C like

regex matching ¨C can be quite powerful. However, in the current case, the matching can be

carried out simply by taking each consecutive sequence of 10 bytes and comparing it to the

Moneyloser ¡°test¡± string.

6. If a match is found in one or more files in the current directory, you will carry out the following

steps:

A. Create a new directory called currentdirector_oldfiles. For example, if the current

directory is called ¡°foo¡±, the new directory will be called foo_oldfiles. It should be created

as a sub-directory of foo (e.g., foo/foo_oldfiles). The new directory must be given

permissions equivalent to ¡°rwxrwx---¡°.

B. Create a file called mod.log in the oldfiles directory. It should be given permissions of ¡°rw-

------¡°. In fact, all files stored to the oldfiles directory should have these same permissions.

C. For each occurrence of ¡°Moneyloser¡± found in a given file (there can be many), write a

line to mod.log that looks like ¡°filename changed at #¡±, where # is the byte offset of the

text to be changed. As a concrete example, you might have something like ¡°abc.txt

changed at 167¡±.

¡¡

C.The idea of program
¡¡

From point of view of compiler project, it is a toy scanner with much simplified functionality. But it is

difficult for me to debug c in Linux. Another reason is the awkward C grammar. There is no "bool" type, there

is no function overloading, there is no ...

name: Qingzhe Huang
ID: 5037735

file list:			functionality summary                
	main.c			the main program  
	myhead.h		my own head file including all necessary header files
	errHandle.c		a simple error handle module can be complicated in future
	fileBuf.h		header of a FILE-like structure, I prefer to implementing my own buffered I/O
	fileBuf.c		my own buffered I/O, mimic C-Library like getc, fopen, fclose etc.
	searchString.h		major implementation is here, like traverse directory, handle log etc.
	searchString.c		handle log, creating directory, handle repeating directory, etc.
	dynaArray.h		my own version of self-incremental array, basically can be unlimited, any type of node
	dynaArray.c		user must supply both compare and eqaul function pointer to achieve dynamic array
	makefile		make file backup souce code to upper directory when finished compiling

major features:
	a) I implement my own version of fgetc, fputc, fopen, fseek, fclose, fflush, itoa etc, so that they can be used as independent
	library in future. That is one reason why this lab takes me more time than expected.

	b) I implement my own version of dynamic array which can self-increment when more elements are added. Also, it maintains order
	when new elements are inserted. So, I think my way has the highest efficiency compared any sorting methods because elements are
	sorted when they are copied into array instead of sorting after all are copied into array. That is why I didn't use any sorting
	library in this assignment. It is very handy to bookkeep what directory has been searched, and it is very easy to sort file 
	records and avoid overwrite file searching records into "mod.log" file. The basic idea is that whenever you insert a new node
	(It is passed as void*, so, it can be a pointer to anything, integer, char*, struct etc.) into array, you have also pass two
	function pointer as parameter. These function pointer takes two paramenters, both void*. One function is to compare two 
	elements and return "TRUE" when first is smaller than second element. The other function pointer returns "TRUE" when two 
	elements are equal. Therefore, when inserting the new element, the dynamic array will search and insert it into appropriate 
	place so that array is in ascending order. Then you don't need to sort them any more.

	c) Generally speaking, I assume serious mistakes happen when any system call fails. And I handle them very simply by exiting 
	program. One only exception is that "creating cur_oldfiles directory" which maybe created before. Another trivial one is that 
	soft link might point to some invalid file or directory and I simply ignore them when this happens.

	d) Originally I designed one highest-efficiency-algorithm by searching words, replacing words, copying old files during one 
	pass. Later I realized there is some logical difficulty handling these three jobs during one pass as they are more clear-cut in
	logic. So, I sacrifice efficiency by doing them in three passes. As a result, my program can search any capital or small case 
	combinations of "MoNeyLosER", but I can only replace them with a fixed string "Moneymaker" since my program only records the 
	file offset when spotting them. 

	e) My searching key words algorithm is based on idea of Deterministic Finite Automaton(DFA) without using any fancy library.
	Because I think this has the highest efficiency and reliability. Any algorithm must read character one by one internally. 
	Therefore instead of copying overhead, my program recognize "moneyloser" by repeating calling "mygetc".

	f) Generally speaking, my program should be able to handle arbitrarily depth of directory since my storage array are all self-
	incremental. However, Linux system has certain limits on length of path, I also follow this. However, for number of files or
	directories under one directory, there is no limit.

	g) Instead book-keeping the actual path of directory searched, I record its inode number in my dynamic array which is space-
	saving and fast and reliable when soft link is involved in the name of path.

	h) There is one known limit for soft link. It is malicious circular link. i.e. softlink1 pointing to softlink2 and softlink2 
	pointing to softlink1. There is no system call can get around this problem.

	i) The main program only accept one parameter as directory name. When it is invalid, program gives error messange and exits.
	
	j) If file name contains "hereisMonEyLOsErname", then my program will replace the string with "hereisMoneymakername" and if 
	in the file there exists string "MoneYLosER", it will also replace with "Moneymaker".
	
	k) If the "curdirectory_oldfiles" doesn't contain a file named "mod.log", I will assume the directory is created by user. 
	Otherwise I will not search the directory. 

major self-defined data structures and data types:
	a) defined in "fileBuf.h" and it is a "FILE-like" structure which records all buffered-I/O information.
	struct FileBuf
	{
        	int fd;
        	int size;
        	int offset;
        	int cur;
        	char buf[BuffSize];
	};
	typedef struct FileBuf* FilePtr;
	b) defined in "dynaArray.h" and it is structure which stores all necessary information of self-incremental array. 
	
	//each is a pointer
	struct DynaArray
	{
        	void** ptr;
        	int len;//max size, the index of last node
        	int cur;//current position, relative index of last node
        	int fold;
	};

	typedef struct DynaArray* DynaArrayPtr;

	c) defined in "searchString.h" and it is used to store the searching result. 
	struct FileLog
	{
        	char fileName[MaxNameLength];
        	int offset;
	};

	typedef struct FileLog Log;

	

	

D.The major functions
major self-defined data structures and data types:
        a) defined in "fileBuf.h" and it is a "FILE-like" structure which records all buffered-I/O information.
        struct FileBuf
        {       
                int fd;
                int size;
                int offset;
                int cur;
                char buf[BuffSize];
        };
        typedef struct FileBuf* FilePtr;
        b) defined in "dynaArray.h" and it is structure which stores all necessary information of self-incremental array.
        
        //each is a pointer
        struct DynaArray
        {       
                void** ptr;
                int len;//max size, the index of last node
                int cur;//current position, relative index of last node
                int fold;
        };
        
        typedef struct DynaArray* DynaArrayPtr;
        
        c) defined in "searchString.h" and it is used to store the searching result.
        struct FileLog
        {       
                char fileName[MaxNameLength];
                int offset;
        };
 	typedef struct FileLog Log;
E.Further improvement
¡¡
F.File listing
1. myhead.h
2. searchString.h
3. searchString.c
4. main.c
5. errHandle.c
6. dynaArray.h
7. dynaArray.c
8. fileBuf.h
9. fileBuf.c
10. makefile
¡¡
¡¡
file name: myhead.h
#ifndef MYHEAD_H
#define MYHEAD_H

#include <sys/types.h>
#include <sys/stat.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <dirent.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>


//define three mode and it is purely like
// DFA=Deterministic Finite Automaton
#define FindNext        0
#define Comparing       1

typedef int bool;



#define TRUE  1
#define FALSE 0

//for a particular file NAME, NOT path
//const int MaxNameLength=20;
#define MaxPathLength 128

#define MaxNameLength  20 

#define MaxIntLength   20

#define BuffSize   128

void errHandle(char* msg);
	
#endif


¡¡
file name: searchString.h
#ifndef SEARCHSTRING_H
#define SEARCHSTRING_H

#include "myhead.h"
#include "fileBuf.h"
#include "dynaArray.h"
#include <utime.h>

#define MaxPathLength 128

struct FileLog
{
	char fileName[MaxNameLength];
	int offset;
};

typedef struct FileLog Log;

bool findString(char* fileName, char* str, DynaArrayPtr arrayPtr);


//search in a file with my own file information struct FileBuf*, 
//target is the string for searching
//repeating call this function and return each location, 
//return -1 indicating end of file
int searchString(FilePtr stream, char* target);

bool compLog(void* first, void* second); 
bool equalLog(void* first, void* second);


Log* createLog(char* theName, int pos);

//I have to admit that my code is a little bit too fancy to show off
//however, I do try to write a general directory traverse function,
//but I later found out I have to make it specific for this particular assignment
//a general dir traverse function
void dirTraverse(char* path, DynaArrayPtr dirList);

void fileHandler(char* name, DynaArrayPtr arrayPtr);




#endif


file name: searchString.c
#include "searchString.h"

//funny, isn't it?
char* moneyLoser="Moneyloser";
char* moneyMaker="Moneymaker";

static bool renameFile(char* path, char* oldName);


void displayLog(DynaArrayPtr arrayPtr)
{
	int index;
	Log* log;
	for (index=0; index<arrayPtr->cur; index++)
	{
		log=(Log*)(arrayPtr->ptr[index]);
		//printf("file name: %s ; offset: %d\n", log->fileName, log->offset);
	}
}
		

char* getFileName(char* path)
{
	char* ptr=path;
	while (*ptr!='\0')
	{
		ptr++;
	}
	while (*ptr!='/')
	{
		ptr--;
	}
	ptr++;//pass over /
	return ptr;
}

bool findString(char* fileName, char* str, DynaArrayPtr arrayPtr)
{
	int fd, n=0;
	FilePtr stream;
	Log* log;
	bool result=FALSE;
	char* chPtr;
	if ((fd=open(fileName, O_RDONLY))<0)
	{
		printf("should be permission denied?\n");
		errHandle("cannot open searching files\n");
	}
	if ((stream=myfopen(fd))==NULL)
	{
		errHandle("openfile error\n");
	}
	chPtr=getFileName(fileName);
	while ((n=searchString(stream, str))!=-1)
	{
		result=TRUE;

		log=createLog(chPtr, n);

		addOne(arrayPtr, (void*)log, compLog, equalLog);
		//printf("test: find target at %d\n", n);
	}
	return result;
}



//starting a particular file offset and search the target string
//if found return the file offset of target, else return -1
int searchString(FilePtr stream, char* target)
{
	char ch;
	int result=-1;//this is the starting offset of word
	//the length of word compared
	int index, length;
	int mode;
	index=0;
	length=strlen(target);
	mode=FindNext;//initial status
	//-1 means EOF
	while ((ch=mygetc(stream))!=-1)
	{
		switch (mode)
		{
                case Comparing:
                        if (issamechar(ch, target[index]))
                        {
                                index++;
                                if (index==length)
	                        {
					return result;
                                }
                        }
                        else
                        {
				if (issamechar(ch, target[0]))
				{
					index=1;
					result=myftell(stream);
				}
				else
				{
                                	index=0;
					result=-1;//not found
                                	mode=FindNext;
				}
			}
			break;
		case FindNext:
			if (issamechar(ch, target[0]))
			{
				index=1;
				result=myftell(stream)-1;
				mode=Comparing;
			}
			break;
		}//switch
	}
	return result;
}


bool compLog(void* first, void* second)
{
	int result;
	Log* log1, *log2;
	log1=(Log*)first;
	log2=(Log*)second;
	if ((result=strcmp(log1->fileName, log2->fileName))==0)
	{
		//strictly less
		return log1->offset < log2->offset;
	}	
	else
	{
		return result < 0;
	}
}


/*		
//searching and copying, replacing for one single pass, it is the highest efficiency
bool searchReplace(char* sourceFile, char* targetFile, char* strSource, char* strTarget, DynaArrayPtr arradyPtr)
{
	int fdsource, int fdtarget;
	FilePtr source, target;
	char ch;
	Log* log;
	int mode, index, length;
	bool result=FALSE;
	length=strlen(strSource);
	if ((fdSource=open(sourceFile, O_RDWR))<0)
        {               
                errHandle("cannot open source file\n");
        }
        if ((fdTarget=open(targetFile, O_WRONLY))<0)
        {
                errHandle("cannot open target file\n");
        }

	source=myfopen(fdsource);
	target=myfopen(fdtarget);
	mode=FindNext;
	index=0;
	while ((ch=mygetc(source))!=-1)
	{
		myputc(target, ch);//copy always first
		switch (mode)
		{
		case Comparing:
			if (issamechar(ch, strSource[index]))
			{
				index++;
				if (index==length)
				{
					//find match, first rewind source, ready to write back
					myfseek(source, -length);
					//first write down the offset into pointer of dynamic array
					log=createLog(sourceFile, myftell(source));
					addOne(arrayPtr, (void*)log, compLog, equalLog);
					result=TRUE;					
					index=0;
					while (index<length)
					{
						myputc(source, strTarget[index++]);
					}
					//reset state
					index=0;
					mode=FindNext;
				}
			}
			else
			{
				index=0;
				mode=FindNext;
			}
			break;
		case FindNext:
			if (issamechar(ch, strSource[0]))
			{
				index=1;//this is explicit
				mode=Comparing;
			}
			//else remain unchanged
			break;
		}//switch
	}
	return result;
}		
*/


Log* createLog(char* theName, int pos)
{
	Log* result;
	if ((result =(Log*)malloc(sizeof(Log)))==NULL)
	{
		errHandle("create log fails\n");
	}
	strcpy(result->fileName, theName);
	result->offset=pos;
	return result;
}

/*
void writeLog(int fdLog, Log* log)
{
	char offsetBuf[MaxPathLength];//not necessarily so large
	myputs(fdLog, log->fileName);
	itoa(log->offset, offsetBuf, MaxPathLength);
	//myfseek(fdLog, 1);
	myputc(fdLog, ' ');
	myputs(fdLog, offsetBuf);
}
*/	
	
void logHandler(char* logName, DynaArrayPtr arrayPtr)
{
	FilePtr stream;
	int fdLog;
	char pathBuf[MaxPathLength];
	char offsetBuf[MaxPathLength];
	char logBuf[MaxPathLength+MaxIntLength+1];
	DynaArrayPtr fileRec;
	Log* log;
	int index=0, len, newSize=0; 
	if ((fdLog=open(logName, O_RDWR|O_CREAT, S_IRUSR|S_IWUSR))<0)
	{
		errHandle("error when open log files\n");
	}
	stream=myfopen(fdLog);
	fileRec=createArray();
	while (mygets(stream, pathBuf, MaxPathLength)!=-1)
	{
		//myfseek(fdLog, 1);
		if (mygets(stream, offsetBuf, MaxPathLength)==-1)
		{
			errHandle("corrupted mod.log record\n");
		}
		log=createLog(pathBuf, atoi(offsetBuf));
		addOne(fileRec, (void*)log, compLog, equalLog);
	}

	for (index=0; index<arrayPtr->cur; index++)
	{
		addOne(fileRec, arrayPtr->ptr[index], compLog, equalLog);
	}
	if (lseek(stream->fd, 0, SEEK_SET)<0)
	{
		errHandle("lseek error when begin writing log\n");
	}
	for (index=0; index<fileRec->cur; index++)
	{
		log=(Log*)fileRec->ptr[index];
		strcpy(logBuf,log->fileName);
		strcat(logBuf, "  ");
		len=strlen(logBuf);
		
		//starting from '\0'
		if (itoa(log->offset, logBuf+len, MaxIntLength)<0)
		{
			errHandle("itoa error when retrieve file offset\n");
		}
		strcat(logBuf, "\n");
		len=strlen(logBuf);//include the '\0'
		if (len!=write(stream->fd, logBuf, len))
		{
			errHandle("write error when writing file offset\n");
		}
		newSize+=len;
	}
	if (ftruncate(stream->fd, newSize)<0)
	{
		errHandle("ftruncate error when writing log files\n");
	}
	
}
			

bool equalLog(void* first, void* second)
{
        Log* log1, *log2;
        log1=(Log*)first;
        log2=(Log*)second;
        return strcmp(log1->fileName, log2->fileName)==0&&log1->offset==log2->offset;
}

bool renameFile(char* path, char* oldName)
{
	//char tempBuf[MaxPathLength];
        char oldNameBuf[MaxPathLength];
        char newNameBuf[MaxPathLength];
	struct stat statBuf;
	bool result=FALSE;
        int index=0, pos=-1, len=0, targetLen=strlen(moneyLoser);
	while (oldName[index]!='\0')
	{
		if (issamechar(oldName[index], moneyLoser[len]))
		{
			len++;
			if (len==targetLen)
			{
				result=TRUE;//repeating
				break;
			}
			if (!result)
			{
				pos=index;
				result=TRUE;
			}
		}
		else
		{
			//restart
			if (issamechar(oldName[index], moneyLoser[0]))
			{
				len=1;
				result=TRUE;
				pos=index;
			}
			else
			{
				len=0;
				result=FALSE;
			}
		}
		index++;
	}
        if (!result)
        {
                return result;
        }

	//search if there is a file name is moneyloser
        strcpy(oldNameBuf, path);
        strcat(oldNameBuf, "/");
        strcpy(newNameBuf, oldNameBuf);
        strcat(oldNameBuf, oldName);

	/*
	//copy first "pos" part
	strncpy(tempBuf, oldName, pos);
	strcat(tempBuf, moneyMaker);
	strcat(newNameBuf, tempBuf);
	strcpy(tempBuf, oldName+pos+targetLen);
	strcat(newNameBuf, tempBuf);
	*/
	len=strlen(newNameBuf);
	strcat(newNameBuf, oldName);
	len+=pos;
	newNameBuf[len]='\0';
	strcat(newNameBuf, moneyMaker);
	strcat(newNameBuf, oldName+pos+targetLen);
        if (lstat(newNameBuf, &statBuf)==0)
        {
                printf("probably there is already moneyMaker file in directory, so ignore renaming\n");
        }
        else
        {
                if (rename(oldNameBuf, newNameBuf)<0)
                {
        	        printf("rename %s to %s failed,reason unknown and ignore it\n", oldName, newNameBuf);                 
		}
                else
                {
                        //printf("rename %s to moneymaker successful\n", oldName);
			result=TRUE;
                }
        }   
	return result;
}	
	

//this is purely because Mokhov suggest a fancy functionality such that
//multiple directory name can be executed by passing them as parameter
//is it logical?
//I have to admit that my code is a little bit too fancy to show off
//however, I do try to write a general directory traverse function,
//but I later found out I have to make it specific for this particular assignment
//a general dir traverse function
void dirTraverse(char* path, DynaArrayPtr dirList)
{
	int pathLength;
	int index=0;
	char dirName[MaxPathLength];
	DIR* dp;
        int* pDirNum=(int*)malloc(sizeof(int));//I have to make a dynamic variable
	DynaArrayPtr arrayPtr;
	char fileName[MaxPathLength];
	char logName[MaxPathLength];
	struct stat statBuf;
	struct dirent* dirNode;
	pathLength=strlen(path);
	strcpy(fileName, path);
	arrayPtr=createArray();//create a dynamic array for Log of file

	index=pathLength;
	while (path[index]!='/')
	{
		index--;
	}
	index++;
	strcpy(dirName, path);
	strcat(dirName, "/");
	strcat(dirName, path+index);
	strcat(dirName, "_oldfiles");
        strcpy(logName, dirName);
        strcat(logName, "/");  
        strcat(logName, "mod.log");  

        if (stat(path, &statBuf)<0)
        {
                errHandle("stat error for dir\n");
        }
        *pDirNum=statBuf.st_ino;
        addOne(dirList, (void*)pDirNum, compInt, equalInt);

	if ((dp=opendir(path))==NULL)
	{
		errHandle("open dir error\n");
	}
	while ((dirNode=readdir(dp))!=NULL)
	{
		if (strcmp(dirNode->d_name, ".")==0||strcmp(dirNode->d_name, "..")==0)
		{
			continue;
		}
		/*
		//a test
		if (strcasecmp(dirNode->d_name, moneyLoser)==0)
		{
			printf("before \n");
			//strcpy(dirNode->d_name, moneyMaker);
			
			printf("after %s\n", dirNode->d_name);
		}
		*/

		if (renameFile(path, dirNode->d_name))
		{
			rewinddir(dp);
			continue;
		}
		fileName[pathLength]='/';
		fileName[pathLength+1]='\0';
		strcat(fileName, dirNode->d_name);
		//it is safe to open it first with lstat if the softlink point to invalid file
		if (lstat(fileName, &statBuf)<0)
		{
			errHandle("lstat error here for  \n");
		}
		if (S_ISLNK(statBuf.st_mode))
		{
	        	if (stat(fileName, &statBuf)<0)
                	{
                        	printf("<%s> is an invalid softlink, ignore it\n");
				continue;//important!!
                	}
		}
		else
		{
			//reopen
			if (stat(fileName, &statBuf)<0)
			{
				errHandle("stat error for reopen\n");
			}
		}
		switch(statBuf.st_mode & S_IFMT)
		{
		case S_IFREG:
			//handle regular files
			fileHandler(fileName, arrayPtr);
			break;
		case S_IFLNK:
			if (stat(fileName, &statBuf)<0)
			{
				printf("<%s> is an invalid softlink, ignore it\n");
			}
			//softlink handler
			break;
		case S_IFDIR:
                       //recursive call, and check if it is already in our list
                        //inode is the most reliable one to check
		        if (!findOne(dirList, (void*)&statBuf.st_ino, equalInt))
                        {
	          		//printf("here is it and ready for dir %s\n", dirName);
				//printf("comp filename %s vs. olddirname %s\n\n", fileName, dirName);
				if (strcmp(fileName, dirName)==0)//not the cur_oldfiles
				{
					//CHECK IF mod.log can be open
					if (open(logName, O_RDONLY)>=0)//IT IS INDEED CREATED BY ME
					{
						continue;
					}
				}
                             	dirTraverse(fileName, dirList);

                        }
			else
			{
				printf("<%s> is a directory already in list, ignore it\n", fileName);
			}
                        break;
		default:
			//just ignore
			break;
		}//switch
		
	}//while
	//here we need to copy all files
	//printf("ready to handle log %s\n", logName);
	logHandler(logName, arrayPtr);	
}


void fileHandler(char* name, DynaArrayPtr arrayPtr)
{
	char fullPath[MaxPathLength];
	char dirName[MaxPathLength];
	char fileName[MaxPathLength];
	char tempBuf[MaxPathLength];
	char backupName[MaxPathLength];
	int index, dirIndex=0;
	int fdCopy;
	struct stat statBuf;
	struct utimbuf timeBuf;
	if (stat(name, &statBuf)<0)
	{
		errHandle("stat error during file handling\n");
	}
	//printf("before search file %s\n", name);
	//displayLog(arrayPtr);
	if (findString(name, moneyLoser, arrayPtr))
	{
		//test
		//printf("target found in file %s\n", name);
		index=0;
		//copy file name
		while (name[index]!='\0')
		{
			fullPath[index]=name[index];
			index++;
		}
		
		//remove file name
		while (name[index]!='/')
		{			
			index--;
		}
		strcpy(fileName, name+index+1);
		fullPath[index+1]='\0';//include the last '/'
		index--;//pass over the last '/'
		//find current directory name
		while (name[index]!='/')
		{			
			index--;
		}
		index++;//pass over the second '/'
		while (name[index]!='/')
		{
			dirName[dirIndex]=name[index];
			index++;
			dirIndex++;
		}
		dirName[dirIndex]='\0';
		strcat(dirName, "_oldfiles");
		strcpy(tempBuf, fullPath);
		strcat(tempBuf, dirName);

		if (mkdir(tempBuf, S_IRUSR|S_IWUSR|S_IXUSR|S_IXGRP|S_IRGRP|S_IWGRP)<0)
		{
			//this indicate the directory is already there or error?
		}
		strcpy(backupName, tempBuf);
		strcat(backupName, "/");
		strcat(backupName, fileName);//create backup file name
		//printf("to copy %s to %s\n", name, backupName);
		copyFile(name, backupName);
		timeBuf.actime=statBuf.st_atime;
		timeBuf.modtime=statBuf.st_mtime;
		if (utime(backupName, &timeBuf)<0)
		{
			errHandle("utime error when file handling\n");
		}
		index=0;
		//open current file for replace
		if ((fdCopy=open(name, O_WRONLY))<0)
		{
			//printf("file name to be replaced by opening %s\n", name);
			errHandle("open error when replacing\n");
		}
		while (index<arrayPtr->cur)
		{
			if (strcmp(((Log*)(arrayPtr->ptr[index]))->fileName, fileName)==0)
			{
				replace(fdCopy, ((Log*)(arrayPtr->ptr[index]))->offset, moneyMaker);
			}
			index++;
		}
	}
	else
	{
		//printf("nothing found to replace\n");        
		//printf("no target found in filename=%s \n\n", name);        
	}	        
	//printf("after search file %s\n", name);
	//displayLog(arrayPtr);
}

file name: main.c
#include "myhead.h"
#include "searchString.h"
#include "dynaArray.h"
#include "fileBuf.h"

const int ArrayLength=100;
void showInt(void* ptr);

int main(int argc, char* argv[])
{
        DynaArrayPtr dynaPtr;
        if (argc!=2)
        {
                errHandle("usage:<main.o> <directoryname>\n");
        }
        dynaPtr=createArray();
        
        dirTraverse(argv[1], dynaPtr);
        
        return 0;
}
 

file name: errHandle.c
#include "myhead.h"

extern int errno;

void errHandle(char* msg)
{
	if (msg!=NULL)
	{
		perror(msg);
	}
	else
	{
		strerror(errno);
	}		
	if (errno==EACCES)
	{
		printf("cannot open file because permission denied, so ignore it\n");
	}
	else
	{
		exit(errno);
	}
}
	
file name: dynaArray.h
#ifndef DYNAARRAY_H
#define DYNAARRAY_H

#include "myhead.h"

#define DefaultLength  20

//each is a pointer
struct DynaArray
{
	void** ptr;
	int len;//max size, the index of last node
	int cur;//current position, relative index of last node
	int fold;
};

typedef struct DynaArray* DynaArrayPtr;

//define a function pointer for comparing in sorting
typedef bool CompFunc(void* first, void* second);

//define a show func type
typedef void ShowFunc(void* ptr); 

typedef bool EqualFunc(void* ptr1, void* ptr2);

//bool compLog(void* first, void* second);

//create with default length and fold is 1
DynaArrayPtr createArray();

//this can be only called internally after a node is added
void expand(DynaArrayPtr arrayPtr);


//this is typical C++ member function, 
void addEnd(DynaArrayPtr arrayPtr, void* ptr);//without sorting

//user need to supply with comp function, return true for first '<' second
//it must be strictly smaller, as equal used to indicate it is a copy
//the addOne function should return the index of node inserted, -1 if already there
int addOne(DynaArrayPtr arrayPtr, void* ptr, CompFunc* compFunc, EqualFunc* equalFunc);


//int addDistinct(DynaArrayPtr arrayPtr, void* ptr, CompFunc* compFunc, EqualFunc* equalFunc);

void displayArray(DynaArrayPtr arrayPtr, ShowFunc* showFunc); 

//bool equalLog(void* first, void* second);

bool compInt(void* first, void* second);

bool equalInt(void*first, void*second);
         
int findOne(DynaArrayPtr arrayPtr, void* ptr, EqualFunc* equalFunc);
		
#endif	
file name: dynaArray.c
#include "dynaArray.h"


bool findOne(DynaArrayPtr arrayPtr, void* ptr, EqualFunc* equalFunc)
{
	int index=0;
	while (index<arrayPtr->cur)
	{
		if (equalFunc(ptr, arrayPtr->ptr[index]))
		{
			return TRUE;
		}
		index++;
	}
	return FALSE;
}

/*
bool compLog(void* first, void* second)
{
        int result;
        Log* log1, *log2;
        log1=(Log*)first;
        log2=(Log*)second;
        if ((result=strcmp(log1->fileName, log2->fileName))==0)
        {
                //strictly less
                return log1->offset < log2->offset;
        }
        else
        {
                return result < 0;
        }
}
*/

bool compInt(void* first, void* second)
{
        return *((int*)first) < *((int*)second);
}

bool equalInt(void* first, void*second)
{
        return *((int*)first) == *((int*)second);
}


//it is a pity that in C, there is no function overloading and default parameter, so
//I have to copy paste all code in "addOne"
//int addDistinct(DynaArrayPtr arrayPtr, void* ptr, CompFunc* compFunc, EqualFunc* equalFunc)

/*
bool equalLog(void* first, void* second)
{
	Log* log1, *log2;
	log1=(Log*)first;
	log2=(Log*)second;
	return strcmp(log1->fileName, log2->fileName)==0&&log1->offset==log2->offset;
}
*/

//user need to supply with comp function, return true for first '<' second
//it must be strictly smaller, as equal used to indicate it is a copy
//the addOne function should return the index of node inserted, -1 if already there
int addOne(DynaArrayPtr arrayPtr, void* ptr, CompFunc* compFunc , EqualFunc* equalFunc)
{
	int pos=0, i;
	//1st case, there is no node in array
	while (pos<arrayPtr->len)
	{
		if (!compFunc(ptr, arrayPtr->ptr[pos]))
		{
			pos++;
		}
		else
		{
			if (ptr==arrayPtr->ptr[pos])
			{
				//2nd case, it is already there
				return -1;//it is a copy
			}
			if (equalFunc(ptr, arrayPtr->ptr[pos]))
			{
				return -1;
			}
			//should insert here, and there must have space, shift first
			for (i=arrayPtr->len; i>pos; i--)
			{
				arrayPtr->ptr[i]=arrayPtr->ptr[i-1];
			}
			//3rd case, it is shifted first
			break;
		}
	}	
	arrayPtr->ptr[pos]=ptr;//insert
	arrayPtr->len++;
	arrayPtr->cur++;
	if (arrayPtr->cur==DefaultLength)
	{
		expand(arrayPtr);
	}		
}



//delete dynamically allocated memory
void closeArray(DynaArrayPtr arrayPtr)
{
	free(arrayPtr->ptr);
	free(arrayPtr);
}
		
//create with default length and fold is 1
DynaArrayPtr createArray()
{
	DynaArrayPtr result;
	if ((result=(DynaArrayPtr)malloc(DefaultLength*sizeof(void*)))==NULL)
	{
		errHandle("cannot alloc memory for dyna array creation\n");
	}
	result->len=0;//the actual number of pointers in array
	result->cur=0;//internal use only
	result->fold=1;
	
	if ((result->ptr=(void**)malloc(DefaultLength*sizeof(void*)))==NULL)
	{
		errHandle("cannot alloc memory for dyna array\n");
	}
	return result;
}

//user must pass a user-defined function pointer for displaying struct
void displayArray(DynaArrayPtr arrayPtr, ShowFunc* showFunc)
{
	int i;
	for (i=0; i<arrayPtr->len; i++)
	{
		showFunc(arrayPtr->ptr[i]);
	}
}


//this is typical C++ member function, 
void addEnd(DynaArrayPtr arrayPtr, void* ptr)
{
	arrayPtr->ptr[arrayPtr->len++]=ptr;
	arrayPtr->cur++;
	if (arrayPtr->cur==DefaultLength)
        {
                expand(arrayPtr);
        }

}

void expand(DynaArrayPtr arrayPtr)
{
	arrayPtr->fold++;
	if ((arrayPtr->ptr=realloc(arrayPtr->ptr, arrayPtr->fold*DefaultLength*sizeof(void*)))==NULL)
	{
		errHandle("expand error for dyna array\n");
	}
	arrayPtr->cur=0;//it should be
}	


file name: fileBuf.h
#ifndef FILEBUF_H
#define FILEBUF_H

#include "myhead.h"

//the max path string length
#define MaxPathLength 128

struct FileBuf
{
	int fd;
	int size;
	int offset;
	int cur;
	char buf[BuffSize];
};

typedef struct FileBuf* FilePtr;

//my version of fgets()
int mygets(FilePtr stream, char* buffer, int bufSize);

//my version of int to char* or itoa
int itoa(int num, char* buf, int bufSize);

//if you are using myputc before you have to myfflush by yourself
//ok, I do it for you.
int myputs(FilePtr stream, char* str);


//specify if it is white space char
//notice here, in C, there is no boolean type and I define it
//as alias of int in "myhead.h"
bool isBreak(char ch);

void copyFile(char* source, char* target);

int myftell(FilePtr stream);

void replace(int fd, int offset, char* str);

//just return the file buffer, regardless read of write
FilePtr myfopen(int fd);

//a-z, A-Z
bool myisalpha(char ch);

//my edition of getc
char mygetc(struct FileBuf* stream);

void myputc(FilePtr stream, char ch);

//my version of fclose
void myfclose(FilePtr stream);

//fflush, user of writing must call this at end of file
void myfflush(FilePtr stream);

//typedef void FileHandler(struct dirent* dirNode, struct stat* statPtr, char* fullPath);
typedef void FileHandler(void* param1, void* param2, void* param3, void* param4);

//void dirTraverse(char* path, FileHandler* funcPtr);

#endif
		
	
	
¡¡
¡¡
file name: fileBuf.c
#include "fileBuf.h"

//the reason I don't want to combine both copy and replace together is
//that in the possible sorting like quick sort which is not stable sorting method
//the offset to be replaced in one file maybe sorted in different order,
//i.e. foo.txt 169 foo.txt 140 foo.txt 249
//in such situation, it is very unsafe to copy a chunk of file and then 
//replace. So, divide the job into TWO pass is a safe way.

//if you are using myputc before you have to myfflush by yourself
//ok, I do it for you.
int myputs(FilePtr stream, char* str)
{
	int len=strlen(str);
	int result;
	myfflush(stream);
	result= write(stream->fd, (void*)str, len);
	if (result>=0)
	{
		stream->offset+=result;
		stream->cur=0;		
	}
	return result;
}

/*
int myrewind(FilePtr stream)
{
	if (lseek(stream->fd, 0, SEEK_SET)<0)
	{
		errHandler("lseek error when rewind file ptr\n");
	}
	stream
*///useless and inconsistency	

//my version of int to char* or itoa
int itoa(int num, char* buf, int bufSize)
{
	int len=bufSize-1;
	int temp, start=0;
	div_t result;
	bool negative=FALSE;
	buf[len]='\0';
	//at least 2
	if (bufSize<2)
	{
		return -1;
	}
	if (num<0)
	{
		negative=TRUE;
		temp=-1*num;
	}
	else
	{
		temp=num;
	}
	len--;
	do
	{
		result=div(temp, 10);
		buf[len]='0'+result.rem;		
		temp=result.quot;
		len--;
	}
	while (temp!=0&&len>=0);
	if (temp==0)//ok
	{
		if (negative)
		{
			if (len>0)
			{
				buf[len]='-';
			}
			else
			{
				return -1;//unable to add '-' sign
			}		
		}
		else
		{
			len++;
		}
		//shift array
			
		while (len<bufSize)
		{
			buf[start]=buf[len];
			start++;
			len++;			
		}
		return start;//the count
	}
	else
	{
		return -1;
	}
}

//always from current, writing must flush by user first
int myfseek(FilePtr stream, int offset)
{
	if ((stream->offset=lseek(stream->fd, offset, SEEK_CUR))<0)
	{
		errHandle("lseek error\n");
	}
	stream->cur=0;//need to restart to get char
	return stream->offset;
}



//my version of fgets()
//return length of string read, -1 means EOF or error param
int mygets(FilePtr stream, char* buffer, int bufSize)
{
	int length=0, mode;
	char ch;
	//it must be bigger than 1
	if (bufSize<2)
	{
		return -1;
	}	
	mode=FindNext;
	while ((ch=mygetc(stream))!=-1)
	{
		switch (mode)
		{
		case FindNext:
			if (!isBreak(ch))
			{
				mode=Comparing;
				buffer[0]=ch;
				length=1;
			}
			break;
		case Comparing:
                        if (length==bufSize)
                        {
                                buffer[length]='\0';
                                return length;
                        }
			if (isBreak(ch))
			{
				buffer[length]='\0';
				return length;
			}
			else
			{
				buffer[length]=ch;
				length++;
			}
		}//switch
	}//while
	if (mode==Comparing)
	{
		buffer[length]='\0';
		return length;
	}
	else
	{
		//also means EOF
		return -1;
	}
}
				

//always case insensitive and ch2 is gurantee to be alphabet
bool issamechar(char ch1, char ch2)
{
	if (ch1==ch2)
	{
		return TRUE;
	}
	if (ch2<='Z')
	{
		return ch1-ch2=='a'-'A';
	}
	else
	{
		return ch2-ch1=='a'-'A';
	}
}

//it is of course low efficiency to first copy and replace, but 
//it is easy to code
void copyFile(char* source, char* target)
{
	int fdOld, fdNew, n;
	char buf[BuffSize];
	if ((fdOld=open(source, O_RDONLY))<0)
	{
		errHandle("open error when copying file\n");
	}
	if ((fdNew=open(target, O_WRONLY|O_CREAT|O_TRUNC, S_IRUSR|S_IWUSR))<0)
	{
		errHandle("open error when copying file\n");
	}
	while ((n=read(fdOld, buf, BuffSize))!=0)
	{
		if (n!=write(fdNew, buf, n))
		{
			errHandle("write error when copying file\n");
		}
	}
}
	
void replace(int fd, int offset, char* str)
{
	int size;
	if (str==NULL)
	{
		printf("cannot replace empty string\n");
		return;
	}
	//strlen is just a function and it will be used more than once
	size=strlen(str);
	if (lseek(fd, offset, SEEK_SET)<0)
	{
		errHandle("lseek error when replacing\n");
	}
	if (size!=write(fd, str, size))
	{
		errHandle("write error when replace\n");
	}
	
}

//my version of fclose
void myfclose(FilePtr ptr)
{
	free(ptr);
}


//actually it only create file buff struct for both read and write purpose
FilePtr myfopen(int fd)
{
	FilePtr result=(FilePtr)malloc(sizeof(struct FileBuf));
	if (result==NULL)
	{
		errHandle("cannot allocate memory for FileBuf\n");
	}
	result->fd=fd;
	result->offset=0;
	result->cur=0;
	/*
	if ((result->size=read(fd, result->buf, BuffSize))==-1)
	{
		errHandle("unable to start reading file for myfopen\n");
	}
	*/
	result->size=0;//first setup as 0
	return result;
}

//fflush
void myfflush(FilePtr stream)
{
	if (stream->cur>0)
	{
		if (stream->cur!=write(stream->fd, stream->buf, stream->cur))
		{
			errHandle("fflush error\n");
		}
		stream->cur=0;//ready for new write
	}
}
	

//putc
void myputc(FilePtr stream, char ch)
{
	if (stream->cur>= stream->size)
	{
		if (BuffSize !=write(stream->fd, stream->buf, BuffSize))
		{
			errHandle("write error of file buffer\n");
		}
		stream->cur=0;
	}
	stream->offset++;
	stream->buf[stream->cur]=ch;
	stream->cur++;
}


//my verstion of fgetc
char mygetc(FilePtr stream)
{
	//if buff is fully read
	if (stream->cur >= stream->size)
	{
		/*
		if (stream->size==0)//must be end of file
		{
			return -1;
			//return EOF;
		}
		*/
		//if erro happened during reading
		if ((stream->size=read(stream->fd, stream->buf, BuffSize))<0)
		{
			errHandle("read error of file buf\n");
		}
	        if (stream->size==0)//must be end of file
                {
                        return -1;
                        //return EOF;
                }
		//else reset cur
		stream->cur=0;
	}
	//offset is always inc for each call
	stream->offset++;
	return stream->buf[stream->cur++];
}

	
//what is white space
bool isBreak(char ch)
{
	return (ch==' '||ch=='\t'||ch=='\n'||ch=='\0');
}

int myftell(FilePtr stream)
{
	return stream->offset;
}

bool myisalpha(char ch)
{
	return ((ch>='A'&&ch<='Z')||(ch>='a'&&ch<='z'));
}


file name: makefile
all:    main.o 
	cp -f ./main.o ./dirtest/
	@echo "make complete for main.o "
	cp -f *.c ../labbackup/
	cp -f *.h ../labbackup/
fileBuf.o : fileBuf.c fileBuf.h myhead.h
	@echo "compiling fileBuf.o..."
	gcc -g -c fileBuf.c -o fileBuf.o

errHandle.o : errHandle.c myhead.h
	@echo "compiling errHanle module..."
	gcc -g -c errHandle.c -o errHandle.o

main.o : searchString.o main.c errHandle.o myhead.h fileBuf.o dynaArray.o
	@echo "compiling main.o..." 
	gcc -g main.c searchString.o errHandle.o fileBuf.o dynaArray.o -o main.o

searchString.o : searchString.h searchString.c myhead.h
	@echo "compiling searchString.o..."
	gcc -g -c searchString.c -o searchString.o

dynaArray.o : dynaArray.h dynaArray.c myhead.h
	@echo "compiling dynaArray.o..."
	gcc -g -c dynaArray.c -o dynaArray.o

clear :
	@echo "clear up...\n"
	rm *.o







How to run?
1. You need the file name as argument in command line
[qingz_hu@alamanni ~/lab1] % cd dirtest
[qingz_hu@alamanni dirtest] % ./main.o ./dirtest1
<./dirtest1/dirtest2/softlink2> is a repeat directory which is already in search list 
[qingz_hu@alamanni dirtest] % ../make
../make: Command not found.
[qingz_hu@alamanni dirtest] % cd ..
[qingz_hu@alamanni ~/lab1] % make
compiling searchString.o...
gcc -g -c searchString.c -o searchString.o
compiling main.o...
gcc -g main.c searchString.o errHandle.o fileBuf.o dynaArray.o -o main.o
cp -f ./main.o ./dirtest/
make complete for main.o 
cp -f *.c ../labbackup/
cp -f *.h ../labbackup/
[qingz_hu@alamanni ~/lab1] % cd dirtest
[qingz_hu@alamanni dirtest] % ./main.o ./dirtest1
<./dirtest1/dirtest2/softlink2> is a directory already in list, ignore it
[qingz_hu@alamanni dirtest] % 





                                 back.gif (341 bytes)       up.gif (335 bytes)         next.gif (337 bytes)