Tue Aug 4 15:16:25 PDT 2015

A Bash Script to Merge Directories: DirMerge.sh

When working on multiple machines, using external drives, and being constrained for disk space, it is all too easy to create cloned directory trees, which are similar but not identical to one another. Looking at the various directories cloned on my hard drive, I decided to create a simple script for merging directories, which is appended. Now, this is very simple, and be warned, it comes with no guarantees expressed or implied, and has minimal error checking. However, I find it useful and thought I would post it in case anyone else were interested.

Here is how it works...

1. Two arguments, the source directory, and the target directory are passed to the awk program

2. The awk program cd's to the source and target directories and builds associative arrays keyed on file names for the file's timestamp and file type (either file or directory)

3. Each file in the source directory is checked in the target directory.

4. If the same file name exists in the target, the checksums of each file are compared, if the files are identical, a command to delete the source file is stored

5. If the files are not identical, a warning is emitted and the file is left in place in the source directory for further investigation

6. If the source file or directory does not exist in the target it is moved to the target directory, again by storing the appropriate command

7. The user is shown the list of commands that the script has decided are required and asked if these should be executed

8. If requested, the merge commands are executed The effect is that identical files are deleted in the source (you already have them in the target after all). Files that are unique are copied to the target. Any files that are in conflict are left in place to be reconciled by hand.

As mentioned above - the script is crude and contains minimal error checking - use at your own risk!

#!/bin/sh
awk '
BEGIN{
  dir1="\"" ARGV[1] "\"/"
  dir2="\"" ARGV[2] "\"/"
  readdir(dir1, lista, typea)
  readdir(dir2, listb, typeb)
  for(filea in lista){
    if(filea in listb){
      if(typea[filea] == "f"){
        if(ckfile(dir1 "\"" filea "\"") != ckfile(dir2 "\"" filea "\"")){
          print "# " dir1 "\"" filea "\"" " " dir2 "\"" filea "\""
          print "# WARNING FILES DIFFER - CONTINUING - YOU NEED TO CHECK WHY!"
        } else {
           com[++ncom]="# files match " dir1 "\"" filea "\"" " " \
                        dir2 "\"" filea "\""
           com[++ncom]="/usr/bin/rm " dir1 "\"" filea "\""
        }
      }
    }else{
      if(typea[filea] == "d" ){
        dcom[++ndcom]="# directory needs to be created in the target"
        dcom[++ndcom]="mkdir -p " substr(dir2,1,length(dir2)-2) \
                       substr(filea,2) "\""
      } else {
        com[++ncom]="# file needs to be moved to the target"
        com[++ncom]="mv " substr(dir1,1,length(dir1)-2) substr(filea,2) \
                 "\"" " " substr(dir2,1,length(dir2)-2) substr(filea,2) "\""
      }
    }
  }
  if(!ncom && !ndcom){
    print "No updates required"
    exit
  }
  print "The following commands are needed to merge directories:"
  for(i=1;i<=ndcom;i++){
    print dcom[i]
  }
  for(i=1;i<=ncom;i++){
    print com[i]
  }
  print "Do you want to execute these commands?"
  getline ans < "/dev/tty"
  if( ans == "y" || ans == "Y"){
    for(i=1;i<=ndcom;i++){
      print "Executing: " dcom[i]
      escapefilename(dcom[i])
      system(dcom[i])
      close (dcom[i])
    }
    for(i=1;i<=ncom;i++){
      print "Executing: " com[i]
      escapefilename(com[i])
      system(com[i])
      close (com[i])
    }
  }
}
function ckfile(filename,   cmd)
{
    if (length(ck[filename])==0){
        cmd="cksum " filename
        cmd | getline ckout
        close(cmd)
        split(ckout, array," ")
        ck[filename]=array[1]
    }
    return ck[filename]
}
function escapefilename(name){
  gsub("\\$", "\\$", name)     # deal with dollars in filename
  gsub("\\(", "\\(", name)     # and parentheses
  gsub("\\)", "\\)", name)
}
function readdir(dir, list, type,        timestamp, ftype, name){
  cmd="cd " dir ";find . -printf \"%T@\\t%y\\t%p\\n\""
  print "Building list of files in: " dir
  while (cmd | getline > 0){
    timestamp=$1
    ftype=$2
    $1=$2=""
    name=substr($0,3)
    list[name]=int(timestamp)
    type[name]=ftype
  }
  close(cmd)
}' "$1" "$2"

Posted by ZFS | Permanent link | File under: bash