Search

Monday, May 27, 2013

Java file organization - part 1

Location: Salt Lake City, UT, USA
In my car I keep all my music on a collection of, currently 8, 64GB USB thumb drives.  It's great!  I can fit years worth of music into a case that would only hold 5-10 CD's or DVD's.  I have a directory for each artist which contains a separate folder for each album.  I can print out this list of artists/albums very easily, print and laminate it, and keep it in my car to easily find what I am looking for.

This arrangement only really has two drawbacks: I have to split my music library into chunks less than 59.6GB (since disk manufacturers define a GB as 10^9, and your computer defines it as 2^30), and my car stereo will only support files of the type mp3, wma, m4a, and wav.  Files like video's, album covers, and playlist (.m3u) files are useless to transfer for my car.  Splitting my collection, efficiently, and weeding out the unneeded files would take me ages.  And my collection continues to expand.  I would be fumbling through menus, drag & drop, and shortcut keys forever doing it all by hand.


What are computers but the most versatile tools created by man?  Why spend my days doing something that a computer can do so much faster?  To my rescue comes Java, a handy little language with a short development time for small widgets like I would need, but the power and speed to efficiently manage a task as massive as I demanded.

The other day I took about five hours and wrote up a set of utilities to manage the vast majority of my music file management tasks, including building and copying my disk lists.  I like little utilities like this, they help me keep my skills limber.

I figured this might be a good topic to write a little how-to on some of the stuff I did.  I will assume you know some basic java as well as the LinkedList class.

The heart to any file organization is being able to find all the children of a given directory.  So to this end I wrote a helper method and two basic filters to do just that:

 public static String validExtRegex = "^.*?(\\.mp3|\\.wma|\\.m4a|\\.wav)$";

 public static LinkedList<File> getImmediateChildren(File root,
   boolean getDirs, boolean getFiles, boolean onlyValidFiles,
   boolean skipEmptyDirs) {

  LinkedList<File> list = new LinkedList<File>();

  if (!root.isDirectory())
   return list;

  if (getDirs) {
   File[] children = root.listFiles(isDirectory);
   for (int i = 0; i < children.length; i++) {
    if (skipEmptyDirs) {
     if (children[i].listFiles().length < 1)
      continue;
     boolean skip = true;
     for (File f : children)
      if (f.isDirectory()
        || (onlyValidFiles && f.isFile() && f.getName()
          .toLowerCase().matches(validExtRegex))) {
       skip = false;
       break;
     }
     if (skip)
      continue;
    }
    list.add(children[i]);
   }
  }

  if (getFiles) {
   File[] children = root.listFiles(isFile);
   for (int i = 0; i < children.length; i++) {
    if (onlyValidFiles
      && !children[i].getName().toLowerCase()
        .matches(validExtRegex))
     continue;
    list.add(children[i]);
   }
  }

  return list;
 }

 public static FileFilter isDirectory = new FileFilter() {
  @Override
  public boolean accept(File pathname) {
   return pathname.isDirectory();
  }
 };

 public static FileFilter isFile = new FileFilter() {
  @Override
  public boolean accept(File pathname) {
   return pathname.isFile();
  }
 };

We start by defining a regular expression (regex) to match known good file types.  A discussion of regex is beyond this post, but this basically says to match anything which ends in .mp3, .wma, .m4a, or .wav.
The method takes a File object which should be a directory which we want to retrieve the children for, the rest of the parameters are booleans for whether we should get files, directories, and if we should only include valid files or skip empty directories.
The real meat of this method is the root.listFiles(isDirectory) and root.listFiles(isFile) calls, which use the FileFilter objects we create below the method, which are called Anonymous Inner Classes.  This returns only files or directories from the listFiles() method of File.  If you are unfamiliar with the syntax I use to create these objects, search google for anonymous inner class.  The only real complexit in this method is in handling the exclusion of empty directories, and filtering out unmatched file types, but the logic involved should be self-evident if you have done much coding.

This method is great, but what we really need is the ability to list files/directories to any depth.  There are two ways to go about this sort of task, recursive, and iterative.  Though many people are tempted to go for the elegance of a recursive solution, for arbitrarily large data sets the stack overhead makes this less than an ideal candidate, so I have instead written an iterative solution:

 public static LinkedList<File> getAllChildren(File root, boolean getDirs,
   boolean getFiles, boolean onlyValidFiles, boolean skipEmptyDirs) {

  LinkedList<File> list = new LinkedList<File>(), dirsToCheck = new LinkedList<File>();

  if (root.isDirectory())
   dirsToCheck.add(root);

  while (!dirsToCheck.isEmpty()) {
   File file = dirsToCheck.pop();
   LinkedList<File> childDirs = getImmediateChildren(file, true,
     false, onlyValidFiles, skipEmptyDirs);
   LinkedList<File> childFiles = getImmediateChildren(file, false,
     true, onlyValidFiles, skipEmptyDirs);
   if (getFiles)
    list.addAll(childFiles);
   if (getDirs)
    list.addAll(childDirs);
   dirsToCheck.addAll(childDirs);
  }

  return list;
 }

This method should be fairly straightforward.  We are simply using a list named dirsToCheck to collect each directory we find, and we continue to remove and check each directory till the list is empty.  This is known as a breadth first search (as opposed to a depth first search) because it examines each complete level in the hierarchy before moving to the next.  This is a fairly simple method, most of the real work is being done in the first method we created.

In my next post I will show how I build my file lists to be written to thumb drives.  This will include deleting and copying files!

No comments:

Post a Comment