New Java Features

I decided to refresh my Java knowledge (last version I used was java 1.6), and because I learn by coding (much better than I learn by reading) here are code samples I prepared.

Note

This might be updated.

Java 1.7

Note

All examples were actually made on Java 1.8 JVM.

I used this as a reference list of changes ( Oracle comparison is way to comprehensive).

java.nio.path package

Java File class is awful, but a very nice api was introduced in this version of Java.

Package java.nio.path has a very nice tutorial by Oracle, so I won’t describe it here.

Here is my (very simple) example, it works similarly to Disk Usage Analyzer. or du command on Linux, that is: it summarizes disk usage for a folder.

To do this I just needed to implement a FileVisitor instance, and then pass it to Files.walkFileTree.

Code Highlights

Most of the logic is in Visitor that subclasses FileVisitor, this class is used to traverse whole directory tree. Inside this instance we keep track of where in the directory tree we are, by using a stack.

Queue<Long> fileSizes = Collections.asLifoQueue(new ArrayDeque<>());

Each entry in the stack corresponds to a parent directory of currently processed path, and each contains a total size of that directory.

To add size of current object to size of the parent following code is used:

private void pushSize(long size){
   long lastSize = fileSizes.poll();
   lastSize+=size;
   fileSizes.add(lastSize);
}
@Override
public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attrs) throws IOException {
    fileSizes.add(0L);
    return FileVisitResult.CONTINUE;
}

@Override
public FileVisitResult postVisitDirectory(Path dir, IOException exc) throws IOException {
    long dirSize = fileSizes.poll();
    pushSize(dirSize);
    if (maxDepthToDisplay<0 || maxDepthToDisplay >= fileSizes.size()) {
        System.out.println(level(fileSizes.size()) + dir + " " + humanReadableByteCount(dirSize, true));
    }
    return FileVisitResult.CONTINUE;
}

@Override
public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
    if (Files.isRegularFile(file)) {
        pushSize(Files.size(file));
    }
    return FileVisitResult.CONTINUE;
}

@Override
public FileVisitResult visitFileFailed(Path file, IOException exc) throws IOException {
    return FileVisitResult.CONTINUE;
}

Complete example

import java.io.IOException;
import java.nio.file.*;
import java.nio.file.attribute.BasicFileAttributes;
import java.util.ArrayDeque;
import java.util.Collections;
import java.util.Queue;

public class PathExamples {

    private static class Visitor extends SimpleFileVisitor<Path>{

        long maxDepthToDisplay;

        Queue<Long> fileSizes = Collections.asLifoQueue(new ArrayDeque<>());

        protected Visitor(long maxDepthToDisplay) {
            super();
            this.maxDepthToDisplay = maxDepthToDisplay;
            fileSizes.add(0L);
        }

        private void pushSize(long size){
            long lastSize = fileSizes.poll();
            lastSize+=size;
            fileSizes.add(lastSize);
        }

        private String level(int depth){
            StringBuilder sbr = new StringBuilder();
            for (int ii = 0; ii < depth; ii++) {
                sbr.append(" ");
            }
            return sbr.toString();
        }

        /**
         * http://stackoverflow.com/a/3758880
         */
        public static String humanReadableByteCount(long bytes, boolean si) {
            int unit = si ? 1000 : 1024;
            if (bytes < unit) return bytes + " B";
            int exp = (int) (Math.log(bytes) / Math.log(unit));
            String pre = (si ? "kMGTPE" : "KMGTPE").charAt(exp-1) + (si ? "" : "i");
            return String.format("%.1f %sB", bytes / Math.pow(unit, exp), pre);
        }

        @Override
        public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attrs) throws IOException {
            fileSizes.add(0L);
            return FileVisitResult.CONTINUE;
        }

        @Override
        public FileVisitResult visitFileFailed(Path file, IOException exc) throws IOException {
            return FileVisitResult.CONTINUE;
        }

        @Override
        public FileVisitResult postVisitDirectory(Path dir, IOException exc) throws IOException {
            long dirSize = fileSizes.poll();
            pushSize(dirSize);
            if (maxDepthToDisplay<0 || maxDepthToDisplay >= fileSizes.size()) {
                System.out.println(level(fileSizes.size()) + dir + " " + humanReadableByteCount(dirSize, true));
            }
            return FileVisitResult.CONTINUE;
        }

        @Override
        public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
            if (Files.isRegularFile(file)) {
                pushSize(Files.size(file));
            }
            return FileVisitResult.CONTINUE;
        }
    }

    public static void main(String[] args) throws IOException {
        Path path = Paths.get(args[0]);
        Files.walkFileTree(path, new Visitor(3));

    }
}

Fork Join Framework

Java 1.7 has very nice Fork-Join Framework, that allows one to dynamically split between cores, but here is the catch: we don’t know the amount of work needed upfront.

I have decided to try this framework, to (once again) summarize size of a directory tree.

This framework is nicely explained in the tutorials.

Overall I’m surprised with the performance of both naive and parralel implementation, naive version takes 6 seconds (when ran on my 150GB home directory), while parralel takes 3sec.

Code Highlights

Task result is a POJO object, containing path, it’s size, information whether this path is a directory, and sub directories (if any). Here is the definition:

private static class WalkFileResult{
      public final Path dirPath;
      public final boolean isDir;
      public final long dirSize;
      public final List<WalkFileResult> subdirs;

      public WalkFileResult(Path dirPath, long dirSize) {
          this(dirPath, dirSize, Collections.emptyList());
      }

      public WalkFileResult(Path dirPath, long dirSize, List<WalkFileResult> subdirs) {
          super();
          this.dirPath=dirPath;
          this.dirSize=dirSize;
          this.isDir=Files.isDirectory(dirPath);
          this.subdirs=subdirs;
      }
  }

Single task has following logic:

  1. If we are looking at a file, calculate file size and return it.
  2. If we are looking at a directory, create task for each child of the directory, execute these tasks in parrarel and then calculate the size.

In Java it is:

@Override
protected WalkFileResult compute() {
    try {
        if (Files.isRegularFile(currentPath)) {
            return new WalkFileResult(currentPath, Files.size(currentPath));
        } else if(Files.isDirectory(currentPath)) {
            List<WalkFileTask> subTasks = getSubtasks();
            return joinOnSubtasks(subTasks);
        }
    }catch (IOException | InterruptedException e){
        throw  new RuntimeException(e);
    }catch (ExecutionException e){
        throw new RuntimeException(e.getCause());
    }
    return new WalkFileResult(currentPath, 0L);
}

 private List<WalkFileTask> getSubtasks() throws IOException{
          // This visitor just returns immediate children of current path
          Visitor v = new Visitor(currentPath);
          Files.walkFileTree(currentPath, v);
          return v.subtasks;
      }

      private WalkFileResult joinOnSubtasks(List<WalkFileTask> subTasks) throws ExecutionException, InterruptedException {
          long size = 0;
          List<WalkFileResult> subDirs = new ArrayList<>();
          for (WalkFileTask res: invokeAll(subTasks)){
              WalkFileResult wfr = res.get();
              size+=wfr.dirSize;
              if (wfr.isDir){
                  subDirs.add(wfr);
              }
          }
          return new WalkFileResult(currentPath, size, subDirs);
      }

Complete example

package examples;

import javax.sound.midi.SysexMessage;
import java.io.IOException;
import java.nio.file.*;
import java.nio.file.attribute.BasicFileAttributes;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.ForkJoinTask;
import java.util.concurrent.RecursiveTask;



public class ForkJoinPath{

    private static class WalkFileResult{
        public final Path dirPath;
        public final boolean isDir;
        public final long dirSize;
        public final List<WalkFileResult> subdirs;

        public WalkFileResult(Path dirPath, long dirSize) {
            this(dirPath, dirSize, Collections.emptyList());
        }

        public WalkFileResult(Path dirPath, long dirSize, List<WalkFileResult> subdirs) {
            super();
            this.dirPath=dirPath;
            this.dirSize=dirSize;
            this.isDir=Files.isDirectory(dirPath);
            this.subdirs=subdirs;
        }
    }

    private static class WalkFileTask extends RecursiveTask<WalkFileResult>{

        private static class Visitor extends SimpleFileVisitor<Path>{

            public final Path root;

            public List<WalkFileTask> subtasks = new ArrayList<>();

            protected Visitor(Path root) {
                super();
                this.root = root;
            }

            @Override
            public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attrs) throws IOException {
                if(Files.isSameFile(dir, root)){
                    return FileVisitResult.CONTINUE;
                }
                if (Files.isReadable(dir)) {
                    subtasks.add(new WalkFileTask(dir));
                }
                return FileVisitResult.SKIP_SUBTREE;
            }

            @Override
            public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
                if (Files.isReadable(file) && Files.isRegularFile(file)) {
                    subtasks.add(new WalkFileTask(file));
                }
                return FileVisitResult.CONTINUE;
            }

            @Override
            public FileVisitResult visitFileFailed(Path file, IOException exc) throws IOException {
                return FileVisitResult.CONTINUE;
            }
        }

        private final Path currentPath;

        public WalkFileTask(Path currentPath) {
            super();
            this.currentPath=currentPath;
        }

        private List<WalkFileTask> getSubtasks() throws IOException{
            // This visitor just returns immediate children of current path
            Visitor v = new Visitor(currentPath);
            Files.walkFileTree(currentPath, v);
            return v.subtasks;
        }

        private WalkFileResult joinOnSubtasks(List<WalkFileTask> subTasks) throws ExecutionException, InterruptedException {
            long size = 0;
            List<WalkFileResult> subDirs = new ArrayList<>();
            for (WalkFileTask res: invokeAll(subTasks)){
                WalkFileResult wfr = res.get();
                size+=wfr.dirSize;
                if (wfr.isDir){
                    subDirs.add(wfr);
                }
            }
            return new WalkFileResult(currentPath, size, subDirs);
        }

        @Override
        protected WalkFileResult compute() {
            try {
                if (Files.isRegularFile(currentPath)) {
                    return new WalkFileResult(currentPath, Files.size(currentPath));
                } else if(Files.isDirectory(currentPath)) {
                    List<WalkFileTask> subTasks = getSubtasks();
                    return joinOnSubtasks(subTasks);
                }
            }catch (IOException | InterruptedException e){
                throw  new RuntimeException(e);
            }catch (ExecutionException e){
                throw new RuntimeException(e.getCause());
            }
            return new WalkFileResult(currentPath, 0L);
        }
    }

    private static String level(int depth){
        StringBuilder sbr = new StringBuilder();
        for (int ii = 0; ii < depth; ii++) {
            sbr.append(" ");
        }
        return sbr.toString();
    }

    /**
     * http://stackoverflow.com/a/3758880
     */
    public static String humanReadableByteCount(long bytes, boolean si) {
        int unit = si ? 1000 : 1024;
        if (bytes < unit) return bytes + " B";
        int exp = (int) (Math.log(bytes) / Math.log(unit));
        String pre = (si ? "kMGTPE" : "KMGTPE").charAt(exp-1) + (si ? "" : "i");
        return String.format("%.1f %sB", bytes / Math.pow(unit, exp), pre);
    }


    private static void printResult(WalkFileResult wfr, int depth){
        if (depth >= 3){
            return;
        }

        System.out.println(level(depth) + wfr.dirPath + " " + humanReadableByteCount(wfr.dirSize, false));
        for (WalkFileResult child: wfr.subdirs){
            printResult(child, depth+1);
        }


    }

    public static void main(String[] args) throws ExecutionException, InterruptedException {
        long start = System.nanoTime();
        Path path = Paths.get(args[0]);
        ForkJoinPool pool = new ForkJoinPool();
        WalkFileTask task = new WalkFileTask(path);
        pool.execute(task);

        printResult(task.get(), 0);
        double duration = (System.nanoTime() - start) * 1E-9;
        System.out.println(duration);



    }
}

Notable mentions

There is also very nice WatchService, that allows to monitor filesysem for file changes.

Java 1.8

Streams and Lambdas

Third attempt to do the same task: to summarize size of a directory tree.

This times using Streams and Lambdas. Solution is most concise, but least readable IMO. Also, while other solutions transparently handle unreadable directories, this one explodes with AccessDenied exception.

Code Highlights

Result POJO:

A function that can throw an exception:

@FunctionalInterface
public interface CheckedFunction<T, R> {
    R apply(T t) throws IOException;
}

A lambda that calculates file size:

CheckedFunction<Path, DirSize> mapper = (Path p) -> new DirSize(p,
    Files.walk(p).parallel()
    .filter(Files::isReadable)
    .mapToLong(StreamExamples::safeSize).sum());

A stream that walks over FS calculating size of each directory:

Files.walk(path, 3)
  .parallel().filter(Files::isDirectory).filter(Files::isReadable).map(
      (Path p) -> {
          try {
              return mapper.apply(p);
          } catch (IOException e) {
              return new DirSize(p, -1);
          }
      }).forEach(
        (DirSize d) ->
          System.out.println(d.path + " " + humanReadableByteCount(d.size, false)));

Complete example

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

/**
 * Created by jb on 12/3/15.
 */
public class StreamExamples {

    private static class DirSize{
        public final Path path;
        public final long size;

        public DirSize(Path path, long size) {
            this.path = path;
            this.size = size;
        }
    }

    @FunctionalInterface
    public interface CheckedFunction<T, R> {
        R apply(T t) throws IOException;
    }

    public static long safeSize(Path p){
        try {
            return Files.size(p);
        } catch (IOException e) {
            return 0;
        }

    }

    public static String humanReadableByteCount(long bytes, boolean si) {
        int unit = si ? 1000 : 1024;
        if (bytes < unit) return bytes + " B";
        int exp = (int) (Math.log(bytes) / Math.log(unit));
        String pre = (si ? "kMGTPE" : "KMGTPE").charAt(exp-1) + (si ? "" : "i");
        return String.format("%.1f %sB", bytes / Math.pow(unit, exp), pre);
    }

    public static void main(String[] args) throws IOException {
        long start = System.nanoTime();
        Path path = Paths.get(args[0]);

        CheckedFunction<Path, DirSize> mapper = (Path p) -> new DirSize(p,
                Files.walk(p, Integer.MAX_VALUE).parallel().filter(Files::isReadable).mapToLong(StreamExamples::safeSize).sum());

        Files.walk(path, 3)
            .parallel().filter(Files::isDirectory).filter(Files::isReadable).map(
                (Path p) -> {
                    try {
                        return mapper.apply(p);
                    } catch (IOException e) {
                        return new DirSize(p, -1);
                    }
                }).forEach((DirSize d) -> System.out.println(d.path + " " + humanReadableByteCount(d.size, false)));

        double duration = (System.nanoTime() - start) * 1E-9;
        System.out.println(duration);
    }


}