Not long ago I went through the “Concurrent Programming in Java” course on Coursera. Though the contents involved are very fundamental, I felt it’s necessary to give them a revisit since they are building blocks of more advanced contents in concurrent world.

Overview

Overall, I figured this course followed a bottom-up approach. It first introduces low-level Java concurrent primitives, and then go through some of the higher-level approaches that solve some of the problems when programming with the low-level concepts. From that end, it discussed some of the concurrent data structures, and formally introduced linearizability, which is the correctness requirement of any concurrent data structure.

I drew this simple diagram to illustrate the relationships between some basic concurrent concepts/approaches:

alt text

In this post I want to first revisit the lowest level – threads and locks.

Thread in Java

In Java, Thread is simply a class that implements the Runnable interface. Typically we pass in an object that implements Runnable to instantiate a Thread object.

When separated from application context, there isn’t too much to be discussed. But as Java 8 brings in functional programming and lambda expression, understanding some related language-specific details (though most of them might be regarded as syntactic sugar), in my opinion, could be necessary for starters.

One of these things is that Java imposes making local variable of a class final if they are to be accessed by lambdas in this class. For example, the snippet below won’t compile. There are many online resources for finding the reason for this (like this article), and in general it’s primarily because the ways Java manages closures.

public class Test {
    public void createThreadIncorrect(int num) {
        boolean run = true;
        Runnable runnable = () -> {
            System.out.println(num);
            while (run) {
                // Some operations.
            }
        }
        (new Thread(runnable)).start();
        run = false;
    }
}

In the above code, we are creating an anonymous class that implements Runnable interface. Java will use the autogenerated constructor to create the object, and this constructor will copy the value of num and run instead of using the num and run variables themselves. There are two major reasons for this behavior:

The compiler doesn’t need to generate extra information to hold the state of the variable, which might be to achieve. In contrast, some other languages, like Golang and C#, actually captures the variables themselves instead of their values.
In concurrent situations, things could get messy if final is not imposed. For example, in the while() loop above, the new thread will try to detect the change of local variable run, but since local variables are stored on stack (and different threads will have their own stacks), it’s not quite possible to observe the value change of run from another thread. Even if we might use volatile to make the value change visible to all threads, synchronization will be needed to prevent inconsistency, which brings in more complexity.

Therefore, to fix gthe above snippet, we need to make both the input argument num and variable run final.

Locks in Java

Locking mechanism in Java can be categorized into two classes: structured locks and unstructured locks. Structured locks basically use the intrinsic locks (which all objects have), and unstructured locks use the lock classes (e.g. ReentrantLock) within the java.util.concurrent.locks package.

Structured Lock

Intrinsic locks are used with synchronized key word, and their use cases can be divided into two types – synchronized methods and synchronized statement. Synchronized method is probably the most coarse-grained synchronization:

public class SyncTest {
    private static int counter = 0;
    public synchronized void syncPrint() {
        System.out.println("In syncPrint()");
    }

    public static synchronized void syncIncrementCounter() {
        counter++;
    }
}

In the code snippet above, the syncPrint() method will lock access to the SyncTest object on which the method is invoked. And in syncIncrementCounter(), since the method is static, the synchronized method can lock access to all the static variables for SyncTest class.

It’s also possible to use synchronized key word in finer granularity (but with cautious interleaving of accesses) with synchronized statement. For example, the syncIncrementCounter() metho above can be re-written as:

public void syncIncrementCounter() {
    synchronized(Test.class) {
        counter++;
    }
}

With synchronized, we can also use the wait-notify mechanism (i.e. guarded blocks). For example:

public class GuardedBlockTest {
    private boolean run = true;
    public synchronized void guardedWait() {
        while (run) {
            try {
                // Do some stuff.
                wait();
            } catch (InterruptedException e) {}
        }
    }

    public synchronized void notifyGuarded() {
        run = false;
        notifyAll();
    }
}

In the code snippet above, when guardedWait() method is invoked from an object of GuardedBlockTest block, the working thread first acquires the lock to get executed. Then when it enters the while loop, it will give up the lock since at wait() statement. This synchronization construct that allows a thread to have both mutual exlusion and the ability to wait for conditions is called a monitor, and it’s apparent that a monitor consists of a mutex object and condition variables.

Then, later in the program, another thread acquires the lock of this GuardedBlockTest object, resets the run variable and calls notifyAll(). This notifies all threads that are waiting for the same lock the change of run, and these threads will resume their execution at the wait() statement. Here we might also use notify() instead of notifyAll(), and the difference is that notify() picks up an arbitrary waiting thread to wake up. Although in both cases, it’s most likely that only one awakened thread can have the lock (one selected by system thread scheduler and the other selected by VM), we typically perfer notifyAll() over notify() unless all massive number of threads (workers in thread pool) are doing the same work.

Unstructured Lock

Unstructured locks basically assign lock object for certain shared resources. This approach has three major advantages over structured locks: first, since it can bind locks to objects with more flexibility, it can handle more complex synchronization scenarios; second, it provides a “back-off” behavior using tryLock() method; last but not least, it is of lighter weight than intrinsic locks (i.e. faster). Of course, when using unstructured locks, programmers are responsible for releasing the locks (typically in finally statement). Let’s look at an example on official javadoc:

public class SafeLockTest {
  static class Friend {
    private final String name;
    private Lock lock;
    public Friend(String _name) {
      name = _name;
      lock = new ReentrantLock();
    }

    public String getName() {
      return name;
    }

    public boolean impendingBow(Friend bower) {
      boolean myLock = false, yourLock = false;
      try {
        myLock = lock.tryLock();
        yourLock = bower.lock.tryLock();
      } finally {
        if (!myLock || !yourLock) {
          if (myLock) {
            lock.unlock();
          }
          if (yourLock) {
            bower.lock.unlock();
          }
        }
      }
      return myLock && yourLock;
    }

    public void bow(Friend bower) {
      if (impendingBow(bower)) {
        try {
          System.out.printf("%s: %s has bowed to me!\n", name, bower.getName());
          bower.bowBack(this);
        } finally {
          lock.unlock();
          bower.lock.unlock();
        }
      } else {
        System.out.printf("%s: %s intends to bow to me, but saw I've started bowing to him.\n", name, bower.getName());
      }
    }

    public void bowBack(Friend bower) {
      System.out.printf("%s: %s has bowed back to me!\n", name, bower.getName());
    }
  }

  static class BowLoop implements Runnable {
    private Friend bower, bowee;
    public BowLoop(Friend a, Friend b) {
      bower = a;
      bowee = b;
    }

    public void run() {
      Random rand = new Random();
      while (true) {
        try {
          Thread.sleep(rand.nextInt(10));
        } catch (InterruptedException e) {}
        bowee.bow(bower);
      }
    }
  }

  public static void main(String[] args) {
    final Friend a = new Friend("Alice");
    final Friend b = new Friend("Bob");
    (new Thread(new BowLoop(a, b))).start();
    (new Thread(new BowLoop(b, a))).start();
  }
}

In the above snippet, this.bow(Friend bower) means bower bows to this, and this.bowBack(Friend bower) means bower bows back to this. impendingBow() basicallly tries to acquires the locks of both bower and bowee, and backs off if the two locks can not be acquired – this is the most important step to avoid deadlock (deadlock can be easily produced if we write bow() and bowBack() methods above as two simple synchronized methods). The result of above code is presented below:

alt text

A Comparison

From a random number generator metric, we can easily observe that the intrinsic locks are way more heavy weighted. Another interesting observation is that, by assigning a booelan fair variable in the constructor of ReentrantLock, we will have a fair lock (meaning the first thread calling for the lock obtains the lock first), and this type of lock costs a lot more. Though heavy-weighted, the intrinsic locks don’t provide any fairness.

alt comparison

Conclusion

In this article we revisited the basic thread and lock primitives in Java, categorized the locks, and made some simple comparison. Hopefully the higher-level concurrent concepts/approaches can be covered in later posts.