Category Archives: Guava

Groovy + GPars: Handling Concurrency

PROBLEM

Let’s assume given a list of user IDs (ex: 1, 2, 3, 4, and 5), we need to query 2 data sources to get the names and the addresses before returning a list of Employee objects. The Employee class looks like this:-

@ToString(includePackage = false)
class Employee {
    Long id
    String name
    String address
}

To simplify the example, let’s assume the name lookup takes 2 seconds to run and the address lookup takes 4 seconds to run.

And now, we have the following code:-

class App {
    String nameLookup(Integer id) {
        log("Name lookup: ${id}")
        Thread.sleep(2000)
        "User ${id}"
    }

    String addressLookup(Integer id) {
        log("Addr lookup: ${id}")
        Thread.sleep(4000)
        "Address ${id}"
    }

    void log(String message) {
        println "${Thread.currentThread()} - ${new Date().format('HH:mm:ss')} - ${message}"
    }

    List<Employee> lookup(List<Integer> userIds) {
        ???
    }

    void run() {
        def start = System.currentTimeMillis()

        def employees = lookup([1, 2, 3, 4, 5])

        def end = System.currentTimeMillis()

        println '---------------------------------'
        println "Total time in seconds: ${(end - start) / 1000}"
        println '---------------------------------'

        employees.each {
            println it
        }
    }

    static void main(String[] args) {
        new App().run()
    }
}

We are going to explore multiple solutions to implement lookup(..) to run as fast as possible.

ATTEMPT 1

The simplest and the most straightforward approach is to perform the task synchronously.

Code:-

List<Employee> lookup(List<Integer> userIds) {
    userIds.collect { id ->
        new Employee(
                id: id,
                name: nameLookup(id),
                address: addressLookup(id)
        )
    }
}

Output:-

Thread[main,5,main] - 20:54:09 - Name lookup: 1
Thread[main,5,main] - 20:54:11 - Addr lookup: 1
Thread[main,5,main] - 20:54:15 - Name lookup: 2
Thread[main,5,main] - 20:54:17 - Addr lookup: 2
Thread[main,5,main] - 20:54:21 - Name lookup: 3
Thread[main,5,main] - 20:54:23 - Addr lookup: 3
Thread[main,5,main] - 20:54:27 - Name lookup: 4
Thread[main,5,main] - 20:54:29 - Addr lookup: 4
Thread[main,5,main] - 20:54:33 - Name lookup: 5
Thread[main,5,main] - 20:54:35 - Addr lookup: 5
---------------------------------
Total time in seconds: 30.129
---------------------------------
Employee(1, User 1, Address 1)
Employee(2, User 2, Address 2)
Employee(3, User 3, Address 3)
Employee(4, User 4, Address 4)
Employee(5, User 5, Address 5)

Since everything runs synchronously in one thread, it takes about 30 seconds to complete.

ATTEMPT 2

In this attempt and the rest of the attempts, we are going to use GPars, which is a concurrency and parallelism library.

<dependency>
    <groupId>org.codehaus.gpars</groupId>
    <artifactId>gpars</artifactId>
    <version>1.2.1</version>
</dependency>

In this attempt, we are going to use GParsPool.executeAsyncAndWait(..).

Code:-

List<Employee> lookup(List<Integer> userIds) {
    (List<Employee>) GParsPool.withPool {
        userIds.collect { id ->
            def results = GParsPool.executeAsyncAndWait(
                    this.&nameLookup.curry(id),
                    this.&addressLookup.curry(id)
            )

            new Employee(
                    id: id,
                    name: results[0],
                    address: results[1]
            )
        }
    }
}

Output:-

Thread[ForkJoinPool-1-worker-2,5,main] - 20:54:40 - Addr lookup: 1
Thread[ForkJoinPool-1-worker-1,5,main] - 20:54:40 - Name lookup: 1
Thread[ForkJoinPool-1-worker-2,5,main] - 20:54:44 - Name lookup: 2
Thread[ForkJoinPool-1-worker-1,5,main] - 20:54:44 - Addr lookup: 2
Thread[ForkJoinPool-1-worker-1,5,main] - 20:54:48 - Name lookup: 3
Thread[ForkJoinPool-1-worker-2,5,main] - 20:54:48 - Addr lookup: 3
Thread[ForkJoinPool-1-worker-1,5,main] - 20:54:52 - Addr lookup: 4
Thread[ForkJoinPool-1-worker-2,5,main] - 20:54:52 - Name lookup: 4
Thread[ForkJoinPool-1-worker-2,5,main] - 20:54:56 - Addr lookup: 5
Thread[ForkJoinPool-1-worker-1,5,main] - 20:54:56 - Name lookup: 5
---------------------------------
Total time in seconds: 20.184
---------------------------------
Employee(1, User 1, Address 1)
Employee(2, User 2, Address 2)
Employee(3, User 3, Address 3)
Employee(4, User 4, Address 4)
Employee(5, User 5, Address 5)

This allows us to run both the name lookup and the address lookup concurrently for each user ID.

Speed gain compared to first attempt: 1.5x

ATTEMPT 3

How about collectParallel(..)?

Code:-

List<Employee> lookup(List<Integer> userIds) {
    (List<Employee>) GParsPool.withPool {
        userIds.collectParallel { Integer id ->
            new Employee(
                    id: id,
                    name: nameLookup(id),
                    address: addressLookup(id)
            )
        }
    }
}

Output:-

Thread[ForkJoinPool-2-worker-1,5,main] - 20:55:00 - Name lookup: 1
Thread[ForkJoinPool-2-worker-4,5,main] - 20:55:00 - Name lookup: 4
Thread[ForkJoinPool-2-worker-3,5,main] - 20:55:00 - Name lookup: 2
Thread[ForkJoinPool-2-worker-2,5,main] - 20:55:00 - Name lookup: 3
Thread[ForkJoinPool-2-worker-5,5,main] - 20:55:00 - Name lookup: 5
Thread[ForkJoinPool-2-worker-4,5,main] - 20:55:02 - Addr lookup: 4
Thread[ForkJoinPool-2-worker-3,5,main] - 20:55:02 - Addr lookup: 2
Thread[ForkJoinPool-2-worker-1,5,main] - 20:55:02 - Addr lookup: 1
Thread[ForkJoinPool-2-worker-2,5,main] - 20:55:02 - Addr lookup: 3
Thread[ForkJoinPool-2-worker-5,5,main] - 20:55:02 - Addr lookup: 5
---------------------------------
Total time in seconds: 6.156
---------------------------------
Employee(1, User 1, Address 1)
Employee(2, User 2, Address 2)
Employee(3, User 3, Address 3)
Employee(4, User 4, Address 4)
Employee(5, User 5, Address 5)

This allows us to run all name lookups concurrently, followed by all address lookup concurrently.

Speed gain compared to first attempt: 4.9x

ATTEMPT 4

Perhaps, another approach is to kick start all the lookups asynchronously and hold on to the returned Future objects:-

Code:-

List<Employee> lookup(List<Integer> userIds) {
    (List<Employee>) GParsPool.withPool {
        List<Future> nameFutures = userIds.collect {
            this.&nameLookup.callAsync(it)
        }

        List<Future> addressFutures = userIds.collect {
            this.&addressLookup.callAsync(it)
        }

        userIds.withIndex().collect { Integer id, Integer i ->
            new Employee(
                    id: id,
                    name: nameFutures[i].get(),
                    address: addressFutures[i].get()
            )
        }
    }
}

Output:-

Thread[ForkJoinPool-3-worker-5,5,main] - 20:55:06 - Name lookup: 5
Thread[ForkJoinPool-3-worker-4,5,main] - 20:55:06 - Name lookup: 4
Thread[ForkJoinPool-3-worker-2,5,main] - 20:55:06 - Name lookup: 2
Thread[ForkJoinPool-3-worker-3,5,main] - 20:55:06 - Name lookup: 3
Thread[ForkJoinPool-3-worker-1,5,main] - 20:55:06 - Name lookup: 1
Thread[ForkJoinPool-3-worker-3,5,main] - 20:55:08 - Addr lookup: 4
Thread[ForkJoinPool-3-worker-2,5,main] - 20:55:08 - Addr lookup: 3
Thread[ForkJoinPool-3-worker-4,5,main] - 20:55:08 - Addr lookup: 2
Thread[ForkJoinPool-3-worker-5,5,main] - 20:55:08 - Addr lookup: 1
Thread[ForkJoinPool-3-worker-1,5,main] - 20:55:08 - Addr lookup: 5
---------------------------------
Total time in seconds: 6.032
---------------------------------
Employee(1, User 1, Address 1)
Employee(2, User 2, Address 2)
Employee(3, User 3, Address 3)
Employee(4, User 4, Address 4)
Employee(5, User 5, Address 5)

It’s slightly better than the previous attempt, but the performance gain is rather negligible.

Speed gain compared to first attempt: 5.0x

ATTEMPT 5

What if we take the previous attempt and increase the number of threads to 10?

Code:-

List<Employee> lookup(List<Integer> userIds) {
    GParsPool.withPool 10, {
        List<Future> nameFutures = userIds.collect {
            this.&nameLookup.callAsync(it)
        }

        List<Future> addressFutures = userIds.collect {
            this.&addressLookup.callAsync(it)
        }

        userIds.withIndex().collect { Integer id, Integer i ->
            new Employee(
                    id: id,
                    name: nameFutures[i].get(),
                    address: addressFutures[i].get()
            )
        }
    }
}

Output:-

Thread[ForkJoinPool-4-worker-1,5,main] - 20:55:12 - Name lookup: 1
Thread[ForkJoinPool-4-worker-2,5,main] - 20:55:12 - Name lookup: 2
Thread[ForkJoinPool-4-worker-3,5,main] - 20:55:12 - Name lookup: 3
Thread[ForkJoinPool-4-worker-4,5,main] - 20:55:12 - Name lookup: 4
Thread[ForkJoinPool-4-worker-5,5,main] - 20:55:12 - Name lookup: 5
Thread[ForkJoinPool-4-worker-6,5,main] - 20:55:12 - Addr lookup: 1
Thread[ForkJoinPool-4-worker-8,5,main] - 20:55:12 - Addr lookup: 3
Thread[ForkJoinPool-4-worker-7,5,main] - 20:55:12 - Addr lookup: 2
Thread[ForkJoinPool-4-worker-9,5,main] - 20:55:12 - Addr lookup: 4
Thread[ForkJoinPool-4-worker-10,5,main] - 20:55:12 - Addr lookup: 5
---------------------------------
Total time in seconds: 4.016
---------------------------------
Employee(1, User 1, Address 1)
Employee(2, User 2, Address 2)
Employee(3, User 3, Address 3)
Employee(4, User 4, Address 4)
Employee(5, User 5, Address 5)

Speed gain compared to first attempt: 7.5x

And now, we have successfully improve our implementation performance from 30 seconds to 4 seconds.

Advertisements

IntelliJ IDEA 14.1: Better equals(), hashCode() and toString()

PROBLEM

Let’s assume we want to create the default equals(), hashCode() and toString() with the following bean:-

public final class Person {
    private final String name;
    private final Collection<Car> cars;

    public Person(final String name, final Collection<Car> cars) {
        this.name = name;
        this.cars = cars;
    }

    public String getName() {
        return name;
    }

    public Collection<Car> getCars() {
        return cars;
    }
}

Most IDEs, including older version of IntelliJ, have code generation features that would create something similar to this:-

public final class Person {
    ...

    @Override
    public boolean equals(final Object o) {
        if (this == o) {
            return true;
        }
        if (o == null || getClass() != o.getClass()) {
            return false;
        }

        final Person person = (Person) o;

        if (name != null ? !name.equals(person.name) : person.name != null) {
            return false;
        }
        return !(cars != null ? !cars.equals(person.cars) : person.cars != null);
    }

    @Override
    public int hashCode() {
        int result = name != null ? name.hashCode() : 0;
        result = 31 * result + (cars != null ? cars.hashCode() : 0);
        return result;
    }

    @Override
    public String toString() {
        return "Person{" +
               "name='" + name + '\'' +
               ", cars=" + cars +
               '}';
    }
}

While it works, the generated code is usually crazy horrendous.

SOLUTION

With IntelliJ 14.x, it allows us to select templates from several proven libraries.

To generate equals() and hashCode(), select equals() and hashCode() option from the Generate pop-up dialog:-

There are several templates to choose from:-

Here’s an example of equals() and hashCode() using Guava and getter methods:-

public final class Person {
    ...

    @Override
    public boolean equals(final Object o) {
        if (this == o) {
            return true;
        }
        if (o == null || getClass() != o.getClass()) {
            return false;
        }
        final Person person = (Person) o;
        return Objects.equal(getName(), person.getName()) &&
               Objects.equal(getCars(), person.getCars());
    }

    @Override
    public int hashCode() {
        return Objects.hashCode(getName(), getCars());
    }
}

To generate toString(), select toString() option from the Generate pop-up dialog:-

Again, there are several templates to choose from:-

Here’s an example of toString() using Guava:-

public final class Person {
    ...

    @Override
    public String toString() {
        return Objects.toStringHelper(this)
                .add("name", name)
                .add("cars", cars)
                .toString();
    }
}

Guava: Testing equals(..) and hashcode(..)

PROBLEM

Let’s assume we want to test the following equals(..):-

public class Person {
    private String name;
    private int age;

    @Override
    public boolean equals(Object o) {
        if (o == null || getClass() != o.getClass()) {
            return false;
        }

        Person person = (Person) o;
        return Objects.equal(name, person.name);
    }

    @Override
    public int hashCode() {
        return Objects.hashCode(name);
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }
}

A correctly implemented equals(..) must be reflexive, symmetric, transitive, consistent and handles null comparison.

In another word, you have to write test cases to pass at least these 5 rules. Anything less is pure bullshit.

SOLUTION

You can write these tests yourself… or you can leverage Guava’s EqualsTester. This library will test these 5 rules and ensure the generated hashcode matches too.

First, include the needed dependency:-

<dependency>
    <groupId>com.google.guava</groupId>
    <artifactId>guava-testlib</artifactId>
    <version>18.0</version>
    <scope>test</scope>
</dependency>

Instead of writing JUnit tests, I’ll be writing Spock specs, which is essentially built on top of Groovy, because it allows me to write very clear and clean tests.

class PersonSpec extends Specification {

    def person = new Person(name: 'Mike', age: 10)

    def "equals - equal"() {
        when:
        new EqualsTester().
                addEqualityGroup(person,
                                 new Person(name: 'Mike', age: 10),
                                 new Person(name: 'Mike', age: 20)).
                testEquals()

        then:
        notThrown(AssertionFailedError.class)
    }

    def "equals - not equal"() {
        when:
        new EqualsTester().
                addEqualityGroup(person,
                                 new Person(name: 'Kurt', age: 10)).
                testEquals()

        then:
        thrown(AssertionFailedError.class)
    }
}

Oh… wait for it… BOOM!

Guava: Reducing Cyclomatic Complexity with Objects.firstNonNull(…)

PROBLEM

Ever written code like this?

Employee employee = employeeService.getByName(employeeName);

if (employee == null) {
    employee = new Employee();
}

While it works, it has several minor problems:-

  • The above code has cyclomatic complexity of 2 due to the if logic.
  • We cannot define final on employee object.
  • Too many lines of code, and seriously, it looks very 2003.

SOLUTION

Guava provides a much cleaner solution to address these problems:-

final Employee employee = Objects.firstNonNull(employeeService.getByName(employeeName),
                                               new Employee());

Guava: FluentIterable vs Collections2

PROBLEM

Guava’s Collections2 is great when dealing with collections, but it quickly becomes rather clumsy and messy when we try to combine multiple actions together.

For example, say we have userIds, which is a collection of user IDs. For each user ID, we want to retrieve the Employee object and add it into an immutable set if the lookup is successful. So, we essentially want to perform a “tranform” and “filter” here.

By using strictly Collections2, we ended up with the following code:-

final ImmutableSet<Employee> employees = ImmutableSet.copyOf(
        Collections2.filter(
                Collections2.transform(userIds,
                                       new Function<String, Employee>() {
                                           @Override
                                           public Employee apply(String userId) {
                                               return employeeService.getEmployee(userId);
                                           }
                                       }),
                new Predicate<Employee>() {
                    @Override
                    public boolean apply(Employee employee) {
                        return employee != null;
                    }
                }));

While it works, it may be confusing to those reading the code due to the nested functions.

SOLUTION

A better approach is to use FluentIterable that allows us to write more natural top-to-bottom code:-

final ImmutableSet<Employee> employees = FluentIterable
        .from(userIds)
        .transform(new Function<String, Employee>() {
            @Override
            public Employee apply(String userId) {
                return employeeService.getEmployee(userId);
            }
        })
        .filter(new Predicate<Employee>() {
            @Override
            public boolean apply(Employee employee) {
                return employee != null;
            }
        })
        .toImmutableSet();

Guava: Elegant Caching Mechanisms

There are many different ways to implement a caching mechanism. Google’s Guava provides very simple and elegant solutions to do so.

OPTION 1: Caching using Supplier

If you want to cache just one thing, Supplier should satisfy that requirement.

private Supplier<Collection<Person>> allEmployeesCache = Suppliers.memoizeWithExpiration(
        new Supplier<Collection<Person>>() {
            public Collection<Person> get() {
                return makeExpensiveCallToGetAllEmployees();
            }
        }, 1, TimeUnit.DAYS);

public Collection<Person> getAllEmployees() {
    return allEmployeesCache.get();
}

In the above example, we cache all employees for a day.

OPTION 2: Caching using LoadingCache

If you want to cache multiple things (ex: think Hashmap-like), LoadingCache should satisfy that requirement.

private LoadingCache<String, Collection<Person>> employeeRoleLoadingCache = CacheBuilder.newBuilder()
        .expireAfterWrite(1, TimeUnit.DAYS)
        .maximumSize(1000)
        .build(new CacheLoader<String, Collection<Person>>() {
            @Override
            public Collection<Person> load(String roleName) throws Exception {
                return makeExpensiveCallToGetAllEmployeesByRoleName(roleName);
            }
        });

public Collection<Person> getAllEmployeesByRole(String roleName) {
    return employeeRoleLoadingCache.getUnchecked(roleName);
}

In the above example, we cache all employees for each given role for a day. Further, we configure the cache to store up to 1000 elements before evicting the older cached element.

Guava: Elegant Collection Ordering

PROBLEM

Java provides a built-in API to sort a list, but it is flat out ugly and prone to error especially when dealing with Comparator. Here’s an example:-

List<Request> allRequests = ...;

Collections.sort(allRequests, new Comparator<Request>() {
    @Override
    public int compare(Request lhs, Request rhs) {
        if (lhs.getSubmissionDatetime() == null) {
            return -1;
        }
        else if (rhs.getSubmissionDatetime() == null) {
            return 1;
        }

        return lhs.getSubmissionDatetime().compareTo(rhs.getSubmissionDatetime());
    }
});

Number of puppies killed = 3 … from the if-else statement.

SOLUTION

A better way to do this is to use Ordering from Guava:-

List<Request> allRequests = ...;

List<Request> sortedRequests = Ordering.natural()
        .nullsFirst()
        .onResultOf(new Function<Request, LocalDateTime>() {
            @Override
            public LocalDateTime apply(Request request) {
                return request.getSubmissionDatetime();
            }
        })
        .immutableSortedCopy(allRequests);

In this example, we want all the null elements to appear on top of the list. We can also use nullsLast() if we want them to be at the bottom of the list.

Number of puppies killed = 0.