Efficient data structures: Sets & Maps

Optimizing Code in Java

Pavlos Kosmetatos

Lead Engineer @Wealthyhood

Applying space/time complexity

We now know about space and time complexity!

How do we apply this understanding to write more efficient code?

By choosing the right data structure!

Optimizing Code in Java

A user management system

  • We are building a user management system

  • We need to check, for a given username, if a user exists

  • Using a list: complexity $O(n)$

public boolean usernameExists(ArrayList<String> users, String newUsername) {
    for (String username : users) {
        if (username.equals(newUsername)) {
            return true;
        }
    }
    return false;
}
Optimizing Code in Java

Sets

  • A collection of unique elements with fast lookup times
  • $O(1)$ average time complexity for adding, removing, and checking if an element exists

Improved user management system solution:

public class UserRegistry {
    private HashSet<String> users = new HashSet<>();

    public boolean userExists(String username) {
        return users.contains(username); // O(1) average time
    }
}
Optimizing Code in Java

Maps

  • Like a dictionary, word (key) -> definition (value)
  • HashMap: $O(1)$ average time complexity for operations
public class UserCache {
    private HashMap<String, UserProfile> userProfiles = new HashMap<>();

    public UserProfile getUser(String username) {
        return userProfiles.get(username);  // O(1) average time
    }
}
Optimizing Code in Java

Hashcode method

  • What makes those fast -> hashCode()
  • In our ArrayList example, we did not know the index of the username to find
  • Using the hashcode() method, we can convert an object into a index we can look for
  • For example pavlos.2020 -> 35189

pavlos.2020 processed through hashcode, resulting in 35189

Optimizing Code in Java

Indexing elements

  • HashMap and HashSet have an underlying array
  • When we add an element:
    • Java calls hashCode() on your element to get an integer
    • It calls modulo on the above to get the bucket number

Example:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

"optimizingCodeInJava" -> 1406313774 // Implemented by Java
1406313774 % 16 = 14 <- that's our bucket!
Optimizing Code in Java

Collisions

  • When more than one items end up in the same bucket
    • Called a collision
  • LinkedList for the bucket
  • That's why we say $O(1)$ on average
Optimizing Code in Java

A note to remember

Data structure selection is like choosing the right tool for a job - a hammer (ArrayList) is great for nails but terrible for screws (where a Set might work better).

Tools

Optimizing Code in Java

Let's practice!

Optimizing Code in Java

Preparing Video For Download...