Collections, internals.
The Collections Framework is the most-used Java API and the most-misunderstood. Today you stop memorizing Big-O and start understanding why each class behaves the way it does — so picking the right one becomes obvious, not lucky.
The big picture — one diagram
The Collections Framework is organized around two top-level interfaces: Collection (lists, sets, queues) and Map (key→value pairs).
List · ordered, indexable, allows duplicates
ArrayList — the default
An ArrayList is backed by an actual array. When the array fills up, it allocates a bigger array (typically 1.5× the old size) and copies elements over.
ArrayList strengths
- O(1) get by index — pointer arithmetic
- Cache-friendly — elements next to each other in memory
- Compact memory footprint
- Default choice for most lists
ArrayList weaknesses
- O(n) insert/delete in the middle — must shift everything
- Resize cost when array fills (rare but real)
- Can't shrink automatically when emptied
LinkedList — a doubly-linked list
A LinkedList is a chain of nodes. Each node holds the data + pointers to the previous and next node. There is no underlying array.
LinkedList strengths
- O(1) insert/delete at known node — just rewire pointers
- No resize cost
LinkedList weaknesses
- O(n) get by index — must walk from head/tail
- Cache-unfriendly — nodes scattered across heap
- Higher memory overhead per element (two pointers + node object)
"When would you use LinkedList over ArrayList?"
Honest answer: almost never in modern Java. Even when you do many inserts in the middle, ArrayList's cache-friendly contiguous memory often outperforms LinkedList in practice on modern CPUs. The only real use case for LinkedList is when you need it to also be a Deque (double-ended queue), and even then ArrayDeque is usually better.
Default to ArrayList. Reach for LinkedList only with measurement.
HashMap · the workhorse you must understand
HashMap is the most-used Map. It's also the most-asked in interviews. Picture how it works:
Step by step — what put() actually does
- Compute
key.hashCode()— an int - Apply internal mixing to spread out clustered hashes
- Modulo by the bucket array length → bucket index
- Look in that bucket. Empty? Place the new entry.
- Already has entries? Walk them, comparing with
key.equals(). Match? Overwrite. No match? Append. - If the array gets too full (load factor > 0.75 by default) → resize (double the array, rehash everything).
If you put a custom object as a HashMap key, you must override both hashCode() and equals(). Skip this and HashMap silently breaks: identical-looking keys go to different buckets, lookups miss, your code mysteriously loses data.
Rule: if two objects are .equals(), their .hashCode() must be the same. The reverse isn't required.
Java 8 improvement — bucket-as-tree
Pre-Java 8, all collisions in a bucket formed a linked list. If many keys hashed to the same bucket (bad hash function or attacker), lookups degraded to O(n). Since Java 8, when a bucket has 8+ entries, it converts to a balanced tree → O(log n) worst case.
Sets — built on Maps, basically
Few people realize: HashSet is literally a HashMap internally where every value is a placeholder. Set's "no duplicates" rule comes from Map's "unique keys" rule.
- HashSet — unordered, O(1) add/contains
- LinkedHashSet — preserves insertion order (linked list overlay)
- TreeSet — sorted (red-black tree), O(log n)
Decision table — when to use what
| If you need… | Use | Why |
|---|---|---|
| Ordered, index-accessed list | ArrayList | Default. Fast random access. |
| Unique items, fast contains-check | HashSet | O(1) lookup |
| Unique items in insertion order | LinkedHashSet | Iteration order = insertion order |
| Sorted unique items | TreeSet | Red-black tree |
| Key→value lookup | HashMap | O(1) average |
| Key→value with insertion order | LinkedHashMap | LRU cache implementations use this |
| Sorted key→value | TreeMap | O(log n) |
| Concurrent map (multi-thread) | ConcurrentHashMap | Lock-striped, scalable |
| FIFO queue / stack | ArrayDeque | Faster than LinkedList |
They're not testing if you memorized Big-O. They're testing if you understand why each one has its tradeoffs. The "right" answer is: ArrayList by default; switch only with measurement; understand cache locality.
One subtle gotcha — fail-fast iterators
If you iterate a collection (with a for-each or iterator) and modify it during iteration through anything other than the iterator itself, Java throws ConcurrentModificationException. This is fail-fast behavior — better to crash early than corrupt data silently.
List<String> list = new ArrayList<>(List.of("a", "b", "c"));
for (String s : list) {
if (s.equals("b")) list.remove(s); // 💥 ConcurrentModificationException
}
// Fix: use Iterator.remove() or removeIf()
list.removeIf(s -> s.equals("b"));Lock in today's learning
If any of these feel uncertain, that section is a re-read.
- Why is
ArrayListget-by-index O(1) butLinkedListO(n)? - What two methods must a class override to be a safe HashMap key, and why?
- What changed about HashMap collisions in Java 8?
- Why is
HashSet"basically a HashMap"? - Give a real reason to choose
LinkedHashMapoverHashMap. - Why do iterators throw
ConcurrentModificationException, and what is the safe way to remove during iteration?
End of Day 4. Tomorrow: Generics — what <T> actually does and why type erasure trips up everyone.