Day 04 · Week 1 · Foundations

Collections, internals.

The Collections Framework is the most-used Java API and the most-misunderstood. Today you stop memorizing Big-O and start understanding why each class behaves the way it does — so picking the right one becomes obvious, not lucky.

~28 min readinterview golddecision tree

You don't need to memorize tables. You need to picture what each collection looks like in memory. Once you can picture it, the performance and behavior become obvious.

The big picture — one diagram

The Collections Framework is organized around two top-level interfaces: Collection (lists, sets, queues) and Map (key→value pairs).

The interfaces define contracts. The classes are different implementations of those contracts.

List · ordered, indexable, allows duplicates

ArrayList — the default

An ArrayList is backed by an actual array. When the array fills up, it allocates a bigger array (typically 1.5× the old size) and copies elements over.

ArrayList in memory. Accessing index 7 is O(1) — just arithmetic.

ArrayList strengths

O(1) get by index — pointer arithmetic
Cache-friendly — elements next to each other in memory
Compact memory footprint
Default choice for most lists

ArrayList weaknesses

O(n) insert/delete in the middle — must shift everything
Resize cost when array fills (rare but real)
Can't shrink automatically when emptied

LinkedList — a doubly-linked list

A LinkedList is a chain of nodes. Each node holds the data + pointers to the previous and next node. There is no underlying array.

LinkedList in memory. Each node is somewhere different on the heap.

LinkedList strengths

O(1) insert/delete at known node — just rewire pointers
No resize cost

LinkedList weaknesses

O(n) get by index — must walk from head/tail
Cache-unfriendly — nodes scattered across heap
Higher memory overhead per element (two pointers + node object)

🧠 The interview question — answered correctly

"When would you use LinkedList over ArrayList?"

Honest answer: almost never in modern Java. Even when you do many inserts in the middle, ArrayList's cache-friendly contiguous memory often outperforms LinkedList in practice on modern CPUs. The only real use case for LinkedList is when you need it to also be a Deque (double-ended queue), and even then ArrayDeque is usually better.

Default to ArrayList. Reach for LinkedList only with measurement.

HashMap · the workhorse you must understand

HashMap is the most-used Map. It's also the most-asked in interviews. Picture how it works:

HashMap = an array of buckets. Hash modulo array-length picks the bucket. Collisions chain into a linked list (or tree).

Step by step — what put() actually does

Compute key.hashCode() — an int
Apply internal mixing to spread out clustered hashes
Modulo by the bucket array length → bucket index
Look in that bucket. Empty? Place the new entry.
Already has entries? Walk them, comparing with key.equals(). Match? Overwrite. No match? Append.
If the array gets too full (load factor > 0.75 by default) → resize (double the array, rehash everything).

🔑 The contract you must respect

If you put a custom object as a HashMap key, you must override both hashCode() and equals(). Skip this and HashMap silently breaks: identical-looking keys go to different buckets, lookups miss, your code mysteriously loses data.

Rule: if two objects are .equals(), their .hashCode() must be the same. The reverse isn't required.

Java 8 improvement — bucket-as-tree

Pre-Java 8, all collisions in a bucket formed a linked list. If many keys hashed to the same bucket (bad hash function or attacker), lookups degraded to O(n). Since Java 8, when a bucket has 8+ entries, it converts to a balanced tree → O(log n) worst case.

Sets — built on Maps, basically

Few people realize: HashSet is literally a HashMap internally where every value is a placeholder. Set's "no duplicates" rule comes from Map's "unique keys" rule.

HashSet — unordered, O(1) add/contains
LinkedHashSet — preserves insertion order (linked list overlay)
TreeSet — sorted (red-black tree), O(log n)

Decision table — when to use what

If you need…	Use	Why
Ordered, index-accessed list	`ArrayList`	Default. Fast random access.
Unique items, fast contains-check	`HashSet`	O(1) lookup
Unique items in insertion order	`LinkedHashSet`	Iteration order = insertion order
Sorted unique items	`TreeSet`	Red-black tree
Key→value lookup	`HashMap`	O(1) average
Key→value with insertion order	`LinkedHashMap`	LRU cache implementations use this
Sorted key→value	`TreeMap`	O(log n)
Concurrent map (multi-thread)	`ConcurrentHashMap`	Lock-striped, scalable
FIFO queue / stack	`ArrayDeque`	Faster than LinkedList

🧠 Why your interviewer asks 'ArrayList vs LinkedList'

They're not testing if you memorized Big-O. They're testing if you understand why each one has its tradeoffs. The "right" answer is: ArrayList by default; switch only with measurement; understand cache locality.

One subtle gotcha — fail-fast iterators

If you iterate a collection (with a for-each or iterator) and modify it during iteration through anything other than the iterator itself, Java throws ConcurrentModificationException. This is fail-fast behavior — better to crash early than corrupt data silently.

List<String> list = new ArrayList<>(List.of("a", "b", "c"));
for (String s : list) {
  if (s.equals("b")) list.remove(s); // 💥 ConcurrentModificationException
}

// Fix: use Iterator.remove() or removeIf()
list.removeIf(s -> s.equals("b"));

⏸ Pause & reflect

Lock in today's learning

If any of these feel uncertain, that section is a re-read.

Why is ArrayList get-by-index O(1) but LinkedList O(n)?
What two methods must a class override to be a safe HashMap key, and why?
What changed about HashMap collisions in Java 8?
Why is HashSet "basically a HashMap"?
Give a real reason to choose LinkedHashMap over HashMap.
Why do iterators throw ConcurrentModificationException, and what is the safe way to remove during iteration?

End of Day 4. Tomorrow: Generics — what <T> actually does and why type erasure trips up everyone.

← previousDay 3 · OOP re-examined tomorrow →Day 5 · Generics & type safety