Day 04 · Week 1 · Foundations

Collections, internals.

The Collections Framework is the most-used Java API and the most-misunderstood. Today you stop memorizing Big-O and start understanding why each class behaves the way it does — so picking the right one becomes obvious, not lucky.

~28 min readinterview golddecision tree
You don't need to memorize tables. You need to picture what each collection looks like in memory. Once you can picture it, the performance and behavior become obvious.

The big picture — one diagram

The Collections Framework is organized around two top-level interfaces: Collection (lists, sets, queues) and Map (key→value pairs).

CollectionListSetQueueArrayListLinkedListVectorHashSetLinkedHashSetTreeSetArrayDequePriorityQueueMapMap typesHashMapLinkedHashMapTreeMapConcurrentHashMap"a sequence/group of items""key → value lookups"
The interfaces define contracts. The classes are different implementations of those contracts.

List · ordered, indexable, allows duplicates

ArrayList — the default

An ArrayList is backed by an actual array. When the array fills up, it allocates a bigger array (typically 1.5× the old size) and copies elements over.

A[0]B[1]C[2]D[3]E[4]F[5]G[6]H[7]contiguous memory · index-addressable
ArrayList in memory. Accessing index 7 is O(1) — just arithmetic.

ArrayList strengths

  • O(1) get by index — pointer arithmetic
  • Cache-friendly — elements next to each other in memory
  • Compact memory footprint
  • Default choice for most lists

ArrayList weaknesses

  • O(n) insert/delete in the middle — must shift everything
  • Resize cost when array fills (rare but real)
  • Can't shrink automatically when emptied

LinkedList — a doubly-linked list

A LinkedList is a chain of nodes. Each node holds the data + pointers to the previous and next node. There is no underlying array.

Aprev / nextBprev / nextCprev / nextDprev / nextscattered memory · pointer-traversed
LinkedList in memory. Each node is somewhere different on the heap.

LinkedList strengths

  • O(1) insert/delete at known node — just rewire pointers
  • No resize cost

LinkedList weaknesses

  • O(n) get by index — must walk from head/tail
  • Cache-unfriendly — nodes scattered across heap
  • Higher memory overhead per element (two pointers + node object)
🧠 The interview question — answered correctly

"When would you use LinkedList over ArrayList?"

Honest answer: almost never in modern Java. Even when you do many inserts in the middle, ArrayList's cache-friendly contiguous memory often outperforms LinkedList in practice on modern CPUs. The only real use case for LinkedList is when you need it to also be a Deque (double-ended queue), and even then ArrayDeque is usually better.

Default to ArrayList. Reach for LinkedList only with measurement.

HashMap · the workhorse you must understand

HashMap is the most-used Map. It's also the most-asked in interviews. Picture how it works:

bucket 0bucket 1bucket 2bucket 3bucket 4bucket 5bucket 6bucket 7key→valkey→valkey→valcollision!key→valput(k,v): bucket = hash(k) % size · then store in that bucketget(k): same bucket · scan its list for matching keyaverage O(1), worst-case O(log n) (after Java 8 tree-bucket fix)
HashMap = an array of buckets. Hash modulo array-length picks the bucket. Collisions chain into a linked list (or tree).

Step by step — what put() actually does

  1. Compute key.hashCode() — an int
  2. Apply internal mixing to spread out clustered hashes
  3. Modulo by the bucket array length → bucket index
  4. Look in that bucket. Empty? Place the new entry.
  5. Already has entries? Walk them, comparing with key.equals(). Match? Overwrite. No match? Append.
  6. If the array gets too full (load factor > 0.75 by default) → resize (double the array, rehash everything).
🔑 The contract you must respect

If you put a custom object as a HashMap key, you must override both hashCode() and equals(). Skip this and HashMap silently breaks: identical-looking keys go to different buckets, lookups miss, your code mysteriously loses data.

Rule: if two objects are .equals(), their .hashCode() must be the same. The reverse isn't required.

Java 8 improvement — bucket-as-tree

Pre-Java 8, all collisions in a bucket formed a linked list. If many keys hashed to the same bucket (bad hash function or attacker), lookups degraded to O(n). Since Java 8, when a bucket has 8+ entries, it converts to a balanced tree → O(log n) worst case.

Sets — built on Maps, basically

Few people realize: HashSet is literally a HashMap internally where every value is a placeholder. Set's "no duplicates" rule comes from Map's "unique keys" rule.

Decision table — when to use what

If you need…UseWhy
Ordered, index-accessed listArrayListDefault. Fast random access.
Unique items, fast contains-checkHashSetO(1) lookup
Unique items in insertion orderLinkedHashSetIteration order = insertion order
Sorted unique itemsTreeSetRed-black tree
Key→value lookupHashMapO(1) average
Key→value with insertion orderLinkedHashMapLRU cache implementations use this
Sorted key→valueTreeMapO(log n)
Concurrent map (multi-thread)ConcurrentHashMapLock-striped, scalable
FIFO queue / stackArrayDequeFaster than LinkedList
🧠 Why your interviewer asks 'ArrayList vs LinkedList'

They're not testing if you memorized Big-O. They're testing if you understand why each one has its tradeoffs. The "right" answer is: ArrayList by default; switch only with measurement; understand cache locality.

One subtle gotcha — fail-fast iterators

If you iterate a collection (with a for-each or iterator) and modify it during iteration through anything other than the iterator itself, Java throws ConcurrentModificationException. This is fail-fast behavior — better to crash early than corrupt data silently.

List<String> list = new ArrayList<>(List.of("a", "b", "c"));
for (String s : list) {
  if (s.equals("b")) list.remove(s); // 💥 ConcurrentModificationException
}

// Fix: use Iterator.remove() or removeIf()
list.removeIf(s -> s.equals("b"));
Pause & reflect

Lock in today's learning

If any of these feel uncertain, that section is a re-read.

  1. Why is ArrayList get-by-index O(1) but LinkedList O(n)?
  2. What two methods must a class override to be a safe HashMap key, and why?
  3. What changed about HashMap collisions in Java 8?
  4. Why is HashSet "basically a HashMap"?
  5. Give a real reason to choose LinkedHashMap over HashMap.
  6. Why do iterators throw ConcurrentModificationException, and what is the safe way to remove during iteration?

End of Day 4. Tomorrow: Generics — what <T> actually does and why type erasure trips up everyone.