Archives

Everything I've written so far, organized by date.

2026 ²⁰

May ¹⁵

Why saveAll() Becomes 10K INSERTs — IDENTITY and Hibernate's Structural Batch Disablement

4 May, 2026

hibernate.jdbc.batch_size=50 is set, yet saveAll() of 10,000 rows fires 10,000 INSERTs. GenerationType.IDENTITY needs LAST_INSERT_ID() per INSERT, which Statement.RETURN_GENERATED_KEYS cannot deliver in batch — Hibernate disables batching structurally. Application-managed IDs (TABLE strategy simulation) restore batching at ~200 SQL. Raw JDBC batchUpdate with rewriteBatchedStatements=true rewrites them as multi-value INSERT (~10 SQL) — the fastest path. The DZone "IDENTITY → SEQUENCE 100x" post is PostgreSQL-specific. MySQL has no native SEQUENCE (it falls back to TABLE), so the real options on MySQL are UUID, TableGenerator pooled-lo, Snowflake/TSID, or raw JDBC batch.
JPA N+1 and the Four JOIN FETCH Traps — MultipleBagFetchException, Pagination OOM, OneToOne LAZY

4 May, 2026

In a 4-depth domain (owner→merchant→rule→history) findAll + child traversal yields 21 SQL. JOIN FETCH collapses it to 1. But fetching two collections at once raises MultipleBagFetchException — Hibernate refuses the cartesian product of two Bags. JOIN FETCH + setMaxResults emits HHH000104 and applies pagination *in memory* — load 10K rows, keep 20, OOM. Non-owning @OneToOne LAZY is *always fetched* because the proxy cannot tell whether the value is null. @BatchSize tames N+1 to N/K+1, the standard mitigation. The fetch traps in JPA come from Bag/List/Set semantics, proxy limitations, and how Hibernate handles cartesian products — never from JOIN FETCH alone.
The Real Cost of JPA Dirty Checking — readOnly, @DynamicUpdate, and Query Plan Cache Leaks

4 May, 2026

Hibernate's dirty checking copies an entity snapshot at load time and compares it against the current state at flush. With 10,000 rows, readOnly=true skips the snapshot copy (memory savings). @DynamicUpdate emits SQL with only the changed columns — but generates a fresh SQL string per update pattern, increasing Query Plan Cache usage (a permanent heap leak if hibernate.query.plan_cache_max_size is unset). @Modifying bulk JPQL is fastest but leaves the persistence context inconsistent — clearAutomatically=true is the standard. The clear() pattern (flush + clear every 50 inserts) keeps memory bounded for large insert batches. The trap in JPA is never one feature; it is the interaction of flush, cache, and snapshot lifecycle.
JPA Optimistic Lock and the Retry Stampede Trap — 6 Scenarios @Version Cannot Cover Alone

4 May, 2026

100 workers each increment the same rule's priority by +1. Without @Version, the final priority < 100 (Lost Update). With @Version, you only get OptimisticLockException — handling is the caller's responsibility, so only some succeed. @Retryable(3) with backoff=0 produces *retry stampede* — retries pile up at the same instant, colliding again. Exponential backoff with full jitter spreads retries out and reaches priority=100. Plus the *self Lost Update* trap discovered along the way — same transaction, two SELECTs returning different objects (JDBC) vs the same instance (JPA first-level cache `==`). Different category from distributed Lost Update. The piece also covers @Transactional + @Retryable AOP ordering and the AWS Architecture Blog rationale for full jitter.
[JPA + Spring Mastery 01] L1 Cache · flush · Transaction Lifecycle — what readOnly really shaves off, dirty checking's true cost

4 May, 2026

How PersistenceContext sits on Fowler's Identity Map (PoEAA, 2002), how the four ActionQueue lists decide SQL emission order, what AutoFlush actually inspects right before a query, and how dirty checking's cost differs sharply between reflection and bytecode enhancement — decomposed line-by-line from Hibernate 6's DefaultFlushEventListener through Kakao Pay's readOnly + set_option (QPS +58%) report. The +0.4 ms baseline measured between raw JDBC (0.74 ms) and JPA variants (0.99–1.95 ms) is unpacked, and `readOnly`'s three-layer effect (Hibernate flush mode + Spring tx marker + MySQL Com_set_option round-trips) is taken apart line-by-line.
[JPA + Spring Mastery 08] Transaction Split Patterns — Saga / Outbox / REQUIRES_NEW, from academic origins to a 9-scenario EXP-09b measurement

4 May, 2026

The maxim *don't call external APIs inside a transaction* is well known; the *how* is rarely treated honestly. This article goes from PROPAGATION's seven semantics to 2PC (XA)'s limits, to Garcia-Molina's 1987 Sagas paper, Pat Helland's CIDR 2005 Data on the Outside, and Vogels's ACM Queue 2008 Eventually Consistent — then through Toss SLASH24's SAGA, 29CM and Ridi's Outbox in production — and lands on the EXP-09b 9-scenario measurement matrix (patterns A/B/C × OFF/DB_FAIL/EXT_FAIL). Payments to Saga, notifications to Outbox, cache only to plain split — academic + production + measurement, in three layers.
[JPA + Spring Mastery 07] Spring AOP self-invocation — the real reason @Transactional doesn't work, decomposed down to TransactionInterceptor.invoke 6 stages

4 May, 2026

In an optimistic-lock measurement, successes=100 but balance stays at 100. The code logic was fine — the cause was a same-class call bypassing Spring AOP's proxy, so @Transactional never fired and flush never happened. This article decomposes the 6 stages of TransactionInterceptor.invoke, the line in MethodInvocation.proceed() that calls the raw target, the 6 annotations sharing the same trap (@Async / @Cacheable / @Validated / @Retryable / @PreAuthorize), and 4 workarounds (separate bean / getBean(self) / AopContext.currentProxy / AspectJ weaving), citing Spring 6 / Hibernate 6 source line-by-line.
MySQL Credit Deduction — 4 Locks Compared, Pessimistic at 180ms / 100% accurate, plus the self-invocation trap I hit during measurement

4 May, 2026

An ordinary scenario — 100 workers concurrently subtracting 1 from an account with balance 100. Four lock strategies (optimistic / pessimistic / MySQL GET_LOCK / Redisson) all produce different results — pessimistic 180ms / 100% / balance 0, optimistic 549ms (retry storm under contention), GET_LOCK 5015ms (advisory lock cost), Redisson 53/100 (single-instance limitation). And during measurement I hit the self-invocation trap — successes=100 but the balance never moved. The real Spring/JPA pitfall is not logic, it is AOP proxy bypass. A walkthrough including direct demos of the connection-bound GET_LOCK traps in 4 scenarios.
RDB Mastery #3 — Mastering EXPLAIN ANALYZE: Push Down Traps and the Real Mechanics of Index Selection

4 May, 2026

Once you can read a single line of an EXPLAIN ANALYZE operator tree, you can directly verify the optimizer's decisions. Filter vs Index Range Scan over — a one-word difference that splits push down success from failure. The ANSI SQL standard row constructor (a,b)<(?,?) doesn't match the MySQL optimizer's whitelist patterns and fails to push down — Bug #16247, filed in 2006, is a long-standing known limitation (currently marked duplicate in the tracker). Index Selection is also a cost-based judgment by the optimizer — the Q2 paradox (with a small LIMIT 5, the optimizer picks the wrong index and adding an index makes it slower). The optimizer is not right 100 percent of the time. We unpack the push-down mechanism and the internals of cost-based index selection by reading 5 EXPLAIN ANALYZE outputs line by line, measured on 10 million rows.
RDB Mastery #2 — MySQL Index Types: B-tree / Hash / Covering / Composite / Multi-valued / Functional, and When to Pick What

4 May, 2026

Not every index in InnoDB is a B-tree. Hash (Memory engine), Spatial (R-tree), Full-text (inverted index), Multi-valued (8.0+, JSON arrays), Functional (8.0.13+, expressions). And even within the B-tree family, clustered vs secondary, covering or not, the leftmost-prefix rule for composites, and cardinality / selectivity become the decision axes. Built five real indexes on a 10M-row table and decided when to pick what by measuring cardinality + Q1~Q5 latency. Q3 covering 2,476x / Q5 composite 577x / Q2 paradox where adding an index made it slower (0.66ms → 13.5ms). Indexes are not free — write cost 5~6x + storage 1.3GB. Unwound to the end with 9 diagrams.
RDB Mastery #1 — InnoDB Index Internals: From No-Index to Multi-Index, the Real Picture B-trees Draw

4 May, 2026

Even when you don't define an index, InnoDB already stores rows inside a B-tree. PK = clustered index = the table itself. Secondary index = a separate B-tree that points to PK. Covering index = an index where the answer lives in the leaf, no PK lookup needed. Reverse scan = walking the leaf doubly-linked list backward. OFFSET cannot skip because B-trees do not maintain row counters. Cursor is fast because WHERE triggers the binary-search primitive of the B-tree. Multi-index means N B-trees on the same table. With a 10M-row environment, [measured] Q3 covering 2,476x / Q5 composite 577x / OFFSET 1M = 171ms / cursor = 0.30ms — unwound to the end with 10 diagrams.
Decoding HikariCP Pool Exhaustion via JVM Thread Dump — What TIMED_WAITING (parked) Really Means

3 May, 2026

When the pool exhaustion alert fires, staring at application code yields nothing. The thread dump from jstack is the real evidence — every worker thread is frozen in HikariCP at TIMED_WAITING (parked). I walk through the JVM Thread State machine, LockSupport.parkNanos, the ConcurrentBag and SynchronousQueue mechanics, and how the transaction-with-external-call pool-exhaustion measurement [measured] (timeout 5s = 100% pass / 1s = 16.7%) maps line-by-line to the dump — diagnosing pool exhaustion from a single dump in production.
MySQL No-Offset Cursor Pagination — At 10M rows, OFFSET 1M takes 171ms / Cursor 0.30ms, and the 500x trap between them, traced down to a single line

3 May, 2026

On a 10M-row table, OFFSET 1M takes 171ms while a No-Offset cursor takes 0.30ms — about 570x faster, reproduced by direct measurement. But how you write the No-Offset code splits another 500x. The ANSI SQL row constructor `(a,b)<(?,?)` is logically equivalent to the OR-split form, yet the MySQL optimizer cannot push it down to an index range (154ms — about the same as OFFSET). The single line in EXPLAIN ANALYZE — Filter: vs Covering index range scan over — is the root cause. A production retrospective combined with a reproducible learning environment.
MySQL InnoDB Isolation Levels — Measuring phantom reads across all 4 levels and decomposing why InnoDB RR is stronger than the ANSI standard

3 May, 2026

The ANSI SQL standard does not guarantee that REPEATABLE READ blocks phantom reads. Yet MySQL InnoDB's RR does. I nailed down this commonly-cited claim with direct measurements — RU/RC: phantom occurs (A1=0 → INSERT → A2=1), RR: blocked (A2=0), SERIALIZABLE: INSERT itself waits 1.56s. Then I decomposed why MySQL RR is stronger than the ANSI standard via three mechanisms — consistent read snapshot, gap lock, and MVCC undo log — to nail down with measurements that for payment domains, RR alone is sufficient.
External API Calls Inside Transactions — Reproducing Pool Exhaustion and Comparing Simple Split, Saga, and Outbox by Measurement

2 May, 2026

I reproduced HikariCP pool exhaustion caused by external API calls inside transactions in a Spring + raw JDBC environment, then compared three remedies — Simple Split, Saga, and Outbox — across 60 workers × 9 chaos scenarios. I caught the moment Simple Split breaks consistency as 60 mismatched records, watched Saga's three-tier safety net trigger in sequence, and saw how Outbox's 72ms ACK and 93-second average completion split the same dataset into opposite conclusions depending on which metric you read.

January ⁵

Proper Connection Pool Configuration in TypeORM & NestJS

15 Jan, 2026

A deep dive into connection pool configuration in TypeORM and mysql2, inspired by Naver D2's Commons DBCP guide. Learn how to calculate required connections using TPS formulas, compare Before/After production code, and understand each configuration option in depth.
When equals/hashCode Goes Wrong: A Duplicate Payment Incident Post-Mortem

9 Jan, 2026

A deep dive into how forgetting to override hashCode() while implementing equals() caused duplicate payments. Includes Kafka TopicPartition analysis, HashMap internals, and code review checklists.
Debugging a Memory Leak in Browser Automation: The Perfect Storm of Three Cleanup Paths

8 Jan, 2026

A deep dive into debugging a memory leak in a production system managing 50 concurrent Firefox browsers. The story of how Promise.race and finally blocks created a double-cleanup bug, and the journey to fix it.
Multi-Platform Database Design: Building Enterprise-Grade Logging Systems

7 Jan, 2026

From specialized table design for new platform integration to AI-driven design validation, index optimization, and partitioning strategies - A complete guide to enterprise-grade database design
Dissecting Kotlin's toSet(): Engineering is About Explaining Choices

7 Jan, 2026

A deep dive into Kotlin's toSet() method from JVM memory model to production environments. Analyzing standard library design decisions, memory overhead, GC impact, and practical guidelines for high-traffic systems.

Archives

Why saveAll() Becomes 10K INSERTs — IDENTITY and Hibernate's Structural Batch Disablement

JPA N+1 and the Four JOIN FETCH Traps — MultipleBagFetchException, Pagination OOM, OneToOne LAZY

The Real Cost of JPA Dirty Checking — readOnly, @DynamicUpdate, and Query Plan Cache Leaks

JPA Optimistic Lock and the Retry Stampede Trap — 6 Scenarios @Version Cannot Cover Alone

[JPA + Spring Mastery 01] L1 Cache · flush · Transaction Lifecycle — what readOnly really shaves off, dirty checking's true cost

[JPA + Spring Mastery 08] Transaction Split Patterns — Saga / Outbox / REQUIRES_NEW, from academic origins to a 9-scenario EXP-09b measurement

[JPA + Spring Mastery 07] Spring AOP self-invocation — the real reason @Transactional doesn't work, decomposed down to TransactionInterceptor.invoke 6 stages

MySQL Credit Deduction — 4 Locks Compared, Pessimistic at 180ms / 100% accurate, plus the self-invocation trap I hit during measurement

RDB Mastery #3 — Mastering EXPLAIN ANALYZE: Push Down Traps and the Real Mechanics of Index Selection

RDB Mastery #2 — MySQL Index Types: B-tree / Hash / Covering / Composite / Multi-valued / Functional, and When to Pick What

RDB Mastery #1 — InnoDB Index Internals: From No-Index to Multi-Index, the Real Picture B-trees Draw

Decoding HikariCP Pool Exhaustion via JVM Thread Dump — What TIMED_WAITING (parked) Really Means

MySQL No-Offset Cursor Pagination — At 10M rows, OFFSET 1M takes 171ms / Cursor 0.30ms, and the 500x trap between them, traced down to a single line

MySQL InnoDB Isolation Levels — Measuring phantom reads across all 4 levels and decomposing why InnoDB RR is stronger than the ANSI standard

External API Calls Inside Transactions — Reproducing Pool Exhaustion and Comparing Simple Split, Saga, and Outbox by Measurement

Proper Connection Pool Configuration in TypeORM & NestJS

When equals/hashCode Goes Wrong: A Duplicate Payment Incident Post-Mortem

Debugging a Memory Leak in Browser Automation: The Perfect Storm of Three Cleanup Paths

Multi-Platform Database Design: Building Enterprise-Grade Logging Systems

Dissecting Kotlin's toSet(): Engineering is About Explaining Choices