"Make it faster" is not a plan. The plan is: measure, find the p95, fix the biggest offender, repeat.

Where the time actually went

Profiling showed most latency was database round-trips, not CPU. Two patterns dominated:

  • N+1 queries hiding behind ORM lazy-loading
  • Endpoints re-computing the same aggregate on every request

The fixes

// before: one query per item
for (const o of orders) o.items = await Item.find({ orderId: o.id });
// after: one query, grouped in memory
const items = await Item.find({ orderId: { $in: ids } });

Adding a short-lived cache on hot aggregates and collapsing N+1s into batched lookups took p95 down by roughly a third — with zero change to the framework.

> The fastest code is the query you don't send.