"Make it faster" is not a plan. The plan is: measure, find the p95, fix the biggest offender, repeat.
Where the time actually went
Profiling showed most latency was database round-trips, not CPU. Two patterns dominated:
- N+1 queries hiding behind ORM lazy-loading
- Endpoints re-computing the same aggregate on every request
The fixes
// before: one query per item
for (const o of orders) o.items = await Item.find({ orderId: o.id });
// after: one query, grouped in memory
const items = await Item.find({ orderId: { $in: ids } });
Adding a short-lived cache on hot aggregates and collapsing N+1s into batched lookups took p95 down by roughly a third — with zero change to the framework.
> The fastest code is the query you don't send.