gtag('config', 'G-0PFHD683JR');
Price Prediction

Reducing the costs of Mongodb by 79 % with the improvements of the first shape

Protecting your database from future fires to avoid widespread capital in series “A”


Release responsibility: The following is a phrase


The day the nuclear bill went

The call came in 2:17.

The Atlas caused the scale of an undesirable production group, which led M60 A monthly cost machine 15 thousand dollars. The council wanted to know the reason for increasing the burning by 20 %, while the M60 is expensive $ 15 k/month system.

I opened the profiler:

db.system.profile.aggregate([
  { $match: { millis: { $gt: 100 } } },
  { $group: {
      _id: { op: "$op", ns: "$ns", query: "$command.filter" },
      n: { $sum: 1 },
      avgMs: { $avg: "$millis" }
  }},
  { $sort: { n: -1 } }, { $limit: 10 }
]).pretty();

Each element user user requests information panel from the perpetrator to withdraw a total 1.7 GB Every minute before serving. The huge amount of memory used mountain peaks in the Everest chart.

M30 servers are now working with one of these groups. The solution did not require an increase in the pieces. Three common defects known as the name Crime It was present at the base of the code before the judiciary.


Crime scene investigation

2.1 n + 1 Tsunami query

This is recognized as models-upon requesting one set of applications, it requires the operation of separate inquiries to recover the application lines.

// Incorrect:  Orders   +   1 000 extra queries
const orders = await db.orders.find({ userId }).toArray();
for (const o of orders) {
  o.lines = await db.orderLines.find({ orderId: o._id }).toArray();
}

Hidden taxes

meter

Why it extends

account

1 000 index = 1 000 context keys

I/o store

1 000 jogging jogging + 1 000 DOC Deserialisitions

network

Each circular eating ~ 1 mm seconds RTT + TLS shaking hands

Refactor (4 lines):

// Success: Single round‑trip, 1 read unit per order
db.orders.aggregate([
  { $match: { userId } },
  { $lookup: {
      from: "orderLines",
      localField: "_id",
      foreignField: "orderId",
      as: "lines"
  }},
  { $project: { lines: 1, total: 1, ts: 1 } }
]);

P95 cumin decreased from 2 300 mm seconds to 160 mm seconds.

Atlas Read – Ops: 101 → 1. This 99 % discount – there is no required voucher code.


2.2 Unlimited inquiries

“But we have to show the full click date!”

Certainly – not only in one indicator.

// Failure: Streams 30 months of data through the API gateway
db.events.find({ userId }).toArray();

FixHard – Cap The Batch and Project only the fields you offer.

db.events.find(
  { userId, ts: { $gte: ISODate(new Date() - 1000*60*60*24*30) } },
  { _id: 0, ts: 1, page: 1, ref: 1 }     // projection
).sort({ ts: -1 }).limit(1_000);

Then let Mongo clean behind you:

// 90‑day sliding window
db.events.createIndex({ ts: 1 }, { expireAfterSeconds: 60*60*24*90 });

Fintech client reduced the storage bill by 72 % overnight by simply by adding TTLS.


2.3 Jumbo money hole

Mongo Caps documents at 16 MB, but anything more than 256 kbi is already a red mark.

{
  "_id": "...",
  "type": "invoice",
  "customer": { /* 700 kB */ },
  "pdf": BinData(0,"..."),        // 4 MB binary
  "history": [ /* 1 200 delta rows */ ],
  "ts": ISODate()
}

Why hurt

  1. The entire document is established even if you read one field.

  2. Writtiger cannot store the largest possible number of documents for each page → beating the lower cache.

  3. Huge index entries → flowering filter makes mistakes → more disk seeks.

solution: Plan – Access – Pattern:

graph TD
  Invoice[(invoices
<2 kB)] -->|ref| Hist[history
<1 kB * N] Invoice -->|ref| Bin[pdf‑store (S3/GridFS)]

Small bill bills remain hot. The cost of the points in S3 0.023 dollars/GB – two months instead of SSDS from the Nand Atlas.


Four other crimes may be guilty

  1. Low commodity index head (({ type: 1, ts: -1 }) – Arrange it to { userId: 1, ts: -1 }.
  2. $ Regex starts with In the unrestricted field – scanning from hell.
  3. Findoneandupdate-Docosfits -the bottle cervical lock level; Use Redis/Kafka.
  4. Skip + big displacement Pacific punctuation – Mongo must calculate each document that has been overcome; Switching to Range (TS, _Id) Indicators.

4 · An autopsy 101

“But the Atlas says the readings are cheap!”

Let’s do mathematics.

metric

value

The cost of the unit

Monthly cost

Read (3 k/s)

7.8B

0.09 dollars / m

$ 702

Written (150 /s)

380 m

0.225 dollars / m

86 dollars

Data transfer

1.5 TB

$ 0.25 / GB

$ 375

Storage (2 TB)

$ 0.24 / GB

$ 480

the total: $ 1,643.

Repair application:

  • Read 70 % → 210 dollars
  • The transfer decreases 80 % → 75 dollars
  • Storage decreases 60 % → 192 dollars

New bill: $ 564. This is a medium -level engineer or The runway to Q4 – choose.


48 hours rescue the enemy (the timed schedule of battle)

hour

an act

tool

Wins

0-2

Run profiler (slowms = 50).

Mongo Shell

The surface is top 10 slow processes.

2-6

Type n + 1 in $lookup.

For code + JEST tests

90 % is less than readings.

6-10

Add expectations & limit To unlimited discoveries.

API layer

Ram Thabet. API 4 x faster.

10-16

Break Jumbo → Metas + GRIDFS/S3.

ETL texts

The working group fits in the RAM.

16-22

Driving/replacing low -commodity indexes.

compass

The disk shrinks. ↑.

22-30

Create TTLS, cold data per month, enable online archive.

Atlas user interface

60 % reserved storage.

30-36

Add GRAFANA panels: Hit % Hit Storage, Scan: IX ratio, evacuation rate.

Prometheus

Early visual warnings.

36-48

Download test with K6

K6 + atlas standards

Confirm P95 <150 ms @ 2 x Download.


Self -review menu – prepare it over your office

  • The largest document ÷ 10? Refactor.

  • Return the indicator> 1000 documents? → Summit.

  • TTL on every event/table table? (Yes/No)

  • Any index where Cardinality <10 %? → Discount/rearrangement.

  • Profiler Slowops> 1 % Total operations? → Improving or cache.

If the initial cache strikes remain less than 90 %, then it is wise to separate groups or add additional RAM deployment repairs.

Apply the review menu on your laptop with adhesive glue after placing it to print.


Why do indexes be knocking on

The Mongodb Inquiry Plan is a cost -based research through candidates’ plans. The cost vector includes:

workUnits = ixScans + fetches + sorts + #docs returned

Indexes only reduce ixScans. Bad shape amplify Lords and KindlyThat often dominates. example:

db.logs.find(
  { ts: { $gte: start, $lt: end }, level: "error" }
).sort({ level: 1, ts: -1 });

index { level: 1, ts: -1 } Plaanner does not help to avoid bringing each document when it adds a backfish to an unspecified arrival field in your expectations. The clear result: 20 k brings 200 visits. The index must precede the form of form in daily operations.


Live standards you should see (Grafana Promql)

# WiredTiger cache hit ratio
(rate(wiredtiger_blockmanager_blocks_read[1m]) /
 (rate(wiredtiger_blockmanager_blocks_read[1m]) +
  rate(wiredtiger_blockmanager_blocks_read_from_cache[1m]))
) < 0.10

Warning if> 10 % lacks a period of 5 m.

# Docs scanned vs returned
rate(mongodb_ssm_metrics_documents[1m]{state="scanned"}) /
rate(mongodb_ssm_metrics_documents[1m]{state="returned"}) > 100

If you wipe more than 100 x documents more than you return, you burn money.


Hands – on: High deportation text

You need to break 1 – Terabayte events Gather clicksand viewsand logins Without stopping? use Write double / fill pattern.

// 1. Add trigger
const changeStream = db.events.watch([], { fullDocument: 'updateLookup' });
changeStream.on('change', ev => {
  const dest = db[`${ev.fullDocument.type}s`];
  dest.insertOne({ ...ev.fullDocument });
});

// 2. Backfill historical in chunks
let lastId = ObjectId("000000...");
while (true) {
  const batch = db.events.find({_id: {$gt: lastId}}).sort({_id: 1}).limit(10_000);
  if (!batch.hasNext()) break;
  const docs = batch.toArray();
  docs.forEach(d => db[`${d.type}s`].insertOne(d));
  lastId = docs[docs.length - 1]._id;
}

Zero time stopping, minimal additional storage (thanks to TTL), everyone sleeps.


Upon calendar He is Answer

According to the thumb base, it should be cut only if it is achieved one From these conditions it is achieved after improving your database:

  1. The system works with a work group representing more than 80 percent of RAM, regardless of the rate of cache.

  2. The system creates more than 15,000 operations per second in the peak of writing when using one basic server.

  3. Your main priorities should be to maintain a multi -regional access time below 70 millimeters because the costs of high AWS do not represent your decisive interest.

The decision should be simple when the conditions do not match these rules.


Wrap case study

metric

before

after

Δ

Random access memory

120 GB

36 GB

−70 %

Read/s

6 700

900

−86 %

Storage (hot)

2.1 TB

600 GB

71 %

P95 cumin

1.9 s

140 mm seconds

−92 %

Atlas Cost / Mo.

15 $ 284

$ 3 210

−79 %

No fragments, do not freeze a major symbol, just a ruthless shape surgery.


Ready meals: debts versus death a whirlpool

The requirements for submitting them at a mandatory speed, but keeping poor quality business belongs to the accumulation of voluntary debts. The provider of the cloud, which combines the interest that accumulates at a rate of annual rate of 1,000 % on your unpaid debt. We examined the high -level credit cards for Mongodb because they represent the five -shape crimes we studied. Getting rid of these debts during the current enemy period will lead to grateful technical and financial data.

You need to open the researcher to work with $lookup Press while adding TTL dust, then spread your lean project. Your DEV and Page and Page at 02:17 will have quality comfort.

Go ahead with re -creation of your code until the following automatic accident occurs.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button