All guides

Large MBOX
performance tips.

Strategies for handling multi-gigabyte MBOX archives on Mac — streaming, splitting, memory management, disk space, and batch workflow.

Published April 22, 2026 · Updated April 22, 2026

The headline

Large MBOX files — multi-gigabyte Gmail Takeouts, decade-long Thunderbird archives, corporate mailbox exports — are common. The problem isn't the size on its own; it's that many tools load the whole file into memory before doing anything. That breaks on large archives.

The fix is a streaming architecture: read one message at a time, process it, write output, move on. MBOX to PDF uses this approach, which is why it can convert archives your Mac's RAM can't hold.

Before you start: understand your archive

Run these checks so you know what you're dealing with:

Size on disk

Finder's Get Info panel or du -h archive.mbox in Terminal.

Approximate message count

In Terminal, count From-line delimiters:

grep -c '^From ' archive.mbox

Not perfectly accurate (the regex can be fooled by quoted lines beginning with "From "), but close enough for capacity planning.

Attachment density

Count attachment markers:

grep -c 'Content-Disposition: attachment' archive.mbox

Divide by message count to get the average attachments per email. Heavy numbers (above 1.0) mean your PDF output will balloon if you embed attachments — extract them separately instead.

Tip 1 — Use a streaming converter

Non-streaming tools load the full archive into memory to build a message index before processing. On a 10 GB MBOX with 16 GB of RAM, they crash. A streaming converter reads one message at a time and has memory usage that stays flat regardless of archive size.

MBOX to PDF is designed around streaming. BitRecover, SysTools, and Aid4Mail also stream in modern builds. Old-generation or script-based conversions may not.

Tip 2 — Use an external drive for I/O-heavy conversions

Reading a 10 GB MBOX and writing 15 GB of PDFs involves moving 25+ GB of data. On internal SSDs this is usually fine. On spinning disks, or when your internal drive is near capacity, moving the archive to a fast external SSD can double or triple throughput.

Keep input and output on the same drive if possible — cross-drive copies add overhead.

Tip 3 — Plan disk space for output

Common pattern: a 5 GB MBOX produces roughly 5–10 GB of PDFs depending on HTML formatting, embedded images, and attachment handling. HTML emails with large images embedded in the PDF can push output larger than the source.

Mitigations:

Tip 4 — Split only when you have a reason

Splitting a large MBOX into smaller chunks is sometimes recommended online. With a streaming converter you usually don't need to. Splitting helps only when:

How to split an MBOX

In Terminal, using awk to split by message count:

awk 'BEGIN{count=0; part=1; out="part_1.mbox"}
     /^From / {if (count % 5000 == 0 && count > 0) {part++; out="part_" part ".mbox"}; count++}
     {print > out}' archive.mbox

Splits by 5,000-message chunks. Adjust the number for your needs.

Tip 5 — Use batch mode carefully

If you have multiple MBOX files (e.g. from different Thunderbird folders or multiple Gmail Takeout pieces), you can drag them all in at once. MBOX to PDF treats them as a combined dataset.

The tradeoff: combining produces a single unified chronological output. Folder boundaries are lost. If you need folder-per-folder PDF organization, run separate conversions and assemble the output structure yourself.

Tip 6 — Run conversions overnight

A 20 GB MBOX with tens of thousands of messages can take an hour or more end-to-end — not because the tool is slow, but because there's genuinely a lot of work per message (HTML parsing, image embedding, layout, PDF encoding).

For very large archives, start the conversion before leaving for the day. Disable Mac's sleep for the duration (System Settings › Battery or Energy Saver). Check back in the morning.

Tip 7 — Test with a sample first

Before committing to a multi-hour conversion, run the same settings on a small subset (50–100 emails). Verify output looks right, pagination, watermarks, attachment extraction. Catching a margin mistake on a sample beats catching it after a 2-hour full run.

MBOX to PDF's real-time preview handles most of this without even running the full conversion.

Tip 8 — Monitor system resources if unsure

If you're concerned about memory or disk pressure, open Activity Monitor during the conversion. Healthy signs:

Red flags:

Tip 9 — Keep the source MBOX after conversion

Don't delete the original after converting to PDF. If you ever want to re-run with different settings (different watermark, attachment handling, page size), you'll need the source. Store the MBOX alongside the PDFs in a read-only archive folder.

Tip 10 — For very large corporate archives, consider server-side conversion

If you're dealing with 50+ GB mailboxes or multi-custodian collections in the hundreds of GB, a local Mac conversion isn't the right tool. Server-side processing via a dedicated eDiscovery platform handles that scale better. See the legal discovery guide for when to escalate.

Frequently asked questions

How large can MBOX files get?

No hard limit. Multi-gigabyte files are common. Practical limit is available disk space.

Does MBOX to PDF handle large archives?

Yes — streaming engine reads one message at a time, so RAM isn't the constraint.

Should I split my MBOX?

Usually not necessary with a streaming converter. Split only for incremental archiving, sharing subsets, or when forced by a non-streaming tool.

How much disk space do I need for output?

Plan for PDF output comparable to or larger than the source, depending on HTML and image density. Extract attachments separately to keep PDFs small.

Why does converting feel slow on huge archives?

Per-message rendering adds up. Tens of thousands of messages take tens of minutes. Streaming means steady progress, not instant.

Can I pause and resume?

MBOX to PDF runs in one pass. Cancel and restart re-processes from the beginning. Partial output already written stays on disk.

Related reading

Multi-gigabyte archives

Built to handle it.

Streaming engine. $14.99 one-time. 100% offline.

Download on theMac App Store