HOWTO Get a load of hard-disk space back

A handy tip -- if you use Thunderbird to get your email, don't forget to occasionally run File -> Compact Folders. I did so yesterday and reclaimed nearly 20GB of hard drive space! Comparing my mail folder to my backup, I discovered that every single email that I'd "deleted" for over a year (by putting it in the Trash and then emptying it) was still lurking on my disk.

Discussion

Take a look at this

There is a setting to compact space automatically.

Look under prefs->advanced->network & disk space

Then click the 'compact folders when it will save over _____ KB' checkbox.

Adjust the numerical value as desired.

Take a look at this

While on the topic of finding free disk space, i highly recommend Windirstat http://windirstat.info/ for Windows (open sauce!)

Take a look at this

And you've been using computers/Thunderbird for how many years? :)

Take a look at this

And you've been using computers/Thunderbird for how many years? :)

LOL This!

Take a look at this

Next tip: Empty the trash/recycle bin. It'll BLOW YOUR MIND.

Take a look at this

Discovered exactly the same thing this week after reinstalling my OS and copying over my old mail folders. It seems odd to me that Thunderbird would retain messages that are explicitly deleted, but I was finding stuff that was more than three years old in there.

On the other hand, I only reclaimed 40Mb so I obviously get nowhere near the amount of mail you do!

Thanks to Frumious for the tip, settings duly changed.

Take a look at this

Outlook just askes if I want to compact my stuff to make more room. Isn't that nice of the 'ware to do? And I always say yes.

Take a look at this

I discovered this about 2 years ago. I noticed that Thunderbird was becoming incredibly slow and found that the inbox file was around 200MB. It was so bad for me, TB even refused to launch sometimes. After finding about about compacting folders, TB suddenly became much more responsive and usable.

Take a look at this

You'd be amazed at how much hdd space I save using gmail.

Take a look at this

I leave all my mail on the server. Let's me get to it anywhere, and saves space on the HD.

Take a look at this
#11 posted by Anonymous , January 31, 2008 5:54 AM

Not kidding... I just did "Compact Folders" for the first time. I went from having 10.4G to 9.92G. I lost disk space! WTF?

Take a look at this
#12 posted by Anonymous , January 31, 2008 6:21 AM

Please tell me that the White House uses Thunderbird too. If so, we'll be able to read all those 'deleted by mistake' emails, right?

Take a look at this

Not only does it keep those deleted messages around, every once in awhile if it crashes it'll just throw them back into your inbox when it restarts! Twice now I've had hundreds of already-read emails get dumped back in. Not fun.

Take a look at this
#14 posted by afo , January 31, 2008 6:43 AM

die POP3, die!

Take a look at this

Ok, so I am going to speculate about why I think this is the case, and I would love more technically inclined readers to correct this so I can put this in my own knowledge base for work.

From what I can tell, TB keeps every email as an individual file and therefore they are easy to index and very easy to move around from computer to computer. I would think that the price for that would be that when the file is storage on the drive that the OS will round up a bit on how much space to allocate to email X because of the block size allocation of the particular operating system a person might be using. Just like when you throw a file around from Windows to Mac to Unix the file size "changes" in how much space the OS thinks it should take up. If you get a lot of emails, then you get a lot of rounding up from the OS for each individual email, which means you would get a larger overall amount of your drive's space allocated to all of these little files. The huge upside also is that all of your emails can't get nuked because a single database file was damaged like with Outlook or Entourage. Outlook and Entourage, though, will have their own way of deal with email internal to their application's storage system and individual emails won't be individual files on your hard drive, but just part of a database file (that can be easily damaged - I've repaired both an Outlook file and an Entourage file at work this week).

Does Thunderbird "compact" into a database when it does the compacting?

So, again, that's a speculation about why you end up with a lot of space being taken by TB files. I would love a more technical explanation than that, even. And a corrected explanation for the places I have clearly speculated myself into fiction. :)

Take a look at this

@CSBMONKEY:

That's incorrect. Thunderbird uses the "mbox" format which stores each email folder as a flat-text "database". What you're thinking of is called the Maildir format.

When Thunderbird "deletes" an email message, all it does is set a flag at the top of that email's entry in the flat-text file. No data is actually removed, not even when TB exits.

This causes exponential growth when "moving" messages between folders - especially ones with large attachments. You end up with hidden copies in each folder that the message has touched.

And, as madsci noted, occasionally it loses track of the flags, and you get an inbox full of already-deleted messages.

It's an all-round epic fail when it comes to modern file formats (mbox, I mean, not just Thunderbird - although TB does handle it more poorly than other clients), and I don't know if anything is on the TB project's horizon to replace it.

I would think that, at the very least, sqlite would be one thing to consider - if not simply adding support for the Maildir format, where each email is a physical file on the disk.

Take a look at this

Danke BERYLLIUM.

Take a look at this

Thunderbird actually concatenates e-mails in the same folder into a single (text) file. (Try poking around inside your mail folder and viewing the files.) That makes each folder a lot like a cassette tape: if you want to remove an e-mail, you have to read in the whole file (starting from the e-mail's position) and write it out again with the e-mail's spot removed. Presumably this is what "compacting" does. "Deleting an e-mail" without compacting would just mark the e-mail as deleted, or perhaps overwrite that chunk of the file with zeros or something.

Presumably this classic "mbox" format (concatenated e-mails into a single file) was designed to solve three problems:

1. When you get new e-mail, how does the e-mail client avoid giving it a filename that takes an existing e-mail's filename? (Imagine you get as many e-mails as Cory.)

2. Some file systems can't have more than a certain number of files (in a directory? on a partition of a disk?).

3. If you have one file per e-mail, you still need to read all the files in order to build an index, sort them by date received, etc.

Take a look at this

awww shoot, I knew someone would beat me to it ;-)

Take a look at this

Seems like compacting should be done automatically when the program closes. Didn't Netscape Mail do that?
I tried using the automatic setting mentioned by Frumious, but it conflicted with the "Empty Trash on Exit" setting in Tools/Account Settings/server settings.
T-Bird would empty the trash just fine, but on the next restart, the auto-compact routine would run immediately, futzing up my incoming mail filters. Mail couldn't be filtered to folders that were being compacted.
Grrrr...

Take a look at this

Uncompressed Mailboxes and Anti-virus Software -

I've been using Eudora for email for over a decade. It keeps mail in mbox format for each folder, and if the amount of deleted material hits some threshold it'll compact that folder when you close it, or you can tell it to compact, which I typically do after emptying trash. I've recently been restructuring backups after a disk upgrade, and I'd updated the virus tables on Kaspersky Anti-Virus.


So Kaspersky suddenly started complaining about all my email backup files :-) Some of the complaints were about the Junk and Trash folders (fine, those are expendable) or about attachments, which get saved as individual files. But it was also complaining about the Inbox files, AFAICT because there're deleted-but-uncompressed messages in them that contain virus signatures. And unfortunately, the error messages don't tell me what byte in the file has the signature, just some text around it, which hasn't been reliably easy to find. In most cases I've been able to either use the current version of Eudora to compress the files or dredge up an older version to compress them with (the mbox format hasn't changed, but the index files that keep track of messages might have), but in some cases that hasn't done it. I don't know if there's some message I've missed deleting, or if there's a string of randomness in some MIME boundary that looks like a virus signature, but either way, Kaspersky thinks my old mailboxes look a lot like Mos Eisley.

Take a look at this

Yikes, epic, epic fail for Thunderbird's default setting, and for the mbox format.

When I empty the trash, I assume that what was in there is no longer within easy reach of the narcs, the tax collector, the church police, my girlfriend, and the BDA (take that, Lemming!).

As someone who has archived mail for a decade and a half, I wonder how many people out there are unable to use a MacBook Air solely because of Thunderbird's bad behaviour.

Take a look at this

I went through a similar experience with the "Norton protected recycle bin". In at least some versions of Norton's antivirus/antispyware products, Norton actually keeps whatever you delete (such as files too big for the recycle bin, or the contents of the recycle bin when you empty it) into a hidden "protected recycle bin". This was never too much of a problem, since I have loads of HD space and don't delete lots of stuff (files either sit on my computer and are updated once in a while, or are saved somplace external to begin with). But when I started ripping DVDs (and saving the ISOs to my hard drive before moving them elsewhere), I noticed that my HD space started getting lower and lower, even though I (as far as I knew) deleted each ISO after moving it somewhere else. Some searches for things like "emptying recycle bin does not free up hard drive space" led me to the answer, and all of a sudden I had dozens and dozens of gigs, more free space than I ever had before (since first installing Norton, the same day the computer came out of the box). Now THERE's an annoying default.

(Another annoying default is that my computer's Windows' display setting is set to a lower DPI than what the monitor has, so while all letters and window edges looked sharp, all icons and images looked like crap. Beyond making sure that the screen resolution matched the LCD's resolution, for a long time (minutes) I could not figure out why all images looked like they had been increased in size through crappy interpolation. I eventually clicked on Display Properties -> Settings -> Advanced and found the stupid DPI setting, and sighed with relief when changing it caused everything to look nice and smooth again. Why on earth would they set the default DPI to a setting where all images - even buttons in programs - look like crap?)

Take a look at this

BTW, here's the Bugzilla entry regarding Maildir support.

This argument has been raging for seven and a half years. No apparent progress, although there is a claim that the feature is planned for Thunderbird 3. Posted a year ago. Sigh.

You can create an account and vote for the feature, if you don't mind your email address being exposed to spammers. See above re: sigh.

Take a look at this

Every now and then, I open Disk Inventory (OS X only) for a graphical representation of what the big files on my computer are. One of the two biggest are the .msf files that Thunderbird seems to make. Weird because I don't even use Thunderbird! The other worth mentioning is filesystem_blobs.MYD which is an automatically generated file created by Adobe Bridge. Deleting these manually saved me about 15gb.

Take a look at this

Yeah, the lack of automatic compacting and the use of mbox instead of something sensible are the two main reasons why I don't use TB anymore. I was using mutt for a while and keep meaning to try a ruby client I was reading about (don't remember the name, but it uses filters and a search engine to organize) but mainly I use a local client just to keep a gmail backup.

I'd love for an open-source multi-platform maildir capable program to come about. and as far as keeping messages indexed goes, c'mon, apple already provides a good example of a good use of sqlite.

Take a look at this

I use the Xpunge (https://addons.mozilla.org/en-US/thunderbird/addon/1279) add-on which gives my OCD a button to push (often) and both empties the trash and compacts all folders at the same time.

Post a comment

Anonymous