4 Responses to MSG is Bad For You! (Expansion of a Prior Blog on Email Metadata)

  1. Craig Ball says:

    Dear Ralph:

    Thanks for the kind comments. It’s always heartening to know someone is reading what I write. It’s even better when they see things the same way!

    The points you make about the advantages of .PST formatted production against production in .MSG format are excellent. I couldn’t agree more that preserving and producing the folder structure is unquestionably helpful and oftimes necessary. Like you, I’d rather get e-mail formatted as a PST.

    But I don’t think that necessarily mitigates against .MSG when the folder structure is preserved–which can be done externally by something as simple as producing the .MSGs within an identical folder tree or by furnishing the path data for each message in a load file.

    Why am I stubbornly defending .MSG when .PST is superior?

    It’s because there are more off-the-shelf applications that can deal with the .MSG format than the .PST. By far the dominant standard for corporate e-mail, you’d think every tool would cut through a PST like a hot knife through butter. Instead, I find that the compressed and encrypted form of PSTs, along with their largely undocumented file internal structure, is something that trips up many tools. Moreover, until they are compacted, PSTs have a nasty propensity (or a wondrous propensity, depending on one’s point of view) to carry double deleted files invisibly within their structure. That’s probably not a big issue in the context of reconstituted production formats, but it’s the sort of insidious risk that keeps us lawyers up at night.

    One further comment. you indicate that a local PST isn’t the default in an Exchange environment. Here, I mean on the user’s PC hard drive. That’s true, but it’s almost universally supplanted by a file with the .OST extension that holds synchronized Exchange e-mail in order to support offline access within Outlook. Accordingly, you almost always run into some sort of local e-mail storage file potentially at variance with what you’ll find on the server. Additionally–as you well appreciate–the user may create any number of .PST backup files at intervals and, to boot, there is often a need to look for an auto-archive container files, locally on the machine or stored within the users network storage areas or “shares.”

    Processing just the server stored email may be sufficient, but it’s not often complete.

    Looking forward to your book!

    Craig Ball

  2. […] So that this blog won’t be all fluff, I want to end with some discussion and analysis of a few interesting points in D’Onofrio. This is a sex discrimination case where the plaintiff moved to compel discovery and impose sanctions against her former employer. Most of the discovery disputes relate to electronically stored information (”ESI”). Plaintiff asked the court to compel production of a business plan document, and certain emails, in their “original electronic format, with accompanying metadata.” D’Onofriopg. 5. Plaintiff claimed that “defendants have deliberately caused the spoliation of electronic records and have purposely failed to produce many e-mails and documents.” Apparently, defendants produced the records requested, but they produced the business plan in paper form, and the emails in MSG format. (For more information on the deficiencies of MSG production, and why only PST production includes full metadata, see my prior blog: MSG is Bad for You.) […]

  3. Venkat Rangan says:

    I agree with your assessment that an MSG file is not ideally suited as a native file representation in Microsoft Outlook and Exchange environments. However, I do want to mention that the alternatives to produce native files are also beset with problems.

    Technically, the most “native” of files in a Microsoft Exchange messaging environment is the EDB file that Exchange stores in its mail servers. This is because this is the file that contains all the mails for all the mailboxes on the Exchange Mailstore, and the place where all emails are actually maintained during the course of conducting business. However, the EDB files contain multiple users mailboxes, and would contain confidential and privileged matter that should not be produced. To isolate an EDB to only the custodians that are relevant to a case would require their PSTs to be produced. Although the PSTs are a snapshot of a mailbox within an EDB file, it is technically not a native file that is used during normal course of business. Also, given that standard business processes and Microsoft’s proven technologies are used for creating PST files from EDB files, there is a certain accepted belief that it is as close to native as we can get.

    On the content of PST files themselves, I agree that it does preserve the complete meta-data of messages including the folder location and the chronological position within that folder. However, producing an entire PST as a responsive document is also not viable, since that would expose other emails that are potentially confidential and/or privileged. When we remove these messages from the original PST, that would alter the PST, therefore exposing yourself for spoliation. An alternative is to produce another PST designed for production and for delivering that to the requesting party. The question then is whether this production step preserves the meta-data of the original message, including the original folder structure that is so vital to establishing certain aspects such as user intent. Also, as one transfers an MSG from the original source PST to a target PST, the MSG’s internal content changes. This is because an MSG’s insertion into a new PST creates a new EntryID property. One could argue that the essential components of an MSG are preserved, but that is not provable by a Hash of the MSG.

    Alternative production approach could be to extract the responsive MSGs from the original PST, store each MSG in a separate file and compute its Hash Value. To preserve the meta-data such as its original folder location, produce a second wrapper file in the form of XML, that includes the file location of the MSG, its hash, the name of the original PST file it was extracted from, and the folder location within that PST. In this way, only the responsive MSG files need be produced, and as they are copied and transferred, its internal contents do not change, and the meta-data are preserved external to the MSG file in a corresponding XML file. The EDRM XML export mechanism provides a framework for producing multiple MSG files while preserving the original meta-data. This would seem to address the needs of native production, while maintaining integrity of MSG as it existed in the original PST file.

  4. Michael La Porte says:

    @Venkat: “When we remove these messages from the original PST, that would alter the PST, therefore exposing yourself for spoliation.”

    Is this really any different than the intentional destruction of metadata by extracting MSG files from the PST container file?

    So long as the producing and receiving parties are aware of what is going on, I’m not sure that this is a persuasive reason for preferring MSG over PST.

Leave a Reply