My Hack of the NSA and Discovery of a Heretofore Unknown Plan to Use Teams of AI-Enhanced Lawyers and Search Experts to Find Critical Evidence

March 1, 2015

NSA_logoNow that my blog has changed from weekly to monthly I have more time for my hobbies, like trying to hack into NSA computers. I made a breakthrough with that recently, thanks primarily to exuberant disclosures by Snowden after the Oscars. I was able to get into one of the NSA’s top-secret systems. Not only that, my hack led to discovery of a convert operation that will blow your mind. (Hey, if the NSA can brag about their exploits, then so can I.) And if that were not enough, I was able to get away with downloading two documents from their system. I will share what I borrowed with you here (and, of course, on Wikileaks). The documents are:

  • A perviously unknown Plan to use sophisticated e-Discovery Teams with AI-enhancements to find evidence for use in investigations and courtrooms around the world.
  • A slide show in movie and PDF form that tells you how these teams operate.

nsa-spying-logoI can disclose my findings and stolen documents here without fear of becoming Citizen Five because what I found out is so incredible that the NSA will disavow all knowledge. They will be forced to claim that I made up the whole story. Besides, I am not going to explain how I hacked the NSA. Moreover, unlike some weasels, I will never knowingly give aid and comfort to foreign governments. This is something many Hollywood types and script kiddies fail to grasp. All I will say is that I discovered a critical zero-day type error in two lines of code, out of billions, in a software program used by the NSA. In accord with standard white hat protocol, if the NSA admits my story here is true, I will tell them the error. Otherwise, I am keeping this code mistake secret.

Time_SpiralThe hack allowed me to access a Top Secret project coded-named Gibson. It is a Cyberspace Time Machine. This heretofore secret device allows you to travel in time, but, here’s the catch, only on the Internet. Since it is an Internet based device the NSA has to keep it plugged in. That is why I was not faced with the nearly insoluble air gap defense protecting the NSA’s other computer systems.

From what I have been able to figure out, the time travel takes place on a subatomic cyber-level and requires access to the Hadron Collider. The Gibson somehow uses entangled electrons, Higgs bosons, and quantum flux probability. The new technology is based on Hawking’s latest theories, the speed of light, gravity, quantum computers, and, can you believe it, imaginary numbers, you know, the square root of negative numbers. It all seems so obvious after you read the NSA executive summary, that other groups with Hadron Collider access and quantum computers are likely to come up with the same invention soon. But for now the NSA has a huge advantage and head start. Maybe someday they will even share some of that info with POTUS.


The NSA Internet Time Machine allows you to peer into the past content of the Internet, which, I know, is not all that new or exciting. But, here is the really cool part that makes this invention truly disruptive, you can also look into the future. With the Gibson and special web browsers you can travel to and capture future webpages and content that have not been created yet, at least not in our time. You can Goggle the future! Just think of the possibilities. No wonder the NSA never has any funding problems.

Apple_buildingThis kind of breakthrough invention is so huge, and so incredible, that NSA must deny all knowledge. If people discover this is even possible, other groups will race to catch up and build their own Internet Time Machines. That is probably why Apple is hoarding so much cash. Will there be a secret collider built off the books under their new headquarters? It kind of looks like it. Google is probably working on this too. The government cannot risk anyone else knowing about this discovery. That would encourage a dangerous time machine race that would make the nuclear race looks like child’s play. Can you imagine what Iran would do with information from the future? The government simply cannot allow that to happen.

minority-report_Cruse_LoseyFor that reason alone my hack and disclosures are untouchable. The NSA cannot admit this is true, or even might be true. Besides, having seen the future, I already know that I will not be prosecuted for these intrusions. In fact, no one but a few hard-core e-Discovery Team players will even believe this story. I can also share the information I have stolen from the future without fear of CFAA prosecution. Technically speaking my unauthorized access of web pages in the future has not happened yet. Despite my PreCrimelike proposals in, you cannot (yet) be prosecuted for future crimes. You can probably be fired for what you may do, but that is another story.

nsa_eye_blueStill, the hack itself is not really what is important here, not even the existence of the NSA’s Time Machine, as great as that is. The two documents that I brought back from the future are what really matters. That is the real point of this blog, just in case you were wondering. I have been able to locate and download from the future Internet a detailed outline of a Plan for AI-Enhanced search and review.

The Plan is apparently in common use by future lawyers. I am not sure of the document’s exact date, but it looks like circa 2025. It is obviously from the future, as nobody has any plans like this now. I also found a video and PDF of a PowerPoint of some kind. It shows how lawyers and other investigators in the future use artificial intelligence to enhance all kinds of ESI search projects, including overt litigation and covert investigations. It appears to be a detailed presentation of how to use what is still called Predictive Coding. (Well, at least they do not call it TAR anymore.) Nobody in our time has seen this presentation yet. I am sure of that. You will have the first glimpse now.

The Plan for AI-Enhanced search and review is in the form of a detailed 1,500 word outline. It looks like this Plan is commonly used in the future to obtain client and insurer approval of e-discovery review projects. I think that this review Plan of the future is part of a standardized approval process that is eventually set up for client protection. Obviously we have nothing like that now. The plan might even be shared with opposing counsel and the courts, but I cannot be sure of that. I had to make a quick exit from the NSA system before my intrusion was detected.

I include a full copy of this Plan below, and the PowerPoint slides in video form. See if thee documents are comprehensible to you. If my blog is brought down by denial of service attacks, you can also find it on Wikileaks servers around the world. The Plan can also be found here as a standalone document, and the PDF of the slides can be found here. I hope that this disclosure is not too disruptive to existing time lines, but, from what I have seen of the future of law, temporal paradox be damned, some disruption is needed!

Time_MachineAlthough I had to make a quick exit, I did leave a back door. I can seize root of the NSA Gibson Cyberspace Time Machine anytime I want. I may share more of what I find in upcoming monthly blogs. It is futuristic, but as part of the remaining elite who still follow this blog, I’m sure you will be able to understand. I may even start incorporating this information into my legal practice, consults, and training. You’ll read about it in the future. I know. I’ve been there.

If you have any suggestions on this hacking endeavor, or the below Plan, send me an encrypted email. But please only use this secure email address: Otherwise the NSA is likely to read it, and you may not enjoy the same level of journalistic sci-fi protection that I do.


Outline of 12-Step Plan for Predictive Coding Review

1. Basic Numerics of the Project

a. Number and type of documents to be reviewed

b. Time to complete review

c. Software to be used for review

(1) Active Machine Learning features

(A) General description

(B) Document ranking system (ie- Kroll ranks documents by percentage probability, .01% – 99.9%)

(2) Vendor expert assistance to be provided

d. Budget Range (supported by separate document with detailed estimates and projections)

2. Basic Goals of the Project, including analysis of impact of Proportionality Doctrine and Document Ranking. Here are some possible examples:

a. High recall and production of responsive documents within budget proportionality constraints and time limits.

b. Top 25% probable relevant, and all probable (50%+) highly relevant is a metric goal proportional and reasonable in this particular case for this kind of ESI. (Note – these numbers are often used in high-end, large scale projects where there is a premium on quality.)

c. All probable relevant and highly relevant within a specified range or set of ranges.

d. Zero Errors in document review screening for attorney client privileged communications.

e. Evaluation of large production received by client.

f. Time sensitive preparations for specific hearings, mediation, depositions, or 3rd party subpoenas.

g. Private internal corporate investigations as part of quality control, business information, compliance and dispute avoidance..

h. Compliance with government requests for information, state criminal investigations and private civil litigation.

3. General Cooperation Strategy

a. Disclosures planned

(1) Transparent

(2) Translucent

(3) Brick Wall

b. Treatment of Irrelevant Documents

c. Relevancy Discussions

d. Sedona Principle Six

4. Team Members for Project

Penrose_triangle_Expertisea. Predictive Coding Chief. Experienced searcher in charge of the Predictive Coding aspects of the document review

1. Experienced ESI Searcher

2. Same person in charge of non-PC aspects, if not, explain

3. Authority and Responsibilities

4. List qualifications and experience

b. Subject Matter Experts (SME)

(1) Senior SME

A. Final Decision Maker – usually partner in charge of case

B. Determines what is relevant or responsive

(i) Based on experience with the type of case at issue

(ii) Predicts how judge will rule on relevance and production issues

C. Formulates specific rules when faced with particular document types

D. Controls communications with requesting parties senior counsel (usually)

E. List qualifications and experience

(2) Junior SME(s)

A. Lead Document Review expert(s)

B. Usually Sr. Associate working directly with partner in charge

C. Seeks input from final decision maker on grey area documents (Undetermined Category)

D. Responsible for Relevancy Rule articulations and communications

E. List qualifications and experience

(3) Amount of estimated time in budget for the work by Sr and Jr SMEs.

A. Assurances of adequate time commitments, availability

B. Reference time estimates in budget

C. Time should exclude training

(4) Response times guaranties to questions, requests from Predictive Coding Chief

c. Vendor Personnel

(1) Anticipated roles

(2) List qualifications and experience

d. Power Users of particular software and predictive coding features to be used

(1) Law Firm and Vendor

(2) List qualifications and experience

e. Outside Consultants or other experts

(1) Anticipated roles

(2) List qualifications and experience

f. Contract Lawyers

(1) Price list for reviewers and reviewer management

A. $500-$750 per hr is typical (Editors Note: Is this widespread inflation, or new respect?)

B. Competing bids requested? Why or why not.

(2) Conflict check procedures

(3) Licensed attorneys only or paralegals also

(4) Size of team planned

A. Rationale for more than 5 contract reviewers

B. “Less is More” plan

(5) Contract Reviewer Selection criteria

g. Plan to properly train and supervise contract lawyers

5. One or Two-Pass Review

a. Two pass is standard, with first pass selecting relevance and privilege using Predictive Coding, and second pass by reviewers with eyes-on review to confirm relevance prediction and code for confidentiality, and create priv log.

b. If one pass proposed (aka Quick Peek), has client approved risks of inadvertent disclosures after written notice of these risks?

6. Clawback and Confidentiality agreements and orders

a. Rule 502(d) Order

b. Confidentiality Agreement: Confidential, AEO, Redactions

c. Privilege and Logging

(1) Contract lawyers

(2) Automated prep

7. Categories for Review Coding and Training

a. Irrelevant – this should be a training category

b. Relevant – this should be a training category

(1) Relevance Manual for contract lawyers (see form)

(2) Email family relevance rules

A. Parents automatically relevant is child (attachment) relevant

B. Attachments automatically relevant if email is?

C. All attachments automatically relevant if one attachment is?

c. Highly Relevant – this should be a training category

d. Undetermined – temporary until final adjudication

e. No or Very Few Sub-Issues of Relevant, usually just Highly Relevant

f. Privilege – this should be a training category

g. Confidential

(1) AEO

(2) Redaction Required

(3) Redaction Completed

i. Second Pass Completed

8. Search Methods to find documents for training and production

a. ID persons responsible and qualifications

CULLING.2-Filters.3-lakes-ProductionLb. Methods to cull-out documents before Predictive Coding training begins to avoid selection of inappropriate documents for training and to improve efficiency

(1) Eg – any non-text document; overly long documents

(2) Plan to review by alternate methods

(3) ID general methods for this first stage culling; both legal and technical

c. ID general methods for Predictive Coding, ie – Machine selected only, or multimodal

d. Describe machine selection methods.

(1) Random – should be used sparingly, and never as sole method

(2) Uncertainty – documents that machine is currently unsure of ranking, usually in 40%-60% range

(3) High Probability – documents as yet un-coded that machine considers likely relevant

(4) All or some of the above in combination

Multimodal Search Pyramide. Describe other human based multimodal methods

(1) Expert manual

(2) Parametric Boolean Keyword

(3) Similarity and Near Duplication

(4) Concept Search (passive machine learning, such as latent semantic indexing)

(5) Various Ranking methods based on probability strata selected by expert in charge

f. Describe whether a Continuous Active Learning (CAL) process for review will be used, or two-stage process (train, then review), and if later, rationale

9. Describe Quality Control procedures, including, where applicable, any features built into the software, to accomplish following QC goals

quality_trianglea. Three areas of focus to maximize quality of predictive coding

(1) Quality of the AI trainers work to select documents for instruction in the active machine learning process

(2) Quality of the SME work to properly classify documents, especially Highly Relevant and grey area documents, in accord with true probative value and court opinions

(3) Quality of the software algorithms that apply the training input to create a mathematical model that accurately separates the document cloud into probability polar groupings

b. Supervise all reviewers, including contract reviewers who usually do the bulk of the document review work.

(1) ID persons responsible

(2) ID general methods

c. Avoid incorrect conceptions and understanding of relevance and responsiveness, iw – what are you searching for and what will you produce.

(1) Target matches legal obligations

(2) Relevance scope dialogues with requesting party

(3) 26(f) conferences and 16(b) hearings

(4) Motion practice with Court for early resolution of disputes

(5) ID persons responsible

d. Minimize human errors in document coding. Zero Error Numerics.

(1) Mistakes in relevance rule applications to particular documents

(2) Physical mistakes in clicking wrong code buttons

(3) Inconsistencies in coding of same or similar documents

(4) Inconsistencies in coding of same or similar document types

(5) ID persons responsible

e. Facilitate horizontal and vertical communications in team

(1) ID persons responsible

(2) ID general methods

f. Corrections for Concept Drift inherent in any large review project where understanding of relevance changes over time

(1) ID persons responsible

(2) ID general methods

g. Detection of inconsistencies between predictive document ranking and coding

(1) ID persons responsible

(2) ID general methods

h. Avoid incomplete, inadequate selection of documents for training

(1) ID persons responsible

(2) ID general methods

i. Avoid premature termination of training

(1) ID persons responsible

(2) ID general methods

j. Avoid omission of any Highly Relevant documents, or new types of strong relevant documents

(1) ID persons responsible

(2) ID general methods

k. Avoid inadvertent production of privileged documents

(1) List of attorneys names and email domains

(2) Active multimodal search supplement to predictive coding

(3) Dual pass review

(4) ID persons responsible

(5) ID general methods

l. Avoid inadvertent production of confidential documents without proper labeling and redactions

(1) ID persons responsible

(2) ID general methods

m. Avoid incomplete, inaccurate privilege logs

(1) ID persons responsible

(2) ID general methods

n. Avoid errors in final media production to requesting party

(1) ID persons responsible

(2) ID general methods

UpSide_down_champagne_glass10. Decision to Stop Training for Predictive Coding

a. ID persons responsible

b. Criteria to make the decision

(1) Probability distribution

(2) Separation of documents into two poles

(3) Ideal of upside down champagne glass visualization

(4) Few new relevant documents found in last rounds of training

(5) Few new strong relevant types found

(6) No new Highly Relevant documents found

11. Quality Assurance Procedures to Validate Reasonability of Decision to Stop

ei-Recall_smalla. Random Sample Tests to validate the decision

(1) ei-Recall method used, if not, describe

(2) accept on zero error for any Highly Relevant found in elusion test, or new strong relevant type.

(3) Recall and Precision goals

b. Judgmental sampling

12. Procedures to Document the Work Performed and Reasonability of Efforts

a. Clear identification of efforts on the review platform itself with screen shots before project closure

b. Memorandums to file or opposing counsel

(1) Basic metrics for possible disclosure

(2) Detail for internal use only and possible testimony

c. Availability of expert testimony if court challenges arise



What follows is another file I stole from the NSA, a video of PowerPoint slides (no voiceover) for a future presentation called:

Predictive Coding: An Introduction and Real World Example.

The PDF of the slides can be found here.



%d bloggers like this: