The basic elements of the biggest
eDiscovery money saver
Your company was just hit with an eDiscovery request from the SEC. You have to produce all documents relevant to your case. You now have millions of documents to review in two weeks.
What do you do?
If this were 2003, you would have to hire 100 attorneys, lock them in a room, give them lots of coffee and pizza, and get them reviewing. Thousands of billable attorney hours later, you had a production to satisfy the SEC’s request.
Today, technology-assisted review (TAR) dramatically reduces the document volume that needs to be reviewed. Though TAR doesn’t completely eliminate the need for expert attorneys, it considerably cuts down on the number of documents to be reviewed by using algorithms to identify the most relevant ones--all while improving accuracy over human review. The result: you can take a set of millions of documents and knock it down to the thousands, all for a tiny fraction of the price of manual attorney review. Corporate clients are becoming more aware of the financial benefits and are requesting TAR more and more.
TAR is a complicated process, and extensive materials have been written on the topic. In this paper, we boil it all down to the straight-forward essentials. If you want to read more, we have provided a list of references at the end of this article.
The foundations of TAR: eDiscovery basics and data culling.
Intro to eDiscovery. Before getting into TAR, a look at the steps of eDiscovery is helpful. eDiscovery is the process of (1) collecting, (2) processing, (3) reviewing, and (4) producing all documents relevant to a particular case. Here’s a quick breakdown of each of these elements:
The collection process involves gathering all relevant data--emails, files on computers, social media, text messages, and just about any data produced by the relevant individuals (known as custodians).
Processing makes the text and metadata of the collected data searchable, and takes out any bugs in the system.
Reviewing documents is the most time intensive part of this process, and is the step of sifting through the data to get what’s most relevant.
All the relevant information is then produced to the party requesting information.
Culling excess. Processing structures the data, and makes it searchable on a spreadsheet-like platform. A more involved step of processing called culling gets rid of a lot of unnecessary data before that data is reviewed. This can remove millions of irrelevant documents, and save millions of dollars in attorney time. Here are some culling practices:
Deduplication. When you collect emails from ten different people, all who received the same email, you are producing ten copies of the same email to be read ten different times. This creates a lot of wasted attorney billable hours. Deduplication looks at the contents of the data to determine what is duplicative, and removes those duplicate items from the set of documents to be reviewed by attorneys.
Email threading. Rather than having different parts of an email communication spread across multiple custodians’, email threading keeps only the most inclusive email of that conversation--the email showing the entire conversation. In an email conversation with ten replies, that cuts the number of documents to be reviewed from ten to one.
File-type filtering. If you’re only interested in communications via email, why look at Excel sheets? This culling technique focuses the review process on only the relevant file types.
Keyword searches. Let’s say your litigation only deals with blue tiles, and not red tiles. Applying the keyword “blue” after you have made all the data searchable can cut down on the number of documents that will be relevant to a case. It should be noted that keyword searching can exclude relevant information, and pull in irrelevant information.
Date ranges. Focusing on the relevant date range for a particular case will significantly reduce the amount of documents to be reviewed. If your case only relates to transactions starting in 2011, it may make sense to only review data from 2011 onwards, rather than review all of a custodian’s data.
How technology-assisted review works.
TAR uses technology to amplify the power of human reasoning, and is the next generation of data culling. TAR comes in many forms, one of the most popular being relevancy ranking. Relevancy ranking uses a subset of reviewed and coded documents in order to predict the likelihood of documents being relevant. A study published by the Richmond Journal of Law and Technology found TAR to be more accurate than human review at finding relevant documents. Here are the steps involved in using relevancy ranking to review documents:
Train the system. One attorney reviewer takes a representative sample of all of the documents to be reviewed and codes them according to the criteria important for that specific review (for example, whether a document is relevant). Using a single attorney reviewer, rather than multiple ones, improves consistency and accuracy.
Inputting attorney work. The sample of reviewed documents is then inputted into a relevancy ranking program like Equivio or Relativity Analytics. After processing this data, the program will give you an initial ranking of how confident it is that each document is relevant. For example, “This document is 90 percent likely to be relevant.”
Refine. The attorney reviewer will then continue to code smaller samples of the documents, helping the TAR system learn to be more accurate. Refinement should be repeated until the next step.
Stabilization. The ideal point is where refinement will no longer improve the relevancy ranking of the system, and there are no significant differences in coding patterns. At this point, relevant documents can be produced at the relevancy ranking level requested by outside counsel (say, 70 percent), and a check of a sample of documents below that ranking level for quality-control purposes should be conducted.
Get rich, quick. A major part of making a TAR system more effective is providing a richer data set. Richness is defined as the percentage of relevant documents found in the greater set of documents. If a large percentage of the entire set of documents to be reviewed is relevant, then the data set will be richer, the TAR process will be more effective, and less time will be spent training the TAR system. Much of making a data set rich is done at the data culling stage, but there are some steps that in-house and outside counsel can take to make data richer:
Custodian filtering. Make sure you identify beforehand who is relevant for this review. You can do this by talking to individuals whose data will be collected to see if they are relevant to the case.
Map out your data. Another way to find the most important data is to speak with custodians to figure out how they organize information so you can use only the most important items. For example, maybe someone has a folder organization system that you can use to identify the richer documents more easily.
Attorney tested, court approved.
Two federal court opinions have recently approved of the use of TAR, recognizing that it’s a legitimate way to conduct eDiscovery. In 2012 in Moore v. Publicis Groupe, Magistrate Judge Peck approved of the use of TAR, and provided an extensive explanation of TAR. On September 17, 2014 in Dynamo Holdings v. Commissioner of Internal Revnue, U.S. Tax Court Judge Ronald L. Buch approved the use of TAR. In dismissing the IRS’s argument that TAR is an unproven technology, Judge Buch noted that while regular eDiscovery would have cost over $500,000 in this case, the estimated cost amounted to around $80,000.
What are you waiting for?
As courts have accepted the role of TAR in the modern era of eDiscovery, it is imperative that attorneys save resources (time and money) by applying the latest review technology. There are better things to do with $420,000 than make attorneys manually review documents.
Dynamo Holdings Limited Partnership v. Commissioner of Internal Revenue, 143 T.C. No. 9 (Sep. 17, 2014), available at http://www.catalystsecure.com/components/ com_wordpress/wp/wp-content/uploads/2014/09/ Dynamo_Holdings _v_ Commissioner _of_Revenue.pdf
Moore v. Publicis Groupe, 287 F.R.D. 182 (S.D.N.Y. 2012), adopted sub nom. Moore v. Publicis Groupe SA, No. 11 Civ. 1279 (ALC)(AJP), 2012 WL 1446534 (S.D.N.Y. Apr. 26, 2012), available at http://www.catalystsecure.com/components/com_wordpress/wp/wp-content/uploads/DaSilva_Moore_11_civ_1279_Opinion_20120224.pdf
Maura R. Grossman and Gordon V. Cormack, “Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhausting Manual Review,” Richmond Journal of Law and Technology, Spring 2011, available at http://jolt.richmond. edu/v17i3/article11.pdf