Summary
Companies are finding a new cost cutting solution to eDiscovery is a tool known as predictive coding, or more specifically, Technology Assisted Review. There are a variety of tools in the market, each with unique methods, but the software is designed to perform a search and return like documents that can be filtered while minimizing human input.
However, predictive coding tends to work best when the search is simplified to a binary response. Is a given document responsive or non-responsive to a given set of criteria? This goes beyond a basic keyword search, but the current software struggles in more nuanced issues. It also works best when culling large amounts of documents, which when used to eliminate irrelevant documents can drastically reduce attorney time.
A defendant can use predictive coding software without requiring the agreement of both parties in terms of the final review, but the risk remains that if a party fails to produce certain documents or produces too many false positives throughout the course of their search, the process can be deemed not reasonable in failing to meet requirements, which could lead to additional sanctions.
Predictive coding remains one tool in an attorney’s arsenal during document review, and it is not to the point where it can operate without human input. However, attorneys rely on it as an accurate tool capable of saving their client time and money.
[pulled quotes] Predictive coding can work well, but it depends on a few things; it depends on the volume of data you have, it depends on the tool that you’re using and it depends on what your goal is.
I had a case where we estimated that if we had done the review in the traditional method, it would have taken 12 weeks and cost a certain amount of money. We used predictive coding however to cut that review down to three weeks, thereby cutting the cost of attorney time by more than $300,000.
Predictive coding is best when you’re dealing with a larger population of documents and you really just want to use the tool for a binary decision, that being responsive or not responsive.
What it will all come down to is an evaluation of the process that you’ve put in place: what was the procedure, what was your QC, who did the review and truly how well did you run and supervise that review? So your risk is that your process was not good, and therefore you will be found to have not complied with your duties as counsel to conduct a reasonable search.
Sanjog : Welcome listeners, this is Sanjog Aul, your host, and the topic for today’s conversation is; “Complex Case Law eDiscovery: Is Predictive Coding a Panacea?” I have with me Shannon CaponeKirk. She is the eDiscovery counsel at Ropes and Gray. Thank you, for joining us.
So I just wanted to get your wisdom and insight when it comes to predictive coding. What we have learned all along is that reviewing huge volumes of electronic documents is a painstaking process, and predictive coding has been provided as a solution; how well does this actually work?
Shannon: Well it can work well, but it depends on a few things; it depends on the volume of data you have, it depends on the tool that you’re using and it depends on what your goal is. So let me break that down a little bit. When I talk about volume I mean of course the amount of data that you have to review. So predictive coding is, I have found at least, more effective, more cost cutting and useful, if you have a large amount of data. Then the tool is also going to seriously impact how well it actually works. Keep in mind that predictive coding is actually just a very broad term that covers a number of different products, and technically “predictive coding” is really applied to only one tool out there, and I believe that they either try to or have a trademark on that phrase.
However we all use it to refer to what’s also known as, more generally, “Technology Assisted Review,” right? And so what we’re really talking about here is where a review platform will learn from attorneys who teach what it’s looking for. What the attorneys are looking for, the computer assists them in bringing back and calling back like documents, so to speak. So that’s a very simple way to look at it. So whether or not predictive coding is a useful solution is of course going to depend on the tool, and all of the tools are different. They all operate very differently so it’s not like a Coke, Pepsi kind of comparison; they all do functionally very different things.
And then finally, what is your goal? I have found that predictive coding is best when you’re dealing with a larger population of documents and you really just want to use the tool for a binary decision, that being responsive or not responsive. So far in my experience, a particular quote on quote predictive coding tool was very useful in cutting costs when we were able to just slash off a large amount of non-responsive documents, and we were really truly making a binary responsive/not responsive call.
I have not had good experience in using the tool for more nuanced issues or complicated issue finding. I have found that it has not been effective in at least the cases when I have used it for that. Of course I’m not speaking beyond what I’ve actually done, and there could be other attorneys out there who have been able to use a different tool in a different way.
[Pulled quotes] Technically “predictive coding” is really applied to only one tool out there, and I believe that they either try to or have a trademark on that phrase.
Sanjog: But the final set produced still has to be agreed upon by human beings who are representing both parties. Can predictive coding only go so far? How do organizations justify investing in it and actually using the software?
Shannon: There’s a misunderstanding that the final set of documents when using predictive coding has to be agreed upon by both parties. That’s not the case in my opinion. The producing party has a duty, no doubt, to conduct a reasonable search of relevant or potentially relevant documents and conduct a search of those documents, review them and produce them.
Now there could be this misunderstanding about predictive coding because there have been two cases in the last year in which both of the parties in both of those cases agreed. To a predictive coding review methodology in which they would share the seed steps of the documents and the iterations of those documents, also ensuring that the producing party shares with the opposing party documents that were coded not responsive.
And the reason for that was the producing party agreed to allow the receiving party to sort of cross check their work, or in other words, agree upon essentially the methodology and ultimately the final set. Now not everybody agrees that that is the correct approach, and in fact in the claim products case in the Northern District of Illinois before Judge Nam Nolin, the plaintiff and a large, anti-trust matter tried to compel the defendant to use predictive coding. In any event, that process that they proposed, the clients would have required that sharing of the seed steps and the involvement of both parties in the review process.
Ultimately through a series of hearings, the judge determined that no, it is actually up to the producing party how best to evaluate their review methodology, and she cited Sedona Principle Six. Now, Sedona Principle Six comes out of the Sedona conference, which is a group of lawyers, industry professionals and judges who have devoted their time to creating the principles for eDiscovery and other issues as well, and they are highly regarded and cited in numerous cases. And Sedona Principle Six states that “The responding parties are best situated to evaluate the procedures, methodologies and technologies appropriate for preserving and producing their own electronically stored information.”
So you see that gets us away from this notion that in all cases, this final set or the methodology to get to the final set must be agreed upon. Now of course if the receiving party does not think that the producing party produced everything or did a reasonable search, they can make that argument.
Sanjog: Is predictive coding technology actually going to lower costs and make the process better and more efficient? Is this primarily used by general counsels and attorneys?
Shannon: No not yet. It is not “primarily” advocated by general counsel or attorneys yet. However, it is certainly a topic of serious consideration and it is something that I always evaluate in every case now, whether it would be appropriate in that particular case. Will it save money? As I mentioned, if used correctly on the right case with the right tool, yes. And for example, I had a case where we estimated that if we had done the review in the traditional method, it would have taken 12 weeks and cost a certain amount of money.
We used predictive coding however, to cut that review down to three weeks, thereby cutting the cost of attorney time by more than $300,000.So in that case, you can see that there are serious cost savings we had, which is why people are paying attention to this.
[pulled quotes] I have found so far that predictive coding or technology assisted review is not very effective or precise when trying to find a thorough outworked product or attorney client communication.
Sanjog: Is this “Star Trek” stuff that predictive coding is actually self-learning and improves the more you use it in trying to find relevant topics?
Shannon: You know it’s not “Star Trek,” and I wish that there was such a thing that I could just plug in and tell it what I was looking for and that the magical theory would come back with all of my responsive documents. But you know what; we are just not there yet. However, it does seem that the better the human teachers are to the tool and the more time and iterations those attorneys spend with the tool, the better the tool is, and the better your results will be. Now keep in mind though that predictive coding is just one tool. It is one tool in the whole arsenal of what lawyers have before them nowadays to conduct efficient and cost effective document review.
So it should not be the only tool to be used. There are other technologies out there that live in tandem with predictive coding. You still have traditional search terms, and I’m not convinced that those must die a quick death. They are still used to some degree for some level of search terms. There are tools that analyze conversational patterns. They give visual depictions of who’s talking to who and when, which helps you to identify relevant conversation threads, relevant folks who perhaps you haven’t thought of at time frames and spikes in conversations that are helpful to zero in on quickly,i.e. cutting down review time.
There’s also more advanced duplication tools out there, which allows you to apply coding from one document to a number of documents whether they reside as attachments to an email or within users file repository. All of these things altogether should live with predictive coding to make for smarter reviews.
Sanjog: So Shannon, to what degree do you think a seasoned attorney like you can actually rely on this predictive coding technology. In your experience, what aspects of eDiscovery cannot be programmed into this software? Further, are there risks of missing documents and false positives?
Shannon: Well, seasoned attorneys should rely on predictive coding only to the extent that the source that they have assigned to teach the tool are actually teaching the tool in the manner in which is appropriate and reasonable and will result in the best use of the tool. So part of it is definitely dependent on human input. So it is certainly not the case that we can just simply rely on the tool. A large part of it is who is doing the teaching of the seed steps, and the seed steps of course are a statistical sample of the entire corpus of documents that you code as a human reviewer, and then that is what is teaching the whole system.
So it is certainly something that relies on human input, and therefore, to whatever degrees those folks are knowledgeable is the degree to which you could rely on the tool. Also, it depends on the tool itself because all these tools are different. Attorneys should do their homework and test its reliability. To my knowledge, no specific tool has actually been tested in terms of technology by a court and found to be not adequate or adequate. What the courts have said so far is, “Based on what I’ve heard so far from you, party plaintiff or defendant, is that this is a reliable tool and I have no reason to disagree with that, and certain studies show that technology assisted review may be more effective than human review, so I will allow it.”
The next part of your question is what cannot be relied upon within these tools? I have found so far that predictive coding or technology assisted review is not very effective or precise when trying to find a thorough outworked product or attorney client communication. To a large degree that is still very human dependent on finding those, and I’ve not found a tool yet that is very good as being accurate with respect to those very important calls by the way. Primarily why we do a large part of document review is to make sure that we are not producing privileged communications or a worked product.
Lastly, what are the risks of missing documents, with technology assisted review, and what are the risks of finding false positives? So of course the risks not finding otherwise relevant documents or that you as counsel will not be compliant with the federal rules and requirements of civil procedure to conduct a reasonable search. What it will all come down to is an evaluation of the process that you’ve put in place: what was the procedure, what was your QC, who did the review and truly how well did you run and supervise that review? So your risk is that your process was not good, and therefore you will be found to have not complied with your duties as counsel to conduct a reasonable search.
The risk of course is that that could lead to sanctions. The risk of finding false positives again speaks to not having a good process. If you are finding and continuing to find too many false positives its, it should be telling you one of two things, either A) your tool is not good or B) your process is not good. And of course the risk of having too many false positives is that you have now run up your review costs and you are not really saving the client any money.
Sanjog: Finally what do you really want from predictive coding and how is that any different from what the corporate general counsels expect from it?
Shannon: Of course what I want is 100 percent aligned with what the corporate general counsel wants. I want to save time and I want to save money. So I want to save time because I want to meet my deadlines. And I also want to save time because it reduces the amount of attorney hours having to do review, which therefore saves the client money. But at the same time, I also want a tool that is accurate because I have a duty as an attorney, and my client has their duty, to be compliant with the federal rules of civil procedure or the state court rules, whatever court I’m in, to conduct a reasonable search and to produce documents that are responsive or will lead to relevant, admissible information. And so I think that those goals of what I want out of predictive coding or technology assisted review, should be consistent with what corporate general counsel would want as well.
Less