Predictive coding for e-discovery involves technology-assisted reviews. Predictive coding software takes data input about document relevance and applies this to larger document sets. This artificial intelligence algorithm applies the machine learning concept to learn from labeled training examples and iterative cycles of prediction. This way, lawyers can quickly and accurately locate relevant documents related to specified cases. Predictive coding has greatly expedited the review process since it was judicially endorsed in 2012. When predictive coding for e-discovery first emerged, there was a great deal of anticipation among legal practitioners, but today the technology has a well-established place in e-discovery. This process encourages parties to be transparent, and it also saves attorneys judicial resources.
How Predictive Coding Works
Predictive coding is controlled by lawyers who specify the relevant criteria for the reviewing process. The computers conduct an expedited discovery which is increasingly becoming automated, objective and scientific. Predictive coding is a law-driven, computer-expedited document review applied in e-discovery for government and other civil litigants. Electronically stored information (ESI) is electrically coded and prioritized following discovery responsiveness, privilege, and designated issues before and during the legal discovery process.
A lawyer who is closely familiar with the case being reviewed specifies the sets of data that define the essence of the case. The review process involves an iterative search that uses an algorithm to produces much smaller sets of documents from the input data.
If the reviewing attorney decides that some results are not probative, he or she can initiate a much deeper iteration to be passed by the predictive coding algorithms. The artificial intelligence involved with predictive coding applies the concept of machine learning to learn how to distinguish what is relevant. As a result, the iteration process produces as a much smaller subset of data that is more relevant and accurate than the preceding subsets.
At the end of the iteration process, there exists a larger set of irrelevant documents that can be used to clarify the integrity of the probative subsets. This can be achieved by confirming the absence of any probative material in the larger irrelevant subset of documents. Once the review is done, the end use of the probative material is entirely dependent on the risk threshold and comfort levels of the attorney and their clients. In summary, predictive coding completes the following tasks;
• Leveraging small samples of data to find other relevant documents
• Reducing the amount of non-relevant documents that attorneys must review
• Validating results obtained from the reviewing process statistically using the larger irrelevant set of documents
Legal Issues and Other Concerns that Have Risen With Predictive Coding
Two important legal issues arise with the use of predictive coding for e-discovery;
• The first issue is, if the counsel can leverage the legal efficiencies of predictive coding for e-discovery and still meet their legal obligations, i.e., be in a position to carry out reasonable research under the Federal Discovery Rule for responsive documents.
• The second concern is whether or not counsel can safeguard attorney-client privilege now that predictive coding for e-discovery creates room for privileged information getting disclosed. This is especially possible because disclosure of privileged information usually happened even when attorneys used the most traditional discovery methods.
These two important questions need to be addressed by Federal case law based on the Federal Rule of Civil Procedure 26(g)(1)(A) that stipulates how attorneys should respond to a discovery request. Salient questions also arise on what constitutes a reasonable inquiry and whether counsel can still satisfy the injunction of Federal Rule of Evidence 502. This is concerning unintended disclosure of information that is otherwise secured and protected by the attorney-client privilege. The ultimate solution is for the holder of the privilege to take reasonable steps to prevent disclosure; inadvertent or otherwise.
• Human Review Myth
The human review myth is a prevalent yet pervasive misconception that dictates that human review is the most thorough and accurate way of identifying relevant documents. Studies have however proved this belief wrong, by confirming the fallibility of pure manual human review. The wide-held perception has over time been disregarded, and more lawyers are slowly adopting predictive coding for e-discovery.
• Technical Unfamiliarity
The introduction of disruptive technology in the legal industry which has historically been resistant towards new technology was met with a fair amount of skepticism. This was especially due to the advanced elements of the procedure that made predictive coding quite complex. The technology has however proved its worth over the years, and more lawyers are on board with its use. Besides, even though it is a complex process, the complexities are largely hidden from the end user.
Justification of Predictive Coding for e-Discovery
Under the standard of reasonableness set by the Federal Rules, the use of predictive coding is considered reasonable. Even though there are significant issues that arise with the use of artificial intelligence to review documents, some of the facts that at first blush make it appear unreasonable are in fact, the ones that allow it to pass as satisfactory. For instance, document reviews are made more efficient because the reviewer is exposed only to subsets identified based on the specifications of the producing party.
Predictive coding saves both parties time and resources that would have otherwise gone into narrowing down documents to find the most relevant ones.
The interactive nature of predictive coding gives it the ability to refine relevant subsets that can also be validated statistically. The key to making the most of predictive coding is establishing mandatory metrics that might have prevailed without scrutiny under past e-Discovery paradigms.
Moving Beyond Predictive Coding
Predictive coding, as aforementioned is the principal type of artificial intelligence used in e-discovery to expedite the review process. New technologies are emerging thanks to the continued research into artificial intelligence. Studies done have discovered the power of deep learning techniques surpassing the capabilities of previous generations of AI. Such technologies are bound to re-shape AI-augmented document review. Building upon the most advanced technologies in deep learning will eventually eliminate the need for seed sets.
Predictive coding has positively impacted e-discovery since it's introduction in the legal industry. The technology not only reduces the amount of non-relevant documents lawyers have to go through but also produces results whose credibility can be statistically proven.