Applying AI & Data Science to MDL Case Processes

Chia Jeng Yang

Chia Jeng Yang

5 mins. read

5 mins. read

Oct 15, 2025

Oct 15, 2025

Motivation 

Identifying emerging potential multidistrict litigation (MDL) cases proves an interesting, non-trivial task for law firms. Achieving this quickly allows law firms to gain a competitive edge over others – and we believe that automating this process into a data-driven approach is a logical next step for us to explore. 

This article discusses our in-depth findings of scraping court docket data, as well as training a machine learning model to predict case outcomes.

Scope Disclaimer

MDL represents the highest-volume end of the U.S. mass-tort spectrum, where the Judicial Panel on Multidistrict Litigation (JPML) consolidates large numbers of related federal cases for coordinated pre-trial management. By design, this report focuses more on these large, court-managed proceedings, as their size and reach allows for rich, publicly-accessible docket data to be extracted and analysed easily at scale. Yet, many profitable mass-tort campaigns never reach the MDL threshold, and consequently the figures presented here should be interpreted as being representative of the entirety of the potential mass-tort market.

Data Mining

We used a multi-step process to obtain our dataset. This involved scraping the JPML website for terminated cases, utilising EXA Research to identify court dockets, and several data engineering steps to clean and process the data. Any missing docket metadata was scraped and filled in using in-house extraction agents. 

Schema

Based on the data, we propose a model for an MDL case to be defined using the following stages:

#

Stage

What it covers

1

Initial transfer & coordination

JPML petition, transfer order, docket consolidation, case-tagging

2

Pre-trial proceedings

Master/short-form pleadings, Rule 12 motions, common discovery, Daubert

3

Bellwether trials

Representative MDL trials to gauge value and legal issues

4

Settlement negotiation

Global frameworks, court-supervised mediations, individual deals

5

Remand / individual resolution

Unsettled cases returned to original courts (or state courts) for further action

6

Appeals & post-MDL management

Interlocutory or final appeals, mandate returns, coordination after remand

7

Termination outcomes


7a

Trial verdict entered

Jury/bench verdict → final judgment (record verdict direction: plaintiff or defendant)

7b

Settlement finalised

Court-approved global settlement or stipulated dismissal with consideration; often follows bellwethers or mediation

7c

Non-trial judgment

Merits resolved without trial – summary judgment (Rule 56) or default judgment when defendant fails to appear

7d

Procedural dismissal / closure

Case ends without merits decision – Rule 12 dismissal, voluntary Rule 41 drop, administrative closure, or bankruptcy-based closure

Classification Performance

We used GPT4.1 to classify the final state of the MDL cases, and trained a CatBoost model on predicting the following outcomes: Settlement(7b), Dismissal(7d), or Other. The model achieved a 73% accuracy, and correctly identified 100% of all Settlement cases and 76% of all Dismissal cases. There were no false positives (incorrect win predictions)– this demonstrates that the model is risk-averse in overestimating success, which is favourable in legal predictions.

metric

value

CV macro‑F1 (best)

0.585

Test accuracy

0.729

Test macro‑Precision

0.651

Test macro‑Recall

0.649

Test macro‑F1

0.630

Takeaways from the Data

Distribution of MDL Outcomes and Durations

Based on the scraped data spanning 2005 to 2023, most MDLs terminate with either 7b: Settlement or 7d: Dismissal.

Other MDLs that terminated in interim stages were likely lacking discoverable docket data. We’ve classified them as ‘other’ for brevity. 

All MDLs terminate within 20 years – with 2014 being the year with the longest median termination duration (7.7) and 2011 coming a close second (7.4).

  • The time from a motion filed to a court hearing takes just over 36 days 

  • The wait between court hearing to leadership setup takes about half the time (18 days). 

  • The average discovery duration is close to a year

  • The time from the court hearing to a bellwether trial takes about 1.3 years.

Bellwether Trials by Year and Nature of Suit (NOS)

  • 2013 saw the biggest spike with 8 bellwether trials held that year.

  • 2007 and 2017 were joint second with 6 bellwether trials.

  • A majority of bellwether trials were related to Personal Injury - Product Liability, and Antitrust, followed by Other.

Types of MDLs by Settlement/Dismissal

  • Interestingly, most Antitrust and Personal Injury-Product Liability MDLs ended in a successful settlement, whereas cases more likely to be dismissed were related to Patents or Retirement Security.

  • MDLs based on 28§1331: Federal Question and 15§1: Sherman Antitrust Act – Restraint of Trade were more likely to be settled. 

  • MDLs under 28:1332 (Diversity Jurisdiction) show a ~1:1 settlement-to-dismissal ratio, typical for this stage where stronger claims are settled either voluntarily or by bellwether leverage, and weaker claims get dismissed. 

MDL Phase Transitions:

Below is a Diagram that shows the proportions of how MDLs in the dataset progress over time. 

  • Note that even after MDL settlements are finalised, a significant portion of them ~30% are still re-opened for negotiations, individual resolutions, and further appeals. 

  • A small proportion (~15%) of cases are abruptly dismissed at the initial transfer stage.

Top 10 Case Types by Nature of Suit

A majority of case types were either Antitrust (39), Other (37), or Personal Injury and Product Liability (36).

Conclusion

We have demonstrated that a systematic approach can reveal patterns in consolidated MDL docket data. The data also reveals patterns in case progression, jurisdictional tendencies and how they affect the case resolution. While MDL proceedings represent only a small proportion of the wider mass-tort landscape, their scale and patterned approaches make them an ideal ground for predictive analytics in litigation. Our findings provide possible insights for proactive case selection and resource allocation in the context of law firms, which will become increasingly valuable for high-volume litigation.

Legal Tech

Law Firm

Law

Build a Smarter, Faster Litigation Practice

Partner with us to detect cases earlier, acquire plaintiffs faster, and scale your legal impact with purpose-built AI.

Partner with us to detect cases earlier, acquire plaintiffs faster, and scale your legal impact with purpose-built AI.

Stay up to date with latest developments

Stay up to date with latest developments