Skip to the content.

SemEval 2024 Task 5: The Legal Argument Reasoning Task in Civil Procedure

2024-01-29: ❗ The end of the evaluation phase is approaching. We have extended the deadline by 1 day to 2024-02-01 00:00:00 UTC. You can check the server time on the codalab competition page!

2024-01-09: The test partition has been sent out! If you have not received it, please contact lena.held@tu-darmstadt.de.

2023-11-28: Thank you for you patience, the CodaLab competition is now open! Make sure to check the participation rules below 🎉

We present a new NLP task and dataset from the domain of the U.S. civil procedure. Each instance of the dataset consists of a general introduction to the case, a particular question, and a possible solution argument, accompanied by a detailed analysis of why the argument applies in that case. Since the dataset is based on a book aimed at law students, we believe that it represents a truly complex task for benchmarking modern legal language models.

Task

The task can be formulated as follows: Given an introduction to the topic, a question and an answer candidate, classify if the given answer candidate is correct (True) or incorrect (False).

Example 1

Introduction

My students always get confused about the relationship between removal to federal court and personal jurisdiction. Suppose that a defendant is sued in Arizona and believes that she is not subject to personal jurisdiction there. Naturally, she should object to personal jurisdiction. [...] But generally the scope of personal jurisdiction in the federal court will be the same as that of the state court, because the Federal Rules require the federal court in most cases to conform to state limits on personal jurisdiction. Fed. R. Civ. P. 4(k)(1)(A). I’ve stumped a multitude of students on this point. Consider the following two cases to clarify the point.

Question

7. A switch in time. Yasuda, from Oregon, sues Boyle, from Idaho, on a state law unfair competition claim, seeking \$250,000 in damages. He sues in state court in Oregon. Ten days later (before an answer is due in state court), Boyle files a notice of removal in federal court. Five days after removing, Boyle answers the complaint, including in her answer an objection to personal jurisdiction. Boyle’s objection to personal jurisdiction is

Answer Candidate

not waived by removal, but will be denied because the federal courts have power to exercise broader personal jurisdiction than the state courts.

Label

0

Example 2

Introduction

My students always get confused about the relationship between removal to federal court and personal jurisdiction. Suppose that a defendant is sued in Arizona and believes that she is not subject to personal jurisdiction there. Naturally, she should object to personal jurisdiction. [...] But generally the scope of personal jurisdiction in the federal court will be the same as that of the state court, because the Federal Rules require the federal court in most cases to conform to state limits on personal jurisdiction. Fed. R. Civ. P. 4(k)(1)(A). I’ve stumped a multitude of students on this point. Consider the following two cases to clarify the point.

Question

7. A switch in time. Yasuda, from Oregon, sues Boyle, from Idaho, on a state law unfair competition claim, seeking \$250,000 in damages. He sues in state court in Oregon. Ten days later (before an answer is due in state court), Boyle files a notice of removal in federal court. Five days after removing, Boyle answers the complaint, including in her answer an objection to personal jurisdiction. Boyle’s objection to personal jurisdiction is

Answer Candidate

not waived by removal. The court should dismiss if there is no personal jurisdiction over Boyle in Oregon, even though the case was properly removed.

Label

1

Obtaining the training/dev dataset

Please refer to these instructions.

CodaLab

To evaluate the submissions we are using the CodaLab competition platform. Because of the licensing restrictions of the dataset, you will have to sign up for the competition and wait for our approval. The CodaLab competition can be found here.

Schedule

We adhere to the schedule for SemEval24, which means the following dates:

Competition Rules

Mailing list

Subscribe to our mailing list.

Leaderboard

CodaLab Leaderboard (last submission)

# User Entries Date of Last Entry Team Name f1-score accuracy
1 zhaoxf4 3 31.1.2024 HW-TSC 0.8231 0.8673
2 irene.benedetto 7 26.1.2024 MAINDZ 0.7747 0.8265
3 kbkrumov 7 31.1.2024 SU-FMI 0.7728 0.8367
4 qiaoxiaosong 9 30.1.2024 0.7644 0.8163
5 arios0 4 30.1.2024 UTSA-NLP 0.7315 0.7959
6 kubapok 10 13.1.2024 0.6971 0.7857
7 samyak 4 31.1.2024 LegalSense 0.6599 0.7449
8 hrandria 7 29.1.2024 0.6327 0.6939
9 Yuan_Lu 8 31.1.2024 0.6000 0.6327
10 PengShi 14 30.1.2024 0.5910 0.6735
11 msiino 1 31.1.2024 Mistral 0.5597 0.5714
12 Hwan_Chang 4 30.1.2024 0.5556 0.5918
13 kriti7 15 31.1.2024 0.5511 0.6020
14 woody 22 30.1.2024 0.5510 0.6633
15 odysseas_aueb 11 1.2.2024 0.5143 0.6122
16 Manvith_Prabhu 11 31.1.2024 SCaLAR Group, NITK Surathkal 0.4966 0.6224
17 lhoorie 1 30.1.2024 0.4957 0.5000
18 yms 8 30.1.2024 0.4827 0.7245
19 U_201060 1 29.1.2024 0.4503 0.6633
20 langml 2 30.1.2024 langml 0.4375 0.4490
21 lena.held 2 7.8.2023 0.4269 0.7449

Custom Leaderboard (best submission)

# User Entries Date of Entry Team Name f1-score accuracy
1 zhaoxf4 3 31.1.2024 HW-TSC 0.8231 0.8673
2 irene.benedetto 7 26.1.2024 MAINDZ 0.7747 0.8265
3 kbkrumov 7 31.1.2024 SU-FMI 0.7728 0.8367
4 qiaoxiaosong 9 30.1.2024 0.7644 0.8163
5 arios0 3 12.1.2024 UTSA-NLP 0.7341 0.8061
6 kubapok 10 13.1.2024 0.6971 0.7857
7 samyak 4 31.1.2024 LegalSense 0.6599 0.7449
8 hrandria 7 29.1.2024 0.6327 0.6939
9 PengShi 13 30.1.2024 0.6166 0.6837
10 Yuan_Lu 8 31.1.2024 0.6000 0.6327
11 Hwan_Chang 3 29.1.2024 0.6000 0.6735
12 msiino 1 31.1.2024 Mistral 0.5597 0.5714
13 kriti7 15 31.1.2024 0.5511 0.6020
14 woody 22 30.1.2024 0.5510 0.6633
15 Manvith_Prabhu 3 25.1.2024 SCaLAR Group, NITK Surathkal 0.5238 0.6429
16 odysseas_aueb 11 1.2.2024 0.5143 0.6122
17 lhoorie 1 30.1.2024 0.4957 0.5000
18 yms 8 30.1.2024 0.4827 0.7245
19 U_201060 1 29.1.2024 0.4503 0.6633
20 langml 2 30.1.2024 langml 0.4375 0.4490
21 lena.held 2 7.8.2023 0.4269 0.7449