Giter VIP home page Giter VIP logo

legal-segmenter's Introduction

legal-segmenter

About

What is it? legal-segmenter is a simple library for segmenting legal texts.

How does it work? A bunch of heuristic rules and a manually assembled list of common BlueBook abbreviations. (Yes it's that dumb.)

Can I add to it? Absolutely! If you notice a place where the code can be improved--perhaps by addressing an edge case or adding another common abbreviation--please create a PR in the repository.

Installation

> pip3 install legal-segmenter

Use

from legal_segmenter.segmenter import Segmenter

# Input data
text = """Rule 23 does not set forth a mere pleading standard. A party seeking class certification must affirmatively demonstrate his compliance with the Rule—that is, he must be prepared to prove that there are in fact sufficiently numerous parties, common questions of law or fact, etc. We recognized in Falcon that “sometimes it may be necessary for the court to probe behind the pleadings before coming to rest on the certification question,” 457 U. S., at 160, and that certification is proper only if “the trial court is satisfied, after a rigorous analysis, that the prerequisites of Rule 23(a) have been satisfied,” id., at 161; see id., at 160 (“[A]ctual, not presumed, conformance with Rule 23(a) remains … indispensable”). Frequently that “rigorous analysis” will entail some overlap with the merits of the plaintiff ’s underlying claim. That cannot be helped. “ ‘[T]he class determination generally involves considerations that are enmeshed in the factual and legal issues comprising the plaintiff ’s cause of action.’ ” Falcon, supra, at 160 (quoting Coopers & Lybrand v. Livesay, 437 U. S. 463, 469 (1978); some internal quotation marks omitted). Nor is there anything unusual about that consequence: The necessity of touching aspects of the merits in order to resolve preliminary matters, e.g., jurisdiction and venue, is a familiar feature of litigation. See Szabo v. Bridgeport Machines, Inc., 249 F. 3d 672, 676–677 (CA7 2001) (Easterbrook, J.)."""

# Print out each sentence extracted
seg = Segmenter()
paragraphs = seg.segment(text)
for paragraph in paragraphs:
    for sentence in paragraph:
        print(sentence.strip())
        print()

The output of the above code is:

Rule 23 does not set forth a mere pleading standard.

A party seeking class certification must affirmatively demonstrate his compliance with the Rule—that is, he must be prepared to prove that there are in fact sufficiently numerous parties, common questions of law or fact, etc.

We recognized in Falcon that “sometimes it may be necessary for the court to probe behind the pleadings before coming to rest on the certification question,” 457 U. S., at 160, and that certification is proper only if “the trial court is satisfied, after a rigorous analysis, that the prerequisites of Rule 23(a) have been satisfied,” id., at 161; see id., at 160 (“[A]ctual, not presumed, conformance with Rule 23(a) remains … indispensable”).

Frequently that “rigorous analysis” will entail some overlap with the merits of the plaintiff ’s underlying claim.

That cannot be helped.

“ ‘[T]he class determination generally involves considerations that are enmeshed in the factual and legal issues comprising the plaintiff ’s cause of action.’ ” Falcon, supra, at 160 (quoting Coopers & Lybrand v. Livesay, 437 U. S. 463, 469 (1978); some internal quotation marks omitted).

Nor is there anything unusual about that consequence: The necessity of touching aspects of the merits in order to resolve preliminary matters, e.g., jurisdiction and venue, is a familiar feature of litigation.

See Szabo v. Bridgeport Machines, Inc., 249 F. 3d 672, 676–677 (CA7 2001) (Easterbrook, J.).

which is not terrible!

Contact

If you have a question or any other issue, please reach out to Neel Guha ([email protected]).

legal-segmenter's People

Contributors

neelguha avatar probablyfaiz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

lexeme-dev

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.