Giter VIP home page Giter VIP logo

demystify's Introduction

Demystify
A Magic: The Gathering Parser

1) ABOUT

Demystify is an attempt to make it possible for a computer to understand what
general Magic: The Gathering cards do. It is currently written in a combination
of ANTLR 3.5 and Python 3.2.

2) INSTALLATION

Demystify depends on the following libraries and/or programs, which you will
need to have installed in order to build the parser and run Demystify.
See INSTALL for more specific instructions.

- Java 1.6.0 (or later)
- ANTLR 3.5 (or later 3.5 release)
- ANTLR 3 Python3 runtime [available at github.com/antlr/antlr3]
- jpackage-utils 1.7.5 (or later)
- Python 3.2
- python-progressbar2

3) BUILDING

To build the parser generator, you need to have the macro.g and Words.g grammar
files up-to-date. If they don't exist, or demystify/keywords.py has changed,
run
    $ python3 demystify/keywords.py
to regenerate them.

Then, you need to run antlr3 on the grammar. If you've followed the instructions
in INSTALL and gotten yourself an antlr3 script, all you need to do is:
    $ cd demystify/grammar/
    $ antlr3 Demystify.g
If not, your commandline will look something like this (supposing your classpath
is properly set):
    $ cd demystify/grammar/
    $ java org.antlr.Tool Demystify.g
(Note that cd instruction; if you "antlr3 demystify/grammar/Demystify.g"
 instead, you'll get the .py output files where they belong, but the .tokens
 files are put in your working directory.)

This takes a little while.

If it worked (and it worked if all the output you received were of the form
"warning(138): ... no start rule ...", and not "error(12345)" etc.),
demystify/grammar/ should now contain a series of Demystify*.py
files. These will be used by the main demystify.py script.

If it doesn't work, and you're sure you installed antlr3 and java correctly,
check that you followed instructions in INSTALL again. If antlr3 is actually
giving grammar errors, please check the Issues tab in github for the issue
or file a new bug.

4) RUNNING

The main entry point is the demystify.py script in the demystify/ folder.
It currently offers two modes of running:

load

Loads the card data from the Scryfall data file in demystify/data/cache/
(may download it from Scryfall if necessary), and performs preprocessing steps
necessary before any lexing and parsing is done. With -i, opens an interactive
prompt afterward, at which functions in demystify.py and card.py can be called.
This is useful for actually invoking the parser, as well as doing special card
searches using the utility functions in card.py.

test

Runs the tests (or a specific test) in demystify/tests/.

These can be run from within demystify with:
    $ python3 demystify.py load -i

Add -h or --help for more information:
    $ python3 demystify.py -h
    $ python3 demystify.py test -h

5) LEXING AND PARSING

The intention of this project is to eventually generate full parse trees for
every line of text on every card. However, as this is a low priority personal
project, Demystify is far from this goal. At the moment, all that you can do is
test the lexer or parser.

First, load the environment.
    $ python3 demystify.py load -i
    Processing cards for card names...
    Innocent Blood   [##########################] 14715 of 14715 Time: 0:00:01
    >>> cards = card.get_cards()

get_cards() returns all the cards loaded. If you like, you can select a smaller
set of cards via card.get_card_set(), perform a search (note that searches
return matching text rather than matching cards), or get an individual card.
    >>> karn = card.get_card('Karn, Silver Golem')

The lexer is essentially done (modulo any new words or symbols in sets not yet
added to the project), but it can still be tested, with test_lex or lex_card.
lex_card will print the preprocessed text followed by a table of lex symbols.
(You can also get at the preprocessed text by simply printing the card object.)
    >>> lex_card(karn)
    whenever SELF blocks or becomes blocked, it gets -4/+4 until end of turn.
    {1}: target non-creature artifact becomes an artifact creature with...
     1    0   0 whenever  WHEN
     1    9   2 SELF      SELF
     1   14   4 blocks    BLOCK
     1   21   6 or        OR
     1   24   8 becomes   BECOME
     1   32  10 blocked   BLOCKED
     1   39  11 ,         COMMA
    ...
     1   72  30 .         PERIOD
     2    0  32 1         MANA_SYM
     2    3  33 :         COLON
    ...

Note that test_lex is very slow, but very effective at finding new vocabulary.

The parse_all function is meant to eventually parse rules text, but at the
moment it parses only mana cost and type lines. It takes a list of cards
to parse, and adds the parse tree to each card individually.
    >>> parse_all(cards)
    Innocent Blood   [##########################] 14715 of 14715 Time: 0:00:25
    0 total errors.
    >>> karn.parsed_cost
    <antlr3.tree.CommonTree instance (COST (MANA 5))>
    >>> karn.parsed_typeline
    <antlr3.tree.CommonTree instance (TYPELINE (SUPERTYPES legendary)
     (TYPES artifact creature) (SUBTYPES golem))>

The other parse functions use a regex to narrow down the text they attempt
to parse. You may still pass all the cards.
    >>> parse_triggers(cards)
    Whippoorwill     [############################] 3702 of 3702 Time: 0:00:02
    181 total errors.
    68 unique cases missing.
    >>> parse_keyword_lines(cards)
    Wasp Lancer      [############################] 5085 of 5085 Time: 0:00:02
    1 total errors.
    1 unique cases missing.
    >>> parse_ability_costs(cards)
    Helm of Obedienc [############################] 4369 of 4369 Time: 0:00:02
    2 total errors.
    2 unique cases missing.

Each parse function reports how many times it couldn't parse a line of text,
and (attempts to) group them together into cases. These are all logged to the
LOG file, which is useful for development.
The parse results themselves are again attached to the cards, but as lists
of strings rather than antlr tree objects.
    >>> karn.parsed_costs
    ['(COST (MANA 1))']
    >>> karn.parsed_triggers
    ['(TRIGGER (EVENT (SUBSET SELF) (OR (BECOME BLOCKING) (BECOME blocked))))']
    >>> akroma = card.get_card('Akroma, Angel of Wrath')
    >>> akroma.parsed_keywords
    ['(KEYWORDS flying FIRST_STRIKE vigilance trample haste
     (protection (PROPERTIES black) (PROPERTIES red)))']

You can test individual rules with arbitrary text by calling test_parse, or by
creating a parsing unittest in test/.

6) MISCELLANEOUS

Dependency visualization:

The deps/deps.py script reads the grammar files and outputs to stdout a
dot format graph file, which describes how parser rules in the grammar call
other parser rules, as well as how the grammar files reference other grammar
files. You'll want to use graphviz's dot program to visualize this, e.g. with:
    $ python3 deps/deps.py | dot -Tpng -o gdeps.png

graphviz can be installed via your package management tool, or downloaded from
http://www.graphviz.org.

7) CONTRIBUTIONS

Are welcome! Please use github forks and pull requests. Bugs without fixes can
be reported as well.

http://github.com/Zannick/demystify/

8) LICENSE AND DISCLAIMER

The files in this repository fall into four categories:
    A) antlr3/antlr3
    B) Python code in demystify/ and ANTLRv3 code in demystify/grammar/
    C) deps/deps.py
    D) Card data in demystify/data/ and demystify/tests/

(A) is based on the antlr3 script that was installed on one of my machines
when I installed ANTLRv3 via yum. It is therefore (to the best of my
knowledge) covered until ANTLRv3's license (BSD),
which is included as antlr3/LICENSE.

(B) is the Demystify program itself, and the core of this repository. It
is licensed under the Lesser GNU Public License version 3. See COPYING
for the GPL v3 and COPYING.LESSER for the LGPL v3.

(C) is also licensed under the LGPL v3 but it is not a part of Demystify
(though it runs by reading parts of Demystify) which is why I've set it apart.

(D) consists of Magic: The Gathering card data, in the form of full card
records, slightly modified to fix errors (data/), or in the form of test cases,
which may include actual card rules or possible card rules. All card
information is copyrighted by Wizards of the Coast.

Demystify is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of 
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

Demystify is not produced, endorsed, supported,
or affiliated with Wizards of the Coast.

9) THANKS

Thanks go to Terence Parr and the ANTLR project (without which this project
would be much more difficult), and to Wizards of the Coast (for making
Magic: The Gathering, without which this project would not exist).

demystify's People

Contributors

flyingmutant avatar zannick avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

demystify's Issues

Scryfall API breaking changes

Hello,

I'm trying to study this project. Currently, I've cloned the repo, installed ANTLR 3, and tried to run it. Unfortunately, it looks like the Scryfall API has changed over the years so some of your codes don't run anymore. Can you help me look into this issue?

Regards,

Switch to using Scryfall API for db updates

Yawgatog no longer provides a text Oracle collection. Viable options seem to be MTGJSON4 and Scryfall, which the former partially depends upon. Scryfall provides a link to a bulk json file in its API docs while MTGJSONv4 provides its own bulk json files on its main page.

Either way, it's also worth considering storing the database in json format, and locally cached (instead of checked-in). This will require rewriting the Card constructor and ditching much of the data.py update logic.

Preview output of a parsed card?

Not really an issue per se; I was looking through the docs and was hoping to find an example of a parsed card, especially the rules text. I'm looking into building a parser for rules text and found your project, I just can't tell if it'll produce the sort of output I'm looking for. For example, what does the parsed version of Liliana of the Veil look like?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.