Giter VIP home page Giter VIP logo

pgn_parser's Introduction

PGN Parser

A Python library for parsing pgn files into a python friendly format.

The parser is built using canopy, the rest is Python.

The PGN spec is based on (and thanks to) the spec at saremba.de.

Setup

Installing

Make sure you have python 3 installed.

pip install pgn_parser

Then import like so:

from pgn_parser import pgn, parser

Testing

The tests are written using pytest and behave, these must be installed first:

pip install pytest behave

For running unit tests:

pytest

For running behavioural tests:

behave

Building pip distributables

make build

Using

Parsing a pgn file

To parse a pgn, you just give the string to the parser.parse along with the Actions() which the parser uses to create python structures.

>>> from pgn_parser import parser, pgn

>>> game = parser.parse("1. e4 e5", actions=pgn.Actions())
>>> print(game.move(1))
1. e4 e5
>>> print(game.move(1).black.san)
e5

Games

After parsing a game, it will be structured into the following classes which are nested in eachother:

Game: Container for the whole game To get a specific move (5 here) from a game

game.move(5)

To retrieve the Movetext

game.movetext

To access the TagPairs

game.tag_pairs

To access the final score

game.score

Movetext: The container of all the moves, e.g "1. c4 c5 2. e4 e5" It is just a list so can be iterated over to retrieve the moves. Be warned, Movetext[0] will be the first move parsed, whether 1. or 31. so use Game.move() if you want a movenumber

Move: A move is a move number, optionally a white Ply and or a black Ply

Ply: Is the unit of moving, in standard algebraic notation (SAN), e.g. the black ply from "1.e4 e5" is e5

TagPairs: An ordered dictionary of all TagPair objects. These are ordered so it keeps the order read in, but will change to seven tag roster order if printed/stringified.

TagPairs

To store meta data about a game you do so in TagPairs

The header of a pgn file

["Site" "github.com"]

Is represented like so in python

game.tag_pairs["Site"] == "github.com"

Moves

Each move has a move number and two ply's, white and black. Each ply can be anything from empty to having comments, variations and nags.

moves = "1. e4 $1 {a comment} (1.d5)"

Is represented like so:

m1 = game.move(1)

assert m1.white.san == "e4"
assert m1.white.comment == "a comment"
assert m1.white.nags[0] == "$1"
assert m1.white.variations[0].move(1).white.san == "d5"

If a ply is empty, then its san will be represented "".

Limitations

No support for RAV style variations No support for multiple games in one parse, must be single games Doesn't attempt to parse turn times as this is not in the original spec and I am not sure what to support.

Authors

  • Brett Bates - Initial work - github

License

This project is licensed under the MIT License - see the LICENSE.md file for details

pgn_parser's People

Contributors

brettbates avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pgn_parser's Issues

Allow for comments on moves aswell as ply

Need to allow clk data to exist even if we don't attempt to parse it

raise ParseError(format_error(self._input, self._failure, self._expected))

pgn_parser.parser.ParseError: Line 21: expected "(", [\s], [0-9], "1-0", "0-1", "1/2-1/2", "*", [\s]

  1. d4 { [%clk 0:03:00] } Nc6 { [%clk 0:03:00] } { A40 Mikenas Defense } 2. Nf3 { [%clk 0:02:56] } e5 { [%clk 0:02:58] } 3. d5 { [%clk 0:02:54] } Nd4 { [%clk 0:02:55]
    } 4. e3 { [%clk 0:02:52] } Nxf3+ { [%clk 0:02:53] } 5. Qxf3 { [%clk 0:02:50] } Nf6 { [%clk 0:02:51] } 6. Nc3 { [%clk 0:02:47] } Bb4 { [%clk 0:02:46] } 7. e4 { [%clk 0:02:44] } d6 { [%clk 0:02:38] } 8. Bg5 { [%clk 0:02:43] } h6 { [%clk 0:02:33] } 9. Bxf6 { [%clk 0:02:38] } gxf6 { [%clk 0:02:33] } 10. Bb5+ { [%clk 0:02:35] } c6 { [%clk 0:02:31] } 11. dxc6 { [%clk 0:02:34] } Bxc3+ { [%clk 0:02:24] } 12. Qxc3 { [%clk 0:02:31] } bxc6 { [%clk 0:02:22] } 13. Bxc6+ { [%clk 0:02:25] } Bd7 { [%clk 0:02:20] } 14. Bxa8 { [%clk 0:02:17] } Qxa8 { [%clk 0:02:19] } 15. O-O { [%clk 0:02:13] } Qxe4 { [%clk 0:02:17] } 16. Rfe1 { [%clk 0:02:11] } Qg6 { [%clk 0:02:09] } 17. Rad1 { [%clk 0:02:06] } Ke7 { [%clk 0:02:03] } 18. Qb4 { [%clk 0:02:01] } Bh3 { [%clk 0:01:59] } 19. g3 { [%clk 0:01:59] } Qxc2 { [%clk 0:01:53] } 20. Qxd6+ { [%clk 0:01:54] } Ke8 { [%clk 0:01:51] } 21. Qd8# { [%clk 0:01:52] } { White wins by checkmate. } 1-0

Verify tag pairs

Not sure if this is something the parser should support, but could be good to validate the seven tag roster, for example FEN being correct.

black.san missing when exporting from chess.com

To reproduce:

See colab notebook

or

!pip install getjson
!pip install chess.com
!pip install pgn-parser
a_player = 'PinIsMightier'
from chessdotcom import get_player_games_by_month
game = get_player_games_by_month(username=a_player, year=2021,month=1).json['games'][0]
from pgn_parser import parser, pgn
games = get_player_games_by_month(username=a_player, year=2021,month=1).json['games']
game = parser.parse(games[0]['pgn'],actions=pgn.Actions())
missing = [ game.move(j+1).black.san for j in range(5) ]

Support Turn Time

  1. a4 {12 that took 12 secs}

Officially there is no PGN idea for clocktime, so different places have implemented different ways.

E.g

{12 This took 12 secs}
{preamble [%clk 0:0:12] took 12 secs}

For now will leave this up to the user to implement timing as it is not part of the pgn spec. Later may look at supporting this as the need arises.

Black ply not read when the move comes after a comment

I'm trying to parse a 36-move game from the Gibraltar Battle of the Sexes 2022 (Round 1 - Game 1). I downloaded the PGN file from lichess. I'm using python 3.10 on Ubuntu 22.04 (fresh install).

Here's a small extract from the game I'm trying to parse

1. e4 { [%eval 0.25] [%clk 1:30:56] } 1... e5 { [%eval 0.12] [%clk 1:30:37] } 2. Nf3 { [%eval 0.28] [%clk 1:31:20] } 2... Nc6 { [%eval 0.25] [%clk 1:30:48] } 1/2-1/2

I'm reading the game as explained in the "tutorial"

game = parser.parse(game_str, actions = pgn.Actions())

and I'm trying to query information about the first move as explained

m1 = game.move(1)

As expected, the object m1 has the field white, that is, I can do

print(m1)
print(m1.white.san)
print(m1.white.comment)
print(m1.white.nags)

and I get the right information. But it does not seem to have the field black, that is, if I do

print(m1.black)
print(m1.black.san)
print(m1.black.comment)
print(m1.black.nags)

I only get blank lines. If I continue with move 2, the information I get for white is correct, and I still get blank lines for black.

Now, I think this is a bug because, if I remove the string "1..." and leave all the comments

1. e4 { [%eval 0.25] [%clk 1:30:56] } e5 { [%eval 0.12] [%clk 1:30:37] } 2. Nf3 { [%eval 0.28] [%clk 1:31:20] } 2... Nc6 { [%eval 0.25] [%clk 1:30:48] }  1/2-1/2

the output is correct for move 1 (for both black and white) but not for the second move. If I remove both strings "1..." and "2..."

1. e4 e5 { [%eval 0.12] [%clk 1:30:37] } 2. Nf3 Nc6 { [%eval 0.25] [%clk 1:30:48] } 1/2-1/2

then the output is correct for both moves (and both white and black).

Problem

I think the strings "1...", "2...", ... are causing severe interferences here.

Expected behavior

The strings "1...", "2..." are part of pgn files (I think) and should not cause the game to be parsed incorrectly. Parsing the game

1. e4 1... e5 2. Nf3 2... Nc6 1/2-1/2

should produce the same result as parsing

1. e4 e5 2. Nf3 Nc6 1/2-1/2

Export board state as FEN

Please add a feature to export the board state as FEN. This would simplify usage of the project with systems that render the board in using FEN.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.