Giter VIP home page Giter VIP logo

xd's Issues

Refetch recent missing crosswords

Due to issue #37, many crosswords are missing from 2016. A simple script should generate a list of 'missing' xdids, and an option should be added to 11-download-puzzles to take this list of xdids and redownload them (storing in .zip on s3 and adding to receipts and inserting into the pipeline as usual).

Generate deepclues pages for recent puzzles

  • Add 36-mkwww-deepclues to 30-mkwww for puzzles received this run (or, most recent 10 puzzles)
  • Generate /deep/ index with list of these puzzles, including stats like number of original clues

100% grid match should be white on pubyear map

To fully match the colors on the individual pubyear pages, those with 100% grid match with same author should be white on the pubyear map also (instead of yellow like other 50%+ with same author).

The color choosing code should be isolated into one place for easy modification.

grid search

Is there a way to see instances of a specific grid, irrespective of fill? That is, I provide a blank grid and see how many puzzles in the database have that precise block pattern.

Improve author diff function

  • Move diff_author into common location, use for all puzzle coloring
  • Use cleaned metadata (in puzzles.tsv) for author diff, instead of raw puzzle data
  • Make configurable threshold for author match instead of 100%

Create AWS cost page for site

Generate AWS cost page for current month/year (or last 30/365 days):

  • S3 storage/bandwidth used
  • all AWS costs incurred
  • runtime required for this run

[log cleanup] Log enumeration script generates error

Log enumeration script generates error:

    raise ClientError(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (PermanentRedirect) when calling the ListObjects operation: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.

Though /logs still seems to be working.

i18n: Please accept '■' (U+25A0) as a block symbol

Hello, I'm a Japanese programmer, Katayama Hirofumi MZ.
The Asian people uses their double-width characters (Kanji, Hiragana, and Katakana, etc.).
For example: あいうえおアイウエオ漢字亜井宇.

'#' (U+0023) is a single-width character, so the appearance of the file contents is ugly.
I propose that your program shalt accept '■' (U+25A0) as a block symbol for internationalization.
U+25A0 is a double-width character for Asian. Thank you!

restructure similar.tsv to have one row per match

similar.tsv should be changed to have the following columns: xdid match_xdid match_pct

match_xdid should always be the earlier puzzle.

The metadatabase.py:xd_similar() function does the split already.

There should not be any duplicate rows in the final similar.tsv.

An xdid that has been checked should create a single row with match_pct of 0.

xd_similar and xd_similar_all and users of similar.tsv (25-analyze, 35-mkwww-diffs, xdfile/pubyear.py) will need to be changed to keep existing functionality.

test_xdfile references functions that don't exist anymore

I couldn't pinpoint the exact commit, but I assume these got refactored away at some point and the tests in xdfile/tests/test_xdfile.py weren't updated

=================================== FAILURES ===================================
________________________________ test_filename _________________________________

    def test_filename():
>       fname = xdfile.get_base_filename(test_selection)
E       AttributeError: module 'xdfile' has no attribute 'get_base_filename'

xdfile/tests/test_xdfile.py:11: AttributeError
_______________________________ test_parse_date ________________________________

    def test_parse_date():
    
>       date = xdfile.parse_date_from_filename(test_selection)
E       AttributeError: module 'xdfile' has no attribute 'parse_date_from_filename'

xdfile/tests/test_xdfile.py:18: AttributeError

Similarity on diagonally flipped crosswords?

I wonder if you considered also to calculate the similarity with the axes of one of the crosswords flipped across the diagonal? Looking at the similarity measure code I did not see that option, but maybe I missed it.

Since this:

TAEBO#MANN#JAPE
ADIEU#AREA#ETAS
JONAS#LEAS#TENS
###STEMSTHETIDE
SWAP#ROOF#XANAX
PILOTS##RAHS###
INTRO#GEENA#AVE
ROOTSFORANUPSET
ESS#CLINK#SUTRA
###CAAN##ATTEST
STOOL#GUAM#ORES
LEAVESASTAIN###
ARTE#OWES#MINOR
BRER#LORE#ACUTE
SANS#ELSA#METOO

would be identical to that:

TAJ#SPIRE#SLABS
ADO#WINOS#TERRA
EIN#ALTOS#OATEN
BEASPORT#COVERS
OUST#TOSCALE###
###ERS#FLA#SOLE
MALMO#GOINGAWOL
ARESO#ERN#USERS
NEATFREAK#ATSEA
NASH#ANN#AMA###
###EXHAUST#IMAM
JETTAS#PUTONICE
ATEIN#ASTER#NUT
PANDA#VERSE#OTO
ESSEX#ETATS#REO

wapost: only on sundays

The other 6 days will fail, so don't bother with those requests.

Also theglobeandmail_canadian.

Bug: only fetches valid crossword every other day

Has to do with server using GMT but crossword sites being on US time; they aren't ready when the request is made. An unfulfilled request should be okay, and just retried the next day. However, the system currently records the request and not the success. This should be fixed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.