Giter VIP home page Giter VIP logo

t2wml's Issues

Import Error ItemTable

I am getting an ImportError while running with the latest staging branch. @devowit

$ sh run_t2wml_food_prices.sh
Traceback (most recent call last):
  File "../generate.py", line 20, in <module>
    from driver import run_t2wml
  File "/home/kyao/dev/t2wml/driver.py", line 2, in <module>
    from backend_code.item_table import ItemTable
  File "/home/kyao/dev/t2wml/backend_code/item_table.py", line 3, in <module>
    from backend_code.utility_functions import query_wikidata_for_label_and_description
  File "/home/kyao/dev/t2wml/backend_code/utility_functions.py", line 12, in <module>
    from backend_code.wikidata_property import get_property_type as gp
  File "/home/kyao/dev/t2wml/backend_code/wikidata_property.py", line 1, in <module>
    from app_config import db
  File "/home/kyao/dev/t2wml/app_config.py", line 41, in <module>
    from backend_code.models import *
  File "/home/kyao/dev/t2wml/backend_code/models.py", line 7, in <module>
    from backend_code.item_table import ItemTable
ImportError: cannot import name 'ItemTable'
Traceback (most recent call last):
  File "../generate.py", line 20, in <module>
    from driver import run_t2wml
  File "/home/kyao/dev/t2wml/driver.py", line 2, in <module>
    from backend_code.item_table import ItemTable
  File "/home/kyao/dev/t2wml/backend_code/item_table.py", line 3, in <module>
    from backend_code.utility_functions import query_wikidata_for_label_and_description
  File "/home/kyao/dev/t2wml/backend_code/utility_functions.py", line 12, in <module>
    from backend_code.wikidata_property import get_property_type as gp
  File "/home/kyao/dev/t2wml/backend_code/wikidata_property.py", line 1, in <module>
    from app_config import db
  File "/home/kyao/dev/t2wml/app_config.py", line 41, in <module>
    from backend_code.models import *
  File "/home/kyao/dev/t2wml/backend_code/models.py", line 7, in <module>
    from backend_code.item_table import ItemTable
ImportError: cannot import name 'ItemTable'

Error when switching projects

After I'm done with a project (a pair of Excel file and Wikifier file), I want to open another project. The request of /upload_excel goes to an error if I open an Excel file first, since the previous Wikifier file may not applicative to the new Excel file.

It would also happens when I upload an inapplicable Wikifier file first and then the Excel file.

qualifiers for factbook not part of output

YAML:

# irrigated_land
statementMapping:
  region:
    - left: CB
      right: CB
      top: 9
      bottom: 26
  template:
    item: item[B, $row]
    property: P1082
    value: value[$col, $row] 
    unit: Q712226 # sq km 
    qualifier:
      - property: P585
        value: value[CD, 9]
        calendar: Q1985727
        precision: year
        time_zone: 0
        format: "%Y"
    #reference:
    #  - property: P246 # stated in
    #    value: Q11191 # The World Factbook

Exception in download

on file: SL.EMP.TOTL.SP.NE.ZS.xls

[2019-07-08 19:40:38,370] ERROR in app: Exception on /download [POST]
Traceback (most recent call last):
  File "/Users/pedroszekely/Documents/GitHub/t2wml/Code/handler.py", line 175, in generate_download_file
    stat = yaml_parser.get_template()
  File "/Users/pedroszekely/Documents/GitHub/t2wml/Code/YamlParser.py", line 68, in get_template
    self.resolve_template(template)
  File "/Users/pedroszekely/Documents/GitHub/t2wml/Code/YamlParser.py", line 59, in resolve_template
    result = parse_evaluate_and_get_cell(qualifier_value)
  File "/Users/pedroszekely/Documents/GitHub/t2wml/Code/t2wml_parser.py", line 144, in parse_evaluate_and_get_cell
    root = generate_tree(text_to_parse)
  File "/Users/pedroszekely/Documents/GitHub/t2wml/Code/t2wml_parser.py", line 22, in generate_tree
    parse_tree = parser.parse(program)
  File "/Users/pedroszekely/.virtualenvs/t2wml/lib/python3.7/site-packages/lark/lark.py", line 292, in parse
    return self.parser.parse(text)
  File "/Users/pedroszekely/.virtualenvs/t2wml/lib/python3.7/site-packages/lark/parser_frontends.py", line 170, in parse
    return self.parser.parse(text)
  File "/Users/pedroszekely/.virtualenvs/t2wml/lib/python3.7/site-packages/lark/parsers/earley.py", line 307, in parse
    return self.forest_tree_visitor.visit(solutions[0])
  File "/Users/pedroszekely/.virtualenvs/t2wml/lib/python3.7/site-packages/lark/parsers/earley_forest.py", line 281, in visit
    return super(ForestToTreeVisitor, self).visit(root)
  File "/Users/pedroszekely/.virtualenvs/t2wml/lib/python3.7/site-packages/lark/parsers/earley_forest.py", line 204, in visit
    vtn(current)
  File "/Users/pedroszekely/.virtualenvs/t2wml/lib/python3.7/site-packages/lark/parsers/earley_forest.py", line 284, in visit_token_node
    self.output_stack[-1].append(node)
IndexError: deque index out of range

Integration of pathlib in older versions of Python

I am working on a system with Python version 3.5.5. I keep on getting the error TypeError: invalid file: WindowsPath("/somepath")

pathlib integrates seemlessly with "open" only in Python 3.6 and later

The built-in open() function has been updated to accept os.PathLike objects, as have all relevant functions in the os and os.path modules, and most other functions and classes in the standard library.

A standard fix for this would be to convert the object to a string before opening the files.

Malformed query sent to sparql endpoint

One example of query sent to SPARQL is

SELECT ?qnode (MIN(?label) AS ?label) (MIN(?desc) AS ?desc) WHERE {
  VALUES ?qnode { wd:Q30271987}
  ?qnode rdfs:label ?label; <http://schema.org/description> ?desc.
  FILTER (langMatches(lang(?label),"EN"))
  FILTER (langMatches(lang(?desc),"EN"))
}
GROUP BY ?qnode

which throws an error, the right query should be

SELECT ?qnode (MIN(?label) AS ?label_1) (MIN(?desc) AS ?desc_1) WHERE {
  VALUES ?qnode { wd:Q30271987}
  ?qnode rdfs:label ?label; <http://schema.org/description> ?desc.
  FILTER (langMatches(lang(?label),"EN"))
  FILTER (langMatches(lang(?desc),"EN"))
}
GROUP BY ?qnode

Notice the change (MIN(?label) AS ?label_1) (MIN(?desc) AS ?desc_1)

Please fix this

Ghost highlighted region

After I applied a YAML file to one sheet, then I switched to another sheet and switched back, the highlighted regions disappeared, which was ok. But when I click some cells in the data region, the request of /resolve_cell still return something, which affects the table viewer and the output.

I suppose deleting the YAML file on backend whenever a /upload_excel request is fired would solve this issue.

image

t2wml specification file

Could it be linked from the readme of the project? It's a little complicated to know which are the supported functions (e.g., to skip rows) if the the spec is not available

Apply YAML for indicator files takes a really long time

Applying the YAML file to this CSV takes a really long time:
DT.ODA.ODAT.GI.ZS.csv.zip

Is it taking a long time in the server or in the browser?

statementMapping:
  region:
    - left: D
      right: BL
      top: 5
      bottom: 269
  template:
    item: item(A/$row)
    property: item(D/$row)
    value: value($col/$row)
    #unit: # need to define the units
    qualifier:
      - property: P585
        value: value($col/4)
        calendar: Q1985727
        precision: year
        time_zone: 0

references not part of the system yet?

YAML:

# irrigated_land
statementMapping:
  region:
    - left: CB
      right: CB
      top: 9
      bottom: 26
  template:
    item: item[B, $row]
    property: P1082
    value: value[$col, $row] 
    unit: Q712226 # sq km 
    qualifier:
      - property: P585
        value: value[CD, 9]
        calendar: Q1985727
        precision: year
        time_zone: 0
        format: "%Y"
    reference:
      - property: P246 # stated in
        value: Q11191 # The World Factbook

Cannot load Excel files with dates

Exception:

 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
[2019-07-04 16:10:37,387] ERROR in app: Exception on /upload_excel [POST]
Traceback (most recent call last):
  File "/Users/pedroszekely/.virtualenvs/t2wml/lib/python3.7/site-packages/flask/app.py", line 2292, in wsgi_app
    response = self.full_dispatch_request()
  File "/Users/pedroszekely/.virtualenvs/t2wml/lib/python3.7/site-packages/flask/app.py", line 1815, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/Users/pedroszekely/.virtualenvs/t2wml/lib/python3.7/site-packages/flask_cors/extension.py", line 161, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
  File "/Users/pedroszekely/.virtualenvs/t2wml/lib/python3.7/site-packages/flask/app.py", line 1718, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/Users/pedroszekely/.virtualenvs/t2wml/lib/python3.7/site-packages/flask/_compat.py", line 35, in reraise
    raise value
  File "/Users/pedroszekely/.virtualenvs/t2wml/lib/python3.7/site-packages/flask/app.py", line 1813, in full_dispatch_request
    rv = self.dispatch_request()
  File "/Users/pedroszekely/.virtualenvs/t2wml/lib/python3.7/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "application.py", line 70, in upload_excel
    return upload_file(user_id, sheet_name)
  File "application.py", line 40, in upload_file
    data = excel_to_json(file_path, sheet_name)
  File "/Users/pedroszekely/Documents/GitHub/t2wml/Code/utility_functions.py", line 100, in excel_to_json
    return json.dumps(result)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type date is not JSON serializable

500 Internal Server Error while downloading file

There are 2 versions of this error:

  1. Uncaught Error 500: This occurs when I try to download the file and the file is already present in the cache.
  2. Caught Error 500: This occurs when cache is not used and a new file is generated to download and none of the cells raise an error while generating the statement.

I have fixed the error in Branch_json_tests. Please refer to this commit for details.
84dc237

Statements are overwritten for cells which are evaluated first with the values of the statements which are generated later

While downloading a JSON/TTL file the statements are being overwritten with values of statements which are generated after that particular cell.
Refer to this log, this is the value of variable data during 2 different iterations of generate_download_file function defined in t2wml_handling , Check the value of P585 qualifier for the cell D4. The first log is when cell D9 is processed and the second is for cell E4. This issue isn't just restricted to qualifiers but to the whole statement.

[
  {
    "cell": "D4",
    "statement": {
      "item": "Q977",
      "property": "P100024",
      "value": 1,
      "qualifier": [
        {
          "property": "P585",
          "calendar": "Q1985727",
          "precision": 9,
          "time_zone": 0,
          "format": "%Y",
          "value": "2000-01-01T00:00:00",
          "cell": "D3"
        },
        {
          "property": "P6001",
          "value": "Q6581097",
          "cell": "C9"
        },
        {
          "property": "P123",
          "value": "Q6039400",
          "cell": "B9"
        }
      ],
      "cell": "A9"
    }
  },
  {
    "cell": "D5",
    "statement": {
      "item": "Q977",
      "property": "P100024",
      "value": 1,
      "qualifier": [
        {
          "property": "P585",
          "calendar": "Q1985727",
          "precision": 9,
          "time_zone": 0,
          "format": "%Y",
          "value": "2000-01-01T00:00:00",
          "cell": "D3"
        },
        {
          "property": "P6001",
          "value": "Q6581097",
          "cell": "C9"
        },
        {
          "property": "P123",
          "value": "Q6039400",
          "cell": "B9"
        }
      ],
      "cell": "A9"
    }
  },
  {
    "cell": "D6",
    "statement": {
      "item": "Q977",
      "property": "P100024",
      "value": 1,
      "qualifier": [
        {
          "property": "P585",
          "calendar": "Q1985727",
          "precision": 9,
          "time_zone": 0,
          "format": "%Y",
          "value": "2000-01-01T00:00:00",
          "cell": "D3"
        },
        {
          "property": "P6001",
          "value": "Q6581097",
          "cell": "C9"
        },
        {
          "property": "P123",
          "value": "Q6039400",
          "cell": "B9"
        }
      ],
      "cell": "A9"
    }
  },
  {
    "cell": "D7",
    "statement": {
      "item": "Q977",
      "property": "P100024",
      "value": 1,
      "qualifier": [
        {
          "property": "P585",
          "calendar": "Q1985727",
          "precision": 9,
          "time_zone": 0,
          "format": "%Y",
          "value": "2000-01-01T00:00:00",
          "cell": "D3"
        },
        {
          "property": "P6001",
          "value": "Q6581097",
          "cell": "C9"
        },
        {
          "property": "P123",
          "value": "Q6039400",
          "cell": "B9"
        }
      ],
      "cell": "A9"
    }
  },
  {
    "cell": "D8",
    "statement": {
      "item": "Q977",
      "property": "P100024",
      "value": 1,
      "qualifier": [
        {
          "property": "P585",
          "calendar": "Q1985727",
          "precision": 9,
          "time_zone": 0,
          "format": "%Y",
          "value": "2000-01-01T00:00:00",
          "cell": "D3"
        },
        {
          "property": "P6001",
          "value": "Q6581097",
          "cell": "C9"
        },
        {
          "property": "P123",
          "value": "Q6039400",
          "cell": "B9"
        }
      ],
      "cell": "A9"
    }
  },
  {
    "cell": "D9",
    "statement": {
      "item": "Q977",
      "property": "P100024",
      "value": 1,
      "qualifier": [
        {
          "property": "P585",
          "calendar": "Q1985727",
          "precision": 9,
          "time_zone": 0,
          "format": "%Y",
          "value": "2000-01-01T00:00:00",
          "cell": "D3"
        },
        {
          "property": "P6001",
          "value": "Q6581097",
          "cell": "C9"
        },
        {
          "property": "P123",
          "value": "Q6039400",
          "cell": "B9"
        }
      ],
      "cell": "A9"
    }
  }
]

********************

[
  {
    "cell": "D4",
    "statement": {
      "item": "Q967",
      "property": "P100024",
      "value": 2,
      "qualifier": [
        {
          "property": "P585",
          "calendar": "Q1985727",
          "precision": 9,
          "time_zone": 0,
          "format": "%Y",
          "value": "2001-01-01T00:00:00",
          "cell": "E3"
        },
        {
          "property": "P6001",
          "value": "Q6581072",
          "cell": "C4"
        },
        {
          "property": "P123",
          "value": "Q7649586",
          "cell": "B4"
        }
      ],
      "cell": "A4"
    }
  },
  {
    "cell": "D5",
    "statement": {
      "item": "Q967",
      "property": "P100024",
      "value": 2,
      "qualifier": [
        {
          "property": "P585",
          "calendar": "Q1985727",
          "precision": 9,
          "time_zone": 0,
          "format": "%Y",
          "value": "2001-01-01T00:00:00",
          "cell": "E3"
        },
        {
          "property": "P6001",
          "value": "Q6581072",
          "cell": "C4"
        },
        {
          "property": "P123",
          "value": "Q7649586",
          "cell": "B4"
        }
      ],
      "cell": "A4"
    }
  },
  {
    "cell": "D6",
    "statement": {
      "item": "Q967",
      "property": "P100024",
      "value": 2,
      "qualifier": [
        {
          "property": "P585",
          "calendar": "Q1985727",
          "precision": 9,
          "time_zone": 0,
          "format": "%Y",
          "value": "2001-01-01T00:00:00",
          "cell": "E3"
        },
        {
          "property": "P6001",
          "value": "Q6581072",
          "cell": "C4"
        },
        {
          "property": "P123",
          "value": "Q7649586",
          "cell": "B4"
        }
      ],
      "cell": "A4"
    }
  },
  {
    "cell": "D7",
    "statement": {
      "item": "Q967",
      "property": "P100024",
      "value": 2,
      "qualifier": [
        {
          "property": "P585",
          "calendar": "Q1985727",
          "precision": 9,
          "time_zone": 0,
          "format": "%Y",
          "value": "2001-01-01T00:00:00",
          "cell": "E3"
        },
        {
          "property": "P6001",
          "value": "Q6581072",
          "cell": "C4"
        },
        {
          "property": "P123",
          "value": "Q7649586",
          "cell": "B4"
        }
      ],
      "cell": "A4"
    }
  },
  {
    "cell": "D8",
    "statement": {
      "item": "Q967",
      "property": "P100024",
      "value": 2,
      "qualifier": [
        {
          "property": "P585",
          "calendar": "Q1985727",
          "precision": 9,
          "time_zone": 0,
          "format": "%Y",
          "value": "2001-01-01T00:00:00",
          "cell": "E3"
        },
        {
          "property": "P6001",
          "value": "Q6581072",
          "cell": "C4"
        },
        {
          "property": "P123",
          "value": "Q7649586",
          "cell": "B4"
        }
      ],
      "cell": "A4"
    }
  },
  {
    "cell": "D9",
    "statement": {
      "item": "Q967",
      "property": "P100024",
      "value": 2,
      "qualifier": [
        {
          "property": "P585",
          "calendar": "Q1985727",
          "precision": 9,
          "time_zone": 0,
          "format": "%Y",
          "value": "2001-01-01T00:00:00",
          "cell": "E3"
        },
        {
          "property": "P6001",
          "value": "Q6581072",
          "cell": "C4"
        },
        {
          "property": "P123",
          "value": "Q7649586",
          "cell": "B4"
        }
      ],
      "cell": "A4"
    }
  },
  {
    "cell": "E4",
    "statement": {
      "item": "Q967",
      "property": "P100024",
      "value": 2,
      "qualifier": [
        {
          "property": "P585",
          "calendar": "Q1985727",
          "precision": 9,
          "time_zone": 0,
          "format": "%Y",
          "value": "2001-01-01T00:00:00",
          "cell": "E3"
        },
        {
          "property": "P6001",
          "value": "Q6581072",
          "cell": "C4"
        },
        {
          "property": "P123",
          "value": "Q7649586",
          "cell": "B4"
        }
      ],
      "cell": "A4"
    }
  }
]
********************

Change default endpoint in the GUI

Currently the queries go the the standard Wikidata endpoint. By default, it should go to our endpoint:

http://sitaware.isi.edu:8080/bigdata/namespace/wdq/sparql

Not able to use GUI in Chrome browser on Windows

Using a Chrome browser on Windows, I get the following error:

idpiframe

If I ignore the error and click on Log in with Google, I get this error:
error

Also, I have allowed all cookies and reset the browser cache.

Cannot put constants in value

The value attribute should be an arbitrary expression, including a constant, for example, the following should be valid value: 2012, but causes an error:

    qualifier:
      - property: P585
        value: 2012
        calendar: Q1985727
        precision: year
        time_zone: 0 

Static value for `value` attribute raises an exception

When I try to give a static value to value attribute instead of a T2WML value expression, the system generates this exception:

"error": {"errorCode": 500, "errorTitle": "Undefined Backend Error", "errorDescription": "not enough values to unpack (expected 3, got 1)"}

Here is an example T2WML spec (based on homicide data table-1a):

statementMapping:
  region:
    - left: D
      right: F
      top: 4
      bottom: 9
  template:
    item: item[A, $row]
    property: P100024 # murder
    value: Q6030821
    #unit: D1002
    qualifier:
      - property: P585
        value: value[$col, 3]
        calendar: Q1985727
        precision: year
        time_zone: 0
        format: "%Y"
      - property: P6001 # applies to people
        value: item[C, $row]
      - property: P123 #source
        value: item[B, $row]
      - property: P1640 # curator
        value: Q6030821 # ISI

Both template->value and template->qualifier->value raise exceptions.

value[A+n, 3] doesn't work

I want to define a YAML file as follows so that the left is the first column that has the value irrigated_land:

# irrigated_land
statementMapping:
  region:
    - left: value[A+n, 3] == "irrigated_land" -> A+n
      right: CB
      top: 9
      bottom: $end
  template:
    item: item[B, $row]
    property: P1082
    value: value[$col, $row]

Looks like it works to have the +n expression with $left as in value[$left+n,3]

Source and wikifier files attached:

2007-10-01_factbook-small.xlsx

wikifier.zip

References should be lists of lists

References are currently implemented the same way as qualifiers. However, references should be list of lists as in

# OECD mapping
statementMapping:
  region: 
    - range: D6:K12
      skip_cell:
        - =value[$col, $row] == " .."
  template:
    # The next line should wikify the extracted country
    #item: '=get_item(regex(value[B, 2], "profile: (.*) \d{4}", 1))' 
    item: Q31 # Belgium
    property: =item[B, $row]
    value: =replace(value[$col, $row], "[^\d.-]", "")
    reference: 
      - - property: P246 # stated in
          value: Q41550 # OECD
        - property: P2006010001 # Datamart dataset id
          value: Q2006050001 # OECD dataset
      - - property: P246 # stated in
          value: Q123456 
        - property: PP1234567 # Datamart dataset id
          value: "hi there"

etk library installation

After following the installation instructions from the readme and running application.py, I got a ModuleNotFoundError for 'etk.wikidata' on line: 'from etk.wikidata.entity import WDItem' (from the triple_generator.py file).
I think that the etk library (https://github.com/usc-isi-i2/etk) that downloads from the requirements.txt installation process doesn't contain WikiData modules. Instead I have tried installing https://github.com/fatestigma/etk/tree/wikidata which does contain WikiData modules, but the file structure doesn't match the imports in the code.

unit not part of output or download

YAML:

# irrigated_land
statementMapping:
  region:
    - left: CB
      right: CB
      top: 9
      bottom: 26
  template:
    item: item[B, $row]
    property: P1082
    value: value[$col, $row]
    unit: Q712226 # sq km

The main problem is that the units are not part of the JSON, so they will not appear in the final output.

A secondary problem is that the units are not shown in the output on the screen, less important to fix, but would be nice. It should get the label of the Qnode for the units and put it next to the value.

data and wikifier same as in issue #89

image

TTL for fertilizer data is incomplete

There are two problems. One in interactive mode, where no file is produced. In batch mode, most of the data is missing. The files are:

For some strange reason the interactive and batch behavior are different, and in the TTL most of the data is missing.

Conflicting "lark" namespace

In the requirements.txt file, both lark==0.0.4 and lark_parser==0.7.1 are included, but both use the namespace "lark" for imports, which leads to a confusion where the wrong library gets imported. The only places where the "lark" namespace is used is in t2wml_parser.py where the reference is to lark_parser, so it appears "lark==0.0.4" is not even needed for function. This is a minor inconvenience as one can simply remove the package from the requirements.txt or manually uninstall it, but it's a source of confusion for fresh users.

Download as KGTK produces duplicate ids for qualifier edges

For example:

oecd;OECD-Latvia g2g9e7f8-en..csv;D6	Q211	P1082	19782	number	True	0	19782													
oecd;OECD-Latvia g2g9e7f8-en...csv;D6;D4	oecd;OECD-Latvia g2g9e7f8-en..csv;D6	P585	^2011-01-01T00:00:00/9	date_and_times	True	0											"2011-01-01T00:00:00"	9		
oecd;OECD-Latvia g2g9e7f8-en...csv;D6;	oecd;OECD-Latvia g2g9e7f8-en..csv;D6	P248	Q41550	symbol	True	0														Q41550
oecd;OECD-Latvia g2g9e7f8-en...csv;D6;	oecd;OECD-Latvia g2g9e7f8-en..csv;D6	P2006010001	Q2006050001	symbol	True	0														Q2006050001

This seems to happen for qualifier with fixed values where no cell is used to supply the value.

A simple solution is to not generate ids for qualifier edges as KGTK can easily add them later.

skip-row isn't skipping all rows that satisfy the conditions

Please check this example. The skip-row isn't skipping row 9.
But If I try to skip rows with value[D, $row] == 2 it works as expected. I think the issue might be with trimming the cell values.
image
Here is the sample YAML file based on Homicide data Table-1a:

statementMapping:
  region:
    - left: D
      right: F
      top: 4
      bottom: 9
      skip_row:
        - value[D, $row] == 1
  template:
    item: item[A, $row]
    property: P100024 # murder
    value: value[$col, $row]
    #unit: D1002
    qualifier:
      - property: P585
        value: value[$col, 3]
        calendar: Q1985727
        precision: year
        time_zone: 0
        format: "%Y"
      - property: P6001 # applies to people
        value: item[C, $row]
      - property: P123 #source
        value: item[B, $row]

Support use of formulas in property, units, precision, etc.

In general, formulas can be used to compute the value of any attribute in a YAML file. For example:

# OECD mapping
statementMapping:
  region: 
    - range: D6:K12
      skip_cell:
        - =value[$col, $row] == " .."
  template:
    # The next line should wikify the extracted country
    #item: '=get_item(regex(value[B, 2], "profile: (.*) \d{4}", 1))' 
    item: Q31 # Belgium
    property: =item[B, $row]
    value: =replace(value[$col, $row], "[^\d.-]", "")
    unit: =item[C, $row]
    qualifier:
      - property: P585 #point in time
        value: =value[$col, 4]
        calendar: Q1985727
        precision: year
        time_zone: 0
        format: "%Y"
    reference: 
      - property: P246 # stated in
        value: Q41550 # OECD
      - property: P2006010001 # Datamart dataset id
        value: Q2006050001 # OECD dataset

In this example, formulas are used in property and unit, but in general formulas could be used anywhere.

Wikifier GUI does not show label and description

Support uploading properties in KGTK format

Currently, properties must be uploaded in JSON. We need support to upload properties in KGTK TSV format as in the attached file. The idea is that the T2WML backend will scan the uploaded file to select rows of the form:

	P2006050001	data_type	quantity
	P2006050002	data_type	quantity

The set of property types are, the following, using the terminology in the KGTK command to generate triples. We may need to change or add aliases to this list:

item
time
globe-coordinate
quantity
monolingualtext
string
external-identifier
url
property

Same Location Value in Multiple Columns

In the attached screenshot, "United Kingdom" in X79 is being mapped correctly according to Qnode in Wikifier file, however same value not being mapped in Y87.
None of the value in column Y is being mapped despite their values defined with qnodes in wikifier file. (This is not a problem if the values are numeric. They map to multiple columns without such problem)
(To reproduce : The file, wikifier-file, and yaml can be found at https://github.com/akankshadiwedy/Wikidata-UCDP/tree/master/location)

Screen Shot 2019-11-07 at 3 29 01 PM

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.