thomhurks / dblp-to-csv Goto Github PK
View Code? Open in Web Editor NEWConvert a DBLP XML file to CSV format.
Convert a DBLP XML file to CSV format.
Hi Thom,
Thank you very much for your code, it is being very useful and practical. I managed to run it and successfully import. Once this is done, I would like to know how to open the database created with Neo4j desktop/browser and launch cypher queries to it.
Thank you very much in advance
Start!
Reading elements from DTD file...
Finding unique attributes for all elements...
Opening output files...
Parsing XML and writing to CSV files...
Traceback (most recent call last):
File "myscript.py", line 400, in
main()
File "myscript.py", line 378, in main
relations, unique_id = parse_xml(xml_file, elements, output_files, args.relations)
File "myscript.py", line 167, in parse_xml
set_relation_values(relations, data, relation_attributes, unique_id)
File "myscript.py", line 195, in set_relation_values
if column_name in relation_attributes:
TypeError: argument of type 'NoneType' is not iterable
kindly resolve this issue
C:\Users\Mooh\pychpr\dblp>python dblp.py dblp.xml dblp.dtd output.csv
Start!
Reading elements from DTD file...
Finding unique attributes for all elements...
Opening output files...
Parsing XML and writing to CSV files...
Traceback (most recent call last):
File "dblp.py", line 400, in
main()
File "dblp.py", line 378, in main
relations, unique_id = parse_xml(xml_file, elements, output_files, args.relations)
File "dblp.py", line 167, in parse_xml
set_relation_values(relations, data, relation_attributes, unique_id)
File "dblp.py", line 195, in set_relation_values
if column_name in relation_attributes:
TypeError: argument of type 'NoneType' is not iterable
Hello i frequently face this error when running the script :
Python38-32/python.exe -X utf8 XMLToCSV.py --annotate --neo4j dblp.xml dblp.dtd import/output.csv --relations author:authored_by journal:published_in publisher:published_by school:submitted_at editor:edited_by cite:has_citation series:is_part_of415\exam\src\dblp-to-csv-master>
Will create relations for attribute(s): author, cite, editor, journal, publisher, school, series
Start!
Reading elements from DTD file...
Finding unique attributes for all elements...
Opening output files...
Parsing XML and writing to CSV files...
Traceback (most recent call last):
File "c:/Users/wassi/Desktop/INFO-H-415/exam/src/dblp-to-csv-master/XMLToCSV.py", line 433, in
main()
File "c:/Users/wassi/Desktop/INFO-H-415/exam/src/dblp-to-csv-master/XMLToCSV.py", line 408, in main
(relations, unique_id, array_elements, element_types) = parse_xml(xml_file, elements, output_files,
File "c:/Users/wassi/Desktop/INFO-H-415/exam/src/dblp-to-csv-master/XMLToCSV.py", line 199, in parse_xml
output_files[current_tag].writerow(data)
File "C:\Users\wassi\AppData\Local\Programs\Python\Python38-32\lib\csv.py", line 154, in writerow
return self.writer.writerow(self._dict_to_list(rowdict))
File "C:\Users\wassi\AppData\Local\Programs\Python\Python38-32\lib\csv.py", line 149, in _dict_to_list
raise ValueError("dict contains fields not in fieldnames: "
ValueError: dict contains fields not in fieldnames: 'publtype'
Any idea ?
Traceback (most recent call last):
File "./XMLToCSV.py", line 412, in
main()
File "./XMLToCSV.py", line 387, in main
relation_attributes, annotate=True)
File "./XMLToCSV.py", line 172, in parse_xml
set_type_information(element_types, current_tag, key, value)
File "./XMLToCSV.py", line 242, in set_type_information
types.add(get_type(value))
File "./XMLToCSV.py", line 263, in get_type
date.fromisoformat(string_value)
AttributeError: type object 'datetime.date' has no attribute 'fromisoformat'
I tried with dblp-2019-01-01.xml and dblp-2017-08-29.dtd
hi , plz help about this error.
xmltocsv.py: error: the following arguments are required: xml_filename, dtd_filename, outputfile
I get through the following stages of the script:
Start!
Reading elements from DTD file...
Finding unique attributes for all elements...
Opening output files...
Parsing XML and writing to CSV files...
Then, it throws the following error:
Python\Python37\Lib\encodings\cp1252.py, line 19, in encode return
codecs.charmap_encode(input,self.errors,encoding_table)[0] unicodeEncodeError: 'charmap'
codec can't encode character '\u221e' in position 186: character maps to
The three optional 8-bit hex characters for this infinity sign are already taken up by special (accent) characters in the de/encoding table. Should I change the table in that (Python 3.7) system file or what should I do?
Cheers,
Jochen
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.