ewancook / transcriber Goto Github PK
View Code? Open in Web Editor NEWRapidly convert FactoryTalk SE DAT files to useful CSV files
License: MIT License
Rapidly convert FactoryTalk SE DAT files to useful CSV files
License: MIT License
Bug:
v0.15 gives incorrect output when trends are added or removed in the middle of (Float)
files.
Detail:
The parser works on the wrong assumption that trends would be added from the start of files or values would be backfilled with placeholders (e.g. 0). Trends can be added at any point, and the only indicators of this are in the (Float)
file:
Status
of U
(update)Marker
s of B
and E
(beginning and end)The markers are usually shown once for each tag at the beginning and end of each file. If a trend is added or removed, E
is assigned to each tag before the trend change, then B
is assigned for each tag after the trend is added (see the example below). I.e. the current record ends and a new record begins, but all in the same file.
Proposed Fix:
SubclassedDBF
will read n
rows until B
is not seen, and yield
the required tag indices. This allows calculation of the initial number of tags. The subsequent required rows will be read (using row skipping) as before without the modular arithmetic parsing breaking.
If trends are added or removed (or the file ends), E
will be present in every row. Therefore only the first required row in the set of n
rows needs to be checked as below.
If more rows are present (trends were added or removed), the number of tags will be recalculated with the method above, and parsing will continue.
This fix has minimal speed reduction!
# checking one row in every set of required rows:
rows = itertools.islice(rows_since_first_set, num_required_rows)
first_row = next(rows)
yield first_row
yield from rows
# check if first_row["Marker"] == b'E' and recalculate tags if so
Beginning and end marker example:
TagIndex | Status | Marker |
---|---|---|
0 | B | |
1 | B | |
2 | B | |
0 | ||
1 | ||
2 | ||
0 | E | |
1 | E | |
2 | E | |
0 | U | B |
2 | U | B |
0 | ||
2 | ||
0 | E | |
2 | E |
Proposed Fix:
For example:
etc.
This would break the fast, maths based transcription - would need to use a dict of tags again...
Given the enormous amount of work to refactor this repository following #3, the following command-line conversion tool may be useful while v0.1.6
is still under development.
The tool will convert a single float file, provided that a tag file and an output file destination are supplied. Options for row averaging and for altering the precision (decimal places) are available. All tags are present in the converted file.
The tool should be added to the root of the transcriber directory (i.e. where run.py
is located) or downloaded as a binary.
Edit: It's worth noting that, due to the version (v.01.5
), issue #3 still applies: this tool can only be relied upon when the tags do not change in the middle of the float file.
Platform | File | Build OS | Python Version |
---|---|---|---|
Linux | transcribe_utility_v0.1_linux.tar.gz | Ubuntu 18.04.5 LTS | 3.6.3 |
Windows | transcribe_utility_v0.1_windows.zip | Windows 10 (19041) | 3.8.2 |
Python:
./transcribe.py -f "2020 12 14 0000 (Float).DAT" -t "2020 12 14 0000 (Tagname).DAT" -o converted.csv
./transcribe.py --float-file "2020 12 14 0000 (Float).DAT" --tag-file "2020 12 14 0000 (Tagname).DAT" --output-file converted.csv --precision 5 --average-rows 3
Windows:
.\transcribe.exe -f "2020 12 14 0000 (Float).DAT" -t "2020 12 14 0000 (Tagname).DAT" -o converted.csv
.\transcribe.exe --float-file "2020 12 14 0000 (Float).DAT" --tag-file "2020 12 14 0000 (Tagname).DAT" --output-file converted.csv --precision 5 --average-rows 3
Linux:
./transcribe -f "2020 12 14 0000 (Float).DAT" -t "2020 12 14 0000 (Tagname).DAT" -o converted.csv
./transcribe --float-file "2020 12 14 0000 (Float).DAT" --tag-file "2020 12 14 0000 (Tagname).DAT" --output-file converted.csv --precision 5 --average-rows 3
usage: transcribe.py [-h] -f FLOAT_FILE -t TAG_FILE -o OUTPUT_FILE
[-p PRECISION] [-a AVERAGE_ROWS]
Convert FactoryTalk DAT Files
optional arguments:
-h, --help show this help message and exit
-f FLOAT_FILE, --float-file FLOAT_FILE
Float File [(Float).DAT]
-t TAG_FILE, --tag-file TAG_FILE
Tag File [(Tagname).DAT]
-o OUTPUT_FILE, --output-file OUTPUT_FILE
Output File
-p PRECISION, --precision PRECISION
Precision
-a AVERAGE_ROWS, --average-rows AVERAGE_ROWS
Rows to Average
Please see the code below. It is also linked above (gist.github.com).
#!/usr/bin/env python3
import argparse
from multiprocessing import freeze_support
from transcriber.converter.dbfworker import utils
from transcriber.converter.dbfworker.worker import DBFWorker
from transcriber.dbf.parser import Parser
TAG_LABEL = "Tagname"
def retrieve_tags(tag_file):
parser = Parser(required_fields=[TAG_LABEL])
return [r[TAG_LABEL].decode().strip() for r in parser.parse_all(tag_file)]
def run(float_file, tag_file, precision, average_rows):
try:
tags = retrieve_tags(tag_file)
worker = DBFWorker(
tag_lookup=tags,
tags=set(tags),
decimal_places=precision,
filename=float_file,
rows_to_average=average_rows,
)
print(f"Successfully converted {worker.work()}")
except Exception as e:
print(f"Conversion failed ({e})")
if __name__ == "__main__":
freeze_support()
parser = argparse.ArgumentParser(
description="Convert FactoryTalk DAT Files"
)
parser.add_argument(
"-f", "--float-file", required=True, help="Float File [(Float).DAT]"
)
parser.add_argument(
"-t", "--tag-file", required=True, help="Tag File [(Tagname).DAT]"
)
parser.add_argument(
"-o", "--output-file", required=True, help="Output File"
)
parser.add_argument(
"-p", "--precision", help="Precision", default=8, type=int
)
parser.add_argument(
"-a", "--average-rows", help="Rows to Average", default=1, type=int
)
args = parser.parse_args()
utils.transcribed_filename = lambda x: args.output_file
average_rows = None if args.average_rows < 2 else args.average_rows
run(args.float_file, args.tag_file, args.precision, average_rows)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.