arpa-simc / dballe Goto Github PK
View Code? Open in Web Editor NEWFast on-disk database for meteorological observed and forecast data.
License: Other
Fast on-disk database for meteorological observed and forecast data.
License: Other
Al momento, per una quantesono/elencamele il numero di stazioni risultanti cambia a seconda dei database:
mem:
, c'è una stazione per ogni combinazione di (report, lat, lon, ident)L'ordinamento delle stazioni nel risultato al momento non è definito: i database basati su SQL ordinano per ana_id, il database mem:
non mi è chiaro per cosa ordini al momento, ma mi sembra dipenda dalla query che viene fatta.
In future implementazioni di database basati su SQL, vorrei andare nella direzione di una stazione per ogni combinazione di (report, lat, lon, ident), come nel database mem:
.
Per l'ordinamento, vorrei documentare che i risultati non sono ordinati.
Mi confermi che questo coincide con la realtà di come DB-All.e è usato al momento?
(rpmbuild defaults include -Werror
)
/bin/sh ../libtool --tag=CXX --mode=compile g++ -DHAVE_CONFIG_H -I. -I.. -DTABLE_DIR=\"/usr/share/wreport\" -I.. -I.. -I/usr/include/mysql -Werror -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -std=gnu++11 -c -o libdballe_la-types.lo `test -f 'types.cc' || echo './'`types.cc
libtool: compile: g++ -DHAVE_CONFIG_H -I. -I.. -DTABLE_DIR=\"/usr/share/wreport\" -I.. -I.. -I/usr/include/mysql -Werror -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -std=gnu++11 -c types.cc -fPIC -DPIC -o .libs/libdballe_la-types.o
types.cc: In function 'std::string dballe::{anonymous}::fmtf(const char*, ...)':
types.cc:812:27: error: ignoring return value of 'int vasprintf(char**, const char*, __va_list_tag*)', declared with attribute warn_unused_result [-Werror=unused-result]
vasprintf( &c, f, ap );
^
cc1plus: all warnings being treated as errors
make[3]: *** [libdballe_la-types.lo] Error 1
make[3]: Leaving directory `/home_local/makerpm/rpmbuild/BUILD/dballe-7.9-1/dballe'
make[2]: *** [all] Error 2
with bufr2json --format=dballe "[" at beginning and "]" at the end are missed as "," as separator in the middle.
dbamsg dump --json synop.bufr
{"version":"0.1","network":"synop","ident":null,"lon":1070000,"lat":4420000,"date":"2013-06-01T00:00:00Z","data":[{"timerange":[1,0,21600],"level":[1,null,null,null],"vars":{"B13011":{"v":0.0,"a":{"B07032":2.00}}}},{"timerange":[1,0,86400],"level":[1,null,null,null],"vars":{"B13011":{"v":0.0,"a":{"B07032":2.00}}}},{"timerange":[4,0,10800],"level":[1,null,null,null],"vars":{"B10060":{"v":-20,"a":{"B07031":2173.0}}}},{"timerange":[4,0,86400],"level":[1,null,null,null],"vars":{"B10060":{"v":100,"a":{"B07031":2173.0}}}},{"timerange":[205,0,10800],"level":[1,null,null,null],"vars":{"B10063":{"v":8,"a":{"B07031":2173.0}}}},{"timerange":[254,0,0],"level":[1,null,null,null],"vars":{"B10004":{"v":77230,"a":{"B07031":2173.0}},"B20001":{"v":30000,"a":{"B07032":2.00}}}},{"timerange":[254,0,0],"level":[100,85000,null,null],"vars":{"B10008":{"v":13651,"a":{}}}},{"timerange":[254,0,0],"level":[103,2000,null,null],"vars":{"B12101":{"v":276.15,"a":{}},"B12103":{"v":271.02,"a":{}},"B13003":{"v":69,"a":{}}}},{"timerange":[254,0,0],"level":[103,10000,null,null],"vars":{"B11001":{"v":350,"a":{"B07032":2.00}},"B11002":{"v":9.3,"a":{"B07032":2.00}}}},{"timerange":[254,0,0],"level":[256,null,258,0],"vars":{"B08002":{"v":8,"a":{}},"B20011":{"v":6,"a":{}},"B20013":{"v":1000,"a":{}}}},{"timerange":[254,0,0],"level":[256,null,258,1],"vars":{"B20012":{"v":30,"a":{}}}},{"timerange":[254,0,0],"level":[256,null,258,2],"vars":{"B20012":{"v":21,"a":{}}}},{"timerange":[254,0,0],"level":[256,null,258,3],"vars":{"B20012":{"v":10,"a":{}}}},{"timerange":[254,0,0],"level":[256,null,259,1],"vars":{"B08002":{"v":1,"a":{}},"B20011":{"v":6,"a":{}},"B20012":{"v":4,"a":{}},"B20013":{"v":1000,"a":{}}}},{"timerange":[254,0,0],"level":[256,null,null,null],"vars":{"B20010":{"v":75,"a":{}}}},{"vars":{"B01001":{"v":16,"a":{}},"B01002":{"v":134,"a":{}},"B01019":{"v":"MONTE_CIMONE","a":{}},"B02001":{"v":2,"a":{}},"B02002":{"v":12,"a":{}},"B05001":{"v":44.20000,"a":{}},"B06001":{"v":10.70000,"a":{}},"B07030":{"v":2165.0,"a":{}},"B07031":{"v":2173.0,"a":{}}}}]}
{"version":"0.1","network":"synop","ident":null,"lon":1185000,"lat":4346667,"date":"2013-06-01T00:00:00Z","data":[{"timerange":[1,0,21600],"level":[1,null,null,null],"vars":{"B13011":{"v":0.0,"a":{"B07032":2.00}}}},{"timerange":[1,0,86400],"level":[1,null,null,null],"vars":{"B13011":{"v":9.0,"a":{"B07032":2.00}}}},{"timerange":[4,0,10800],"level":[1,null,null,null],"vars":{"B10060":{"v":0,"a":{"B07031":249.0}}}},{"timerange":[4,0,86400],"level":[1,null,null,null],"vars":{"B10060":{"v":-10,"a":{"B07031":249.0}}}},{"timerange":[205,0,10800],"level":[1,null,null,null],"vars":{"B10063":{"v":4,"a":{"B07031":249.0}}}},{"timerange":[254,0,0],"level":[1,null,null,null],"vars":{"B10004":{"v":97600,"a":{"B07031":249.0}},"B20001":{"v":14000,"a":{"B07032":2.00}}}},{"timerange":[254,0,0],"level":[101,null,null,null],"vars":{"B10051":{"v":100540,"a":{"B07031":249.0}}}},{"timerange":[254,0,0],"level":[103,2000,null,null],"vars":{"B12101":{"v":285.35,"a":{}},"B12103":{"v":282.74,"a":{}},"B13003":{"v":84,"a":{}}}},{"timerange":[254,0,0],"level":[103,10000,null,null],"vars":{"B11001":{"v":0,"a":{"B07032":2.00}},"B11002":{"v":0.0,"a":{"B07032":2.00}}}},{"timerange":[254,0,0],"level":[256,null,258,0],"vars":{"B08002":{"v":7,"a":{}},"B20011":{"v":7,"a":{}},"B20013":{"v":840,"a":{}}}},{"timerange":[254,0,0],"level":[256,null,258,1],"vars":{"B20012":{"v":35,"a":{}}}},{"timerange":[254,0,0],"level":[256,null,258,2],"vars":{"B20012":{"v":20,"a":{}}}},{"timerange":[254,0,0],"level":[256,null,258,3],"vars":{"B20012":{"v":10,"a":{}}}},{"timerange":[254,0,0],"level":[256,null,259,1],"vars":{"B08002":{"v":1,"a":{}},"B20011":{"v":7,"a":{}},"B20012":{"v":6,"a":{}},"B20013":{"v":840,"a":{}}}},{"timerange":[254,0,0],"level":[256,null,null,null],"vars":{"B20010":{"v":88,"a":{}}}},{"vars":{"B01001":{"v":16,"a":{}},"B01002":{"v":172,"a":{}},"B01019":{"v":"AREZZO","a":{}},"B02001":{"v":2,"a":{}},"B02002":{"v":12,"a":{}},"B05001":{"v":43.46667,"a":{}},"B06001":{"v":11.85000,"a":{}},"B07030":{"v":248.0,"a":{}},"B07031":{"v":249.0,"a":{}}}}]}
Uno dei requisiti iniziali era che la documentazione di DB-All.e fosse in LaTeX. Questo però richiede che per far build di DB-All.e serva avere installato tutto LaTeX (e la distribuzione di latex usata per far build della roba prodotta da doxygen è piuttosto grossa), e al momento non stiamo usando formule o altre cose per cui LaTeX eccelle. Oltre a questo, per la generazione di HTML al momento stiamo usando latex2html che non è software libero.
È possibile rilassare questo requisito, e portare la documentazione a un formato piú leggero tipo markdown o restructuredText?
Con il commit 9fd88f8 ho finito di implementare il logging su stderr dell'analisi delle query che vengono fatte al database.
Se si lancia un programma che usa DB-All.e settanto la variabile di ambiente DBA_EXPLAIN=1
, su stderr compariranno tutte le query fatte al database, i parametri di query corrispondenti di DB-All.e, e un'analisi della query fatta dal database.
Sarebbe possibile far girare con DBA_EXPLAIN=1
un po' di procedure significative effettivamente usate in produzione, e mandarmi il loro standard error?
Vorrei usare queste informazioni per vedere se effettivamente il database (e i suoi indici) sono stati strutturati in un modo che effettivamente corrisponde alle esigenze reali di utililzzo.
dbamsg dump --interpreted
#0[0] generic message with 1 contexts:
Level -,-,-,- tr -,-,- 5 vars:
001011 SHIP OR MOBILE LAND STATION IDENTIFIER(CCITTIA5): dancast78
001194 [SIM] Report mnemonic(CCITTIA5): rmap
001213 AIRBASE AIR QUALITY OBSERVING STATION CODE(CCITTIA5): conn
005001 LATITUDE (HIGH ACCURACY)(DEGREE): 44.69950
006001 LONGITUDE (HIGH ACCURACY)(DEGREE): 10.64555
mqtt2bufr -t rmap/dancast78/#|dbamsg dump
#0 BUFR message: 112 bytes, origin 200:0, category 0 255:255:0, bufr edition 4, tables 14:1, subsets 1, values: 11/11:
Subset 0:
001194 [SIM] Report mnemonic(CCITTIA5): rmap
004001 YEAR(YEAR): 2016
004002 MONTH(MONTH): 1
004003 DAY(DAY): 29
004004 HOUR(HOUR): 12
004005 MINUTE(MINUTE): 14
004006 SECOND(SECOND): 55
001011 SHIP OR MOBILE LAND STATION IDENTIFIER(CCITTIA5): dancast78
001213 AIRBASE AIR QUALITY OBSERVING STATION CODE(CCITTIA5): conn
005001 LATITUDE (HIGH ACCURACY)(DEGREE): 44.69950
006001 LONGITUDE (HIGH ACCURACY)(DEGREE): 10.64555
Starting a web server from top src dir:
$ python -m SimpleHTTPServer 8000
The code sometimes raise KeyError: 'could not detect the encoding of '
or OSError: reading a 5505024-bytes record from : Illegal seek
.
# test.py
import dballe
from urllib2 import urlopen
from glob import glob
db = dballe.DB.connect_from_file("/tmp/buttami.db")
db.reset()
for f in glob("extra/bufr/*.bufr"):
r = urlopen("http://localhost:8000/{}".format(f))
db.load(r)
Inspecting with gdb, it seems that sometimes the encoding is not dected the line int c = getc(stream); return 255 (encoding not detected) or 0 (create a AOF file):
$ gdb python
(gdb) l dballe/file.cc:98
93 if (c == EOF)
94 return create(BUFR, st.release(), close_on_exit, name);
95
96 if (ungetc(c, stream) == EOF)
97 error_system::throwf("cannot put the first byte of %s back into the input stream", name.c_str());
98
99 switch (c)
100 {
101 case 'B': return create(BUFR, st.release(), close_on_exit, name);
102 case 'C': return create(CREX, st.release(), close_on_exit, name);
(gdb) b dballe/file.cc:98
(gdb) r test.py
Starting program: /usr/bin/python test.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Breakpoint 1, dballe::File::create (stream=0x7fbc60, close_on_exit=close_on_exit@entry=true, name="") at file.cc:99
99 switch (c)
(gdb) p c
$1 = 0
Hoto reproduce:
dbadb wipe --dsn=sqlite:/tmp/test.sqlite
[pat1@asus-pat1 ~]$ dbadb export rep_memo=mnw --dsn=sqlite:/tmp/test.sqlite
looking for repinfo corresponding to 'mnw'
[pat1@asus-pat1 ~]$ echo $?
1
In application like Borinud this stop elaboration when the expected is that no data should be returned.
In dballe .txt we have two entries for sea temperature, B22042 and B22043, with different precision and historically used in different templates. Since B22043 is the reference variable internally used by dballe when interpreting bufr, can we drop B22042 in dballe.txt to avoid confusion and leave it only in wreport tables?
Since I opened a branch for modifying dballe.txt I can add this modification to the branch together with others before making a PR.
I attach 2 ECMWF bufr files using variable B22042, if needed for testing.
ship_bufr.zip
Importing and re-exporting extra/bufr/synop-rad2.bufr
does not seem to currently give the same file.
Quando si fanno query (voglioquesto, dimenticami), viene mai usato context_id?
Se non viene mai usato, vorrei dichiarare la feature ufficialmente non supportata.
se cerco di cancellare con "dbadb delete", ho questo errore:
dbadb delete attr_filter="B33007<50"
--dsn='mysql://localhost:3306/soglie?user=vpavan&password=80qwfwq'
cannot execute 'DELETE FROM data WHERE id IN ()':You have an error in
your SQL syntax; check the manual that corresponds to your MariaDB
server version for the right syntax to use near ')' at line 1
Che dire?
Grazie e ciao.
Andrea
In branch devel:
dbadb import -t json --dsn=$MYDSN < in.json
doesn't import the data.
following ARPA-SIMC/libsim#15
I suggest to remove username and password from all APIs.
I can start to do this in libsim and when ready should be done in dballe too.
Facendo review del codice ho notato comportamenti incoerenti nell'ordinamento dei risultati della voglioquesto/dammelo, e ho aggiunto test per verificarli. Al momento lo stato è questo:
Database memdb
:
Database SQL:
Ordinare per ana_id nei db SQL significa solo che i dati sono raggruppati per coordinate e ident, ma l'ordine dei gruppi non è definito.
In teoria, l'ordinamento nel caso dei database SQL sarebbe da correggere scambiando report e varcode.
In pratica però, siccome mantenere questi ordinamenti è costoso in termini di performance, e se tutto al momento sta funzionando con ordinamenti piú scarpazzoni, vorrei ridefinire i requisiti di ordinamento con qualcosa che permetta ai software di funzionare e lasci piú libertà possibile alle implementazioni di DB-All.e.
Per esempio, possiamo distinguere raggruppamenti e ordinamenti, e dire che i dati siano raggruppati per (coordinate, ident) invece di ordinati per (coordinate, ident).
Stesso discorso per level e timerange: serve un ordinamento, o basta un raggruppamento? In caso di raggruppamento, serve raggruppare per livello e poi raggruppare per timerange tutti i dati con lo stesso livello, o basta raggruppare per combinazioni uniche di livello e timerange?
Riguardo all'ordinamento/raggruppamento per report, come viene usato? All'inizio l'idea era di avere per ogni variabile in un certo punto+istante, tutti i suoi valori report per report, ma visto che l'implementazione per i database SQL al momento rompe questo vincolo, mi chiedo: questo vincolo serve, o potrei invece, per esempio, raggruppare tutto per (report, coordinate, ident) invece che per (coordinate, ident)?
Quanto è possibile rilassare tutto questo? Possiamo decidere che i dati vengano restituiti in ordine sparso? Qual è il minimo requisito di raggruppamento / ordinamento necessario a far funzionare i programmi che usano DB-All.e?
Togliere core::Query::data_id
e in cascata tutto quello che lo usa, e i test relativi: non vengono usati (#17)
From libsim examples I get this trace:
// ** Execution begins **
auto_ptr<DB> db0(DB::connect_from_url("sqlite:/tmp/dballe.sqlite"));
DbAPI dbapi0(*db0, "write", "write", "write");
dbapi0.scopa();
MsgAPI msgapi1("/dev/null", "w", BUFR);
// msgapi1 not used anymore
M sgAPI msgapi1("/dev/null", "w", BUFR);
// msgapi1 not used anymore
dbapi0.unsetall();
dbapi0.seti("lat", 4500000);
dbapi0.seti("lon", 1000000);
dbapi0.unset("ident");
dbapi0.unset("mobile");
dbapi0.setc("rep_memo", "generic");
dbapi0.setdate(2014, 1, 6, 18, 0, 0);
dbapi0.setlevel(105, 2000, 2147483647, 2147483647);
dbapi0.settimerange(4, 3600, 7200);
dbapi0.seti("B13003", 85);
dbapi0.prendilo();
dbapi0.setd("*B33192", 30.000000);
dbapi0.seti("*B33193", 50);
dbapi0.setd("*B33194", 70.000000);
dbapi0.critica();
dbapi0.seti("B12101", 27315);
dbapi0.prendilo();
dbapi0.setd("*B33192", 30.000000);
dbapi0.seti("*B33193", 50);
dbapi0.critica();
// error: cannot insert attributes for variable 000000: no data id given or found from last prendilo()
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff74d89c1 in dballe::fortran::Handler<HSimple, 50>::get (this=0x7ffff76dc860 <hsimp>, id=2147483647) at handles.h:109
109 assert(records[id].used);
(gdb) where
#0 0x00007ffff74d89c1 in dballe::fortran::Handler<HSimple, 50>::get (this=0x7ffff76dc860 <hsimp>, id=2147483647) at handles.h:109
#1 0x00007ffff74d7aa7 in idba_messages_open_input_ (handle=0x7fffffffaba0, filename=0x426c77 "example_dballe.bufr",
mode=0x7fffffffa640 "r", ' ' <repeats 39 times>, "\330\344p\367\377\177",
format=0x7fffffffa670 "BUFR", ' ' <repeats 36 times>, "\220\256\377\377\377\177", simplified=0x7fffffffa63c, filename_length=19, mode_length=40,
format_length=40) at binding.cc:1778
#2 0x00007ffff7a19c6e in dballe_class::dbasession_messages_open_input (session=..., filename='example_dballe.bufr', mode='r',
format='BUFR', ' ' <repeats 36 times>, simplified=.TRUE., _filename=19, _mode=1, _format=40) at dballe_class.F03:1132
#3 0x00007ffff79f7aa5 in dballe_class::dbasession_init (connection=..., anaflag="", dataflag="", attrflag="", filename='example_dballe.bufr',
mode='r', format='BUFR', template="", write=<error reading variable: Cannot access memory at address 0x0>, wipe=.TRUE., repinfo="",
simplified=<error reading variable: Cannot access memory at address 0x0>, memdb=.TRUE., loadfile=.FALSE., categoryappend="", _anaflag=0,
_dataflag=0, _attrflag=0, _filename=19, _mode=1, _format=4, _template=0, _repinfo=0, _categoryappend=0) at dballe_class.F03:3893
#4 0x000000000040ea2e in readmem () at example_dballe.F03:824
#5 0x00000000004069ab in example_dballe () at example_dballe.F03:177
#6 0x000000000042657c in main (argc=1, argv=0x7fffffffdc7c) at example_dballe.F03:3
#7 0x0000003859a21d65 in __libc_start_main (main=0x426548 <main>, argc=1, argv=0x7fffffffd838, init=<optimized out>, fini=<optimized out>,
rtld_fini=<optimized out>, stack_end=0x7fffffffd828) at libc-start.c:285
#8 0x0000000000403659 in _start ()
(gdb)
export DBALLE_TRACE_FORTRAN=tmp.log
produce un file vuoto
In branch devel, when dbadb import
or dbadb export
are executed, the report name is forced to an empty string.
The bug was introduced in commit 9aa1e53 (see removed function dballe::cmdline::dbadb::parse_op_report).
Ho cambiato la struttura delle transazioni SQL in DB-All.e e ora le performance di dbadb import
dovrebbero essere migliorate in maniera significativa; vi chiederei di farmi sapere se notate qualcosa, e se è migliorato, di quanto è migliorato.
Se è migliorato in maniera significativa, vorrei far si che tutto quello che sta tra una preparati e una fatto stia dentro a un'unica transazione. Questo avrebbe un paio di effetti di cui vale la pena discutere, ma direi di farlo solo dopo aver misurato quanto cambia dbadb.
Importing and re-exporting extra/bufr/gts-amdar2.bufr
does not seem to currently give the same file.
In msg.cc lon and lat are written to csv files (and possibly other text formats) using setprecision(5), thus having 5 overall significant digits, e.g. 11.572, while it is necessary to have 5 significant decimal digits, e.g. 11.57293 the equivalent of %.5f C format, don't know how to do it with C++ iostream.
Reference:
http://www.cplusplus.com/reference/iomanip/setprecision/
non trovo documentazione
dbamsg convert -t csv -d bufr
do not take in account csv header for reading positional csv fields
In devel branch, the labels (stdin)
and (stdout)
were removed but from Fortran bindings is not possible to read from stdin or write to stdout (see idba_messages_open_input and idba_messages_open_output).
// ** Execution begins **
MsgAPI msgapi0("SMRERSOUNDLMM.2010012712.bufr", "w", BUFR);
msgapi0.unsetall();
msgapi0.setcontextana();
msgapi0.setc("rep_memo", "temp");
msgapi0.setd("lat", 45.027700);
msgapi0.setd("lon", 9.666700);
msgapi0.seti("mobile", 0);
msgapi0.seti("block", 0);
msgapi0.seti("station", 101);
msgapi0.prendilo();
// error: no year information found in message to import
Importing and re-exporting extra/bufr/gts-amdar1.bufr
does not seem to currently give the same file.
Importing and re-exporting extra/bufr/pilot-gts1.bufr
does not seem to currently give the same file.
Importing and re-exporting temp-timesig18.bufr
does not seem to currently give the same file.
$ ls
archive2013.csv
$ cat archive2013.csv |dbamsg convert -t csv -d bufr
$ ls -lrt
-rw-rw-r-- 1 ppatruno ppatruno 116630106 7 feb 10.25 archive2013.csv
-rw-rw-r-- 1 ppatruno ppatruno 74155868 9 feb 12.06 (stdout)
quindi scrive su un file che chiama "(stdout)"
rpm -q dballe
dballe-7.7-2.x86_64
dbadb import -t json --wipe-first --dsn=sqlite:/dev/shm/tmp.sqlite tmp.json
Invalid JSON value
cat tmp.json
[{"ident": null, "network": "arpav", "lon": 1187637, "lat": 4649926, "date": "2016-01-01T01:00:00Z", "data": [{"vars": {"B01019": {"v": "3 Arabba" },"B07030": {"v": 1645.0},"B07031": {"v": 1645.0} } }, {"timerange": [1,0,3600],"vars": {"B13011": { "a": { }, "v": 0.0 } },"level": [ 1,null,null,null]} ] },
{"ident": null, "network": "arpav", "lon": 1187637, "lat": 4649926, "date": "2016-01-01T02:00:00Z", "data": [{"vars": {"B01019": {"v": "3 Arabba" },"B07030": {"v": 1645.0},"B07031": {"v": 1645.0} } }, {"timerange": [1,0,3600],"vars": {"B13011": { "a": { }, "v": 0.0 } },"level": [ 1,null,null,null]} ] }]
dbadb import -t json --wipe-first --dsn=sqlite:/dev/shm/tmp.sqlite tmp.json
unexpected character
cat tmp.json
{"ident": null, "network": "arpav", "lon": 1187637, "lat": 4649926, "date": "2016-01-01T01:00:00Z", "data": [{"vars": {"B01019": {"v": "3 Arabba" },"B07030": {"v": 1645.0},"B07031": {"v": 1645.0} } }, {"timerange": [1,0,3600],"vars": {"B13011": { "a": { }, "v": 0.0 } },"level": [ 1,null,null,null]} ] },
{"ident": null, "network": "arpav", "lon": 1187637, "lat": 4649926, "date": "2016-01-01T02:00:00Z", "data": [{"vars": {"B01019": {"v": "3 Arabba" },"B07030": {"v": 1645.0},"B07031": {"v": 1645.0} } }, {"timerange": [1,0,3600],"vars": {"B13011": { "a": { }, "v": 0.0 } },"level": [ 1,null,null,null]} ] }
Ho creato un nuovo formato di database sperimentale, implementato finora solo per sqlite e postgresql, che dovrebbe velocizzare un po' le cose almeno per PostgreSQL, perché cerca di fare tutto con meno query.
Di default dballe lavora sempre col formato stabile. Per provare quello nuovo esportare DBA_DB_FORMAT=V7
prima di creare un nuovo database.
Il database V7 si comporta come memdb, per cui a un ana_id corrisponde un'unica combinazione di (lat, lon, ident e rep_memo) invece di (lat, lon, ident) come nel formato attuale.
Apro questo issue per tracciare le prove che si fanno.
dbadb export --dsn=rmap lat=44.65305 lon=11.62301 month=04 year=2016 >richardson_aprile.bufr
write to temporary file was interrupted
dbadb import -t json --wipe-first --dsn=sqlite:/dev/shm/tmp.sqlite tmp.json
looking for repinfo corresponding to 'ARPAV'
with tmp.json:
{
"ident": null,
"network": "ARPAV",
"lon": 1187637,
"lat": 4649926,
"date": "2016-01-01T01:00:00Z",
"data": [
{
"vars": {
"B01019": {
"v": "3 Arabba"
},
"B07030": {
"v": 1645.0
},
"B07031": {
"v": 1645.0
}
}
},
{
"timerange": [
1,
0,
3600
],
"vars": {
"B13011": {
"a": {
},
"v": 0.0
}
},
"level": [
1,
null,
null,
null
]
}
]
}
the problem is the same with DB initialized with repinfo.csv:
1,synop,synop , 1,oss, 255
50,ARPAV,ARPAV , 50,oss, 255
255,generic,export generici da db meteo,1000,?,255
with "network": "arpav" everything works well
Would it be a hard work to implement full interpretation of a bufr message like extra/bufr/test-soil1.bufr? I mean interpret context B07061 as depth below land (leveltype 106 according to fapi_ltypes.md) and place temperature at that level. Now interpretation is limited to station data:
$ dbamsg dump --interpreted ../../../extra/bufr/test-soil1.bufr
#0[0] synop message with 1 contexts:
Level -,-,-,- tr -,-,- 6 vars:
001001 WMO BLOCK NUMBER(NUMERIC): 11
001002 WMO STATION NUMBER(NUMERIC): 406
002001 TYPE OF STATION(CODE TABLE): 0
005001 LATITUDE (HIGH ACCURACY)(DEGREE): 50.06972
006001 LONGITUDE (HIGH ACCURACY)(DEGREE): 12.39306
007030 HEIGHT OF STATION GROUND ABOVE MEAN SEA LEVEL (SEE NOTE 3)(M): 483.0
I would like to remove ODBC support. It is slow and buggier than direct connection methods, and currently only useful to use Oracle as a SQL database, which is untested and, as far as I understand, unneeded.
I'm opening this issue to track what is still using ODBC support in DB-All.e. If by the 18th of April 2016 there is no notice of anything using it, I will remove the relevant code.
I dati possono essere importati (dbadb import
) e inseriti (prendilo
).
Ricordo che si era parlato di cosa dovrebbe succedere agli attributi già esistenti in quei due casi, ma non ricordo fosse mai stato definito niente.
Al momento il comportamento è:
ricordo vagamente una richiesta di questo tipo:
Prima di procedere a studiare strategie piú efficienti di gestione degli attributi, vorrei rendere definitiva questa specifica del loro comportamento.
It's already broken and the namespace conflicts with the working https://github.com/ARPA-SIMC/provami
La definizione di PM1 in btable è fatta per l'output di Chimere (così come implementato al SIMC ad oggi):
015203 [SIM] PM1 Concentration (tot. aerosol < 1.25 ug)
Attenzione, non è esatta. Se si vorrà usare per gli osservati (che già sono presenti per alcune stazioni di monitoraggio), se ne dovrà introdurre una diversa
xxxxxx [SIM] PM1 Concentration (tot. aerosol < 1 ug)
e magari ridefinire con maggiore accuratezza l'attuale
015203 [SIM] PM1.25 Concentration (tot. aerosol < 1.25 ug)
Ci siamo accorti che le performance di memdb in importazione, usato in libsim per importare file bufr, peggiorano fortemente e non linearmente all'aumentare della dimensione del file bufr, si arriva a 17 minuti per un file di ~1.5MB e ~7Kmessaggi. L'importazione su database "vero", es sqlite, non mostra il problema. Esempio per riprodurre:
wget ftp://ftp.smr.arpa.emr.it/incoming/dav/arpapiemonte/common20160229
for n in 10 50 100 500 1000 5000; do dbamsg cat --index=1-$n common20160229 > testbufr_$n; done
for file in testbufr_* ; do echo $file; time dbadb import --dsn=mem: $file; done
Un risultato simile in termini di prestazioni si ottiene importando con v7d_transform di libsim; un grezzo profiling al volo con perf top
mostra:
Samples: 136K of event 'cycles', Event count (approx.): 57059249544
Overhead Shared Object Symbol
61,01% libstdc++.so.6.0.19 [.] std::_Rb_tree_increment
7,93% libdballe.so.7.0.3 [.] dballe::stl::stlutils::Itersection<unsigned long>::sync_iters
3,80% libdballe.so.7.0.3 [.] dballe::stl::stlutils::SequenceIters<std::_Rb_tree_const_iterator<unsigned long> >::next
2,43% libdballe.so.7.0.3 [.] dballe::stl::stlutils::SequenceIters<std::_Rb_tree_const_iterator<unsigned long> >::valid
0,99% [vdso] [.] __vdso_gettimeofday
0,90% libdballe.so.7.0.3 [.] dballe::stl::stlutils::SequenceIters<std::_Rb_tree_const_iterator<unsigned long> >::get
mentre export DBA_PROFILE=Y
non aggiunge nulla. Stiamo facendo qualche errore procedurale, o la situazione è questa?
If I run: ./run-check -C dballe TEST_WHITELIST="core_json*"
then several tests have now started to fail:
core_json: .xxxx..
core_json.bool: value 'truefalse' is different than the expected 'false'
core/json-test.cc:34:actual(out.str()) == "false"
core_json.int: value '1-1234567' is different than the expected '-1234567'
core/json-test.cc:46:actual(out.str()) == "-1234567"
core_json.double: value '1.100000-1.100000' is different than the expected '-1.100000'
core/json-test.cc:57:actual(out.str()) == "-1.100000"
core_json.string: value '"""antani"' is different than the expected '"antani"'
core/json-test.cc:76:actual(out.str()) == "\"antani\""
4/7 tests failed
$ python3 --version
Python 3.3.2
$ python3 -c "import dballe; print(dir(dballe))"
['__doc__', '__initializing__', '__loader__', '__name__', '__package__', '__path__']
$ python3 --version
Python 3.4.3+
$ python3 -c "import dballe; print(dir(dballe))"
['Cursor', 'DB', 'Record', 'Var', 'Varinfo', 'Vartable', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', '_dballe', 'absolute_import', 'describe_level', 'describe_trange', 'var', 'varinfo', 'wreport']
Reading bufrs from file
bufr.zip
return:
messages_read_next();
// error: date/time informations not found (or incomplete) in message to insert
with this trace:
dbapi5.messages_open_input("example_dballe.bufr", "r", BUFR, true);
ires = dbapi5.messages_read_next();
wassert(actual(ires) == 1);
dbapi5.unsetall();
dbapi5.unset("lat");
dbapi5.unset("lon");
dbapi5.unset("ident");
dbapi5.unset("mobile");
dbapi5.unset("rep_memo");
dbapi5.setc("var", "B12101");
dbapi5.unset("limit");
dbapi5.unset("priority");
dbapi5.unset("priomin");
dbapi5.unset("priomax");
dbapi5.unset("latmin");
dbapi5.unset("lonmin");
dbapi5.unset("latmax");
dbapi5.unset("lonmax");
dbapi5.unset("ana_filter");
dbapi5.unset("data_filter");
dbapi5.unset("attr_filter");
dbapi5.unset("query");
dbapi5.unset("yearmin");
dbapi5.unset("monthmin");
dbapi5.unset("daymin");
dbapi5.unset("hourmin");
dbapi5.unset("minumin");
dbapi5.unset("secmin");
dbapi5.unset("yearmax");
dbapi5.unset("monthmax");
dbapi5.unset("daymax");
dbapi5.unset("hourmax");
dbapi5.unset("minumax");
dbapi5.unset("secmax");
dbapi5.unset("varlist");
dbapi5.unset("*varlist");
ires = dbapi5.voglioquesto();
wassert(actual(ires) == 1);
sres = dbapi5.dammelo();
wassert(actual(sres) == "B12101");
ires = dbapi5.voglioquesto();
wassert(actual(ires) == 1);
sres = dbapi5.dammelo();
wassert(actual(sres) == "B12101");
MsgAPI msgapi6("/dev/null", "w", BUFR);
// msgapi6 not used anymore
MsgAPI msgapi6("/dev/null", "w", BUFR);
// msgapi6 not used anymore
dbapi5.remove_all();
ires = dbapi5.messages_read_next();
// error: date/time informations not found (or incomplete) in message to insert
https://github.com/ARPA-SIMC/dballe/blob/master/dballe/core/record-test.cc#L187-L189
core_record.get_set_level: value '1,0,-,-' is different than the expected '1,-,-,-'
core/record-test.cc:189:actual(rec.get_level()) == Level(1)
start with this command:
dbadb import -t json --wipe-first --dsn=sqlite:/dev/shm/arpav.sqlite Dati_ARPAV_json_rmap_20151201-20151231.txt
executing COMMIT:database is locked
on an other terminal:
provami-qt sqlite:///dev/shm/arpav.sqlite
setNativeLocks failed: Risorsa temporaneamente non disponibile
setNativeLocks failed: Risorsa temporaneamente non disponibile
progress "data" : new task: "Loading data..."
progress "summary" : new task: "Loading summary..."
progress "summary" : task update: "Loading summary from db..."
progress "data" : task update: "Processing data..."
progress "data" : task ends
progress "summary" : task update: "Processing summary..."
Refresh summary results arrived
"undefined:0: TypeError: undefined is not a function"
Summary collation started
process_summary
update stations
"undefined:1: ReferenceError: Can't find variable: set_stations"
"undefined:0: TypeError: undefined is not a function"
progress "summary" : task ends
Is this expected with sqlite ?
Add an input/output JSON format, e.g.:
{
"ident": null,
"network": "rer",
"lon": 915454,
"lat": 4451485,
"date": "2015-07-30T15:30:00Z",
"data": [
{
"vars": {
"B01019": {
"v": "Torriglia"
},
"B07030": {
"v": 769.0
},
"B07031": {
"v": 769.0
}
}
},
{
"timerange": [
1,
0,
3600
],
"vars": {
"B13011": {
"a": {
},
"v": 0.0
}
},
"level": [
1,
null,
null,
null
]
},
{
"timerange": [
254,
0,
0
],
"vars": {
"B12101": {
"a": {
},
"v": 297.15
},
"B13003": {
"a": {
},
"v": 50
}
},
"level": [
103,
2000,
null,
null
]
}
]
}
Issue reported by Massimo Bider.
The output of dbamsg dump --csv
is changed in 7.2:
edition
is now master_table_number
)""
"representative_time","2015-5-27 9:0:0"
was date,2015-05-27 09:00:00
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.