demiangomez / parallel.gamit Goto Github PK
View Code? Open in Web Editor NEWPython wrapper to parallelize GAMIT executions
License: GNU General Public License v3.0
Python wrapper to parallelize GAMIT executions
License: GNU General Public License v3.0
And also allow to select a date interval. The frame should be specified by hand given that there are orbit files with the wrong frame in their headers.
When invoking a job server for parallel python, a "check node" object should handle the job creation and verification that each node in the cluster has all the necessary dependencies to run. If the node doesn't have all the necessary programs/dependencies, remove the node from the cluster and continue execution without it.
Whenever a modification is performed to a station information record, it would be good to create an event that shows network, station, old and new station info record.
Mechanical offsets between measured height-of-instrument (HI) and the antenna reference point (ARP) for SEPPOLANT_X_MF
When it gets a new station that has a "problem" (too close to more than one station, etc.), and ends up in the data_rejected directory, it would be nice if the error message had the psql command to add the station to the database, or at least printed out the info one needed (xyz, lat/lon/ht) to add it. (the problem may also be a rename, so that command might also be useful in the error message).
It would be nice if the rejected folder had some subfolders based on why it was rejected - at least one subfolder for failed to find location after 6 tries, and one subfolder for confused with another station(s), and possibly one for "other" (for now, could break down farther as other specific problems arise).
Put a or at the end of the log files so the prompt shows up on a new line.
ppp reports no station info found when database says there is station info. problem arises when station info start/end is to second, or within a second of rinex start end.
Change in the "NetworkCode" to make it more useful for multiple applications of database.
Make "NetworkCode" meaningless to the end user. There could be a base network code, say n00, where all new sites with unique names go. Duplicate names go into networks n01, n02, etc. as such sites come into the database.
Second network code where user can organize groups of sites based on need or preference.
Example - IGN and CAP use same underlying database but can organize the interface how they want/need (IGN can have IGS, SIR, RMS, etc. networks, while CAP can have ARG, ARS, CHI, CHS, PIF, MAU, PIS, ...)
So any given site could be in multiple "networks". Need to be careful when deleting sites - if in multiple application networks, does not get touched at the nXX level. If a unique site sends a message. Or always send message about sites affiniies.
This is an expansion of the missing new line at the end of log files issue (labeled as bug) from Dec.
Newlines are missing at the end of many of the error messages from IntegrityCheck (and probably ScanArchive and ArchiveService). It makes it hard to read and process automatically.
Run pep8 on everything.
When checking for gaps, report the total number of RINEX files and the total number of missing files (between the reported start and stop dates).
Started with ~1100 files missing from an igs continuous station (got our attention because entry in one of ? directories, but no rinex in the repository tree, no locks. Got the 1100 files by looking at data in osu archive and moving the ones not in the PG archive into data_in).
Ran pyArchiveSevice.py on these 1100 files. About half were moved to the archive (or at least disappeared from the repository directory tree. A handful ended up in data_retry_in and data_rejected, the remaining ones still in data_in (the number of files in the archive grew by 500).
There were no locked files and probably a single error in the error message file (errors_pyArchiveService.log in /Volumes/UsersDrive/Users/smalley/Working.Parallel.GAMIT/run_dir on capybara). I think each run of pyArchiveService.py generates one error.
Re-running pyArchiveService.sh started reporting 120 files (ls | wc shows ~575, and the number of files in the archive is constant), and each iteration it drops the number of files reorted by 5-7 files (a few times it dropped by more). I think it added one line to the error file each run).(see screen output).
So - files in data_in not going anywhere (from ls), but "disappearing" from processing by apArchiveService.py, not locked and no errors.
When the problem is related to no PPP solutions due to outliers.
Instead of writing the relevant information about network-station, year, doy, etc in the EventDescription field, add independent fields (that can be set to NULL) to facilitate searching events. Also, add another event description (besides type = warn, info and error).
-Fetches and installs dependencies
-Add a dev flag which also installs a DB and sets the program up for testing.
When starting a program, stop execution if there are no nodes found in dispy
Also, check what happens with multihomed computers when creating a cluster. Seems like PG is not finding nodes connected to secondary ethernet port.
Some sp3 files have the wrong frame in their headers. This produces coordinates in the ppp_soln with a frame that does not correspond to the actual frame of the orbits. The frame should be declared in the gnss_data.cfg rather than being read from the sp3 files.
When looking at error statistics it would be handy if the errors had a number. Each error is "unique" in the sense that the file name, and other details are unique, so it is hard to find out the kinds of errors and how many of each kind there are. If the errors had numbers and there was an error number table one could quickly find specific types of errors.
Either in the events table or maybe add a new field to the ppp_soln table. We should save when the PPP coordinate was generated to make sure that the coordinate is updated after a metadata change.
add switch to only run on local machine, would still be parallel, but not try to go over network to other machines.
There are 3 sites named corr in Argentina (1 cap and 1 saga from 1993 and 1 de un proyecto LISN [it's possible it's closed).
I put rinex files that I thought were all from the cap corr site (they were in the cap folder of the osu databae) into data_in and ran ArchiveSerivce.
Two of the files were actually from the saga site. They stayed in data_in and got associated with the ??? network, but there were no messages anywhere saying there was a duplicate name for a new station. They show up with a very different lat/lon in the database, but otherwise no information.
error message from ArchiveService run, there should be a pdf attached with NRCAN solution.
says problem is with sampling interval.
RINEX sampling interval could not be determined. The output from RinSum was:
+++++++++++++ RinSum summary of Rinex obs file production/rinex/6e271835-0489-4a0f-9e9a-dde6b5be02d3/lo101310.00o +++++++++++++
Warning : Failed to read header: text 0:Unidentified label: >ANTENNA: DEL<
text 1:In record 0
text 2:In file production/rinex/6e271835-0489-4a0f-9e9a-dde6b5be02d3/lo101310.00o
text 3:Near file line 12
location 0:/Volumes/UsersDrive/moved_user/smalley/Downloads/gpstk-2.5.src/dev/ext/lib/FileHandling/RINEX3/Rinex3ObsHeader.cpp:1425
location 1:/Volumes/UsersDrive/moved_user/smalley/Downloads/gpstk-2.5.src/dev/ext/lib/FileHandling/RINEX3/Rinex3ObsHeader.cpp:1471
location 2:/Volumes/UsersDrive/moved_user/smalley/Downloads/gpstk-2.5.src/dev/ext/lib/FileHandling/FFStream.cpp:150
location 3:/Volumes/UsersDrive/moved_user/smalley/Downloads/gpstk-2.5.src/dev/ext/lib/FileHandling/FFStream.hpp:184
location 4:/Volumes/UsersDrive/moved_user/smalley/Downloads/gpstk-2.5.src/dev/ext/lib/FileHandling/FFStream.hpp:184
Header dump follows.
---------------------------------- REQUIRED ----------------------------------
Rinex Version 2.00, File type O BSERVATION DATA, System G (GPS).
Prgm: ASHTORIN, Run: 31 - MAY - 00 16:20, By:
Marker type: .
Observer : AO_, Agency:
Rec#: GN-1331, Type: TOPCON GP-R1DY, Vers:
Antenna # : BX-3191, Type :
Position (XYZ,m) : (1911998.2600, -4237323.8800, -4352388.4900).
Antenna Delta (HEN,m) : (0.0000, 0.0000, 0.0000).
Time of first obs -002/12/01 00:00:00.000 UNK
(This header is VALID)
---------------------------------- OPTIONAL ----------------------------------
Marker number :
Comments (3) :
Concatenated from 2 rinex files by RNXCAT on 20 Jul 00
-------------------------------- END OF HEADER --------------------------------
RinSum timing: processing 0.007 sec, wallclock: 0 sec.
/lo101310.00d.Z: (file moved to /Volumes/UsersDrive/repository/data_rejected/bad_rinex/2000/131)
lo101310.pdf
When a multiday RINEX file was already in the database (entered through an old version of pyScanArchive) PPP fails to process it because pyRinex by default bins it into the multiple days. Therefore, an IOError: [Errno 2] No such file or directory exception is thrown when normalizing the header.
See example:
IOError: [Errno 2] No such file or directory: 'production/rinex/8f77b22c-f0ee-4bcc-ab6d-3a4e8593c7d5/tuc12242.10o' processing: rms tuc1 2010 224 using node elvira END OF ERROR ===================
Should deal with these files correctly
This error does not seem to prevent getting an ETM solution or a plot. It occurs once (there are 506 stations in the database, and 500 stations [not counting duplicate names] in the ppp_soln, I've not yet figured out the postgresql command to do the "distinct" using both StationCode and NetworkCode)
/Volumes/Sierra750GB/usr/local/pyconda/lib/python2.7/site-packages/numpy/linalg/linalg.py:1487: RuntimeWarning: divide by zero encountered in true_divide
return s[..., 0]/s[..., -1]
Successfully plotted bra.bomj
There are a number of sites similar to bomj (a sirgas campaign measurement followed years later by continuous operations) but the other ones do not report any errors.
There are a number of stations with less than 4 occupations and they are captured by an if statement and not processed. It would be more informative if they produced a message saying this and did not receive further processing. Here is the message all but one of them produce.
Traceback (most recent call last):
File "../classes/pyPlotETM.py", line 58, in main
json.dump(etm.todictionary(True), f, indent=4, sort_keys=False)
File "/Volumes/UsersDrive/Users/smalley/Working.Parallel.GAMIT/classes/pyPPPETM.py", line 752, in todictionary
etm['Linear'] = {'tref': self.Linear.tref, 'params': self.Linear.values.tolist()}
AttributeError: ETM instance has no attribute 'Linear'
One station with 3 "occupations"/5 days of data: 98[2],03[2],10[1], produces this error message
Error during processing of cer.ccrn
Traceback (most recent call last):
File "../classes/pyPlotETM.py", line 52, in main
etm = pyPPPETM.ETM(cnn, stn['NetworkCode'], stn['StationCode'], False)
File "/Volumes/UsersDrive/Users/smalley/Working.Parallel.GAMIT/classes/pyPPPETM.py", line 624, in init
self.Jumps = JumpsTable(cnn, NetworkCode, StationCode, ppp.t, add_antenna_jumps=self.Periodic.params)
File "/Volumes/UsersDrive/Users/smalley/Working.Parallel.GAMIT/classes/pyPPPETM.py", line 294, in init
DOP = np.diag(np.linalg.inv(np.dot(self.A.transpose(), self.A)))
File "/Volumes/Sierra750GB/usr/local/pyconda/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 513, in inv
ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
File "/Volumes/Sierra750GB/usr/local/pyconda/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 90, in _raise_linalgerror_singular
raise LinAlgError("Singular matrix")
LinAlgError: Singular matrix
Campaign data does not have equipment jumps, but may have earthquake jumps with only one occupation/file after the earthquake, and that could make it fail.
But it seems that once the test for the number occupations fails, it should exit cleanly.
When pyETM does not generate a model, print out a message (in the console or plot) saying why a model could not be computed.
ppp is having trouble with rinex files in which the sample time is not lined up on whole seconds (for 1 sec and slower sampling)
ex. rinex file
2.11 OBSERVATION DATA G (GPS) RINEX VERSION / TYPE
....
UBAT - RBMC Ubatuba MARKER NAME
...
15.0000 INTERVAL
SNR is mapped to RINEX snr flag value [1-9] COMMENT
L1: 3 -> 1; 8 -> 5; 40 -> 9 COMMENT
L2: 1 -> 1; 5 -> 5; 60 -> 9 COMMENT
2006 1 2 0 0 15.1880000 GPS TIME OF FIRST OBS
END OF HEADER
06 1 2 0 0 15.1880000 0 9G02G04G26G08G29G24G09G17G07
When pyArchiveService starts it doesn't pull the max cpus info form gnss_data.cfg
Whenever there's a version change, the ETMs table should be auto-purged before running the etms generation script.
Found an error due to low disk space. Check disk space before starting PG.
Report to screen (and monitor.log) if a site ended up not having data in the H file. Read log from monitor.log:
----------------------------------------------------------------
Processing file 1 h-file ../133/hsirga.18133
Atm models: DryZen UFL WetZen GP25 DryMap VMF1 WetMap VMF1 IonSrc NONE MagFld
No data for site MGV1
No data for site UYRO
No data for site UYSO
There are 41 sites in ../133/hsirga.18133
Name Full name
1 BATF BATF_GPS 60664 TRM 0.0 0.0100
2 BAVC BAVC_GPS 29198 TRM 0.0 0.0080
3 BOGT BOGT_GPS 21936 JAV 0.0 0.0610
4 CEFE CEFE_GPS 20514 TRM 0.0 0.0000
Whenever a station match is found during pyArchiveService but the name of the RINEX files does not agree with the StationCode, add an optional command line to rename the RINEX in the log file created in the retry folder.
Format docstrings for sphinx
Add a switch to plot a portion of the time series rather than the whole thing. This helps to view the last part of the TS to identify missing jumps, metadata problems, etc.
had ~80 newly added stations that needed station info information added. Did about 30, complaining about 50 and not loading their information. Similar to pyArchiveService.py, is stuck. Number files not changing when run it again.
See screen and files in run_dir.
The check
if file.endswith("d.Z"):
needs to be improved to avoid problems with files named old.Z, since condition now lets this type of file pass. A regular expression should be used to guarantee that the filename has the form stnmddd.yyd.Z
Peter and I were looking at the ETM and ts/etm plotting today.
We found out why most of the ones not plotting were "failing" - they had less than 4 observations - some had between 3 and 1, and at least one had 0 observations, no rinex file in the archive, and no return from postgresql in the station list).
Here is our request. Separate the ETM calculations and plotting. The ETM class should calculate the ETM and provide an object to pass around in the program or write it to disk (with time series optional as it is now). Put all zeros in the etm parameters result to signify there was no fit. This will allow the objet/file to be used to pass just the time series.
In the PlotETM class, plot the time series (from an object within the program or from a file) as the basic result, and plot the etm if the parameters indicate an ETM was found (at least one amplitude not zero), or no ETM if it was not found. Should be able to plot just the time series if ETM exists. Should also be able to plot the residuals after removing the ETM (to see if there is any structure in the residuals).
There may be some complications if one want's to do the jumps (know where they are), but probably best for just time series to plot raw time series.
It is very hard to read the log files when printing out a series of them as the first line of the (N+1)th log file continues without a newline as a continuation of the last line of the Nth log file
capybara:bad_rinex smalley$ cat ????/???/*log
During decimation or remove_systems (to run auto_coord), teqc returned: %sCould not find a first observation in RINEX file. Truncated file? Header follows:
2.11 OBSERVATION DATA G (GPS) RINEX VERSION / TYPE
.......
SAN JOSE DE MORRO MARKER NAME
MORR MARKER NUMBER
..........
Forced Modulo Decimation to 30 seconds COMMENT
/morr0800.94d.Z: (file moved to /Volumes/UsersDrive/repository/data_rejected/bad_rinex/1994/080)During decimation or remove_systems (to run auto_coord), teqc returned: %sCould not find a first observation in RINEX file. Truncated file? Header follows:
2.11 OBSERVATION DATA G (GPS) RINEX VERSION / TYPE
.........
PALO MARKER NAME
0200 MARKER NUMBER
..........
Forced Modulo Decimation to 30 seconds COMMENT
/palo0800.94d.Z: (file moved to /Volumes/UsersDrive/repository/data_rejected/bad_rinex/1994/080)During decimation or remove_systems (to run auto_coord), teqc returned: %sCould not find a first observation in RINEX file. Truncated file? Header follows:
2.11 OBSERVATION DATA G (GPS) RINEX VERSION / TYPEteqc 2017Jul3 20171230 06:45:19UTCPGM / RUN BY / DATE
I'm bringing in the CAP campaign data - the 1993 data is in rinex v1 and rnx2crx only works with v2 and above.
The current solution is to use teqc to convert it all to rinex v2 (I'm testing the results of both now)
if there are stations with 4 letter codes "allx", where x is the 4th letter in the code, these are interpreted as "all" by ScanArchive - so cer.allo behaves as cer.all
could be helpful to allow reg exp wildcards in the python commands for station and network codes, e.g. cer.at1[0-9] instead of having to write cer.a01 cer.at02 ... cer.at09
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.