Comments (3)
class PODAACHarvester(Harvester):
ingester = ingesters.DDXIngester
crawler = crawlers.OpenDAPCrawler
"""Harvester class for PODAAC data (NASA)"""
def _create_crawlers(self):
return [
self.crawler(url, time_range=(self.get_time_range()))
for url in self.config['urls']
]
def _create_ingester(self):
parameters = {}
for parameter_name in ['max_fetcher_threads', 'max_db_threads']:
if parameter_name in self.config:
parameters[parameter_name] = self.config[parameter_name]
return self.ingester(**parameters)
class OSISAFHarvester(PODAACHarvester):
ingester = ingesters.DDXOsisafIngester
crawler = crawlers.OSISAFCrawler
from django-geo-spaas-harvesting.
class DDXIngester(MetanormIngester):
def prepare_ur(self, url):
return url
....
def _get_normalized_attributes(self, url, *args, **kwargs):
"""Get normalized metadata from the DDX info of the dataset located at the provided URL"""
prepared_url = url if url.endswith('.ddx') else url + '.ddx'
prepared_url = self.prepare_url(prepared_url)
....
class DDXOsisafIngester(DDXIngester)
def prepare_url(self, prepared_url):
return prepared_url.replace(prepared_url[prepared_url.find("catalog/"):prepared_url.find("?dataset=")+9],"dodsC/")
from django-geo-spaas-harvesting.
The new tests of this should not pass because we need new normalizer of OSISAF for it and it needs new version of metanorm to work correctly. So please merge that one first when it becomes.
@akorosov @aperrin66
from django-geo-spaas-harvesting.
Related Issues (20)
- Put all dependencies in setup.py
- test update for newer version of metanorm that deals with colhub HOT 1
- WebDirectoryCrawler: add time filtering for yyyymmdd date folders
- `verify_urls.find_provider()` does not work when the provider name contains a "_"
- Make the `verify_urls` module work with pure FTP repositories
- needed modifications in harvesters and ingestors for aviso and cpom SLA
- The LocalDirectoryCrawler should be able to accept a file path
- PO.DAAC OpenDAP does not allow direct access to files anymore HOT 1
- Adapt ingesters to metanorm refactoring
- Implement retry mechanism in case of provider unavailability
- New Creodias API URL
- Update the README with the new harvesters
- integrating the logging action with nansat
- Fix time condition in Copernicus crawler
- Copernicus apihub URL changed
- Make it possible to update existing datasets
- Move TrustDomainSession to django-geo-spaas repo
- NansatIngester: support the case where no "dataset_parameters" exist
- Harvest new PO.DAAC data for TOPVOYS HOT 1
- Pin Pythesint vocabularies versions
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from django-geo-spaas-harvesting.