Processes

Download

class malleefowl.processes.wps_download.Download[source]

download Download files (v0.9)

Downloads files and provides file list as json document.

Parameters:resource (string) – URL pointing to your resource which should be downloaded.
Returns:output – Json document with list of downloaded files with file url.
Return type:application/json

References

which should be downloaded.

The downloader first checks if the file is available in the local ESGF archive or cache. If not then the file will be downloaded and stored in a local cache. As a result it provides a list of local file:// paths to the requested files.

The downloader does not download files if they are already in the ESGF archive or in the local cache.

ESGSearch

class malleefowl.processes.wps_esgsearch.ESGSearchProcess[source]

esgsearch ESGF Search (v0.6)

Search ESGF datasets, files and aggreations.

Parameters:
  • url (string, optional) – URL of ESGF Search Index which is used for search queries. Example: http://esgf-data.dkrz.de/esg-search
  • distrib (boolean, optional) – If flag is set then a distributed search will be run.
  • replica (boolean, optional) – If flag is set then search will include replicated datasets.
  • latest (boolean, optional) – If flag is set then search will include only latest datasets.
  • temporal (boolean, optional) – If flag is set then search will use temporal filter.
  • search_type ({'Dataset', 'File', 'Aggregation'}, optional) – Search on Datasets, Files or Aggregations.
  • constraints (string, optional) – Constraints as list of key/value pairs.Example: project:CORDEX, time_frequency:mon, variable:tas
  • query (string, optional) – Freetext query. For Example: temperatue
  • start (dateTime, optional) – Startime: 2000-01-11T12:00:00Z
  • end (dateTime, optional) – Endtime: 2005-12-31T12:00:00Z
  • limit ({'0', '1', '2', '5', '10', '20', '50', '100', '200'}, optional) – Maximum number of datasets in search result
  • offset (integer, optional) – Start search of datasets at offset.
Returns:

  • output (application/json) – JSON document with search result, a list of URLs to files on ESGF archive nodes.
  • summary (application/json) – JSON document with search result summary
  • facet_counts (application/json) – JSON document with facet counts for constraints.

References

to get a list of matching files on ESGF data nodes. It is using esgf-pyclient Python client for the ESGF search API.

In addition to the esgf-pyclient the process checks if local replicas are available and would return the replica files instead of the original one.

The result is a JSON document with a list of http:// URLs to files on ESGF data nodes.

TODO: bbox constraint for datasets

ThreddsDownload

class malleefowl.processes.wps_thredds.ThreddsDownload[source]

thredds_download Download files from Thredds Catalog (v0.5)

Downloads files from Thredds Catalog and provides file list as JSON Document.

Parameters:url (string) – URL of the catalog.
Returns:output – JSON document with list of downloaded files with file url.
Return type:application/json

References

Workflow

class malleefowl.processes.wps_workflow.DispelWorkflow[source]

workflow Workflow (v0.7)

Runs Workflow with dispel4py.

Parameters:workflow (text/yaml) – Workflow description in YAML.
Returns:
  • output (text/yaml) – Workflow result document in YAML.
  • logfile (text/plain) – Workflow log file.

References

run WPS process for climate data (like cfchecker, climate indices with ocgis, …) with a given selection of input data (currently NetCDF files from ESGF data nodes).

Currently the Dispel4Py workflow engine is used.

The Workflow for ESGF input data is as follows:

Search ESGF files -> Download ESGF files -> Run choosen process on local (downloaded) ESGF files.