wiki:SoftWare/ProjectMeta

Version 7 (modified by sommeria, 6 years ago) (diff)

--

SoftWare / ProjectMeta - Meta project for open data management

Aim

Project-Meta is a set of software to help you to manage your open data, using the protocol OpenDAP. The initiative is supported by the European Commission as part of the project Hydralab+ of the Horizon 2020 programme. This programme requests that research data are open access, that is providing online access free of charge to the end-user and reusable. Furthermore access must allow the right to copy, distribute, search, link, crawl and mine the data. In addition to these general requests, we aim at achieving the following goals: 1) Allow the end user to scan and visualise the data without downloading. 2) Integrate the process in the data analysis procedure, with minimal additional work.

OpenDap?

The protocol OPeNDAP (Open-source Project for a Network Data Access Protocol). This includes standards for encapsulating structured data, annotating the data with attributes and adding semantics that describe the data. OPeNDAP is widely used by governmental agencies such as NASA and NOAA to serve satellite, weather and other observed earth science data.

The protocol is based on http, so that data can be scanned with an ordinary web browser. However added functionality of data visualization is provided by graphics programs (like Matlab, GrADS, Ferret or ncBrowse). Compared to ordinary file transfer protocols (e.g. FTP) a major advantage using OPeNDAP is the ability to retrieve subsets of files, so it is possible to work remotely without downloading whole data files. Although any file format can be use, data are often in HDF or NetCDF formats. The older NetCDF format is limited to arrays of numbers, while HDF provides wider possibilities of data structures (and it contains NetCDF as a particular case). We choose the NetCDF format which is sufficient for our applications and can be more easily read with a variety of software.

Description

The main script is project-meta. Before using it, you will need a METADATA file in your current folder with the name PROJECT-META.yml. This file is in the YAML format. An example could be found in the Project-Meta repository or online PROJECT-META.sample.yml.

At this stage, Project-Meta is at a early stage of develepment. Many aspect of it will be improve in the future.

PROJECT-META.yml meta file

This file is the core of the project. One difficult task is to list all your open data. Sometime, it's very easy, sometime it's could be painful and time consuming, especially if there is a lot of data

Here an example to append all the folder of kind '*.mproj*' (Coriolis example for some open data) at the end of the PROJECT-META.yml file.

find . -name '*.mproj*' -a -type d | sed 's/^/    - /;' >> PROJECT-META.yml

The find command only search the right folders under the current one (.) and the sed command add 4 spaces and the dash at the beginning of each line in order to respect the YAML format.

Repository

All code is under free license. Scripts in bash are under GPL version 3 or later (http://www.gnu.org/licenses/gpl.html), C++ sources are under GPL version 2 or newer, the perl scripts are under the same license as perl itself ie the double license GPL and Artistic License (http://dev.perl.org/licenses/artistic.html).

All sources are available on the LEGI forge: http://servforge.legi.grenoble-inp.fr/svn/soft-trokata/trunk/project-meta

The sources are managed via subversion (http://subversion.tigris.org/). It is very easy to stay synchronized with these sources

  • initial recovery
    svn checkout http://servforge.legi.grenoble-inp.fr/svn/soft-trokata/trunk/project-meta soft-project-meta
    
  • the updates thereafter
    svn update
    

It is possible to have access to writing at the forge on reasoned request to Gabriel Moreau. For issues of administration time and security, the forge is not writable without permission. For the issues of decentralization of the web, autonomy and non-allegiance to the ambient (and North American) centralism, we use our own forge...

You can propose an email patch of a particular file via the diff command. Note that svn defaults to the unified format (-u). Two examples:

diff -u project-meta.org project-meta.new > project-meta.patch
svn diff project-meta > project-meta.patch

We apply the patch (after having read and read it again) via the command

patch -p0 < project-meta.patch

Attachments (1)

Download all attachments as: .zip