wiki:ProjectDocumentation

Project Name

InfrastructureCNRS_Coriolis
Project title (long )..........
Project key name - name of the data folderproject name
Lead Author name...............
Contributor names............
Date Campaign Start.........
Date Campaign End...........

0 - Publications, reports from the project

To attach a file,leave the editor and press the button 'Attach file' at the bottom of the edit window. The use the format of the example: OpenDAP_GM.pdf.

1 - Objectives

This is a template for the documentation of an experimental project.

  • It is first needed as a common notebook for the research team.
  • It is eventually aimed to be used as a documentation for the published data.
  • The wiki format is convenient for collaborative projects, as it can be easily accessed from the web, and it manages and keeps track of the successive modifications by several authors (see button "Timeline" above). The wiki editor relies on a few simple conventions to provide basic formatting, including tables, figures, hyperlinks. Mathematic formula can be introduced with Latex conventions. To initiate a new project, copy-paste this page as a template (in mode "textarea" button above). The mode "wysiwyg" can be also useful to see the end result. Once it is completed, the document can be exported in pdf format, see button at the bottom.

2 - Experimental setup

2.1 General description

include sketches, photos, ...

To attach figures or files, leave the editor and press the button 'Attach file' at the bottom of the edit window. To insert the image in the text, follow this exemple:

2.2 Definition of the coordinate system

Define the coordinate system $(x,y,z)$ used to identify points on images and probe locations. Think from the beginning at the most rational choice for the final publication of the data, so that no change is needed which may lead to ambiguities in data description. If several coordinate sets need to be used, define precisely their links.

2.3 Relevant fixed parameters:

Define here the parameters which will remain constant during the project, example:

Notation DefintionValueremarks
Htotal water heightH=80 cm.....
Wchannel widthL=150 cm.....
$\alpha$slope angle$\alpha$=20 degrees.....

2.4 Definition of the variable control parameters

The set of these parameter values characterise each experiment.

Notation DefinitionUnitremarks
Param1...
Param2...
Param3...

2.5 Definition of the relevant derived parameters and non-dimensional numbers

Define here the derived parameters of interest, in particular non-dimensional numbers. For instance, for the Reynolds' experiment, the pipe diameter and length, and the fluid viscosity $\nu$ may be the fixed parameters, the pressure drop driving the flow may be the control parameter, while the flow rate $Q$ and velocity $U=Q/\pi r2$ are the derived parameters, and the Reynolds number $Re=Ud/\nu$ the relevant non-dimensional parameter.

3 Instrumentation and data acquisition

3.1 Instruments used

With positions of probes, cameras... For the Coriolis platform you may refer to the description CoriolisInstrumentation.

3.2 Definition of time origin and instrument synchronisation

It is important to properly define the time $t=0$ used to describe phenomena in an experiment. All instruments must be synchronised in time, or at least provide information on the respective measurement timing. Check the consistency of dates and times for all the computers and instruments involved.

3.3 Requested final output and statistics

Define here the kind of statistics eventually needed.

4 - Methods of calibration and data processing

5 - Organisation of data files

All data related to a project are stored in a given folder, accessible to all team members. For the Coriolis platform,it is named "/.fsnet/project/coriolis/year/project key name". Data must be properly stored and documented, taking into account the future constraints of data processing and publication beyond the research team, see section #Datamanagement. Although the details of data organisation should be adapted to different projects, some general rules must be followed:

  • A key request is to properly define a set of experiments, to be described in the table of section 6. The data from each instrument should properly refer to its experiment name.
  • Use simple names for files and folders, excluding special characters: blank,/,\,=. Those often lead to severe problems in data processing, depending on the opeating system and software used. The description of experimental date and parameters should be given in the table of section 6, not encoded in file names.
  • Avoid complex hierarchical structures which lead to difficulties for data processing and management.

The basic folder structure

EXP01/InstrumentName01 
     /InstrumentName02
EXP02/InstrumentName01 
     /InstrumentName02

is convenient when the data from the different instruments must be combined for processing, for instance two cameras used in stereoscopic view. Then results from successive processing steps can be marked by a specific extension. This is useful to quickly identify the state of advancement of data processing and to support strategy for data backup and publication: due to space limitation and lack of interested for final users, the raw data are generally not published. The software uvmat [->http://servforge.legi.grenoble-inp.fr/projects/soft-uvmat] for image processing thus produce a folder CamA.civ for the velocity fields obtained from the image folder CamA. And the 3 velocity components obtained by the stereoscopic combination of the velocity fields in CamA.civ and CamB.civ are stored in CamA.civ-CamB.civ.vel3C.

However the alternative hierarchy

InstrumentName01/EXP01
                /EXP02
InstrumentName02/EXP01
                /EXP02

may be more appropriate if the different instruments are managed by different computers, and sometimes by different researchers. Then clear and simple nomenclature of the experiment names is essential for a good matching in the different instrument folders.

In addition to the proper data sets, a few folders are needed for experiment description and various ancillary files, for instance:

  • 0_DOC: miscellaneous documentation and reports related to the project
  • 0_MATLAB_FCT: specific matlab processing functions used in the project.
  • 0_PHOTOS: photos of set-up
  • 0_REF_FILES: files of general use (calibration data, grids ...)

6 - Table of experiments

List of parameter, Param1... , denoted by names defined in section 4.2.

NameDateParam1Param2Param3Remarks
#EXP01
#EXP02
#EXP03

Note that #EXP01 provides an internal link to the diary.

7 - Diary:

EXP01

EXP02

EXP03

8 - Data management

We can distinguish three steps:

  • Raw data: as given by the different instruments. They are generally stored in local disks, using instrument units (volts, pixels for images...). The available formats are often proprietary and limited by the constraints of fast disk writting. For long data sets, avoid text formats which are longer to read and occupy more disk space than binary formats.
  • Processed data: those are the physical quantities of interest, obtained after calibration of the raw data and processing linked to the instruments, leading for instance to velocity fields form the images by Particle Imaging Velocimetry. They should be understandable by researchers who did not participate in the project, so that physical units and standard formats are needed. They are stored in storage bays with backup system, as local instrument disks are quickly full and less safe.
  • Published data: they contain a selection of the most "interesting" data, and involve various analysis of the processed data, like statistics, plots, which are very dependent on the project. We consider two levels of published data.

9 - Data format

Images

We use different imaging systems that provide their own proprietary image format. Successive frames, and even frames from different cameras, can be packed together in a single file for faster disk writting. Therefore an operation of extraction is often done, to provide a set of properly indexed images in a standard format, png (portable network graphics). It is a binary format for images with lossless (reversible) compression (like .zip) recommended by w3c (http://www.w3.org/Graphics/PNG). It can be read directly by all standard programs of image visualisation and processing. Compressing a raw binary image to its png form typically saves disk storage by a factor of 3.

Data from instruments

Like imaging systems, instruments provide various proprietary formats. Text formats are often used for interoperability and human reading. However automatic reading of text files is slow and is often inpaired by pecularities of text heading and data separators. Furthermore text transcription of numbers is very inefficient in terms of disk storage.

The NetCDF binary format is much appropriate for data arrays, time series or multidimensional fields. It benefits from a long experience in atmospheric science, but it is now used in many fields and most computer langage and software provide convenient reading and writting tools. A more recent alternative is the HDF format, which can be viewed as an extension of NetCDF providing wider possibilities of data structures. However the added complexity is not justified for most types of experimental data.

Therefore the NetCDF format is recommended, at least for processed data and for published data. Tools are provided in NetCdf to write and read NetCDF files in Matlab and Python.

Metadata

Those are "data which provide information on other data". They are needed at different levels.

  • At the level of the overall project, this is the data report, as given for instance by the wiki.
  • For raw data, it corresponds to the instrument settings, dates and time of the measurement series, calibration parameters. In modern instrumentation, this information is generally stored in ancillary text files or encapsulated with the data in proprieratory formats. Some information may need also to be recorded manually in log books. In uvmat, we use the xml format to record the key parameters (timing, calibration parameters...) in a standard way, independent from the instrument trademark.
  • For processed data, the NetCDF format allows us to store metadata as "attributes". In uvmat, xml files are produced ans stored beside the resut folder to keep track of all the processing parameters.
  • For published data, the technical documentation of the previous levels should be accessible, but a standardised metadata set of 15 items has been established to characterise publications, independnetly form their content, the Dublin Core (https://en.wikipedia.org/wiki/Dublin_Core).
Last modified 6 years ago Last modified on Jun 17, 2018, 10:55:44 AM

Attachments (2)

Download all attachments as: .zip