ngs file format reference

This is documentation on the data exchange format for the 'ngs' system.

To ease data exchange this system comes with a simple 'tab separated values' file format. In such text files the data is formatted in tables with the columns separated using tabs, colons, or semi-colons. Advantage is that these files can be easily created and parsed using common spreadsheet tools like Excel. An example of such tab delimited file is shown below:

name	description	date
Experiment1	This is my first experiment	2010-01-19
Experiment2	This is my second experiment	2010-01-20
This document describes what file types and columns are defined for the 'ngs' system. Data in this format can be uploaded to the database via the user interface using the 'File' menu). Alternatively, a whole directory of such files can be loaded in batch using the CsvImport program. The following files are currently recognized by this program (grouped by topic):

Below, the columns for each of these file types are detailed as well as example data shown (if available).

NGS module file types

Extension to Observ-OM to support Next Gen Sequencing (NGS) projects using the Illumina platform. Each project consists of NGS experiments having one NGS sample. One or more samples are placed on a FlowcellLane, optionally with a barcode. An experiment can be performed one time or may on the same sample. Additionally an experiment can contain a merged collection of experiments done on the same sample.

File: project.txt

Contents:
A Project bundles information about one project.

Structure:
column name type required? auto/default description
projectname string YES   The name of a project.
projectcomment string     The comment of a project.
projectcustomer_UserName
xref     The user who ordered the project.. This xref uses {projectcustomer_username} to find related elements in file ngsUser.txt based on unique column {username}.
projectanalist_UserName
mref     The user which will be responsible for performing the wet-lab activities.. This mref uses {projectanalist_username} to find related elements in file ngsUser.txt based on unique column {username}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
projectplannedfinishdate datetime     The date a project is expected to be finished.
laneamount int     Number of lanes planned in this project.
sampleamount int     Number of samples planned in this project.
sampletype enum   DNA Type of sample.
seqtype enum   PE The type of sequencing.
prepkit_PrepKitName
xref     Preparation kit used on the sample for this experiment. This xref uses {prepkit_prepkitname} to find related elements in file prepKit.txt based on unique column {prepkitname}.
declarationnr string     The Declaration Nr of a project.
gccanalysis bool     Indication if variant calling has to be performed on the experiment.
gccinstructions string     Specific instructions for GCC members for the analysis.
resultfilesdir string     File location of the resulting files after analysis occurred (for file download purpose).
resultshippeduser_UserName
xref     The user who shipped the date.. This xref uses {resultshippeduser_username} to find related elements in file ngsUser.txt based on unique column {username}.
resultshippedto_UserName
xref     Name of the person to which the data has been shipped.. This xref uses {resultshippedto_username} to find related elements in file ngsUser.txt based on unique column {username}.
resultshippeddate date     The date the result of the experiment is shipped to the custumor.
Constraint: values in column projectname should unique.

File: ngsuser.txt

Contents:
An NgsUser bundles information about a user.

Structure:
column name type required? auto/default description
username string YES   The name of a user.
useremail string     The email of a user.
userrole enum YES   The role the user has.
usergroup enum     The group a user belongs to.
Constraint: values in column username should unique.

File: sample.txt

Contents:
A Sample bundles information about a sample. Sample can be pooled. A pooled sample is just another sample, but with a additional mref (many to many) linking to the samples inside the pool.

Structure:
column name type required? auto/default description
internalid string YES   The number of a sample as used in-house.
externalid string     The name of a sample as known by the customer.
samplecomment text     Comments about the sample.
projectid_ProjectName
xref YES   the sample that this sample is part of. This xref uses {projectid_projectname} to find related elements in file project.txt based on unique column {projectname}.
arrayfile string     Location of arrayfile for the sample in this lane-barcode.
arrayid string     ID of the sample on the arrayFile.
capturingkit_CapturingKitName
xref     Capturing kit used.. This xref uses {capturingkit_capturingkitname} to find related elements in file capturingKit.txt based on unique column {capturingkitname}.
samplebarcode_SampleBarcodeName
xref     Multiple samples can be on one flowcell using barcodes.. This xref uses {samplebarcode_samplebarcodename} to find related elements in file sampleBarcode.txt based on unique column {samplebarcodename}.
sampleinpool_InternalId
mref     List of samples inside the pool. This mref uses {sampleinpool_internalid} to find related elements in file sample.txt based on unique column {internalid}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
labstatus enum   Not Started Lab status phase of this sample.
Constraint: values in column internalid should unique.

File: flowcelllane.txt

Contents:
A FlowcellLaneSampleBarcode bundles information about a sample which is added to a lane on a flowcell.

Structure:
column name type required? auto/default description
flowcell_FlowcellName
xref     The flowcell to which this lane belongs. This xref uses {flowcell_flowcellname} to find related elements in file flowcell.txt based on unique column {flowcellname}.
lane enum     The lane number that is unique within a flow cell with a range of 1-8.
sample_InternalId
xref YES   Sample. This xref uses {sample_internalid} to find related elements in file sample.txt based on unique column {internalid}.
flowcelllanecomment text     Comment about the experiment.
qcwetmet enum   Not determined Quality criteria of the wet lab met.
qcwetuser_UserName
xref     The user who checked if the quality score on the wet-lab was met. This xref uses {qcwetuser_username} to find related elements in file ngsUser.txt based on unique column {username}.
qcwetdate datetime     The date the QcWetUser selected the QcWetMet value.
qcdrymet enum   Not determined Variant calling quality criteria.
qcdryuser_UserName
xref     The user who checked if the quality score of the analysis was met. This xref uses {qcdryuser_username} to find related elements in file ngsUser.txt based on unique column {username}.
qcdrydate datetime     The date the QcWetUser selected the QcWetMet value.
Contraint: values in the combined columns (flowcell, lane, sample) should be unique.

File: flowcell.txt

Structure:
column name type required? auto/default description
flowcelldirection enum     The direction used to read sequences.
flowcellname string YES   The unique name of a flowcell.
machine_MachineName
xref     The machine used for sequencing. This xref uses {machine_machinename} to find related elements in file machine.txt based on unique column {machinename}.
run string     The run number always use 4 digits (with leading 0's).
rundate datetime     The date the flowcell was run on the machine.
Constraint: values in column flowcellname should unique.

File: samplebarcode.txt

Structure:
column name type required? auto/default description
samplebarcodenr string     The identifier of a barcode given by the supplier.
samplebarcodesequence string YES   The nucleotide sequence which forms the barcode.
samplebarcodetype_SampleBarcodeTypeName
xref     Type. This xref uses {samplebarcodetype_samplebarcodetypename} to find related elements in file sampleBarcodeType.txt based on unique column {samplebarcodetypename}.
samplebarcodename string     A concatenated value of Type, Nr and Sequence.
Constraint: values in column samplebarcodename should unique.
Contraint: values in the combined columns (samplebarcodetype, samplebarcodenr, samplebarcodesequence) should be unique.

File: samplebarcodetype.txt

Structure:
column name type required? auto/default description
samplebarcodetypename string YES   The name of a barcode type.
Constraint: values in column samplebarcodetypename should unique.

File: capturingkit.txt

Structure:
column name type required? auto/default description
capturingkitname string YES   The name of a capturing kit.
Constraint: values in column capturingkitname should unique.

File: prepkit.txt

Structure:
column name type required? auto/default description
prepkitname string YES   The name of a prep kit.
Constraint: values in column prepkitname should unique.

File: machine.txt

Structure:
column name type required? auto/default description
machinename string YES   The name of a sequence machine.
Constraint: values in column machinename should unique.

org.molgenis.omx.core file types

File: molgenisentity.txt

Contents:
Referenceable catalog of entity names, menus, forms and plugins.

Structure:
column name type required? auto/default description
name string YES   Name of the entity.
type_ string YES   Type of the entity.
classname string YES   Full name of the entity.
Constraint: values in column classname should unique.
Contraint: values in the combined columns (name, type_) should be unique.

File: molgenisfile.txt

Contents:
Helper entity to deal with files. Has a decorator to regulate storage and coupling to an Entity. Do not make abstract because of subtyping. This means the names of the subclasses will be used to distinguish MolgenisFiles and place them in the correct folders.

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   human readible name, not necessary unique.
extension string YES   The file extension. This will be mapped to MIME type at runtime. For example, a type 'png' will be served out as 'image/png'.
Constraint: values in column identifier should unique.
Constraint: values in column name should unique.

File: runtimeproperty.txt

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   human readible name, not necessary unique.
value text YES   Value.
Constraint: values in column identifier should unique.
Constraint: values in column name should unique.

org.molgenis.omx.observ file types

Observ-OM is a model to uniformly describe any phenotypic, genotypic or molecular observation. The four core concepts are:

File: characteristic.txt

Contents:
Characteristics are yes-no statements about things in the world. These can be used as part of an observation, as parameter of ObservableFeature ('measuredCharacteristic'). For example: 'What is allele of [Marker]', here the [Marker] is a characteristic. Also, Characteristics can be used as target of observation. Typical examples are 'Individual' or 'Panel'. But also 'Marker' can be an Target when asked the question 'QTL p-value for [phenotype]': here both target and feature are characteristic, for example 'leave count' (phenotype characteristic) and 'PVV4' (marker characteristic).

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   human readible name, not necessary unique.
description text     (Optional) Rudimentary meta data about the observable feature. Use of ontology terms references to establish unambigious descriptions is recommended.
Constraint: values in column identifier should unique.

File: observationtarget.txt

Contents:
ObservationTarget defines subjects of observation, such as Individual, Panel, Sample, etc. For instance: 'target 1' IS A 'Individual'.

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   human readible name, not necessary unique.
description text     (Optional) Rudimentary meta data about the observable feature. Use of ontology terms references to establish unambigious descriptions is recommended.
Constraint: values in column identifier should unique.

File: observablefeature.txt

Contents:
ObservableFeature defines anything that can be observed.

In other words, ObservableFeature are the questions asked, e.g. 'What is Height?', 'What is Systolic blood pressure?', or 'Has blue eyes?'.

Some questions may be repeated for multiple characteristics. For example 'What is [MarkerAllele] observed?' can be applied to all elements of a MarkerSet, and 'What is [medicin codes] uses' can be applied to a set of Medicine codes. This can be specified using the measuredCharacteristic field.

The identifier of ObservableFeature is globally unique. It is recommended that each ObservableFeature is named according to a well-defined ontology term or database accession.

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   human readible name, not necessary unique.
description text     (Optional) Rudimentary meta data about the observable feature. Use of ontology terms references to establish unambigious descriptions is recommended.
unit_Identifier
xref     (Optional) Reference to the well-defined measurement unit used to observe this feature (if feature is that concrete). E.g. mmHg. This xref uses {unit_identifier} to find related elements in file ontologyTerm.txt based on unique column {identifier}.
definition_Identifier
xref     The concept that is being measured in a specific way.. This xref uses {definition_identifier} to find related elements in file ontologyTerm.txt based on unique column {identifier}.
datatype enum   string (Optional) Reference to the technical data type. E.g. 'int'.
temporal bool   false Whether this feature is time dependent and can have different values when measured on different times (e.g. weight, temporal=true) or generally only measured once (e.g. birth date, temporal=false).
Constraint: values in column identifier should unique.

File: category.txt

Contents:
Category is partOf ObservableFeature to define categories for an ObservableFeature, such as the categorical answer codes that are often used in Questionaires. For example the ObservableFeature 'sex' has {code_string = 1, label=male} and {code_string = 2, label=female}. Category can be linked to well-defined ontology terms via the ontologyReference. Category extends ObservationElement such that it can be referenced by ObservedValue.value. The Category class maps to METABASE::Category .

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   human readible name, not necessary unique.
description text     (Optional) Rudimentary meta data about the observable feature. Use of ontology terms references to establish unambigious descriptions is recommended.
observablefeature_Identifier
xref YES   The Measurement these permitted values are part of.. This xref uses {observablefeature_identifier} to find related elements in file observableFeature.txt based on unique column {identifier}.
valuecode string     The value used to store this category in ObservedValue. For example '1', '2'.
definition_Identifier
xref     The category that is being measured in a specific way.. This xref uses {definition_identifier} to find related elements in file ontologyTerm.txt based on unique column {identifier}.
ismissing bool   false whether this value should be treated as missing value.
Constraint: values in column identifier should unique.

File: protocol.txt

Contents:
The Protocol class defines parameterizable descriptions of (analysis)methods. Examples of protocols are: Questionaires, SOPs, Assay platforms, Statistical analyses, etc. Each protocol has a unique identifier. Protocol has an association to OntologyTerm to represent the type of protocol.

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   human readible name, not necessary unique.
description text     (Optional) Rudimentary meta data about the observable feature. Use of ontology terms references to establish unambigious descriptions is recommended.
protocoltype_Identifier
xref     classification of protocol. This xref uses {protocoltype_identifier} to find related elements in file ontologyTerm.txt based on unique column {identifier}.
subprotocols_Identifier
mref     Subprotocols of this protocol. This mref uses {subprotocols_identifier} to find related elements in file protocol.txt based on unique column {identifier}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
features_Identifier
mref     parameters (in/out) that are used or produced by this protocol.. This mref uses {features_identifier} to find related elements in file observableFeature.txt based on unique column {identifier}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
Constraint: values in column identifier should unique.

File: dataset.txt

Contents:
Container for one or more observations that are measured using the same protocol and by the same performer(s). The dataset may be a file (having the same identifier) but in most cases it is a data table consisting of rows (Observation). This entity replaces ProtocolApplication.

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   human readible name, not necessary unique.
description text     (Optional) Rudimentary meta data about the observable feature. Use of ontology terms references to establish unambigious descriptions is recommended.
protocolused_Identifier
xref YES   Reference to the protocol that is being used (if available). This xref uses {protocolused_identifier} to find related elements in file protocol.txt based on unique column {identifier}.
starttime datetime   today time when the protocol started.
endtime datetime   today (Optional) time when the protocol ended.
Constraint: values in column identifier should unique.

File: observationset.txt

Contents:
In practice: Observation is one row within a DataSet.

Structure:
column name type required? auto/default description
partofdataset_Identifier
xref YES   DataSet this ValueSet is part of.. This xref uses {partofdataset_identifier} to find related elements in file dataSet.txt based on unique column {identifier}.
time datetime     Time of this observationSet.
Contraint: values in the combined columns (partofdataset, time) should be unique.

File: observedvalue.txt

Contents:
Generic storage of values as part of one observation event. Values are atomatic observations, e.g., length (feature) of individual 1 (valueset.target) = 179cm (value). Values can also be qualified by some characteristic, e.g., QTL p-value (feature) between phenotype 'leaf count' (characteristic) and marker 'PVV4' (valueset.target) = 0.1^10+3 (value).

Structure:
column name type required? auto/default description
observationset_id
xref YES   Reference to the observation. For example a particular patient visit or the application of a microarray or the calculation of a QTL model. This xref uses {observationset_id} to find related elements in file observationSet.txt based on unique column {id}.
feature_Identifier
xref YES   References the ObservableFeature that this observation was made on. For example 'probe123'.. This xref uses {feature_identifier} to find related elements in file observableFeature.txt based on unique column {identifier}.
value_id
xref     The value observed. This xref uses {value_id} to find related elements in file value.txt based on unique column {id}.

org.molgenis.omx.observ.target file types

File: species.txt

Contents:
Ontology terms for species. E.g. Arabidopsis thaliana. DISCUSSION: should we avoid subclasses of OntologyTerm and instead make a 'tag' filter on terms so we can make pulldowns context dependent (e.g. to only show particular subqueries of ontologies).

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   human readible name, not necessary unique.
description text     (Optional) Rudimentary meta data about the observable feature. Use of ontology terms references to establish unambigious descriptions is recommended.
ontology_Identifier
xref     (Optional) The source ontology or controlled vocabulary list that ontology terms have been obtained from.. This xref uses {ontology_identifier} to find related elements in file ontology.txt based on unique column {identifier}.
termaccession string     (Optional) The accession number assigned to the ontology term in its source ontology. If empty it is assumed to be a locally defined term.
definition string     (Optional) The definition of the term.
Constraint: values in column identifier should unique.
Contraint: values in the combined columns (ontology, termaccession) should be unique.

File: individual.txt

Contents:
The Individuals class defines the subjects that are used as observation target. The Individual class maps to XGAP:Individual and PaGE:Individual. Groups of individuals can be defined via Panel.

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   human readible name, not necessary unique.
description text     (Optional) Rudimentary meta data about the observable feature. Use of ontology terms references to establish unambigious descriptions is recommended.
mother_Identifier
xref     Refers to the mother of the individual.. This xref uses {mother_identifier} to find related elements in file individual.txt based on unique column {identifier}.
father_Identifier
xref     Refers to the father of the individual.. This xref uses {father_identifier} to find related elements in file individual.txt based on unique column {identifier}.
Constraint: values in column identifier should unique.

File: panel.txt

Contents:
The Panel class defines groups of individuals based on cohort design, case/controls, families, etc. For instance: 'LifeLines cohort', 'middle aged man', 'recombinant mouse inbred Line dba x b6' or 'Smith family'. A Panel can act as a single ObservationTarget. For example: average height (Measurement) in the LifeLines cohort (Panel) is 174cm (ObservedValue). The Panel class maps to XGAP:Strain and PaGE:Panel classes. In METABASE this is assumed there is one panel per study.

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   human readible name, not necessary unique.
description text     (Optional) Rudimentary meta data about the observable feature. Use of ontology terms references to establish unambigious descriptions is recommended.
paneltype_Identifier
xref     Indicate the type of Panel (example: Sample panel, AssayedPanel, Natural=wild type, Parental=parents of a cross, F1=First generation of cross, RCC=Recombinant congenic, CSS=chromosome substitution). This xref uses {paneltype_identifier} to find related elements in file ontologyTerm.txt based on unique column {identifier}.
numberofindividuals int YES   NumberOfIndividuals.
species_Identifier
xref     The species this panel is an instance of/part of/extracted from.. This xref uses {species_identifier} to find related elements in file species.txt based on unique column {identifier}.
individuals_Identifier
mref     The list of individuals in this panel. This mref uses {individuals_identifier} to find related elements in file individual.txt based on unique column {identifier}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
Constraint: values in column identifier should unique.

File: panelsource.txt

Contents:
PanelSources is partOf Panel to define how panels are related panels, founder panels, such as overlap, selection criteria, getting assayed panel from a sample panel, etc.

Structure:
column name type required? auto/default description
currentpanel_Identifier
xref YES   Panel for which these sources are defined.. This xref uses {currentpanel_identifier} to find related elements in file panel.txt based on unique column {identifier}.
sourcepanel_Identifier
xref YES   Source that contributed individuals to current panel. This xref uses {sourcepanel_identifier} to find related elements in file panel.txt based on unique column {identifier}.
numberofindividuals int     Number of individuals lifted over from this source.
selectioncriteria text YES   Inclusion/exclusion criteria used to select these individuals from source into current panel.

File: ontology.txt

Contents:
Ontology defines a reference to an ontology or controlled vocabulary from which well-defined and stable (ontology) terms can be obtained. Each Ontology should have a unique identifer, for instance: Gene Ontology, Mammalian Phenotype, Human Phenotype Ontology, Unified Medical Language System, Medical Subject Headings, etc. Also a abbreviation is required, for instance: GO, MP, HPO, UMLS, MeSH, etc. Use of existing ontologies/vocabularies is recommended to harmonize phenotypic feature and value descriptions. But one can also create a 'local' Ontology. The Ontology class maps to FuGE::Ontology, MAGE-TAB::TermSourceREF.

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   human readible name, not necessary unique.
ontologyaccession string     A accession that uniquely identifies the ontology (typically an acronym). E.g. GO, MeSH, HPO.
ontologyuri hyperlink     (Optional) A URI that references the location of the ontology.
Constraint: values in column identifier should unique.

File: ontologyterm.txt

Contents:
OntologyTerm defines a single entry (term) from an ontology or a controlled vocabulary (defined by Ontology). The identifier is the ontology term is unique. E.g. 'NCI:Antigen Gene'. Other data entities can reference to this OntologyTerm to harmonize naming of concepts. If no suitable ontology term exists then one can define new terms locally (in which case there is no formal accession for the term limiting its use for cross-Investigation queries).

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   human readible name, not necessary unique.
description text     (Optional) Rudimentary meta data about the observable feature. Use of ontology terms references to establish unambigious descriptions is recommended.
ontology_Identifier
xref     (Optional) The source ontology or controlled vocabulary list that ontology terms have been obtained from.. This xref uses {ontology_identifier} to find related elements in file ontology.txt based on unique column {identifier}.
termaccession string     (Optional) The accession number assigned to the ontology term in its source ontology. If empty it is assumed to be a locally defined term.
definition string     (Optional) The definition of the term.
Constraint: values in column identifier should unique.
Contraint: values in the combined columns (ontology, termaccession) should be unique.

File: accession.txt

Contents:
An external identifier for an annotation. For example: name='R13H8.1', ontology='ensembl' or name='WBgene00000912', ontology='wormbase'.

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   human readible name, not necessary unique.
description text     (Optional) Rudimentary meta data about the observable feature. Use of ontology terms references to establish unambigious descriptions is recommended.
ontology_Identifier
xref     (Optional) The source ontology or controlled vocabulary list that ontology terms have been obtained from.. This xref uses {ontology_identifier} to find related elements in file ontology.txt based on unique column {identifier}.
termaccession string     (Optional) The accession number assigned to the ontology term in its source ontology. If empty it is assumed to be a locally defined term.
definition string     (Optional) The definition of the term.
Constraint: values in column identifier should unique.
Contraint: values in the combined columns (ontology, termaccession) should be unique.

org.molgenis.omx.observ.value file types

File: value.txt

Structure:
column name type required? auto/default description

File: boolvalue.txt

Structure:
column name type required? auto/default description
value bool YES   Value.

File: categoricalvalue.txt

Structure:
column name type required? auto/default description
value_Identifier
xref YES   Value. This xref uses {value_identifier} to find related elements in file category.txt based on unique column {identifier}.

File: datevalue.txt

Structure:
column name type required? auto/default description
value date YES   Value.

File: datetimevalue.txt

Structure:
column name type required? auto/default description
value datetime YES   Value.

File: decimalvalue.txt

Structure:
column name type required? auto/default description
value decimal YES   Value.

File: emailvalue.txt

Structure:
column name type required? auto/default description
value email YES   Value.

File: htmlvalue.txt

Structure:
column name type required? auto/default description
value text YES   Value.

File: hyperlinkvalue.txt

Structure:
column name type required? auto/default description
value hyperlink YES   Value.

File: intvalue.txt

Structure:
column name type required? auto/default description
value int YES   Value.

File: longvalue.txt

Structure:
column name type required? auto/default description
value long YES   Value.

File: mrefvalue.txt

Structure:
column name type required? auto/default description
value_Identifier
mref YES   Value. This mref uses {value_identifier} to find related elements in file characteristic.txt based on unique column {identifier}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.

File: stringvalue.txt

Structure:
column name type required? auto/default description
value string YES   Value.

File: textvalue.txt

Structure:
column name type required? auto/default description
value text YES   Value.

File: xrefvalue.txt

Structure:
column name type required? auto/default description
value_Identifier
xref YES   Value. This xref uses {value_identifier} to find related elements in file characteristic.txt based on unique column {identifier}.

org.molgenis.omx.auth file types

File: molgenisrole.txt

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   Name.
Constraint: values in column identifier should unique.
Constraint: values in column name should unique.

File: molgenisgroup.txt

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   Name.
Constraint: values in column identifier should unique.
Constraint: values in column name should unique.

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   human readible name, not necessary unique.
group__Name
xref YES   group_. This xref uses {group__name} to find related elements in file molgenisGroup.txt based on unique column {name}.
role__Name
xref YES   role_. This xref uses {role__name} to find related elements in file molgenisRole.txt based on unique column {name}.
Constraint: values in column identifier should unique.
Contraint: values in the combined columns (group_, role_) should be unique.

File: person.txt

Contents:
Person represents one or more people involved with an Investigation. This may include authors on a paper, lab personnel or PIs. Person has last name, firstname, mid initial, address, contact and email. A Person role is included to represent how a Person is involved with an investigation. For submission to repository purposes an allowed value is 'submitter' and the term is present in the MGED Ontology, an alternative use could represent job title. An Example from ArrayExpress is E-MTAB-506 ftp://ftp.ebi.ac.uk/pub/databases/microarray/data/experiment/TABM/E-TABM-506/E-TABM-506.idf.txt. .
The FUGE equivalent to Person is FuGE::Person.

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   human readible name, not necessary unique.
description text     (Optional) Rudimentary meta data about the observable feature. Use of ontology terms references to establish unambigious descriptions is recommended.
address text     The address of the Contact.
phone string     The telephone number of the Contact including the suitable area codes.
email string     The email address of the Contact.
fax string     The fax number of the Contact.
tollfreephone string     A toll free phone number for the Contact, including suitable area codes.
city string     Added from the old definition of MolgenisUser. City of this contact.
country string     Added from the old definition of MolgenisUser. Country of this contact.
firstname string     First Name.
midinitials string     Mid Initials.
lastname string     Last Name.
title string     An academic title, e.g. Prof.dr, PhD.
affiliation_Name
xref     Affiliation. This xref uses {affiliation_name} to find related elements in file institute.txt based on unique column {name}.
department string     Added from the old definition of MolgenisUser. Department of this contact.
roles_Identifier
xref     Indicate role of the contact, e.g. lab worker or PI. Changed from mref to xref in oct 2011.. This xref uses {roles_identifier} to find related elements in file personRole.txt based on unique column {identifier}.
Constraint: values in column identifier should unique.
Constraint: values in column email should unique.

File: personrole.txt

Contents:
Seperate type of ontologyTerm to administrate roles.

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   human readible name, not necessary unique.
description text     (Optional) Rudimentary meta data about the observable feature. Use of ontology terms references to establish unambigious descriptions is recommended.
ontology_Identifier
xref     (Optional) The source ontology or controlled vocabulary list that ontology terms have been obtained from.. This xref uses {ontology_identifier} to find related elements in file ontology.txt based on unique column {identifier}.
termaccession string     (Optional) The accession number assigned to the ontology term in its source ontology. If empty it is assumed to be a locally defined term.
definition string     (Optional) The definition of the term.
Constraint: values in column identifier should unique.
Contraint: values in the combined columns (ontology, termaccession) should be unique.

File: institute.txt

Contents:
A contact is either a person or an organization. Copied from FuGE::Contact.

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   Name.
description text     (Optional) Rudimentary meta data about the observable feature. Use of ontology terms references to establish unambigious descriptions is recommended.
address text     The address of the Contact.
phone string     The telephone number of the Contact including the suitable area codes.
email string     The email address of the Contact.
fax string     The fax number of the Contact.
tollfreephone string     A toll free phone number for the Contact, including suitable area codes.
city string     Added from the old definition of MolgenisUser. City of this contact.
country string     Added from the old definition of MolgenisUser. Country of this contact.
Constraint: values in column identifier should unique.
Constraint: values in column name should unique.

File: molgenisuser.txt

Contents:
Anyone who can login.

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   Name.
address text     The address of the Contact.
phone string     The telephone number of the Contact including the suitable area codes.
email string     The email address of the Contact.
fax string     The fax number of the Contact.
tollfreephone string     A toll free phone number for the Contact, including suitable area codes.
city string     Added from the old definition of MolgenisUser. City of this contact.
country string     Added from the old definition of MolgenisUser. Country of this contact.
firstname string     First Name.
midinitials string     Mid Initials.
lastname string     Last Name.
title string     An academic title, e.g. Prof.dr, PhD.
affiliation_Name
xref     Affiliation. This xref uses {affiliation_name} to find related elements in file institute.txt based on unique column {name}.
department string     Added from the old definition of MolgenisUser. Department of this contact.
roles_Identifier
xref     Indicate role of the contact, e.g. lab worker or PI. Changed from mref to xref in oct 2011.. This xref uses {roles_identifier} to find related elements in file personRole.txt based on unique column {identifier}.
password_ string   secret big fixme: password type.
activationcode string     Used as alternative authentication mechanism to verify user email and/or if user has lost password.
active bool   false Boolean to indicate if this account can be used to login.
superuser bool   false superuser.
Constraint: values in column identifier should unique.
Constraint: values in column name should unique.
Constraint: values in column email should unique.

File: molgenispermission.txt

Structure:
column name type required? auto/default description
identifier string YES   user supplied or automatically assigned (using a decorator) unique and short identifier, e.g. MA1234.
name string YES   human readible name, not necessary unique.
role__Name
xref YES   role_. This xref uses {role__name} to find related elements in file molgenisRole.txt based on unique column {name}.
entity_className
xref YES   entity. This xref uses {entity_classname} to find related elements in file molgenisEntity.txt based on unique column {classname}.
permission enum YES   permission.
Constraint: values in column identifier should unique.
Contraint: values in the combined columns (role_, entity, permission) should be unique.

Appendix: documentation of the mref tables

ngs file types

File: project_projectanalist.txt

Contents:
Link table for many-to-many relationship 'Project.ProjectAnalist'.

Structure:
column name type required? auto/default description
projectanalist_UserName
xref YES   This xref uses {projectanalist_username} to find related elements in file ngsUser.txt based on unique column {username}.
project_ProjectName
xref YES   This xref uses {project_projectname} to find related elements in file project.txt based on unique column {projectname}.
Contraint: values in the combined columns (projectanalist, project) should be unique.

File: sample_sampleinpool.txt

Contents:
Link table for many-to-many relationship 'Sample.SampleInPool'.

Structure:
column name type required? auto/default description
sampleinpool_InternalId
xref YES   This xref uses {sampleinpool_internalid} to find related elements in file sample.txt based on unique column {internalid}.
sample_InternalId
xref YES   This xref uses {sample_internalid} to find related elements in file sample.txt based on unique column {internalid}.
Contraint: values in the combined columns (sampleinpool, sample) should be unique.