|
|
Line 12: |
Line 12: |
| The name of the database is '''oedb''' | | The name of the database is '''oedb''' |
| | | |
| + | <br/> |
| | | |
| == Database Schema == | | == Database Schema == |
Line 33: |
Line 34: |
| Example: ''orig_vg250'' | | Example: ''orig_vg250'' |
| | | |
| + | <br/> |
| | | |
| == <br/>Database Table == | | == <br/>Database Table == |
Line 58: |
Line 60: |
| *Grants to oeuser | | *Grants to oeuser |
| | | |
| + | <br/> |
| | | |
| + | <br/> |
| | | |
− | | + | <br/> |
− | | + | |
| | | |
| == Geografic Data<br/> == | | == Geografic Data<br/> == |
Line 77: |
Line 80: |
| *Spacial index '''GIST''' | | *Spacial index '''GIST''' |
| | | |
| + | <br/> |
| | | |
| = Data Referencing = | | = Data Referencing = |
Line 136: |
Line 140: |
| {"name":"example_value",<br/>"description"Some important value"",<br/>"description_german":"",<br/>"unit":"EUR" }], | | {"name":"example_value",<br/>"description"Some important value"",<br/>"description_german":"",<br/>"unit":"EUR" }], |
| | | |
− | "Changes":[<br/> { "name":"Wolf-Dieter Bunke",<br/> "mail":"wd.bunke@gmail.com",<br/> "date":"16.06.2014",<br/> "comment":"Created table" }, | + | "Changes":[<br/> { "name":"Autor1",<br/> "mail":"Autor1@e-mail.com",<br/> "date":"16.06.2014",<br/> "comment":"Created table" }, |
| | | |
− | { "name":"Cord Kaldemeyer",<br/> "mail":"cord.kaldemeyer@fh-flensburg.de",<br/> "date":"17.07.2014",<br/> "comment":"Translated field names"}], | + | { "name":"Autor2",<br/> "mail":"Autor2@e-mail.com",<br/> "date":"17.07.2014",<br/> "comment":"Translated field names"}], |
| | | |
| "ToDo": ["Some datasets are odd -> Check numbers against another data"],<br/>"Licence": ["Datenlizenz Deutschland – Namensnennung – Version 2.0 (dl-de/by-2-0; [http://www.govdata.de/dl-de/by-2-0 http://www.govdata.de/dl-de/by-2-0])"],<br/>"Instructions for proper use": ["Always state licence"]}<br/> | | "ToDo": ["Some datasets are odd -> Check numbers against another data"],<br/>"Licence": ["Datenlizenz Deutschland – Namensnennung – Version 2.0 (dl-de/by-2-0; [http://www.govdata.de/dl-de/by-2-0 http://www.govdata.de/dl-de/by-2-0])"],<br/>"Instructions for proper use": ["Always state licence"]}<br/> |
Line 157: |
Line 161: |
| ****solution: Method that was used to generate date to fill this gap (e.g. linear interpolation) | | ****solution: Method that was used to generate date to fill this gap (e.g. linear interpolation) |
| | | |
− | ***multiplicity: A field could be filled by several values | + | * |
| + | * |
| + | *multiplicity: A field could be filled by several values |
| ****values: Possible Values that could habe been used | | ****values: Possible Values that could habe been used |
| ****solution: Method that was used to select one value (e.g. Minimum) | | ****solution: Method that was used to select one value (e.g. Minimum) |
Revision as of 09:26, 27 May 2016
Data Documentation
All data included in the databases has be documented!
All abreviations have to be documentated in the Glossary!
Naming of Data
Database Name
The name of the database is oedb
Database Schema
The structure of the database is realised via the naming of the schema!
- always lower case
- no points, no commas
- no spaces
- no dates
- use underscores
- name starts with type of schema
- orig for original data
- calc for processed data
- name includes distinct subject area or source
Example: orig_vg250
Database Table
- always lower case
- no points, no commas
- no spaces
- no dates
- use underscores
- name starts with the source (e.g. zensus)
- main value (e.g. population)
- if separated by [attribute] (e.g. by_gender)
- with resolution [tupel] (e.g. per_mun)
Example: zensus_population_by_gender_per_mun
Data Integrity
General
- Primary Key [PK]
- Grants to oeuser
Geografic Data
- WGS84 - EPSG: 4326 (http://spatialreference.org/ref/epsg/4326/)
- ETRS89 / ETRS-LAEA - EPSG: 3035 (http://spatialreference.org/ref/epsg/3035/)
Data Referencing
Original Data (orig)
Tables are annotated by a comment in form of a json string:
{"Name": "Wirtschaftsdaten pro Region",
"Source": ["Regionaldatenbank Deutschland", "www.regionalstatistik.de / registation required"],
"Reference date": ["2013"],
"Retrieved": ["04.06.2014"],
"Date of collection": ["01.08.2013"],
"Original file": ["346-22-5.xls"],
"Spatial resolution": ["Germany"],
"Description": ["Financial key figures of German municipalities (annual totals)", "Regional level: municipalities, association of municipalities"],
"Table fields": [
{"name":"id",
"description"Unique identifier"",
"description_german":"",
"unit":"" },
{"name":"year",
"description"Reference Year"",
"description_german":"",
"unit":"" },
{"name":"example_value",
"description"Some important value"",
"description_german":"",
"unit":"EUR" }],
"Changes":[
{ "name":"Wolf-Dieter Bunke",
"mail":"wd.bunke@gmail.com",
"date":"16.06.2014",
"comment":"Created table" },
{ "name":"Cord Kaldemeyer",
"mail":"cord.kaldemeyer@fh-flensburg.de",
"date":"17.07.2014",
"comment":"Translated field names"}],
"ToDo": ["Some datasets are odd -> Check numbers against another data"],
"Licence": ["Datenlizenz Deutschland – Namensnennung – Version 2.0 (dl-de/by-2-0; [http://www.govdata.de/dl-de/by-2-0 http://www.govdata.de/dl-de/by-2-0])"],
"Instructions for proper use": ["Always state licence"]}
Processed Data (calc)
{"Name": "Results",
"Date of collection": ["01.08.2013"],
"Spatial resolution": ["Germany"],
"Description": ["Financial key figures of German municipalities (annual totals)", "Regional level: municipalities, association of municipalities"],
"Table fields": [
{"name":"id",
"description"Unique identifier"",
"description_german":"",
"unit":"" },
{"name":"year",
"description"Reference Year"",
"description_german":"",
"unit":"" },
{"name":"example_value",
"description"Some important value"",
"description_german":"",
"unit":"EUR" }],
"Changes":[
{ "name":"Autor1",
"mail":"Autor1@e-mail.com",
"date":"16.06.2014",
"comment":"Created table" },
{ "name":"Autor2",
"mail":"Autor2@e-mail.com",
"date":"17.07.2014",
"comment":"Translated field names"}],
"ToDo": ["Some datasets are odd -> Check numbers against another data"],
"Licence": ["Datenlizenz Deutschland – Namensnennung – Version 2.0 (dl-de/by-2-0; http://www.govdata.de/dl-de/by-2-0)"],
"Instructions for proper use": ["Always state licence"]}
Processed Data (calc) - Row Annotation
Each row has to be annotated by a json dictionary that must contain the following fields:
- origin: Link or textual description of the data set this row origins from.
- method: Method used to calculate this row from above origin (e.g. Link to a python script)
- assumption: A list of dictionaries. Each dictionary describes an assumption and annotates the affected rows.
- begin: First column affected by the assumption
- end: Last column affected by the assumption
- type: Type of the problem that had to be solved. Each type requires one or more additional keys in this dictionary. Possible types and their required additional keys are:
- gap: A not all fields could be calculated and/or filled,
- solution: Method that was used to generate date to fill this gap (e.g. linear interpolation)
-
-
- multiplicity: A field could be filled by several values
- values: Possible Values that could habe been used
- solution: Method that was used to select one value (e.g. Minimum)
An examplatory dictionary:
{
"origin":"https://data.openmod-initiative.org/data/oedb/orig_db/table
"method":"https://github.com/openego/data_processing/blob/master/calc_ego_substation/Voronoi_ehv.sql"
"assumptions":
[
{
"type": "gap"
"begin": "step_15"
"end": "step_34"
"solution": "linear_interpolation"
}
]
}