Context
Germany (BMWi, BNetzA) plans to to develop a register of power plant data that covers all plants (“starting from 0 kW”) that supersedes the current “EEG Stammdatenbank” and the “BNetzA Kraftwerksliste”.
- geographic scope: Germany
- content: capacity, technical characteristics, detailed information such as hub height of wind turbines (“master data”)
- detail: individual blocks
- go-live: early 2017
- not included: time series of any sort; yearly generation data
- license: BMWi and BNetzA has not thought about a license yet
http://www.bundesnetzagentur.de/DE/Sachgebiete/ElektrizitaetundGas/Unternehmen_Institutionen/DatenaustauschundMonitoring/MaStR/Datendefinitionen/mastr_datendefinitionen-node.html
There is a three-step consultation process with three deadlines ("milestones").
Milestone 1 (deadline: 31 August 2015)
General points
- We support the idea of a register
- Should be published under an open license. (suggesting ODbL? [1]
- The register should “think” European, be compatible with existing registers elsewhere.Do we have specific examples? (ORISPL - Office of Regulatory Information System Plant Location (US)')
Specific points
- the coverage of the data should become clear (e.g., "X% of all capacity is included; most missing capacity is small-scale gas-fired plants")
- There are > 800k PV installations - will these be included?
- Does Stromerzeugungsanlagen (EAS) distinguish between generators and boilers?
- Form EIA-923 & Form EIA-860 are excellent examples of how to manage data on the power sector
- Are market roles and functions linked? Or would we have to deduplicate things by hand? Also, what about companies that may have a joint partnership in a power plant? Or subsidiaries? Relations that change over time? Mergers & splits?
- Ability to track changes on a power plant site? A wind farm may be built in stages, generators may be decommissed, upgraded, etc. Primary fuel used may change as well.
- What format will the data be published in? Will it be machine-readable? The BNetzA xls database is mostly ok in this regard, but columns like Spezifizierung "Mehrere Energieträger" und "Sonstige Energieträger" - Hauptbrennstoff are difficult to parse consistently.
- Best would be to provide data in different formats: Georeferenced database table (online accessible and downloadable dump) and Spreadsheet. To keep costs low, spreadsheet should be derived from database table.
=> Relational DB (related tables for plants, operators etc.) to model structure including change history
- Plants could be linked to power network (at least IDs?) to be on target for a possible future power grid database (i.e. grid conection point to medium-voltage power grid)
=> General question: Should we consider interfaces to other potential DBs?
-
- e.g. "be able to sum up all CO2 emissions for all facilities owned by company X and all its subsidiaries" - for this to be answered, the data structures need to provide links both between organizations and from organizations to their facilities. In other words, through this exercise we're mapping out what types of data should ideally be linked to what other types of data. This isn't just about putting data in tables, it's about if we can link data spread across multiple tables.
- Include spatial data
- Best longitude and latitude, Ok adress data
- Anonymisation of very small (private) EAS => assignment to (unique) grid conection point / local grid ?
- Why not include time-series or annual generation data?
- ...
-
It might be useful to suggest collecting use cases for the data to help ensure that what is being gathered can help to answer real research problems.
[1] http://opendatacommons.org/licenses/odbl/ (also used by OpenStreetMap)
Milestone 2 (deadline: 31 October 2015)
General points
- Repeat the general points of milestone 1 (support idea, open license, ...)
Specific points
Milestone 3 (deadline: 31 May 2016)
General points
Specific points