Skip to content

Fix architecture  #1

@saroele

Description

@saroele

With opengrid-data I want to create a user-friendly solution to get (large sets of) building monitoring data. This repo should never be a dependency of opengrid, but will probably be a dependency of opengrid-demo. I also propose to keep the limited set of demo data in opengrid for elementary testing.

We should discuss on the best way to serve these large datasets. Here's a first set of desired features:

  • python api: should as simple as
    import opengrid-data
    df = opengrid-data.load('electricity_households_all')
    
  • we have a frozen and a evolutive dataset, the first one never changes (useful for tests, demos, etc), the second one can be extended as more data becomes available with time
  • it should be easy to get an overview of available data
  • missing/updated data is fetched automatically
  • the data is cached on the local harddrive

I currently see two solutions, there may be others:

  1. we simply put all data in a python package and host it on pip. Similarly to the module datasets.py in opengrid we have some code to list and load dataframes.
  2. we host the datafiles on a public file hosting solution and write code to sync these files with the local hard drive and check for updates.

There's also git-lfs (https://git-lfs.github.com/) but that would require git and the additional git-lfs package to be installed which is not currently required for opengrid users who just want to do pip installs.

Feel free to add requirements and solutions and we put this topic on the agenda for our next dev meeting.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions