diff --git a/tutorials/Identification_python model of an opaque wall.ipynb b/tutorials/Identification_python model of an opaque wall.ipynb new file mode 100644 index 0000000..ed5af01 --- /dev/null +++ b/tutorials/Identification_python model of an opaque wall.ipynb @@ -0,0 +1,1419 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "fe658bc78054a398", + "metadata": {}, + "source": [ + "This tutorial can be found and ran in the GITHUB libray corrAI: https://github.com/BuildingEnergySimulationTools/corrai" + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "---\n", + "# Characterisation method tutorial\n", + "\n", + "The aim of this tutorial is to provide tools for identification and optimisation of parameters of a model, using **CorrAI** modules. It contains three parts:\n", + "\n", + "1. **Model definition**: short definition of a mathematical model of an opaque wall, using a resistance/capacity approach.\n", + "\n", + "2. **Measurement import**: Loading of measurements performed on a test benchto serve as reference data for comparison with the model.\n", + "\n", + "3. **Model identification**: Identificaiton of the parameters that need to be adjusted to improve the model's accuracy. This involves fitting the model to the data and refining its parameters. A sensitivity analysis is performed in tutorial \"Sensitivity analysis of an opaque wall model\" to rank the parameters by order of influence on the model error.\n", + "---" + ], + "id": "6fdc965b87565807" + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "# 1. Model definition", + "id": "4b04987b85f74388" + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "Measurement of temperature evolution in layers of an opaque wall installed on a real-scale test bench were performed. The aim is to identify the thermal conductivity of the wall, based on a simplified model.\n", + "\n", + "For more detailed on the model description and measurement, see description in tutorial \"Sensitivity analysis of an opaque wall model\".\n", + "\n", + "
\n", + "
\n", + " Figure : RC model\n", + "
\n" + ], + "id": "1c0a276f7b0d2c67" + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "As a reminder, it should follow the `Model` interface define in corrai, which guarantees that it can be used in calibration, sensitivity analysis, and optimization workflows.\n", + "\n", + "A valid simulator must implement the following concepts:\n", + "\n", + "- **`simulate(property_dict, simulation_options, simulation_kwargs)`**\n", + " Main entry point. Runs the simulation and returns results as a `pandas.DataFrame` indexed by time.\n", + "\n", + "- **`property_dict`**\n", + " Dictionary of model parameters `{property_name: value}`.\n", + " These overwrite default values to perform calibration, parameter sweeps, or optimization.\n", + "Here, the wall is described by default physical properties, which can be overridden at runtime :" + ], + "id": "74c93be386487cda" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "from corrai.base.model import Model\n", + "\n", + "class OpaqueWallSimple(Model):\n", + " def simulate(\n", + " self,\n", + " property_dict: dict,\n", + " simulation_options: dict,\n", + " simulation_kwargs: dict | None = None,\n", + " ) -> pd.DataFrame:\n", + "\n", + " default_parameters = {\n", + " \"R_ext\": 0.005,\n", + " \"R_int\": 0.01,\n", + " \"R_concrete\": 0.10,\n", + " \"R_ins\": 0.32,\n", + " \"C_concrete\": 2.95e6,\n", + " \"C_ins\": 3.64e4,\n", + " \"alpha\": 0.5,\n", + " \"S_wall\": 7,\n", + " \"epsilon\": 0.4,\n", + " \"fview\": 0.5,\n", + " }\n", + " parameters = {**default_parameters, **property_dict}\n", + "\n", + " R_ext = parameters[\"R_ext\"]\n", + " R_int = parameters[\"R_int\"]\n", + " R_concrete = parameters[\"R_concrete\"]\n", + " R_ins = parameters[\"R_ins\"]\n", + " C_concrete = parameters[\"C_concrete\"]\n", + " C_ins = parameters[\"C_ins\"]\n", + " alpha = parameters[\"alpha\"]\n", + " S_wall = parameters[\"S_wall\"]\n", + " epsilon = parameters[\"epsilon\"]\n", + " fview = parameters[\"fview\"]\n", + "\n", + " sigma = 5.67e-8 # W/m^2/K^4\n", + "\n", + " df = simulation_options[\"dataframe\"]\n", + " time = df[\"time_sec\"].values\n", + " T_ext = df[\"T_ext\"].values\n", + " T_int = df[\"T_int\"].values\n", + " Q_rad = df[\"Pyr\"].values\n", + "\n", + " startTime = simulation_options.get(\"startTime\", time[0])\n", + " stopTime = simulation_options.get(\"stopTime\", time[-1])\n", + "\n", + " mask = (time >= startTime) & (time <= stopTime)\n", + " time = time[mask]\n", + " T_ext = T_ext[mask]\n", + " T_int = T_int[mask]\n", + " Q_rad = Q_rad[mask]\n", + "\n", + " # init\n", + " T_se = np.zeros(len(time))\n", + " T_concrete = np.zeros(len(time))\n", + " T_ins = np.zeros(len(time))\n", + " T_interface = np.zeros(len(time))\n", + " T_si = np.zeros(len(time))\n", + " T_sky = np.zeros(len(time))\n", + "\n", + " T_se[0] = T_ext[0]\n", + " T_concrete[0] = 299 - 273.15\n", + " T_ins[0] = T_int[0]\n", + " T_interface[0] = (T_ins[0] + T_concrete[0]) / 2\n", + " T_si[0] = T_int[0]\n", + " T_sky[0] = T_int[0]\n", + "\n", + " for t in range(1, len(time)):\n", + " dt = time[t] - time[t - 1]\n", + "\n", + " T_sky[t] = 0.0552 * (T_ext[t] ** 1.5)\n", + "\n", + " Q_rad_sky = epsilon * fview * sigma * (T_se[t - 1] ** 4 - T_sky[t] ** 4) * S_wall\n", + " Q_rad_amb = epsilon * fview * sigma * (T_se[t - 1] ** 4 - T_ext[t - 1] ** 4) * S_wall\n", + " Q_rad_dir = Q_rad[t - 1] * alpha * S_wall\n", + "\n", + " T_se[t] = (\n", + " T_ext[t - 1] / R_ext\n", + " + T_ins[t - 1] / (R_ins / 2)\n", + " + Q_rad_dir - Q_rad_sky - Q_rad_amb\n", + " ) / (1 / R_ext + 1 / (R_ins / 2))\n", + "\n", + " T_interface[t] = (\n", + " T_ins[t - 1] / (R_ins / 2) + T_concrete[t - 1] / (R_concrete / 2)\n", + " ) / (1 / (R_concrete / 2) + 1 / (R_ins / 2))\n", + "\n", + " T_si[t] = (\n", + " T_int[t - 1] / R_int + T_concrete[t - 1] / (R_concrete / 2)\n", + " ) / (1 / R_int + 1 / (R_concrete / 2))\n", + "\n", + " T_ins[t] = T_ins[t - 1] + dt / C_ins * (\n", + " (T_se[t] - T_ins[t - 1]) / (R_ins / 2)\n", + " + (T_interface[t] - T_ins[t - 1]) / (R_ins / 2)\n", + " )\n", + "\n", + " T_concrete[t] = T_concrete[t - 1] + dt / C_concrete * (\n", + " (T_interface[t] - T_concrete[t - 1]) / (R_concrete / 2)\n", + " + (T_si[t] - T_concrete[t - 1]) / (R_concrete / 2)\n", + " )\n", + "\n", + " # output\n", + " df_out = pd.DataFrame(\n", + " {\n", + " \"T_concrete\": T_concrete,\n", + " \"T_interface\": T_interface,\n", + " \"T_ins\": T_ins,\n", + " },\n", + " index=df.index[mask],\n", + " )\n", + " self.simulation_options = simulation_options\n", + " return df_out" + ], + "id": "e900f998f4ea30ea", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "# 2. Measurement import\n", + "Let's now load the reference cell measurement data on python that will be used as boundary conditions. Note that although the data loaded here should have be cleaned beforhand, it is always important to check it before using it for analyses." + ], + "id": "5eb5ae2297f037b2" + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "## Loading the dataframe", + "id": "b39ed86d8b604a9d" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "import pandas as pd\n", + "import numpy as np\n", + "import os\n", + "import plotly.express as px\n", + "from pathlib import Path\n", + "pd.options.mode.chained_assignment = None # default='warn'\n", + "\n", + "TUTORIAL_DIR = Path(os.getcwd()).as_posix()" + ], + "id": "1a5a801c534c3d2f", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "reference_df = pd.read_csv(\n", + " Path(TUTORIAL_DIR) / \"resources/tuto_data_SA.csv\",\n", + " index_col=0,\n", + " sep=\";\",\n", + " decimal=\".\",\n", + " parse_dates=True\n", + ")" + ], + "id": "e5b967f57e7c698f", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "code", + "source": "reference_df.head()", + "id": "5f6d357ab16d1421", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "reference_df.loc[:,\"T_ins\"] = reference_df['T_ins_2'] # middle of insulation temperature\n", + "reference_df.loc[:,\"T_int\"] =reference_df[['Tint_1', 'Tint_2']].mean(axis=1) # indoor temperature\n", + "reference_df.loc[:,\"T_interface\"] = reference_df[['T_interface_1', 'T_interface_2']].mean(axis=1) # interface temperature, between insulation and concrete panels" + ], + "id": "2eb47f766e09efa0", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "## Reference simulation", + "id": "4aff4c7d1aafbe37" + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "Let's run our first simulation with default parameters, on a period with both sunny and cloudy days.\n", + "To create a \"time_sec\" column used by the model, we can use the following function:" + ], + "id": "a1fcbdb4fc0494c0" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "import datetime as dt\n", + "\n", + "def datetime_to_seconds(index_datetime):\n", + " time_start = dt.datetime(index_datetime[0].year, 1, 1, tzinfo=dt.timezone.utc)\n", + " new_index = index_datetime.to_frame().diff().squeeze()\n", + " new_index.iloc[0] = dt.timedelta(\n", + " seconds=index_datetime[0].timestamp() - time_start.timestamp()\n", + " )\n", + " sec_dt = [elmt.total_seconds() for elmt in new_index]\n", + " return list(pd.Series(sec_dt).cumsum())" + ], + "id": "dad5b30f6e0669e4", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "feat_train = reference_df.loc[\"2024-09-04 00:00\":\"2024-09-07 00:00\"]\n", + "feat_train[\"time_sec\"] = datetime_to_seconds(feat_train.index)\n", + "\n", + "sim_opt ={\n", + " \"dataframe\":feat_train,\n", + " \"startTime\":feat_train[\"time_sec\"].iloc[0],\n", + " \"endTime\": feat_train[\"time_sec\"].iloc[-1],\n", + "}" + ], + "id": "a0d55c030a0592a0", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "model = OpaqueWallSimple()\n", + "init_res = model.simulate(\n", + " property_dict ={},\n", + " simulation_options=sim_opt\n", + ")" + ], + "id": "4481da307308deb7", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "Let's compare results of simulation to measurement:", + "id": "25980a9648429cd7" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "init_res_renamed = init_res.copy()\n", + "init_res_renamed.index = init_res_renamed.index.tz_localize(None)\n", + "init_res_renamed = init_res_renamed.rename(columns={\n", + " \"T_concrete\": \"T_concrete_PYTHON\",\n", + " \"T_interface\": \"T_interface_PYTHON\",\n", + " \"T_ins\": \"T_insulation_PYTHON\",\n", + "})\n", + "measure_comp = pd.concat([\n", + " feat_train[[\"T_interface\", \"T_ins\"]],\n", + " init_res_renamed[[\"T_interface_PYTHON\", \"T_insulation_PYTHON\" ]]], axis = 1)\n", + "\n", + "color_map = {\n", + " \"T_interface\": \"darkblue\",\n", + " \"T_interface_PYTHON\": \"blue\",\n", + " \"T_ins\": \"darkgreen\",\n", + " \"T_insulation_PYTHON\": \"green\"\n", + "}\n", + "\n", + "import plotly.graph_objects as go\n", + "from plotly.subplots import make_subplots\n", + "\n", + "fig = make_subplots(rows=1, cols=2, subplot_titles=(\"Interface layer\", \"Insulation layer\"))\n", + "\n", + "color_map = {\n", + " \"T_interface\": \"darkblue\",\n", + " \"T_interface_PYTHON\": \"blue\",\n", + " \"T_ins\": \"darkgreen\",\n", + " \"T_insulation_PYTHON\": \"green\"\n", + "}\n", + "for col in [\"T_interface\", \"T_interface_PYTHON\"]:\n", + " fig.add_trace(\n", + " go.Scatter(x=measure_comp.index, y=measure_comp[col], mode=\"lines\",\n", + " name=col, line=dict(color=color_map[col])),\n", + " row=1, col=1\n", + " )\n", + "for col in [\"T_ins\", \"T_insulation_PYTHON\"]:\n", + " fig.add_trace(\n", + " go.Scatter(x=measure_comp.index, y=measure_comp[col], mode=\"lines\",\n", + " name=col, line=dict(color=color_map[col])),\n", + " row=1, col=2\n", + " )\n", + "fig.update_layout(\n", + " title=\"Comparaison Python vs Measurement\",\n", + " width=1000,\n", + " height=500,\n", + " legend=dict(orientation=\"h\", yanchor=\"bottom\", y=-0.2, xanchor=\"center\", x=0.5)\n", + ")\n", + "\n", + "fig.show()" + ], + "id": "2d347ef906deccdb", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "# 3. Identification\n", + "Now, we proceed to finding optimal values for these parameters by minimizing the coefficient of variation of root mean square error (cv_rmse) between one or several measured nodes, and one or several relevant outputs of our model.\n", + "\n", + "**Step-by-Step Process**\n", + "\n", + "1. **Define the model and parameters**: by defining `OpaqueWallSimple` (this we already did) and specifying the parameters to be identify. Each parameter includes a name, an interval of possible values, a type, and an initial value.\n", + "\n", + "2. **Instantiate an objective function**: We create an instance of the `ObjectiveFunction` class, providing the model, simulation options, list of parameters, and indicators. The `scipy_obj_function` method of `ObjectiveFunction` will be used as the objective function for optimization. This method calculates for instance the cv_rmse of measured vs simulated temperatures, for the given parameter values.\n", + "\n", + "3. **Perform optimization**: We use minimization functions (for instance `minimize_scalar` from `scipy.optimize`) to minimize the objective function.\n" + ], + "id": "8fa01e0b0e27f2ef" + }, + { + "cell_type": "markdown", + "id": "881eb3948326386e", + "metadata": {}, + "source": [ + "## Definition of an objective function\n", + "In parameter optimization, we aim to adjust certain model parameters to minimize the difference between simulated and observed data. The objective function is a scalar function that quantifies this difference. In this case, the CV_RMSE (Coefficient of Variation of Root Mean Square Error) is used as a measure of how well the model output matches the reference measurements.\n", + "\n", + "**How the ObjectiveFunction Class Works** : The ObjectiveFunction class simplifies the process of optimizing model parameters by providing a structured way to:\n", + "\n", + "- Run simulations: For a given set of parameters, the model is simulated over the input data.\n", + "- Calculate error metrics (e.g., CV_RMSE): The model output is compared to the reference measurements, and a scalar error metric is calculated (such as the CV_RMSE).\n", + "- Optimize: The class can then be used with optimization algorithms (like scipy.optimize or pymoo) to adjust the model parameters in order to minimize the error metric.\n", + "\n", + "\n", + "To do so, the `ObjectiveFunction` class is designed to facilitate the optimization of model parameters using `scipy.optimize` or `pymoo` optimization methods by encapsulating the logic for simulation and indicator calculation. The `ObjectiveFunction` class takes a model, simulation options, a list of parameters to be calibrated, and a list of indicators as input. It provides methods to calculate the objective function, which can be used by optimization routines to find the optimal parameters.\n", + "\n", + "### Attributes\n", + "\n", + "- **model**: The model to be calibrated.\n", + "- **simulation_options**: A dictionary containing simulation options, including input data.\n", + "- **param_list**: A list of dictionaries specifying the parameters to be calibrated.\n", + "- **indicators_config**: A dictionary where the keys are the names of the indicators corresponding to the model outputs, and the values are either:\n", + " - An aggregation function to compute the indicator (e.g., np.mean, np.sum).\n", + " - A tuple (function, reference_data) if a comparison with reference data is required (e.g., sklearn.metrics.mean_squared_error, corrai.metrics.mae).\n", + "- **scipy_obj_indicator**: The name of the indicator used as the objective function for optimization with scipy.optimize. By default, it is the first key in indicators_config." + ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Definition of parameters\n", + "Each decision variable of the optimization problem must be described using a `Parameter` object.\n", + "A parameter specifies:\n", + "* `name` — The variable name (string, must be unique within the problem).\n", + "* `ptype` — Variable type, one of:\n", + " * `\"Real\" `— Continuous real number\n", + " * `\"Integer\"` — Discrete integer\n", + " * `\"Binary\"` — Boolean, domain {False, True} (set automatically if no domain is given)\n", + " *` \"Choice\"` — Categorical variable with a fixed set of discrete options\n", + "* `Domain definition` — Choose exactly one of:\n", + " * ` interval=(lo, hi) `— Lower and upper bounds (for \"Real\" and \"Integer\", optional for \"Binary\" if you want (0,1))\n", + " * ` values=(v1, v2, …)` — Explicit list/tuple of allowed values (for \"Choice\", and optionally for \"Integer\", \"Real\", or \"Binary\")\n", + "*` Optional fields`:\n", + " * `init_value` — Initial value (or tuple/list of initial values for batch runs); must be within the defined domain\n", + " * `relabs` — `\"Absolute\"` or `\"Relative\"`\n", + " * `model_property` — String or tuple specifying the corresponding property in the simulation/model\n", + " * `min_max_interval` — Optional extra bounds for use in analysis/validation" + ], + "id": "bd3e20dc28aad0" + }, + { + "cell_type": "markdown", + "id": "323fe63a31424dcc", + "metadata": {}, + "source": "## Application: One-dimensional optimization problem" + }, + { + "cell_type": "markdown", + "id": "984adbbfb7c96a81", + "metadata": {}, + "source": [ + "First, we can try scalar functions optimization from scipy (different methods: brent, boulded, golden ... see documentation on scipy website).\n", + "For Scypi, each objective function is minimized for optimization:\n", + "- Here we chose as indicators the temperature calculated and measured within the wall insulation. Note this could be another node (a heat flux densitiy, another temperature node).\n", + "- The identified parameter here is $alpha$.\n", + "\n", + "Let's define the parameter as Real, between 0 and 1, with a default value of 0.5. By default, it will be considered as an absolute value." + ] + }, + { + "cell_type": "code", + "id": "7bb570cb25cc3c66", + "metadata": {}, + "source": [ + "from corrai.base.parameter import Parameter\n", + "\n", + "calibration_params = Parameter(\n", + " name='alpha',\n", + " interval=(0,1),\n", + " ptype=\"Real\",\n", + " model_property='alpha'\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "ff18639556ad2235", + "metadata": {}, + "source": [ + "We can now instanciate an objective function, using `ObjectiveFunction`." + ] + }, + { + "cell_type": "code", + "id": "9a24e457cc8ad243", + "metadata": {}, + "source": [ + "from corrai.optimize import ObjectiveFunction\n", + "from corrai.base.metrics import cv_rmse\n", + "\n", + "obj_func = ObjectiveFunction(\n", + " model=OpaqueWallSimple(),\n", + " simulation_options=sim_opt,\n", + " parameters=[calibration_params],\n", + " indicators_config={\"T_ins\": (cv_rmse, feat_train[\"T_ins\"])},\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "Let's see how ObjectiveFunction works by calculating the objective function value for a given value of alpha:", + "id": "5dc30a4f9746a12e" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "obj_func.function(\n", + " parameter_value_pairs = [(calibration_params, 0.5)] # Example value for alpha\n", + ")" + ], + "id": "1e6c800f938e74f0", + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "33c9a83c1b01e35", + "metadata": {}, + "source": [ + "The `minimize_scalar` function in `scipy.optimize` is used for scalar function minimization, specifically for one-dimensional optimization problems. This function finds the minimum value of a scalar function over a specified interval. The main methods are `Brent`, `bounded`, and `golden`.\n", + "\n", + "The function returns an optimization result object that contains information about the optimization process and the final solution.\n", + "\n", + "- **Brent** :The Brent method uses Brent’s algorithm, which combines a parabolic interpolation with the golden section search. This method does not require the interval bounds.\n", + "- **Golden** : Employs the golden section search method, which reduces the interval of uncertainty using the golden ratio. Simple and reliable for unimodal functions, but may be slower than Brent's method.\n", + "- **Bounded** : Restricts the search to the specified bounds using a combination of golden section search and parabolic interpolation.\n", + "Advantages: Ensures that the solution remains within the given bounds, making it ideal for constrained problems." + ] + }, + { + "cell_type": "code", + "id": "78db962a6495b017", + "metadata": {}, + "source": [ + "from scipy.optimize import minimize_scalar" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "d5092d877401ac2", + "metadata": {}, + "source": [ + "import warnings\n", + "warnings.filterwarnings('ignore', category=RuntimeWarning)\n", + "\n", + "result = minimize_scalar(\n", + " obj_func.scipy_obj_function, \n", + " bounds=obj_func.bounds[0],\n", + " method=\"Bounded\"\n", + ")\n", + "\n", + "result" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "4cbf2fc9729e00c3", + "metadata": {}, + "source": "A solution is found with a value of 0.12 for alpha and 4.36 for the CV_RMSE. Let's check if the parameter value is close to the boundaries." + }, + { + "cell_type": "code", + "id": "26f4a8819c02136f", + "metadata": {}, + "source": [ + "obj_func.bounds" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "48dc0d58db1cfc5a", + "metadata": {}, + "source": [ + "It is indeed not far from 0.1 but not a the limit. Let's now run the simulation using this parameter value and compare with the initial simulation." + ] + }, + { + "cell_type": "code", + "id": "2ae801d71552398c", + "metadata": {}, + "source": [ + "parameter_names = [\"alpha\"]\n", + "parameter_dict1 = {param_name: result.x for i, param_name in enumerate(parameter_names)}\n", + "\n", + "result_optim = model.simulate(\n", + " property_dict=parameter_dict1,\n", + " simulation_options=sim_opt\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "77aa000a449d0e", + "metadata": {}, + "source": [ + "import plotly.graph_objects as go\n", + "\n", + "fig = go.Figure()\n", + "\n", + "fig.add_trace(go.Scatter(\n", + " x=feat_train.index,\n", + " y=feat_train[\"T_ins\"],\n", + " fill=None,\n", + " mode='lines',\n", + " line_color='green',\n", + " name=\"T_insulation - Measurement\"\n", + "))\n", + "\n", + "fig.add_trace(go.Scatter(\n", + " x=init_res.index,\n", + " y=init_res[\"T_ins\"],\n", + " fill=None,\n", + " mode='lines',\n", + " line_color='orange',\n", + " name=\"T_insulation - Initial results\"\n", + "))\n", + "\n", + "fig.add_trace(go.Scatter(\n", + " x=result_optim.index,\n", + " y=result_optim[\"T_ins\"],\n", + " fill=None,\n", + " mode='lines',\n", + " line_color='brown',\n", + " name=\"T_insulation - Optimization results\"\n", + "))\n", + "\n", + "\n", + "fig.update_layout(\n", + " title='Optimization vs. Measurement ',\n", + " xaxis_title='Date',\n", + " yaxis_title='Temperature [K]')\n", + "\n", + "fig.show()" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "679cbfeaa19bb4c4", + "metadata": {}, + "source": [ + "Results are closer to measurements but still far off." + ] + }, + { + "cell_type": "markdown", + "id": "cdb92f3584436e59", + "metadata": {}, + "source": "## Application: multi- objectives and multi parameters optimization" + }, + { + "cell_type": "markdown", + "id": "1b52c09b40f97821", + "metadata": {}, + "source": [ + "Let's use Pymoo, integrated into the `Problem` class of `CorrAI` and multi-parameters optimization with multi_objectives.\n", + "\n", + "Note that :\n", + "- All **objectives are minimized**.\n", + "- All **constraints** must be provided in the form **g(x) ≤ 0** (inequalities).\n", + "\n", + "`Problem` aggregates all evaluator outputs into a single dictionary and then **extracts**:\n", + "- **Objectives** in the order of `objective_ids` → matrix **F**\n", + "- **Constraints** in the order of `constraint_ids` → matrix **G** (≤ 0)\n", + "\n", + "`Problem` takes as arguments:\n", + "- **`parameters`**: as defined earlier from the class `Parameter`\n", + "- **`evaluators`** (or objective functions)\n", + "- **`objective_ids`** : Names (keys) to extract from evaluator outputs to build **F** (in order).\n", + "- **`constraint_ids`** : Optional names to extract for **G** (inequalities ≤ 0). If omitted, `G` is empty.\n" + ] + }, + { + "cell_type": "markdown", + "id": "1a18680c67860070", + "metadata": {}, + "source": "### Several parameters, one objective" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "# Surface of the tested wall\n", + "S_wall = 7\n", + "\n", + "# Thickness of layer\n", + "ep_concrete = 0.200 #m\n", + "ep_ins = 0.140 #m\n", + "\n", + "# Conductivity of layer\n", + "lambda_concrete = 0.13 # (W/mK)\n", + "lambda_ins = 0.031 # W/(mK)\n", + "\n", + "# Density of layer\n", + "rho_concrete = 2400 # kg/m3\n", + "rho_ins = 32.5 # kg/m3\"\n", + "\n", + "sc_concrete = 880 # J/kg.K\n", + "sc_ins = 1000 # J/kg.K\n", + "\n", + "# solar paremetesr\n", + "alpha = 0.2 # absorption coefficient\n", + "epsilon = 0.8 # emissivity\n", + "fview = 0.5 # view factor of tested wall" + ], + "id": "cc809d95ef604660", + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "Let's define now a list of two parameters, with one as a relative value.\n", + "The specified interval defines the relative search space around the initial value.\n", + "If no initial value is provided, the model should have a `get_property_value()` method to retrieve a default value." + ], + "id": "4b4b576895a2adde" + }, + { + "cell_type": "code", + "id": "382639dbbdf0b7f5", + "metadata": {}, + "source": [ + "calibration_params = [\n", + " Parameter(\n", + " name='R_ext',\n", + " init_value= 0.04/S_wall,\n", + " interval=(0.4, 1.6), # search between 40% and 160% of initial value\n", + " relabs=\"Relative\",\n", + " ptype=\"Real\",\n", + " model_property='R_ext',\n", + " ),\n", + " Parameter(\n", + " name='alpha',\n", + " interval=(0.2, 0.8),\n", + " ptype=\"Real\",\n", + " model_property='alpha',\n", + " ),\n", + "]" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "49e611448d80337b", + "metadata": {}, + "source": "Here, we can add the interferace temperature (between insulation and concrete panels) as an indicator." + }, + { + "cell_type": "code", + "id": "ef97cc58e43c40d6", + "metadata": {}, + "source": [ + "from corrai.base.metrics import cv_rmse\n", + "\n", + "obj_func = ObjectiveFunction(\n", + " model=OpaqueWallSimple(),\n", + " simulation_options=sim_opt,\n", + " parameters=calibration_params,\n", + " indicators_config={\n", + " \"T_ins\": (cv_rmse, feat_train[\"T_ins\"]),\n", + " \"T_interface\": (cv_rmse, feat_train[\"T_interface\"]),\n", + " },\n", + " scipy_obj_indicator=\"T_ins\",\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "93799f08a89ee390", + "metadata": {}, + "source": [ + "Now let's instanciate the problem using the classe `Problem`:" + ] + }, + { + "cell_type": "code", + "id": "40c58ef20557e69a", + "metadata": {}, + "source": [ + "from corrai.optimize import Problem\n", + "\n", + "problem = Problem(\n", + " parameters=calibration_params,\n", + " evaluators=[obj_func],\n", + " objective_ids=[\"T_ins\", \"T_interface\"],\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "5c560de6fcd540c7", + "metadata": {}, + "source": [ + "For a two objective problem, we choose here **NSGA2**, as a well-known multi-objective optimization algorithm based on non-dominated sorting and crowding.\n", + "List of algorithms here https://pymoo.org/algorithms/list.html#nb-algorithms-list.\n", + "\n", + "If the verbose=True, some printouts during the algorithm’s execution are provided. This can very from algorithm to algorithm. Here, we execute NSGA2 on a problem where pymoo has no knowledge about the optimum. Each line represents one iteration. The first two columns are the current generation counter and the number of evaluations so far. For constrained problems, the next two columns show the minimum constraint violation (cv (min)) and the average constraint violation (cv (avg)) in the current population. This is followed by the number of non-dominated solutions (n_nds) and two more metrics which represents the movement in the objective space." + ] + }, + { + "cell_type": "code", + "id": "36c4a5e29c3580da", + "metadata": {}, + "source": [ + "from pymoo.algorithms.moo.nsga2 import NSGA2\n", + "from pymoo.optimize import minimize\n", + "from pymoo.operators.crossover.pntx import TwoPointCrossover\n", + "from pymoo.termination import get_termination\n", + "from pymoo.algorithms.moo.nsga2 import NSGA2\n", + "from pymoo.operators.crossover.sbx import SBX\n", + "from pymoo.operators.mutation.pm import PM\n", + "from pymoo.operators.sampling.rnd import FloatRandomSampling" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "14d5c56706972fb0", + "metadata": {}, + "source": [ + "algorithm = NSGA2(\n", + " pop_size=50,\n", + " #n_offsprings=10,\n", + " #sampling=FloatRandomSampling(),\n", + " crossover=SBX(prob=0.9, eta=15),\n", + " mutation=PM(eta=20),\n", + " eliminate_duplicates=True\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "e7742f7b075e7bc4", + "metadata": {}, + "source": [ + "Here, we will run 15 generation with a population of 50." + ] + }, + { + "cell_type": "code", + "id": "7023f1282a7b0398", + "metadata": {}, + "source": [ + "termination = get_termination(\"n_gen\", 15)\n", + "\n", + "res = minimize(problem,\n", + " algorithm,\n", + " termination,\n", + " seed=42,\n", + " verbose=True)\n", + "\n", + "print(\"Best solution found: \\nX = %s\\nF = %s\" % (res.X, res.F))" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "5c3a4c4acc8b32f2", + "metadata": {}, + "source": "We can now visualize the objectives functions results using the `Scatter` function from pymoo." + }, + { + "cell_type": "code", + "id": "9d9ff70f4e4e5873", + "metadata": {}, + "source": [ + "from pymoo.visualization.scatter import Scatter\n", + "Scatter().add(res.F).show()" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "3310ee8c9851ee39", + "metadata": {}, + "source": [ + "For a bi-objective problem,and helping us chosing the best set of parameters value, we can use the decomposition method called Augmented Scalarization Function (ASF), a well-known metric in the multi-objective optimization literature.\n", + "Let us assume the are equally important by setting the weights to 0.5 and 0.5 and setting these" + ] + }, + { + "cell_type": "code", + "id": "430753101a36d277", + "metadata": {}, + "source": [ + "from pymoo.decomposition.asf import ASF\n", + "F = res.F\n", + "approx_ideal = F.min(axis=0)\n", + "approx_nadir = F.max(axis=0)\n", + "nF = (F - approx_ideal) / (approx_nadir - approx_ideal)\n", + "\n", + "fl = nF.min(axis=0)\n", + "fu = nF.max(axis=0)\n", + "weights = np.array([0.5, 0.5])\n", + "decomp = ASF()\n", + "\n", + "i = decomp.do(nF, 1/weights).argmin()\n", + "\n", + "parameter_names = [param.name for param in calibration_params]\n", + "parameter_dict = {param_name: res.X[i][j] for j, param_name in enumerate(parameter_names)}\n", + "\n", + "print(\n", + " \"Best regarding ASF: Point \\ni = %s\\nF = %s\" % (i, F[i]),\n", + " parameter_dict\n", + ")\n" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "6af5abd476a2d1ce", + "metadata": {}, + "source": [ + "The estimated parameters are consistent with our expectations but at the limits of our bounds. This shows the model is not either adapted to physical behaviour of our system, or boundaries are not large enough. Either way, we can compare the profile of measured indoor temperature with the output that the model predicts given the identified optimal parameters.\n", + "\n", + "Since we used relative parameters, we need to create a list of parameter-value pairs to run the simulation and use the method `simulate_parameter` from Model(ABC), which will convert the given percentage found by the algorithm into an absolute value for R_ext." + ] + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "parameter_value_pairs = [\n", + " (param, res.X[i][j])\n", + " for j, param in enumerate(calibration_params)\n", + "]" + ], + "id": "df6d385e415776b3", + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "9a396bd186cd8dce", + "metadata": {}, + "source": [ + "result_optim = model.simulate_parameter(\n", + " parameter_value_pairs=parameter_value_pairs,\n", + " simulation_options=sim_opt,\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "77bd39626a63b358", + "metadata": {}, + "source": [ + "import plotly.graph_objects as go\n", + "from plotly.subplots import make_subplots\n", + "\n", + "fig = make_subplots(rows=1, cols=2, subplot_titles=(\"T_insulation\", \"T_interface\"))\n", + "\n", + "fig.add_trace(go.Scatter(\n", + " x=init_res.index,\n", + " y=feat_train[\"T_ins\"],\n", + " mode=\"lines\",\n", + " line_color=\"green\",\n", + " name=\"Measurement\"\n", + "), row=1, col=1)\n", + "\n", + "fig.add_trace(go.Scatter(\n", + " x=init_res.index,\n", + " y=init_res[\"T_ins\"],\n", + " mode=\"lines\",\n", + " line_color=\"orange\",\n", + " name=\"Initial results\"\n", + "), row=1, col=1)\n", + "\n", + "fig.add_trace(go.Scatter(\n", + " x=feat_train.index,\n", + " y=result_optim[\"T_ins\"],\n", + " mode=\"lines\",\n", + " line_color=\"brown\",\n", + " name=\"Optimization results\"\n", + "), row=1, col=1)\n", + "\n", + "# --- Droite : T_interface ---\n", + "fig.add_trace(go.Scatter(\n", + " x=init_res.index,\n", + " y=feat_train[\"T_interface\"],\n", + " mode=\"lines\",\n", + " line_color=\"green\",\n", + " name=\"Measurement\"\n", + "), row=1, col=2)\n", + "\n", + "fig.add_trace(go.Scatter(\n", + " x=init_res.index,\n", + " y=init_res[\"T_interface\"],\n", + " mode=\"lines\",\n", + " line_color=\"orange\",\n", + " name=\"Initial results\"\n", + "), row=1, col=2)\n", + "\n", + "fig.add_trace(go.Scatter(\n", + " x=feat_train.index,\n", + " y=result_optim[\"T_interface\"],\n", + " mode=\"lines\",\n", + " line_color=\"brown\",\n", + " name=\"Optimization results\"\n", + "), row=1, col=2)\n", + "\n", + "# Layout\n", + "fig.update_layout(\n", + " title=\"Optimization vs. Measurement\",\n", + " xaxis_title=\"Date\",\n", + " yaxis_title=\"Temperature [°C]\",\n", + " showlegend=True\n", + ")\n", + "\n", + "fig.show()\n" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "2d5b8b9c95df81cc", + "metadata": {}, + "source": [ + "#### Validation set\n", + "An important step is to check identified parameters on validation set. Let's try on an new period using the last identified values." + ] + }, + { + "cell_type": "code", + "id": "9aa4f7a4cc5c7ddc", + "metadata": {}, + "source": [ + "validation_set = reference_df.loc[\"2024-09-08 00:00\":\"2024-09-14 00:00\"]\n", + "validation_set.loc[:,\"time_sec\"] = datetime_to_seconds(validation_set.index)\n", + "\n", + "validation_set = validation_set.resample('5min').mean()\n", + "second_index = datetime_to_seconds(validation_set.index)\n", + "\n", + "new_sim_opt={\n", + " \"dataframe\":validation_set,\n", + " \"startTime\": second_index[0],\n", + " \"endTime\": second_index[-1], \n", + "}" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "cad5e270e1e6ab01", + "metadata": {}, + "source": [ + "validation_results = model.simulate_parameter(\n", + " parameter_value_pairs=parameter_value_pairs,\n", + " simulation_options=new_sim_opt,\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "c894241ee9e19da6", + "metadata": {}, + "source": [ + "import plotly.graph_objects as go\n", + "\n", + "fig = go.Figure()\n", + "\n", + "fig.add_trace(go.Scatter(\n", + " x=validation_results.index,\n", + " y=validation_results[\"T_ins\"] ,\n", + " fill=None,\n", + " mode='lines',\n", + " line_color='orange',\n", + " name=\"T_insulation - Validation result\"\n", + "))\n", + "\n", + "fig.add_trace(go.Scatter(\n", + " x=validation_results.index,\n", + " y=validation_set[\"T_ins\"],\n", + " fill=None,\n", + " mode='lines',\n", + " line_color='green',\n", + " name=\"T_insulation - Measurement\"\n", + "))\n", + "\n", + "fig.update_layout(\n", + " title='Simulation vs. Measurement ',\n", + " xaxis_title='Date',\n", + " yaxis_title='Temperature [K]')\n", + "\n", + "fig.show()" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "84673e7340839047", + "metadata": {}, + "source": [ + "cv_rmse(\n", + " validation_results[\"T_ins\"],\n", + " validation_set[\"T_ins\"]\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "As we can see, the results are not very good. We could try to identifiy more parameters, or change the model to better represent the physical phenomena.", + "id": "8c604268cd14f32b" + }, + { + "cell_type": "markdown", + "id": "f85f87661de5c8c3", + "metadata": {}, + "source": [ + "## 4.3.3 Mixed parameters" + ] + }, + { + "cell_type": "markdown", + "id": "77844027c51e6f9b", + "metadata": {}, + "source": [ + "In some calibrations, some decision variables are discrete rather than purely continuous—e.g., on/off features (booleans), choices among a few valid modes, or integers for counts. Standard continuous optimizers used earlier assume a smooth search space and therefore won’t handle these variables correctly.\n", + "\n", + "To reflect realistic design decisions and configuration toggles, we now introduce a model with a boolean switch (e.g., enabling/disabling a physical effect) and restrict alpha to a small set of admissible values. This mixed-variable setup better captures practical constraints and uncertainty (e.g., unknown presence of a phenomenon, or controller setpoint options), and requires a mixed-variable optimization strategy." + ] + }, + { + "cell_type": "code", + "id": "f38515d84c5b71cd", + "metadata": {}, + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "from corrai.base.model import Model\n", + "\n", + "class OpaqueWallBool(Model):\n", + " def simulate(\n", + " self,\n", + " property_dict: dict,\n", + " simulation_options: dict,\n", + " simulation_kwargs: dict | None = None,\n", + " ) -> pd.DataFrame:\n", + "\n", + " default_parameters = {\n", + " \"R_ext\": 0.005,\n", + " \"R_int\": 0.01,\n", + " \"R_concrete\": 0.10,\n", + " \"R_ins\": 0.32,\n", + " \"C_concrete\": 2.95e6,\n", + " \"C_ins\": 3.64e4,\n", + " \"alpha\": 0.2,\n", + " \"S_wall\": 7,\n", + " \"epsilon\": 0.4,\n", + " \"fview\": 0.5,\n", + " \"has_LW_radiation\": True\n", + " }\n", + " parameters = {**default_parameters, **property_dict}\n", + "\n", + " R_ext = parameters[\"R_ext\"]\n", + " R_int = parameters[\"R_int\"]\n", + " R_concrete = parameters[\"R_concrete\"]\n", + " R_ins = parameters[\"R_ins\"]\n", + " C_concrete = parameters[\"C_concrete\"]\n", + " C_ins = parameters[\"C_ins\"]\n", + " alpha = parameters[\"alpha\"]\n", + " S_wall = parameters[\"S_wall\"]\n", + " epsilon = parameters[\"epsilon\"]\n", + " fview = parameters[\"fview\"]\n", + " has_LW_radiation = parameters[\"has_LW_radiation\"]\n", + "\n", + " sigma = 5.67e-8 # W/m^2/K^4\n", + "\n", + " df = simulation_options[\"dataframe\"]\n", + " time = df[\"time_sec\"].values\n", + " T_ext = df[\"T_ext\"].values\n", + " T_int = df[\"T_int\"].values\n", + " Q_rad = df[\"Pyr\"].values\n", + "\n", + " startTime = simulation_options.get(\"startTime\", time[0])\n", + " stopTime = simulation_options.get(\"stopTime\", time[-1])\n", + "\n", + " mask = (time >= startTime) & (time <= stopTime)\n", + " time = time[mask]\n", + " T_ext = T_ext[mask]\n", + " T_int = T_int[mask]\n", + " Q_rad = Q_rad[mask]\n", + "\n", + " # init\n", + " T_se = np.zeros(len(time))\n", + " T_concrete = np.zeros(len(time))\n", + " T_ins = np.zeros(len(time))\n", + " T_interface = np.zeros(len(time))\n", + " T_si = np.zeros(len(time))\n", + " T_sky = np.zeros(len(time))\n", + "\n", + " T_se[0] = T_ext[0]\n", + " T_concrete[0] = 299\n", + " T_ins[0] = T_int[0]\n", + " T_interface[0] = (T_ins[0] + T_concrete[0]) / 2\n", + " T_si[0] = T_int[0]\n", + " T_sky[0] = T_int[0]\n", + "\n", + " for t in range(1, len(time)):\n", + " dt = time[t] - time[t - 1]\n", + "\n", + "\n", + " ## boolean to turn off long wave radiation exchange with environnement and sky\n", + " T_sky[t] = 0.0552 * (T_ext[t] ** 1.5) if has_LW_radiation else 0.0\n", + "\n", + " Q_rad_sky = (\n", + " epsilon * fview * sigma * (T_se[t - 1] ** 4 - T_sky[t] ** 4) * S_wall\n", + " if has_LW_radiation else 0.0\n", + " )\n", + " Q_rad_amb = (\n", + " epsilon * fview * sigma * (T_se[t - 1] ** 4 - T_ext[t - 1] ** 4) * S_wall\n", + " if has_LW_radiation else 0.0\n", + " )\n", + " Q_rad_dir = Q_rad[t - 1] * alpha * S_wall\n", + "\n", + " T_se[t] = (\n", + " T_ext[t - 1] / R_ext\n", + " + T_ins[t - 1] / (R_ins / 2)\n", + " + Q_rad_dir - Q_rad_sky - Q_rad_amb\n", + " ) / (1 / R_ext + 1 / (R_ins / 2))\n", + "\n", + " T_interface[t] = (\n", + " T_ins[t - 1] / (R_ins / 2) + T_concrete[t - 1] / (R_concrete / 2)\n", + " ) / (1 / (R_concrete / 2) + 1 / (R_ins / 2))\n", + "\n", + " T_si[t] = (\n", + " T_int[t - 1] / R_int + T_concrete[t - 1] / (R_concrete / 2)\n", + " ) / (1 / R_int + 1 / (R_concrete / 2))\n", + "\n", + " T_ins[t] = T_ins[t - 1] + dt / C_ins * (\n", + " (T_se[t] - T_ins[t - 1]) / (R_ins / 2)\n", + " + (T_interface[t] - T_ins[t - 1]) / (R_ins / 2)\n", + " )\n", + "\n", + " T_concrete[t] = T_concrete[t - 1] + dt / C_concrete * (\n", + " (T_interface[t] - T_concrete[t - 1]) / (R_concrete / 2)\n", + " + (T_si[t] - T_concrete[t - 1]) / (R_concrete / 2)\n", + " )\n", + "\n", + " # output\n", + " df_out = pd.DataFrame(\n", + " {\n", + " \"T_concrete\": T_concrete,\n", + " \"T_interface\": T_interface,\n", + " \"T_ins\": T_ins,\n", + " },\n", + " index=df.index[mask],\n", + " )\n", + " self.simulation_options = simulation_options\n", + " return df_out" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "6c69db5bf099ef9", + "metadata": {}, + "source": [ + "from corrai.base.parameter import Parameter\n", + "\n", + "mixed_params = [\n", + " Parameter(name=\"alpha\", values=(0.2, 0.4, 0.5), ptype=\"Choice\", model_property=\"alpha\"),\n", + " Parameter(name=\"epsilon\", interval=(0, 1), ptype=\"Real\", model_property=\"epsilon\"),\n", + " Parameter(name=\"has_LW_radiation\", ptype=\"Binary\", model_property=\"has_LW_radiation\"),\n", + "]" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "c1003131b39bbf24", + "metadata": {}, + "source": [ + "from corrai.optimize import Problem, ObjectiveFunction\n", + "from corrai.base.metrics import cv_rmse\n", + "\n", + "obj_func = ObjectiveFunction(\n", + " model=OpaqueWallBool(),\n", + " simulation_options=sim_opt,\n", + " parameters=mixed_params,\n", + " indicators_config={\n", + " \"T_ins\": (cv_rmse, feat_train[\"T_ins\"]),\n", + " },\n", + ")\n", + "\n", + "problem = Problem(\n", + " parameters=mixed_params,\n", + " evaluators=[obj_func],\n", + " objective_ids=[\"T_ins\"],\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "f4a5b54c01f43a76", + "metadata": {}, + "source": [ + "from pymoo.termination import get_termination\n", + "from pymoo.algorithms.moo.nsga2 import RankAndCrowdingSurvival\n", + "from pymoo.core.mixed import MixedVariableGA\n", + "from pymoo.optimize import minimize\n", + "\n", + "algorithm = MixedVariableGA(pop_size=50, survival=RankAndCrowdingSurvival())\n", + "termination = get_termination(\"n_gen\", 5)\n", + "\n", + "res = minimize(problem,\n", + " algorithm,\n", + " termination,\n", + " seed=42,\n", + " verbose=True)\n", + "\n", + "print(\"Best solution found: \\nX = %s\\nF = %s\" % (res.X, res.F))" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "29b9e0c7e9436863", + "metadata": {}, + "source": [ + "parameter_value_pairs = [(p, res.X[p.name]) for p in mixed_params]\n", + "\n", + "model.simulate_parameter(\n", + " parameter_value_pairs=parameter_value_pairs,\n", + " simulation_options=sim_opt,\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "code", + "source": "", + "id": "3038c362af13a0dc", + "outputs": [], + "execution_count": null + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.11" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/tutorials/MeasureDat example_Cleaning and selection of data.ipynb b/tutorials/MeasureDat example_Cleaning and selection of data.ipynb deleted file mode 100644 index 500cdd0..0000000 --- a/tutorials/MeasureDat example_Cleaning and selection of data.ipynb +++ /dev/null @@ -1,813 +0,0 @@ -{ - "cells": [ - { - "cell_type": "code", - "execution_count": null, - "id": "398f247c", - "metadata": {}, - "outputs": [], - "source": [ - "from pathlib import Path" - ] - }, - { - "cell_type": "markdown", - "id": "553f8dc4", - "metadata": {}, - "source": [ - "---\n", - "# Loading and processing measured data with MeasuredDats" - ] - }, - { - "cell_type": "markdown", - "id": "c5ae542b-fc43-449a-98a1-6e911cef0a28", - "metadata": {}, - "source": [ - "The goal of this tutorial is to provide a comprehensive workflow for treating measured data on a test bench using **CorrAI** MeasuredDats. \n", - "At first sight, a dataset may look fine, but missing values or incorrect variations are not always visible. The following steps are proposed to ensure data quality.\n", - "\n", - "#### 1- Identify anomalies:\n", - "Purpose: Establish boundaries for acceptable data values and rates of change.\n", - "- __upper__ and __lower values__ as boundaries: Define permissible ranges for measured values. Data points outside these ranges are flagged as anomalies.\n", - "- __upper__ and __lower \"rates__\" : Set thresholds for acceptable rates of change in measured values over time. Values exceeding these thresholds indicate anomalies.\n", - "\n", - "#### 2- Missing data interpolation\n", - "Purpose: Ensure continuity and reliability of data for analysis.\n", - "- __Interpolation Methods__: Use techniques like linear interpolation to estimate missing data points between known values.\n", - "- __Handling Edge Cases__: Correct errors at the start or end of a time series by extrapolating from the first or last known value.\n", - "\n", - "Physical models and analytical tools usually require complete data sets.\n", - "\n", - "#### 3- Selecting dataset for modelling\n", - "Purpose: Manage large datasets effectively and select relevant set of information.\n", - "\n", - "- __Resampling__: Aggregate data into larger time intervals (e.g., from 1-minute to 10-minute intervals) using resampling techniques.\n", - "- __Selection Criteria__: Choose optimal time intervals based on system requirements and modeling resolution. For example, select days with consistent weather conditions.\n", - "\n", - "By following these steps in your data treatment workflow with CorrAI, you can enhance the reliability and usability of your measured data for analysis and modeling.\n", - "---" - ] - }, - { - "cell_type": "markdown", - "id": "3b85508d", - "metadata": { - "jp-MarkdownHeadingCollapsed": true - }, - "source": [ - "## Use case\n", - "\n", - "The data used here are measurements collected from a real-scale benchmark conducted by Nobatek's BEF (Banc d'Essais Façade), which provides experimental cells for testing building façade solutions. Heat exchanges in a cell are limited to five of its faces, while the sixth face is dedicated to the tested solution. Internal temperature and hydrometry conditions can be controlled or monitored, and external conditions, such as temperatures and solar radiation, are measured.\n", - "\n", - "The experimental setup is presented in the following figures:\n", - "\n", - "| Figure 1: picture of the benchmark | Figure 2: wall layers from the inside (right) to the outside (left) |\n", - "| :---: | :---: |\n", - "| | |\n", - "\n", - "Additional details about the data:\n", - "- The measurement campaign spanned from 13/12/2017 to 13/06/2018.\n", - "- The acquisition timestep is around 1 minute.\n" - ] - }, - { - "cell_type": "markdown", - "id": "3768a5bb", - "metadata": {}, - "source": [ - "# Measured data analysis and correction" - ] - }, - { - "cell_type": "markdown", - "id": "61f9bac0", - "metadata": {}, - "source": [ - "Measured data are loaded using pandas python library." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "5ba6c97b", - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "21aeb02e", - "metadata": {}, - "outputs": [], - "source": [ - "raw_data = pd.read_csv(\n", - " Path(r\"resources/tuto_data.csv\"),\n", - " sep=\",\",\n", - " index_col=0,\n", - " parse_dates=True\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "3250c929", - "metadata": {}, - "source": [ - "Plotting the raw temperatures gives precious information on the dataset." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "fc901bec", - "metadata": {}, - "outputs": [], - "source": [ - "raw_data['T_ext'].plot()" - ] - }, - { - "cell_type": "markdown", - "id": "ebe828d3-e1d3-4f6d-b606-0c70d51b249a", - "metadata": {}, - "source": [ - "However, missing values or incorrect variations are not always visible." - ] - }, - { - "cell_type": "markdown", - "id": "7ae77f39", - "metadata": {}, - "source": [ - "## Using Class MeasuredDat to perform operations on data\n", - "\n", - "The MeasuredDats **corrai** object is designed to specify transformations to apply to a measured dataset and to visualize their effects.\n", - "The measurements are classified in _categories_ (eg. temperature, power, control, etc.).\n", - "There are 3 kinds of transformations :\n", - "- _Category level_: specify transformations \"in parallel\" that will be applied to specified categories\n", - "- _Common transformation_: apply transformation to all categories\n", - "- _Resampling_: process data using a time rule and an aggregation method. It may be used to align data on a regular time index or reduce the size of the dataset\n" - ] - }, - { - "cell_type": "markdown", - "id": "1460e850-7193-4ee3-8feb-6b3ab6781311", - "metadata": {}, - "source": [ - "#### Presentation of MeasuredDats" - ] - }, - { - "cell_type": "markdown", - "id": "e466f8b9", - "metadata": {}, - "source": [ - "The MeasuredDats **corrai** object is designed to specify transformations to apply to a measured dataset and to visualize their effects.\n", - "The measurements are classified in _categories_ (eg. temperature, power, control, etc.).\n", - "There are 3 kinds of transformations :\n", - "- _Category level_: specify transformations \"in parallel\" that will be applied to specified categories\n", - "- _Common transformation_: apply transformation to all categories\n", - "- _Resampling_: process data using a time rule and an aggregation method. It may be used to align data on a regular time index or reduce the size of the dataset\n", - "\n", - "**** Presentation of MeasuredDats\n", - "\n", - "MeasuredDats uses **Scikit Learn** _pipelines_. The transformers are corrai.custom_transfomers objects, they inherit from scikit base class BaseEstimator and TransformerMixin. These transformer ensure that Pandas DataFrame with DateTimeIndex are used through the process\n", - "\n", - "You can list the transformers keys from corrai.measure.Transformer.\n", - "\n", - "Refer to corrai.transformers documentation to configure transformers arguments.\n", - "\n", - "The figure below describes a \"pipeline\" that apply a series of corrections to the dataset.\n", - "\n", - "\n", - "\n", - "4 successive transformations are applied: 2 category transformations, a common transformation and a resampling operation.\n", - "\n", - "The **categories** are specified using data_type_dict :\n", - "```\n", - "category_dict = {\n", - " \"Temperatures\": [\"temp_1\", \"temp_2\"],\n", - " \"Power\": [\"pow_1\"],\n", - " \"radiation\": [\"Sol_rad\"]\n", - " }\n", - "```\n", - "\n", - "The **category transformations** are specified using category_transformations :\n", - "```\n", - "category_transformations = {\n", - " \"Temperatures\":{\n", - " \"ANOMALIES\": [\n", - " [Transformer.DROP_THRESHOLD, {\"upper\": 40000, \"lower\": 0}],\n", - " [Transformer.DROP_TIME_GRADIENT, {\"upper_rate\": 5000, \"lower_rate\": 0}],\n", - " ],\n", - " },\n", - " \"Power\": {\n", - " \"ANOMALIES\": [\n", - " [Transformer.DROP_THRESHOLD, {\"upper\": 50, \"lower\": -2}],\n", - " [Transformer.DROP_TIME_GRADIENT, {\"upper_rate\": 5000, \"lower_rate\": 0}],\n", - " ],\n", - " \"PROCESS\": [\n", - " [Transformer.APPLY_EXPRESSION, {\"expression\": \"X / 1000\"}]\n", - " ],\n", - " },\n", - " \"radiation\": {}\n", - "},\n", - "```\n", - "\n", - "- The dictionary keys must match the category defined in category_dict\n", - "- For each category you can specify as much transformer as you want. Similar name must be given in each category if you want transformer to be used in the same _\"category transformation\"_ (eg. ANOMALIES)\n", - "- For each transformer a list of transformation is given. They are defined by a list with two elements [custom_transformer key, {custom transformer args}]\n", - "- If the category doesn't require any transformation, specify an empty dictionary\n", - "\n", - "The **common transformations** are specified using common_transformations:\n", - "\n", - "```\n", - "common_transformations={\n", - " \"COMMON\": [\n", - " [Transformer.INTERPOLATE, {\"method\": 'linear'}],\n", - " [Transformer.FILL_NA, {\"method\": 'bfill'}]\n", - " ]\n", - "}\n", - "```\n", - "- The dictionary keys are the names of the common transformers\n", - "- For each transformer, a list of transformations is given. They are defined by a list of two elements [custom_transformer key, {custom transformer args}]\n", - "\n", - "The **Resampler** is configured as follows:\n", - "```\n", - "resampler_agg_methods={\n", - " \"Temperatures\": AggMethod.MEAN\n", - "}\n", - "```\n", - "\n", - "- An optional key \"RESAMPLE\" may be given to specify the category aggregation method in case of resampling. By default, resampling method is _mean_. If you want \"mean\" for all categories, an empty dictionary may be specified (default value).\n", - "\n", - "\n", - "The **transformer list** :\n", - "Lastly you can specify the order of the transformations using transformers_list. For example transformers_list = [\"ANOMALIES\", \"COMMON\", \"RESAMPLER\", \"PROCESS\"]\n", - "- If transformer_list is left to None, transformers list hold all the category_transformers, than all the common transformers\n", - "- If \"RESAMPLER\" is not present in transformer_list, but a resampling_rule is provided, the \"RESAMPLER\" will automatically be added at the end of the transformers_list\n", - "- This list can be changed at all time\n", - "- You don't have to use all the transformers\n" - ] - }, - { - "cell_type": "markdown", - "id": "6256a228", - "metadata": {}, - "source": [ - "Below is an example for the dataset we just loaded." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "8c38cb7b", - "metadata": {}, - "outputs": [], - "source": [ - "from corrai.measure import MeasuredDats, Transformer, AggMethod" - ] - }, - { - "cell_type": "markdown", - "id": "36f28fed", - "metadata": {}, - "source": [ - "You can print the list of all available transformers from the class Transformer (using enums) to help you configure MeasuredDats. More information on these transformers are available in the corrai.measure.py script." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "8470f497", - "metadata": {}, - "outputs": [], - "source": [ - "list(Transformer)" - ] - }, - { - "cell_type": "markdown", - "id": "b45eaeeb", - "metadata": {}, - "source": [ - "Likewise, you can check the available aggregation methods from the class AggMethod." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c837dfe0", - "metadata": {}, - "outputs": [], - "source": [ - "list(AggMethod)" - ] - }, - { - "cell_type": "markdown", - "id": "0f879ab8-d831-40cc-a183-ff6216cd3d76", - "metadata": {}, - "source": [ - "Let's first instanciate MeasuredDats, providing :\n", - "- the dataframe\n", - "- a dictionary mapping data categories to column name\n", - "- transformations (common transformation and resamplers)\n", - "- and a transformers list, provinding the order of transformation (RESAMPLE at the end by Defaults).s" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ee32ac52", - "metadata": {}, - "outputs": [], - "source": [ - "my_data = MeasuredDats(\n", - " data = raw_data,\n", - " category_dict = {\n", - " \"temperatures\": [\n", - " 'T_Wall_Ins_1', 'T_Wall_Ins_2', 'T_Ins_Ins_1', 'T_Ins_Ins_2',\n", - " 'T_Ins_Coat_1', 'T_Ins_Coat_2', 'T_int_1', 'T_int_2', 'T_ext', 'T_garde'\n", - " ],\n", - " \"illuminance\": [\"Lux_CW\"],\n", - " \"radiation\": [\"Sol_rad\"]\n", - " },\n", - " category_transformations = {\n", - " \"temperatures\": {\n", - " \"ANOMALIES\": [\n", - " [Transformer.DROP_THRESHOLD, {\"upper\": 100, \"lower\": -20}],\n", - " [Transformer.DROP_TIME_GRADIENT, {\"upper_rate\": 2, \"lower_rate\": 0}]\n", - " ],\n", - " },\n", - " \"illuminance\": {\n", - " \"ANOMALIES\": [\n", - " [Transformer.DROP_THRESHOLD, {\"upper\": 1000, \"lower\": 0}],\n", - " ],\n", - " },\n", - " \"radiation\": {\n", - " \"ANOMALIES\": [\n", - " [Transformer.DROP_THRESHOLD, {\"upper\": 1000, \"lower\": 0}],\n", - " ],\n", - " }\n", - " },\n", - " common_transformations={\n", - " \"COMMON\": [\n", - " [Transformer.INTERPOLATE, {\"method\": 'linear'}],\n", - " [Transformer.BFILL, {}],\n", - " ]\n", - " },\n", - " transformers_list=[\"ANOMALIES\", \"COMMON\"]\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "80ee27fd", - "metadata": {}, - "source": [ - "Note that transformers_list could have been left to None. Here, we are applying the _anomalies_ transformer, then the _common_ transformer." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "98dc6d7b", - "metadata": {}, - "outputs": [], - "source": [ - "my_data.get_pipeline()" - ] - }, - { - "cell_type": "markdown", - "id": "e32c1647-b898-434d-92fc-280e4bad3f1e", - "metadata": {}, - "source": [ - "#### 1. Identify anomalies" - ] - }, - { - "cell_type": "markdown", - "id": "7c54297b", - "metadata": {}, - "source": [ - "The plot method can be used to plot the data. Provide a list to the argument cols to specify the entries you want to plot. A new y axis will be created for each data type. Choose begin and end dates for specific days." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "8ac115e7", - "metadata": { - "pycharm": { - "is_executing": true - } - }, - "outputs": [], - "source": [ - "my_data.plot(\n", - " cols=['T_Wall_Ins_1', 'Sol_rad', 'Lux_CW'],\n", - " begin='2018-04-15',\n", - " end='2018-04-18',\n", - " title='Plot uncorrected data',\n", - " marker_raw=True,\n", - " line_raw=False,\n", - " plot_raw = True,\n", - " resampling_rule='15min'\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "70e79b06", - "metadata": {}, - "source": [ - "Plotted data are the corrected_data obtained after going through the pipeline described in the MeasuredDats object's transformers_list. You can specify an alternative transformers_list using the plot function argument transformers_list.\n", - "\n", - "Use plot_raw=True to display raw data. This is useful to assess the impact of the correction and of the resampling methods." - ] - }, - { - "cell_type": "markdown", - "id": "efea3932", - "metadata": {}, - "source": [ - "\n", - "The get_missing_values_stats method gives information on the amount of missing values.\n", - "You can get it for raw, corrected or partially corrected data depending on the transformers specified in transformers_list.\n", - "\n", - "Let's have a look for raw data. We specify an empty transformers_list that create a pipeline with a single Identity transformer." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "26c4db7c", - "metadata": {}, - "outputs": [], - "source": [ - "my_data.get_missing_value_stats(transformers_list=[])" - ] - }, - { - "cell_type": "markdown", - "id": "e555a612-947e-4cf0-b6a6-a3467c877062", - "metadata": {}, - "source": [ - "Before correction, the journal shows that ~2% of the data is missing for the temperature sensor and ~3% for external temperature, \"garde\" temperature and solar radiation. It corresponds to data having a timestamp, but with missing value. In this specific case, this is not related to sensors errors. 2 distinct acquisition device were used to perform the measurement. The merging of the data from the two devices created troubles in timestamp " - ] - }, - { - "cell_type": "markdown", - "id": "344b3561-ea95-436d-85f6-5d2487224f69", - "metadata": {}, - "source": [ - "Now let's apply the ANOMALIES transformer to delete invalid data according to the specifications." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "386abd88", - "metadata": {}, - "outputs": [], - "source": [ - "my_data.get_missing_value_stats([\"ANOMALIES\"])" - ] - }, - { - "cell_type": "markdown", - "id": "a08fd878", - "metadata": {}, - "source": [ - "It looks like the applied corrections removed several data.\n", - "For example, the sensors measuring the cell internal temperature have now up to __4.6%__ of missing data.\n", - "\n", - "Few corrections were applied to the outside temperature sensor.\n", - "\n", - "The journal of correction holds further information on the gaps of data.\n", - "For example, we might want to know more about the missing values of T_int_1." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f0a01c87", - "metadata": {}, - "outputs": [], - "source": [ - "my_data.get_gaps_description(cols=[\"T_int_1\"], transformers_list=[\"ANOMALIES\"])" - ] - }, - { - "cell_type": "markdown", - "id": "8c0b3f59", - "metadata": {}, - "source": [ - "- There are 11220 gaps\n", - "- 75% of these gaps do not exceed 1 timestep (~1min)\n", - "- The longest is 1h\n", - "\n", - "It is also possible to \"aggregate\" the gaps to know when at least one of the data is missing." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "724533ce", - "metadata": {}, - "outputs": [], - "source": [ - "my_data.get_gaps_description(transformers_list=[\"ANOMALIES\"])[\"combination\"]" - ] - }, - { - "cell_type": "markdown", - "id": "61c767b8", - "metadata": {}, - "source": [ - "- There are 28007 gaps (~10% of the dataset).\n", - "- The size of 75% of these gaps do not exceed 2 minutes\n", - "- The biggest gap lasts about 1 hour\n", - "\n", - "There is not a lot of difference. It looks like the values are missing at the same timestamps. This is good news, it means that there are a lot of periods with all data available" - ] - }, - { - "cell_type": "markdown", - "id": "1275e599", - "metadata": {}, - "source": [ - "The plotting method plot_gaps can be used to visualize where the gap(s) happened.\n", - "\n", - "This dataset holds a lot of values, hence we only plot the input 'T_int_1' here, as it is supposed to have the more gaps.\n", - "\n", - "We are interested in gaps lasting more than 15 minutes." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "88573e39", - "metadata": {}, - "outputs": [], - "source": [ - "import datetime as dt\n", - "my_data.plot_gaps(\n", - " cols=['T_Wall_Ins_1', 'Sol_rad', 'Lux_CW'],\n", - " begin='2018-03-25',\n", - " end='2018-03-25',\n", - " gaps_timestep=dt.timedelta(minutes=15),\n", - " transformers_list=[\"ANOMALIES\"]\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "0f229b95", - "metadata": {}, - "source": [ - "There seem to be only 1 gap greater than 15 minutes, it happens the 2018-03-25 between ~02:00 and ~3:00.\n", - "This is the gap we identified in the correction journal." - ] - }, - { - "cell_type": "markdown", - "id": "4e5a7b15", - "metadata": {}, - "source": [ - "#### 2- Missing data interpolation\n", - "In our example, the same method is used to fill the gaps for all categories.\n", - "It is described in common_transformations\n", - "Below is the object transformers list:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "24a391a1", - "metadata": {}, - "outputs": [], - "source": [ - "my_data.transformers_list" - ] - }, - { - "cell_type": "markdown", - "id": "f799dc35", - "metadata": {}, - "source": [ - "To get the corrected data, we just need to call get_corrected_data method, with default arguments" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b28d0e1c", - "metadata": {}, - "outputs": [], - "source": [ - "my_data.get_corrected_data()" - ] - }, - { - "cell_type": "markdown", - "id": "c68c870f", - "metadata": {}, - "source": [ - "We can check the effect of the transformation :" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "5ca472de", - "metadata": {}, - "outputs": [], - "source": [ - "my_data.get_missing_value_stats()" - ] - }, - { - "cell_type": "markdown", - "id": "16ccf064", - "metadata": {}, - "source": [ - "Wow, perfect dataset !\n", - "\n", - "**Be careful** Having zero missing data doesn't imply zero issues; a dataset that was initially of poor quality remains so even if gaps are filled by copying values or interpolating between (_what seems to be_) valid points.\n" - ] - }, - { - "cell_type": "markdown", - "id": "6b3effa3", - "metadata": {}, - "source": [ - "#### 3- Reducing the dataset size\n", - "As we noted earlier, 1 min timestep is too small.\n", - "Regarding the physical phenomenon involved here, we could say that 5min is ok.\n", - "Let's have a look at the corrected data versus the raw data.\n", - "We select a period around the gap we identified (from the 2018-03-24 to the 2018-03-26)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6feafd9f", - "metadata": {}, - "outputs": [], - "source": [ - "my_data.plot(\n", - " title=\"Raw data versus corrected data\",\n", - " cols=['T_int_1'],\n", - " begin='2018-03-24 00:00:00',\n", - " end='2018-03-26 05:00:00',\n", - " plot_raw=True,\n", - " plot_corrected=True,\n", - " line_raw=False,\n", - " marker_corrected=False,\n", - " resampling_rule='5min'\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "b69ab339", - "metadata": {}, - "source": [ - "On the above graph, you can see the effects of mean resampling, that diminishes the number of points and smooths out the data.\n", - "\n", - "The gap have been filled out, using linear interpolation at the required timestep. It is important to compare your data before and after applying the correction methods. Resampling with a large timestep can lead to a loss of information. \n", - "Depending on the modelkkbg aoori\n", - "For modelling considerations, let's select two datasets containing days with " - ] - }, - { - "cell_type": "markdown", - "id": "b405bfcb-db04-4d3a-bda6-e18383db8138", - "metadata": {}, - "source": [ - "# Conclusion" - ] - }, - { - "cell_type": "markdown", - "id": "4d1f0ff1-0c9d-4936-9482-ba865da78c0e", - "metadata": {}, - "source": [ - "Each dataset selection should be guided by the specific requirements of the modeling approach, ensuring that the data provided fits the analytical framework but also enhances the model’s predictive capability and applicability to real-world scenarios.\n", - " \n", - "Datasets characterized by prolonged periods of stable external conditions such as consistent temperatures without significant fluctuations, will be prioritized for steady-state modeling. For dynamic modeling applications — where system responses to varying inputs over time are crucial — datasets with diverse environmental profiles(varying solar radiation intensities across different days and seasons) should be preferred. \n" - ] - }, - { - "cell_type": "markdown", - "id": "3cc951dd-1f96-4622-bbd9-e47402187ae7", - "metadata": {}, - "source": [ - "For instance, here is a plot during winter, with rather cold outside temperatures, and days with different profiles of solar radiations (from low to high values)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "403eb4b0-9073-473a-8f27-8c709b284604", - "metadata": {}, - "outputs": [], - "source": [ - "my_data.plot(\n", - " title=\"Raw data versus corrected data\",\n", - " cols=['T_int_1', \"T_ext\", \"Sol_rad\" ],\n", - " begin='2018-01-09 00:00:00',\n", - " end='2018-01-11 23:00:00',\n", - " plot_raw=True,\n", - " plot_corrected=True,\n", - " line_raw=False,\n", - " marker_corrected=False,\n", - " resampling_rule='5min'\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "01c26452-f61d-4aaa-8ea2-f3bad150261c", - "metadata": {}, - "source": [ - "Alternatively, if we aim to characterize the wall according to ISO 9869-1:2023, which mandates steady temperatures during nighttime and a significant temperature difference between indoor and outdoor environments (>10 or 15 K), the measurements from the following nights would be pertinent." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "fa89101b-c392-4bea-a12e-0adb3218be5a", - "metadata": {}, - "outputs": [], - "source": [ - "my_data.plot(\n", - " title=\"Raw data versus corrected data\",\n", - " cols=['T_int_1', \"T_ext\", \"Sol_rad\" ],\n", - " begin='2018-03-25 00:00:00',\n", - " end='2018-03-28 23:00:00',\n", - " plot_raw=True,\n", - " plot_corrected=True,\n", - " line_raw=False,\n", - " marker_corrected=False,\n", - " resampling_rule='5min'\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "d2442c67-e045-4cfd-baa1-5a45a673fb30", - "metadata": {}, - "source": [ - "For future use, the corrected data during a period of interest can be saved with:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3198769e-b58f-4813-a860-ceca00292e88", - "metadata": {}, - "outputs": [], - "source": [ - "my_data.get_corrected_data(resampling_rule=\"5min\").loc[\"2018-01-09\":\"2018-01-11\"].to_csv(\n", - " \"resources/3day_study_df.csv\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c59334ae-6e1d-48bc-a685-d5284cace375", - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git "a/tutorials/SA and Identification example_Modelica & python model of a green fa\303\247ade.ipynb" "b/tutorials/SA and Identification example_Modelica & python model of a green fa\303\247ade.ipynb" deleted file mode 100644 index 810c3e7..0000000 --- "a/tutorials/SA and Identification example_Modelica & python model of a green fa\303\247ade.ipynb" +++ /dev/null @@ -1,2248 +0,0 @@ -{ - "cells": [ - { - "metadata": {}, - "cell_type": "markdown", - "source": "This tutorial can be found and ran in the GITHUB libray corrAI: https://github.com/BuildingEnergySimulationTools/corrai", - "id": "fe658bc78054a398" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "---\n", - "# Characterisation method tutorial" - ], - "id": "d62323ea2b70844" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "The aim of this tutorial is to provide a complete workflow for material physical properties identification of a reference wall using **Modelitool** (modelica simulator) and **CorrAI** models by following these steps:\n", - "\n", - "1. **Measurement import and verification**:\n", - " - We will import measurement dataframe into the notebook and check measurement before selecting an analysis period.\n", - " \n", - "2. **Model creation/import**:\n", - " - We will either create a new mathematical model or impor model (FMU, openModelica).\n", - "\n", - "3. **Performing sensitivity analysis**:\n", - " - Sensitivity analysis will be conducted to determine how different input variables affect the output of the model. This step helps in identifying the most influential parameters in the model.\n", - "\n", - "4. **Model identification**:\n", - " - Using the results from the sensitivity analysis, we will identify the parameters that need to be adjusted to improve the model's accuracy. This involves fitting the model to the data and refining its parameters.\n", - "\n", - "---" - ], - "id": "fc9dbe513aaef7ed" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "# Introduction", - "id": "4ed710c97edd95e1" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "## Use case presentation\n", - "\n", - "A \"real-scale\" test bench is used. The **O3BET** (or in this example BEF test bench in Anglet, France) offers experimental conditions to evaluate building façade solutions. Heat exchanges in a cell are restricted on five of its faces, while the sixth face is dedicated to the tested solution. Internal temperature and humidity conditions can be controlled or monitored. External conditions, including temperatures and solar radiation, are measured.\n", - "\n", - "The tested technology here is a green façade, coupled wiht insulated panels. The experimental setup is presented in the following picture: one cell is equiped with the technology (right one) and another serves as a reference (insulation only).\n", - "\n", - "| Figure : Pictures of reference wall and Urban Canopee installation |\n", - "| :---: |\n", - "| |\n", - "\n", - "Sensors (heatflux density meters, thermocouples, RTD) are positioned in several parts: in the middle of insulation panels, between insulation and concrete layers, between leaves, in substrates, indoor. Climatic conditions (external temperature, incident solar radiation) are also monitored.\n", - "\n", - "\n", - "| Figure : Sensors installation scheme |\n", - "| :---: |\n", - "| | \n", - "\n", - "- Measure campaign spans from april 2024 to october 2024\n", - "- Acquisition timestep is 60 secondes minimum." - ], - "id": "49fdc09f8bf34966" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "## Identification framework\n", - "The following framework is proposed to identify the **REFERENCE** wall thermal conductivity, and provides:\n", - "- Physical model description using Python\n", - "- Sensitivity analysis to identify materials properties which have an influence on the discrepancy between model outputs and measured phenomenon\n", - "- Wall thermal conductivity identification using optimization algorithm" - ], - "id": "eb68fb42d92f1785" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "# 1. Measurement import and verification\n", - "\n", - "First, let's load some generic libraries (pandas, Path). We can also define the function datetime_to_seconds, which will be later used to convert datetimes to seconds.\n", - "\n", - "\n", - "Then we can load the reference cell measurement data on python that will be used as boundary conditions. Note that the data loaded here should be cleaned beforhand (see Tutorial **\"MeasuredDat example\"** if needed)" - ], - "id": "cad4938a77f4a783" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "import pandas as pd\n", - "import os\n", - "import plotly.express as px\n", - "from pathlib import Path\n", - "pd.options.mode.chained_assignment = None # default='warn'" - ], - "id": "77c679d36151b736", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "import datetime as dt\n", - "\n", - "def datetime_to_seconds(index_datetime):\n", - " time_start = dt.datetime(index_datetime[0].year, 1, 1, tzinfo=dt.timezone.utc)\n", - " new_index = index_datetime.to_frame().diff().squeeze()\n", - " new_index.iloc[0] = dt.timedelta(\n", - " seconds=index_datetime[0].timestamp() - time_start.timestamp()\n", - " )\n", - " sec_dt = [elmt.total_seconds() for elmt in new_index]\n", - " return list(pd.Series(sec_dt).cumsum())" - ], - "id": "2f8d425878b94320", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": "TUTORIAL_DIR = Path(os.getcwd()).as_posix()", - "id": "59473d207b322155", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "reference_df = pd.read_csv(\n", - " Path(TUTORIAL_DIR) / \"resources/tuto_data_SA.csv\",\n", - " index_col=0,\n", - " sep=\";\",\n", - " decimal=\".\",\n", - " parse_dates=True\n", - ")" - ], - "id": "6d01322236314e2b", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": "reference_df.head()", - "id": "8a81862b0c17b1b4", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": "reference_df.isna().any().any()", - "id": "5a3e4bb69c962f76", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "There seems to be no missing data. Let's first plot all data to check if nothing stands out. Although data was cleaned (supposedly, some measurement errors and irregularities might have been missed in the process.", - "id": "3da1cce8d7320cb5" - }, - { - "metadata": {}, - "cell_type": "code", - "source": "px.line(reference_df)", - "id": "bed89b439bf7437", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "Here we can see that one of the temperature sensors installed in the insulation panels stopped measuring on september 7th. Only T_ins_2 will be used as the insulation temperature.", - "id": "c579c990290e894c" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "reference_df.loc[:,\"T_ins\"] = reference_df['T_ins_2'] # middle of insulation temperature\n", - "reference_df.loc[:,\"T_int\"] =reference_df[['Tint_1', 'Tint_2']].mean(axis=1) # indoor temperature\n", - "reference_df.loc[:,\"T_interface\"] = reference_df[['T_interface_1', 'T_interface_2']].mean(axis=1) # interface temperature, between insulation and concrete panels" - ], - "id": "b84fb1e0db1d3855", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "For modeling purpose, we convert the temperature from °C to Kelvin.", - "id": "afb9340941fec786" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "temperatures = [\n", - " 'T_ins',\n", - " 'T_int',\n", - " 'T_interface', \n", - " 'T_ext'\n", - "]\n", - "reference_df[temperatures] += 273.15" - ], - "id": "d29a52e89115cdd2", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "Also, for the first simulation, let's chose a short period with both sunny and cloudy days:", - "id": "6b4ea7c50fac69a0" - }, - { - "metadata": {}, - "cell_type": "code", - "source": "simulation_df = reference_df.loc[\"2024-09-04 00:00\":\"2024-09-09 00:00\"]", - "id": "342da2f0046b008a", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "# 2. Modeling approach and set-up", - "id": "ac2b9b75817d9ed5" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "## 2.1 Proposed model ", - "id": "8848934ec278aa7" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "For this example we propose a resistance/capacity approach. Based on electrical circuit analogy, each layer of the wall is modeled by two resistances and a capacity:\n", - "\n", - "\n", - "| Figure : RC model|\n", - "| :---: |\n", - "| | \n", - "\n", - "\n", - "Note that it is recommended to start with a **very simple** model (e.g., one resistance, one capacity with only Text and Tint as boundary conditions) and gradually make it more complex as you identify parameters. For the sake of this tutorial, the proposed model is already a bit detailed.\n", - "\n", - "\n", - "The following is a brief description of the thermal model:\n", - "\n", - "- Each wall layer is modeled by 2 thermal resistances and a capacity.\n", - " - Resistances to create a gradiant and better resolution of distribution of heat flow : $ R_1 = R_2 = \\frac{ep_{layer}}{lambda_{layer} \\times 2} $ \n", - " - Capacity in the middle of both our layers, representing its thermal mass and ability to store heat. : $ C = ep_{layer} \\times rho_{layer} \\times cap_{layer} $\n", - " \n", - "- Inside and outside convection/conduction transfers are model as a constant value thermal resistance.\n", - "\n", - "- Infrared transfers are considered :\n", - " - With the sky, with $ T_{sky} = 0.0552T_{ext}^{1.5} $ as the sky is a significant source of infrared radiation, especially at night. This radiation can have a considerable impact on the thermal behavior of the system, influencing both heating and cooling processes\n", - " - With the surrounding considered to be at $ T_{ext} $ as surroundings or environment also emit infrared radiation\n", - "\n", - "- Short wave solar radiation heat flux is computed $Sw_{gain} = Pyr \\times \\alpha_{coat} $ with $Pyr$ the measured solar radiation onthe wall (W/m²) and $\\alpha_{coat}$ the coating solar absorbtion coefficient.\n", - "\n", - "- Temperatures $ T_{ext}$ and $T_{int} $ are boundary conditions. $ T_{int}$ represents the temperature within the controlled environment of the system.\n", - "\n", - "Here are somes theoretical parameters for the model:" - ], - "id": "be34c58477a92f22" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "# Surface of the tested wall\n", - "S_wall = 7\n", - "\n", - "# Thickness of layer\n", - "ep_concrete = 0.200 #m\n", - "ep_ins = 0.140 #m\n", - "\n", - "# Conductivity of layer\n", - "lambda_concrete = 0.13 # (W/mK)\n", - "lambda_ins = 0.031 # W/(mK)\n", - "\n", - "# Density of layer\n", - "rho_concrete = 2400 # kg/m3 \n", - "rho_ins = 32.5 # kg/m3\"\n", - "\n", - "\n", - "sc_concrete = 880 # J/kg.K\n", - "sc_ins = 1000 # J/kg.K\n", - "\n", - "# solar paremetesr\n", - "alpha = 0.2 # absorption coefficient\n", - "epsilon = 0.8 # emissivity \n", - "fview = 0.5 # view factor of tested wall" - ], - "id": "536c528c698d62a", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "## 2.3 Define a simulator\n", - "\n", - "You can either load a Modelica model or FMU (using **Modelitool** library, see: https://github.com/BuildingEnergySimulationTools/modelitool), \n", - "or directly write a model on Python. For either, here are some in common parameters definition. \n", - "\n", - "For the model to work with corrAI and modelitool libraries (sensitivity analyses, optimisation, etc.), the model will be written as a class with :\n", - "- A parameter dictionary,\n", - "- Simulation options.\n", - "\n", - "How does it work:\n", - "\n", - "- Parameters: The wall's thermal resistances (R_ext, R_int, R_concrete, R_ins) and heat capacities (C_concrete, C_ins) are initialized with default values, but can be overridden by user inputs. Other physical parameters include the wall surface area (S_wall), view factor (fview), and material emissivity (epsilon).\n", - "- Dataframe: The model uses a dataframe input which includes external and internal temperatures (T_ext, T_int), time (time_sec), and solar radiation (Pyr from a pyranometer).\n", - "- Initial Conditions: The temperatures of the external surface, concrete, insulation, and interfaces are initialized based on the external and internal temperatures at the start.\n", - "- Time-Stepping: The simulation proceeds over time steps, updating temperatures at each layer.\n", - "- Radiative Heat Transfer: The model calculates radiative heat transfer between the wall surface and the sky, ambient, and direct solar radiation.\n", - "- Temperature Update: At each time step, temperatures are updated based on thermal resistances and capacitances of each layer using a finite difference approach." - ], - "id": "fa7b692f2ef788f9" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "import numpy as np\n", - "import pandas as pd\n", - "from corrai.base.model import Model\n", - "\n", - "class OpaqueWallSimple(Model):\n", - " def simulate(\n", - " self,\n", - " property_dict: dict,\n", - " simulation_options: dict,\n", - " simulation_kwargs: dict | None = None,\n", - " **kwargs,\n", - " ) -> pd.DataFrame:\n", - "\n", - " default_parameters = {\n", - " \"R_ext\": 0.005,\n", - " \"R_int\": 0.01,\n", - " \"R_concrete\": 0.10,\n", - " \"R_ins\": 0.32,\n", - " \"C_concrete\": 2.95e6,\n", - " \"C_ins\": 3.64e4,\n", - " \"alpha\": 0.2,\n", - " \"S_wall\": 7,\n", - " \"epsilon\": 0.4,\n", - " \"fview\": 0.5,\n", - " }\n", - " parameters = {**default_parameters, **property_dict}\n", - "\n", - " R_ext = parameters[\"R_ext\"]\n", - " R_int = parameters[\"R_int\"]\n", - " R_concrete = parameters[\"R_concrete\"]\n", - " R_ins = parameters[\"R_ins\"]\n", - " C_concrete = parameters[\"C_concrete\"]\n", - " C_ins = parameters[\"C_ins\"]\n", - " alpha = parameters[\"alpha\"]\n", - " S_wall = parameters[\"S_wall\"]\n", - " epsilon = parameters[\"epsilon\"]\n", - " fview = parameters[\"fview\"]\n", - "\n", - " sigma = 5.67e-8 # W/m^2/K^4\n", - "\n", - " df = simulation_options[\"dataframe\"]\n", - " time = df[\"time_sec\"].values\n", - " T_ext = df[\"T_ext\"].values\n", - " T_int = df[\"T_int\"].values\n", - " Q_rad = df[\"Pyr\"].values\n", - "\n", - " startTime = simulation_options.get(\"startTime\", time[0])\n", - " stopTime = simulation_options.get(\"stopTime\", time[-1])\n", - "\n", - " mask = (time >= startTime) & (time <= stopTime)\n", - " time = time[mask]\n", - " T_ext = T_ext[mask]\n", - " T_int = T_int[mask]\n", - " Q_rad = Q_rad[mask]\n", - "\n", - " # init\n", - " T_se = np.zeros(len(time))\n", - " T_concrete = np.zeros(len(time))\n", - " T_ins = np.zeros(len(time))\n", - " T_interface = np.zeros(len(time))\n", - " T_si = np.zeros(len(time))\n", - " T_sky = np.zeros(len(time))\n", - "\n", - " T_se[0] = T_ext[0]\n", - " T_concrete[0] = 299\n", - " T_ins[0] = T_int[0]\n", - " T_interface[0] = (T_ins[0] + T_concrete[0]) / 2\n", - " T_si[0] = T_int[0]\n", - " T_sky[0] = T_int[0]\n", - "\n", - " for t in range(1, len(time)):\n", - " dt = time[t] - time[t - 1]\n", - "\n", - " T_sky[t] = 0.0552 * (T_ext[t] ** 1.5)\n", - "\n", - " Q_rad_sky = epsilon * fview * sigma * (T_se[t - 1] ** 4 - T_sky[t] ** 4) * S_wall\n", - " Q_rad_amb = epsilon * fview * sigma * (T_se[t - 1] ** 4 - T_ext[t - 1] ** 4) * S_wall\n", - " Q_rad_dir = Q_rad[t - 1] * alpha * S_wall\n", - "\n", - " T_se[t] = (\n", - " T_ext[t - 1] / R_ext\n", - " + T_ins[t - 1] / (R_ins / 2)\n", - " + Q_rad_dir - Q_rad_sky - Q_rad_amb\n", - " ) / (1 / R_ext + 1 / (R_ins / 2))\n", - "\n", - " T_interface[t] = (\n", - " T_ins[t - 1] / (R_ins / 2) + T_concrete[t - 1] / (R_concrete / 2)\n", - " ) / (1 / (R_concrete / 2) + 1 / (R_ins / 2))\n", - "\n", - " T_si[t] = (\n", - " T_int[t - 1] / R_int + T_concrete[t - 1] / (R_concrete / 2)\n", - " ) / (1 / R_int + 1 / (R_concrete / 2))\n", - "\n", - " T_ins[t] = T_ins[t - 1] + dt / C_ins * (\n", - " (T_se[t] - T_ins[t - 1]) / (R_ins / 2)\n", - " + (T_interface[t] - T_ins[t - 1]) / (R_ins / 2)\n", - " )\n", - "\n", - " T_concrete[t] = T_concrete[t - 1] + dt / C_concrete * (\n", - " (T_interface[t] - T_concrete[t - 1]) / (R_concrete / 2)\n", - " + (T_si[t] - T_concrete[t - 1]) / (R_concrete / 2)\n", - " )\n", - "\n", - " # output\n", - " df_out = pd.DataFrame(\n", - " {\n", - " \"T_concrete\": T_concrete,\n", - " \"T_interface\": T_interface,\n", - " \"T_ins\": T_ins,\n", - " },\n", - " index=df.index[mask],\n", - " )\n", - " self.simulation_options = simulation_options\n", - " return df_out" - ], - "id": "ce77813f073f7e13", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "Datetime should be in second: we can use datetime_to_seconds function defined earlier. Moreover, data are in minutes, we should resample them to 5min samples.", - "id": "49afd017d15ef580" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "simulation_df.loc[:,\"time_sec\"] = datetime_to_seconds(simulation_df.index)\n", - "simulation_df_resample = simulation_df.resample(\"5min\").mean()" - ], - "id": "81614ca843787295", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "Now, let's define: \n", - "- simulation options, with starttime, endtime, and a dataframe for boundary conditions\n", - "- a dictionary, containing values for ou different parameters" - ], - "id": "946f30e15c17a2f" - }, - { - "metadata": {}, - "cell_type": "code", - "source": "second_index = datetime_to_seconds(simulation_df_resample.index)", - "id": "c4af9c61d72ec5cb", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "simulation_options_PYTH={\n", - " \"dataframe\":simulation_df_resample,\n", - " \"startTime\": second_index[0],\n", - " \"endTime\": second_index[-1], \n", - "}" - ], - "id": "71b2a7cf44a5deb4", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "parameter_dict_PYTH = {\n", - " \"R_ext\": 0.04/S_wall, \n", - " \"R_int\": 0.13/S_wall, \n", - " \"R_concrete\": 1 / (lambda_concrete / ep_concrete) / 2 / S_wall, \n", - " \"R_ins\": 1 / (lambda_ins / ep_ins) / 2 / S_wall, \n", - " \"C_ins\": rho_ins*ep_ins*S_wall*sc_ins, \n", - " \"C_concrete\": rho_concrete*ep_concrete*S_wall*sc_concrete, \n", - " \"alpha\": alpha, \n", - " \"S_wall\": S_wall, \n", - " \"epsilon\": epsilon,\n", - " 'fview': fview\n", - "}" - ], - "id": "58ae79a228160f5", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "We can now instantiate the model and run the simulation. ", - "id": "db06964b679af24" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "simu_PYTH = OpaqueWallSimple()\n", - "\n", - "init_res_PYTH = simu_PYTH.simulate(\n", - " property_dict=parameter_dict_PYTH,\n", - " simulation_options=simulation_options_PYTH\n", - ")" - ], - "id": "34845fea7fc10aad", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": "init_res_PYTH.head()", - "id": "549136e8960dc769", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "Let's compare results to measurement:", - "id": "8e2d5abe27a2cf9b" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "#renaming\n", - "copy_res = init_res_PYTH\n", - "copy_res.index = copy_res.index.tz_localize(None)\n", - "\n", - "copy_res = copy_res.rename(columns={\n", - " \"T_concrete\": \"T_concrete_PYTHON\",\n", - " \"T_interface\": \"T_interface_PYTHON\",\n", - " \"T_ins\": \"T_insulation_PYTHON\",\n", - "})" - ], - "id": "4d225f8a2dcdd2b6", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "measure_comp = pd.concat([\n", - " simulation_df_resample[[\"T_interface\", \"T_ins\"]], \n", - " copy_res[[\"T_interface_PYTHON\", \"T_insulation_PYTHON\" ]]], axis = 1)\n", - "\n", - "color_map = {\n", - " \"T_interface\": \"darkblue\", \n", - " \"T_interface_PYTHON\": \"blue\", \n", - " \"T_ins\": \"darkgreen\", \n", - " \"T_insulation_PYTHON\": \"green\" \n", - "}\n", - "\n", - "fig = px.line(measure_comp)\n", - "\n", - "for trace in fig.data:\n", - " trace_name = trace.name\n", - " if trace_name in color_map:\n", - " trace.line.color = color_map[trace_name]\n", - " \n", - "fig.show()" - ], - "id": "374575bc5edf48b4", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "Not good. A sensitivity analysis should be performed to rank the parameters by order of influence on the error between measured temperature and model prediction.\n", - "id": "ec5e69a99a60a6dd" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "# 3. Sensitivity analysis\n", - "It is very important to know how our defined parameters have an influence on the model prediction. Therefore, we use a sensitivity analysis to \"rank\" the parameter by order of influence\n", - "on the model error. " - ], - "id": "ccdfb5cca982ee18" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "## 3.1. Error function\n", - "The chosen error function is the CV_RMSE. The formula for CV_RMSE is given by:\n", - "\n", - "$$\n", - "CV\\_RMSE = \\frac{RMSE}{\\bar{y}}\n", - "$$\n", - "\n", - "Where:\n", - "- *RMSE* is the root mean squared error,\n", - "- *bar{y}* is the mean of the observed values.\n", - "\n", - "The RMSE is calculated as:\n", - "\n", - "$$\n", - "RMSE = \\sqrt{\\frac{1}{n} \\sum_{i=1}^{n} (y_i - \\hat{y}_i)^2}\n", - "$$\n", - "\n", - "Where:\n", - "- *n* is the number of observations,\n", - "- *y_i* is the observed value for the \\( i \\)-th observation,\n", - "- *hat{y}_i* is the predicted value for the \\( i \\)-th observation.\n", - "\n", - "The CV_RMSE measures the variation of the RMSE relative to the mean of the observed values. It provides a standardized measure of the error, which can be useful for comparing the performance of different models across different datasets.\n", - "Here, we can chose the error function as the CV_RMSE between measured temperature(s) and model prediction." - ], - "id": "e46bcafc7717aa5a" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "## 3.2. Tested parameters\n", - "\n", - "The chosen parameters are all the model parameters θ=(R_concrete, R_ins, C_ins, C_concrete, alpha, epsilon, R_ext, R_int ).\n", - "Each decision variable of the optimization problem must be described using a `Parameter` object.\n", - "A parameter specifies:\n", - "* `name` — The variable name (string, must be unique within the problem).\n", - "* `ptype` — Variable type, one of:\n", - " * `\"Real\" `— Continuous real number\n", - " * `\"Integer\"` — Discrete integer\n", - " * `\"Binary\"` — Boolean, domain {False, True} (set automatically if no domain is given)\n", - " *` \"Choice\"` — Categorical variable with a fixed set of discrete options\n", - "* `Domain definition` — Choose exactly one of:\n", - " * ` interval=(lo, hi) `— Lower and upper bounds (for \"Real\" and \"Integer\", optional for \"Binary\" if you want (0,1))\n", - " * ` values=(v1, v2, …)` — Explicit list/tuple of allowed values (for \"Choice\", and optionally for \"Integer\", \"Real\", or \"Binary\")\n", - "*` Optional fields`:\n", - " * `init_value` — Initial value (or tuple/list of initial values for batch runs); must be within the defined domain\n", - " * `relabs` — `\"Absolute\"` or `\"Relative\"` (or a boolean flag, depending on usage in your model)\n", - " * `model_property` — String or tuple specifying the corresponding property in the simulation/model\n", - " * `min_max_interval` — Optional extra bounds for use in analysis/validation" - ], - "id": "da0b289ff76e21de" - }, - { - "metadata": {}, - "cell_type": "code", - "source": "from corrai.base.parameter import Parameter", - "id": "ea7f3dc574b396f3", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "frac_p = 0.20 # 20%\n", - "frac_conv = 0.40 # 40%\n", - "\n", - "params = [\n", - " Parameter(\n", - " name='R_concrete',\n", - " interval=(1-frac_p, 1+frac_p),\n", - " ptype=\"Real\",\n", - " init_value=1 / (lambda_concrete / ep_concrete) / 2 / S_wall,\n", - " relabs=\"Relative\",\n", - " model_property='R_concrete',\n", - " ),\n", - " Parameter(\n", - " name='R_ins',\n", - " interval=(1-frac_p, 1+frac_p),\n", - " init_value=1 / (lambda_ins / ep_ins) / 2 / S_wall,\n", - " relabs=\"Relative\",\n", - " ptype=\"Real\",\n", - " model_property='R_ins',\n", - " ),\n", - " Parameter(\n", - " name='C_ins',\n", - " interval=(1-frac_p, 1+frac_p),\n", - " init_value=rho_ins*ep_ins*S_wall*sc_ins,\n", - " relabs=\"Relative\",\n", - " ptype=\"Real\",\n", - " model_property='C_ins',\n", - " ),\n", - " Parameter(\n", - " name='C_concrete',\n", - " interval=(1-frac_p, 1+frac_p),\n", - " init_value=rho_concrete*ep_concrete*S_wall*sc_concrete,\n", - " relabs=\"Relative\",\n", - " ptype=\"Real\",\n", - " model_property='C_concrete',\n", - " ),\n", - " Parameter(\n", - " name='alpha',\n", - " interval=(0.1, 0.6),\n", - " ptype=\"Real\",\n", - " model_property='alpha',\n", - " ),\n", - " Parameter(\n", - " name='epsilon',\n", - " interval=(0.2, 0.9),\n", - " ptype=\"Real\",\n", - " model_property='epsilon',\n", - " ),\n", - " Parameter(\n", - " name='R_ext',\n", - " init_value= 0.04/S_wall,\n", - " interval=(1-frac_conv, 1+frac_conv),\n", - " ptype=\"Real\",\n", - " relabs=\"Relative\",\n", - " model_property='R_ext',\n", - " ),\n", - " Parameter(\n", - " name='R_int',\n", - " init_value= 0.13/S_wall,\n", - " interval=(1-frac_conv, 1+frac_conv),\n", - " ptype=\"Real\",\n", - " relabs=\"Relative\",\n", - " model_property='R_int',\n", - " ),\n", - "]" - ], - "id": "8f8e2bab5b874374", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "## 3.3. Problem description\n", - "We can now load from `corrai.sensitivity` module a sensitivity method.\n", - "\n", - "*Note: for now, only SOBOL, FAST, RBD_FAST, \n", - "and MORRIS methods are implemented.*" - ], - "id": "64dfc055e9c59627" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "### 3.1.A SOBOL method", - "id": "3c70d720457ec5a9" - }, - { - "metadata": {}, - "cell_type": "code", - "source": "from corrai.sensitivity import SobolSanalysis", - "id": "f8e6cd080e0732af", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "sa_study = SobolSanalysis(\n", - " parameters=params,\n", - " model=simu_PYTH,\n", - " simulation_options=simulation_options_PYTH,\n", - ")" - ], - "id": "cba431b6fc562b4e", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "The method add sample draw a sample of parameters to be simulated. Each method has its sampling method. Please see SALib documentation for further explanation (https://salib.readthedocs.io/en/latest/index.html)\n", - "\n", - "Note that:\n", - "- Convergence properties of the Sobol' sequence is only valid if\n", - " `N` (100) is equal to `2^n`.\n", - " N (int) – The number of samples to generate. Ideally a power of 2 and <= skip_values.\n", - "- Convergence properties of the Fast' method is only valid if sample size N > 4M^2 (M=4 by default)" - ], - "id": "2c00547d792def6c" - }, - { - "metadata": {}, - "cell_type": "code", - "source": "sa_study.add_sample(N=2**6, n_cpu=-1)", - "id": "4fe53d7e43bada10", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "First, we can first take a look on the parallel coordinate plot of all parameter values and one of the simulation outputs aggregated according to a chosen aggregation method:", - "id": "23e24bd8271d28e5" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "sa_study.plot_pcp(\n", - " aggregations={\"T_ins\": np.mean},\n", - ")" - ], - "id": "16a854b3cfe48368", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "Now, let's perform a **Sobol sensitivity analysis** on the following indicator: cv_rmse on `T_ins` .\n", - "The `analyze()` method computes Sobol indices based on a chosen performance metric—in this case, the **cv_rmse**—by comparing model predictions of `T_ins` against the provided reference time series (measured or baseline data)." - ], - "id": "ca2c82746fa97244" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "sa_study.analyze(\n", - " indicator=\"T_ins\",\n", - " method=\"cv_rmse\",\n", - " reference_time_series=simulation_df_resample[\"T_ins\"],\n", - ")" - ], - "id": "990ac0bb9acddf35", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "The `plot_bar()` method displays the Sobol sensitivity indices as a bar chart, allowing a quick comparison of parameter importance.\n", - "It uses the aggregated results from the `analyze()` step (e.g., based on `cv_rmse`) and shows first-order indices for each parameter, making it easy to identify which parameters most influence the chosen performance metric.\n", - "\n", - "The sum of all the indices should be close to 1. Also, the mean confidence interval should be very low. In that case, results of the sensitivity analysis can be considered as robust." - ], - "id": "f5c2a855e8324d1d" - }, - { - "metadata": {}, - "cell_type": "code", - "source": "sa_study.plot_bar(\"T_ins\")", - "id": "3305230bc9a51f3d", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "The parameter epsilon appears to have the most influence on the model errors. This impact is calculated on the overall error, but will depend on the time of the day. Let's observe the parameters' impact dynamically with a 15minutes frequency.\n", - "\n", - "The `plot_dynamic_metric()` method extends the analysis to a frequency view of the selected performance metric.\n", - "Here, we apply it to `T_ins` using the `cv_rmse` metric, comparing model output against the `reference_time_series` at regular intervals.\n", - "\n", - "Key arguments:\n", - "- **`freq`**: Controls the temporal resolution for aggregation (e.g., `freq=\"2h\"` computes sensitivity indices every two hours). This enables tracking how parameter influence changes over time.\n", - "- **`method`**: The metric used for comparison (here, `cv_rmse` between model predictions and the reference).\n", - "- **`reference_time_series`**: The baseline data against which each simulation is evaluated.\n", - "\n", - "This approach is particularly useful in dynamic systems, where parameter importance may vary significantly across different time periods.\n", - "\n", - "*Calculation can take time according to the number of simulations and frequency of data.*\n" - ], - "id": "e53ad52d3a6e9896" - }, - { - "metadata": {}, - "cell_type": "code", - "source": "sa_study.plot_dynamic_metric(\"T_ins\", freq=\"2h\", method=\"cv_rmse\", reference_time_series=simulation_df_resample[\"T_ins\"] )", - "id": "fd866b363245ec8", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "### 3.1.B MORRIS method", - "id": "a31e489e8d7ee834" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "Note that a first screening using MORRIS method (as well as `RBD-FAST` and `RBD` methods) could have been performed.", - "id": "204bb3136c2d6d5b" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "For MORRIS method, two indices, µj* for the mean of the absolute values of these effects and σj for the standard deviation of these effects, are calculated as follows:\n", - "\n", - "$$\n", - "mu_{j}^{*} = \\frac{1}{r} \\sum_{i=1}^{r} E_{ij}\n", - "$$\n", - "\n", - "$$\n", - "sigma_{j} = \\sqrt{\\frac{1}{r-1} \\sum_{i=1}^{r} (E_{ij} - \\mu_{j}^{*})^2}\n", - "$$" - ], - "id": "bd08f44deb200457" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "from corrai.sensitivity import MorrisSanalysis\n", - "\n", - "sa_study = MorrisSanalysis(\n", - " parameters=params,\n", - " model=simu_PYTH,\n", - " simulation_options=simulation_options_PYTH,\n", - ")" - ], - "id": "71c9b5fa64936e4a", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": "sa_study.add_sample(N=3**2, n_cpu=-1)", - "id": "a9ef9dae1e1b2296", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "We can plot all simulations in one graph and compare the simulated internal temperature to measured T_int. Argument show_legends can be set to True if you want see associated parameters values.", - "id": "8ed249dc6495c87e" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "%load_ext autoreload\n", - "%autoreload 2" - ], - "id": "dace560c00468bda", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "from corrai.sampling import plot_sample\n", - "\n", - "plot_sample(\n", - " results=sa_study.results,\n", - " indicator=\"T_ins\",\n", - " reference_timeseries = simulation_df_resample[\"T_ins\"],\n", - " x_label=\"time\",\n", - " y_label=\"simulated temperature of T_ins [K]\",\n", - " show_legends=False\n", - ")" - ], - "id": "d04fbd553b2ec3a2", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "This graph can be very instructive as at some moments, simulations are far from measurements. It show that whatever the values of our parameters, it still does not fit reality: this is either due to a problem of measurement, or to our modeling approach (physical inconsistency, physical phenomenon not properly taken into account, etc.).\n", - "id": "69c09ae315745853" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "sa_study.evaluate(\n", - " model = simu_PYTH,\n", - " simulation_options=simulation_options_PYTH,\n", - ")" - ], - "id": "d068e11a7572ee9c", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "The highter $mu_{j}$, the more the parameter $j$ contributes to an uncertain output, and the higher $sigma_{j}$, the more pronounced the interaction effects between the model parameters are. Plotting $sigma_{j}$ against $mu_{j}^{}$ is often used to distinguish factors with negligible, linear, and/or interaction effects.", - "id": "3cb2bc12ba38b9af" - }, - { - "metadata": {}, - "cell_type": "code", - "source": "sa_study.plot_scatter(indicator=\"T_ins\")", - "id": "ccd6dc275855f2be", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "Besides this visual interpretation, it is also possible to calculate the Euclidean distance $d$ to the origin to obtain the total effect of the uncertain parameters:", - "id": "345e88c76ddfeebd" - }, - { - "metadata": {}, - "cell_type": "code", - "source": "sa_study.plot_bar(indicator=\"T_ins\")", - "id": "6a4ceb844b59ab20", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "Five parameters seem to have more impact on the error than other: $alpha$, $R_{ins}$, $epsilon$, $R_{ext}$, and $R_{concrete}$. \n", - "These results are consistant with SOBOL method." - ], - "id": "9a387a5d3c8c5cc1" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "## 3.4 Conclusion on sensitivity analysis\n", - "\n", - "The sensitivity analysis allows us to rank the influence of uncertain parameter\n", - "on an indicator. \n", - "\n", - "Results with Sobol are consistant with Morris. $epsilon$ and $alpha$ are the most influencial parameters, followed by $R_{ins}$, $R_{ext}$, and $R_{concrete}$.\n", - "\n", - "In the following section, we will see how to use corrai to identify the\n", - "optimal values for these parameters in order to fit the measurement." - ], - "id": "76b78d1d9e64eca" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "# 4. Identification\n", - "Now, we proceed to finding optimal values for these parameters by minimizing the coefficient of variation of root mean square error (cv_rmse) between one or several measured nodes, and one or several relevant outputs of our model.\n", - "\n", - "\n", - "### Step-by-Step Process\n", - "\n", - "1. **Define the model and parameters**: by defining `OpaqueWallSimple`(this we already did) and specifying the parameters to be identify. Each parameter includes a name, an interval of possible values, a type, and an initial value.\n", - "\n", - "2. **Instantiate an objective function**: We create an instance of the `ObjectiveFunction` class, providing the model, simulation options, list of parameters, and indicators. The `scipy_obj_function` method of `ObjectiveFunction` will be used as the objective function for optimization. This method calculates the cv_rmse for the given parameter values.\n", - "\n", - "3. **Perform optimization**: We use the `minimize_scalar` function from `scipy.optimize` to minimize the objective function. Different methods can be chosen for optimization, such as Brent, bounded, and golden.\n" - ], - "id": "4c84cb82b435f019" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "## 4.1. Objective function\n", - "In parameter optimization, we aim to adjust certain model parameters to minimize the difference between simulated and observed data. The objective function is a scalar function that quantifies this difference. In this case, the CV_RMSE (Coefficient of Variation of Root Mean Square Error) is used as a measure of how well the model output matches the reference measurements.\n", - "\n", - "**How the ObjectiveFunction Class Works** : The ObjectiveFunction class simplifies the process of optimizing model parameters by providing a structured way to:\n", - "\n", - "- Run simulations: For a given set of parameters, the model is simulated over the input data.\n", - "- Calculate error metrics (e.g., CV_RMSE): The model output is compared to the reference measurements, and a scalar error metric is calculated (such as the CV_RMSE).\n", - "Optimize: The class can then be used with optimization algorithms (like scipy.optimize or pymoo) to adjust the model parameters in order to minimize the error metric.\n", - "Attributes of the ObjectiveFunction Class\n", - "\n", - "To do so, the `ObjectiveFunction` class is designed to facilitate the optimization of model parameters using `scipy.optimize` or `pymoo` optimization methods by encapsulating the logic for simulation and indicator calculation. The `ObjectiveFunction` class takes a model, simulation options, a list of parameters to be calibrated, and a list of indicators as input. It provides methods to calculate the objective function, which can be used by optimization routines to find the optimal parameters.\n", - "\n", - "### Attributes\n", - "\n", - "- **model**: The model to be calibrated.\n", - "- **simulation_options**: A dictionary containing simulation options, including input data.\n", - "- **param_list**: A list of dictionaries specifying the parameters to be calibrated.\n", - "- **indicators_config**: A dictionary where the keys are the names of the indicators corresponding to the model outputs, and the values are either:\n", - " - An aggregation function to compute the indicator (e.g., np.mean, np.sum).\n", - " - A tuple (function, reference_data) if a comparison with reference data is required (e.g., sklearn.metrics.mean_squared_error, corrai.metrics.mae).\n", - "- **scipy_obj_indicator**: The name of the indicator used as the objective function for optimization with scipy.optimize. By default, it is the first key in indicators_config.\n", - "\n", - "Let's define an anlysis daterange and redefine simulation optiouns:" - ], - "id": "8fa01e0b0e27f2ef" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "feat_train = simulation_df_resample.loc[\"2024-09-04 00:00\":\"2024-09-07 00:00\"]\n", - "\n", - "second_index = datetime_to_seconds(feat_train.index)\n", - "\n", - "simulation_options_PYTH = {\n", - " \"dataframe\": feat_train,\n", - " \"startTime\": second_index[0],\n", - " \"endTime\": second_index[-1],\n", - "}" - ], - "id": "881eb3948326386e", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "## 4.2. One-dimensional optimization problem", - "id": "97d7597af7e451db" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "First, we can try scalar functions optimization from scipy (different methods: brent, boulded, golden ... see documentation on scipy website).\n", - "For Scypi, each objective function is minimized for optimization:\n", - "- Here we chose as indicators the temperature calculated and measured within the wall insulation. Note this could be another node (a heat flux densitiy, another temperature node).\n", - "- The identified parameter is $alpha$." - ], - "id": "323fe63a31424dcc" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "Let's define a reference dictionnary, setting observation and prediction to be used for the CV_RMSE calculation. (Here they have have the same name, so it can be confusing)", - "id": "984adbbfb7c96a81" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "from corrai.base.metrics import cv_rmse\n", - "\n", - "reference_dict = {\"T_ins\": (cv_rmse, feat_train[\"T_ins\"])}" - ], - "id": "ceaf13ac2715d351", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "We also need to set which parameter should be calibrated.", - "id": "d1cf9c7f458228e6" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "calibration_params = [Parameter(\n", - " name='epsilon',\n", - " interval=(0.1, 0.8),\n", - " ptype=\"Real\",\n", - " model_property='epsilon'\n", - ")]" - ], - "id": "8ca836679de47ae6", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "We can now instanciate an objective function, using `ObjectiveFunction`.", - "id": "7bb570cb25cc3c66" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "\n", - "from corrai.surrogate import ObjectiveFunction\n", - "\n", - "obj_func = ObjectiveFunction(\n", - " model=OpaqueWallSimple(),\n", - " simulation_options=simulation_options_PYTH,\n", - " parameters=calibration_params,\n", - " indicators_config=reference_dict,\n", - ")" - ], - "id": "ff18639556ad2235", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "The `minimize_scalar` function in `scipy.optimize` is used for scalar function minimization, specifically for one-dimensional optimization problems. This function finds the minimum value of a scalar function over a specified interval. The main methods are `Brent`, `bounded`, and `golden`.\n", - "\n", - "The function returns an optimization result object that contains information about the optimization process and the final solution.\n", - "\n", - "- **Brent** :The Brent method uses Brent’s algorithm, which combines a parabolic interpolation with the golden section search. This method does not require the interval bounds.\n", - "- **Golden** : Employs the golden section search method, which reduces the interval of uncertainty using the golden ratio. Simple and reliable for unimodal functions, but may be slower than Brent's method.\n", - "- **Bounded** : Restricts the search to the specified bounds using a combination of golden section search and parabolic interpolation.\n", - "Advantages: Ensures that the solution remains within the given bounds, making it ideal for constrained problems." - ], - "id": "9a24e457cc8ad243" - }, - { - "metadata": {}, - "cell_type": "code", - "source": "from scipy.optimize import minimize_scalar", - "id": "33c9a83c1b01e35", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "import warnings\n", - "warnings.filterwarnings('ignore', category=RuntimeWarning)\n", - "\n", - "result = minimize_scalar(\n", - " obj_func.scipy_obj_function, \n", - " bounds=obj_func.bounds[0],\n", - " method=\"Bounded\"\n", - ")\n", - "\n", - "result" - ], - "id": "78db962a6495b017", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "A solution is found with a value of 0.12 for alpha and 0.36 for the CV_RMSE. Let's check if the parameter value is close to the boundaries.", - "id": "d5092d877401ac2" - }, - { - "metadata": {}, - "cell_type": "code", - "source": "obj_func.bounds", - "id": "4cbf2fc9729e00c3", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "It is indeed not far from 0.1 but not a the limit. Let's now run the simulation using this parameter value and compare with the initial simulation.", - "id": "26f4a8819c02136f" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "parameter_names = [param.name for param in calibration_params]\n", - "parameter_dict1 = {param_name: result.x for i, param_name in enumerate(parameter_names)}\n", - "\n", - "result_optim = simu_PYTH.simulate(\n", - " property_dict=parameter_dict1,\n", - " simulation_options=simulation_options_PYTH\n", - ")" - ], - "id": "48dc0d58db1cfc5a", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "import plotly.graph_objects as go\n", - "\n", - "fig = go.Figure()\n", - "\n", - "fig.add_trace(go.Scatter(\n", - " x=init_res_PYTH.index,\n", - " y=simulation_df_resample[\"T_ins\"],\n", - " fill=None,\n", - " mode='lines',\n", - " line_color='green',\n", - " name=\"T_insulation - Measurement\"\n", - "))\n", - "\n", - "fig.add_trace(go.Scatter(\n", - " x=init_res_PYTH.index,\n", - " y=init_res_PYTH[\"T_ins\"],\n", - " fill=None,\n", - " mode='lines',\n", - " line_color='orange',\n", - " name=\"T_insulation - Initial results\"\n", - "))\n", - "\n", - "fig.add_trace(go.Scatter(\n", - " x=init_res_PYTH.index,\n", - " y=result_optim[\"T_ins\"],\n", - " fill=None,\n", - " mode='lines',\n", - " line_color='brown',\n", - " name=\"T_insulation - Optimization results\"\n", - "))\n", - "\n", - "\n", - "fig.update_layout(\n", - " title='Optimization vs. Measurement ',\n", - " xaxis_title='Date',\n", - " yaxis_title='Temperature [K]')\n", - "\n", - "fig.show()" - ], - "id": "2ae801d71552398c", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "Results are closer to measurements but still far off.", - "id": "77aa000a449d0e" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "## 4.3. Multi- objectives and parameters optimization", - "id": "679cbfeaa19bb4c4" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "Let's use Pymoo, integrated into the `Problem` class of `CorrAI` and multi-parameters optimization with multi_objectives.\n", - "\n", - "Note that :\n", - "- All **objectives are minimized**.\n", - "- All **constraints** must be provided in the form **g(x) ≤ 0** (inequalities).\n", - "\n", - "`Problem` aggregates all evaluator outputs into a single dictionary and then **extracts**:\n", - "- **Objectives** in the order of `objective_ids` → matrix **F**\n", - "- **Constraints** in the order of `constraint_ids` → matrix **G** (≤ 0)\n", - "\n", - "`Problem` takes as arguments:\n", - "- **`parameters`**: as defined earlier from the class `Parameter`\n", - "- **`evaluators`** (or objective functions)\n", - "- **`objective_ids`** : Names (keys) to extract from evaluator outputs to build **F** (in order).\n", - "- **`constraint_ids`** : Optional names to extract for **G** (inequalities ≤ 0). If omitted, `G` is empty.\n" - ], - "id": "cdb92f3584436e59" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "### 4.3.1. Let's try with few parameters", - "id": "1b52c09b40f97821" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "frac_p = 0.20\n", - "\n", - "calibration_params = [\n", - "\n", - " Parameter(\n", - " name='R_ins',\n", - " interval=(1-frac_p, 1+frac_p),\n", - " init_value=1 / (lambda_ins / ep_ins) / 2 / S_wall,\n", - " relabs=\"Relative\",\n", - " ptype=\"Real\",\n", - " model_property='R_ins',\n", - " ),\n", - " Parameter(\n", - " name='alpha',\n", - " interval=(0.1, 0.9),\n", - " ptype=\"Real\",\n", - " model_property='alpha',\n", - " ),\n", - " Parameter(\n", - " name='epsilon',\n", - " interval=(0.1, 0.9),\n", - " ptype=\"Real\",\n", - " model_property='epsilon',\n", - " ),\n", - "]" - ], - "id": "1a18680c67860070", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "Here, we can add the interferace temperature (between insulation and concrete panels) as an indicator.", - "id": "382639dbbdf0b7f5" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "from corrai.base.metrics import cv_rmse\n", - "\n", - "obj_func = ObjectiveFunction(\n", - " model=OpaqueWallSimple(),\n", - " simulation_options=simulation_options_PYTH,\n", - " parameters=calibration_params,\n", - " indicators_config={\n", - " \"T_ins\": (cv_rmse, feat_train[\"T_ins\"]),\n", - " \"T_interface\": (cv_rmse, feat_train[\"T_interface\"]),\n", - " },\n", - " scipy_obj_indicator=\"T_ins\",\n", - ")" - ], - "id": "49e611448d80337b", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "Now let's instanciate the problem using the classe `Problem`:", - "id": "ef97cc58e43c40d6" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "from corrai.multi_optimize import Problem\n", - "\n", - "problem = Problem(\n", - " parameters=calibration_params,\n", - " evaluators=[obj_func],\n", - " objective_ids=[\"T_ins\", \"T_interface\"],\n", - ")" - ], - "id": "93799f08a89ee390", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "For a two objective problem, we choose here **NSGA2**, as a well-known multi-objective optimization algorithm based on non-dominated sorting and crowding.\n", - "List of algorithms here https://pymoo.org/algorithms/list.html#nb-algorithms-list.\n", - "\n", - "If the verbose=True, some printouts during the algorithm’s execution are provided. This can very from algorithm to algorithm. Here, we execute NSGA2 on a problem where pymoo has no knowledge about the optimum. Each line represents one iteration. The first two columns are the current generation counter and the number of evaluations so far. For constrained problems, the next two columns show the minimum constraint violation (cv (min)) and the average constraint violation (cv (avg)) in the current population. This is followed by the number of non-dominated solutions (n_nds) and two more metrics which represents the movement in the objective space." - ], - "id": "40c58ef20557e69a" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "\n", - "from pymoo.optimize import minimize\n", - "from pymoo.termination import get_termination\n", - "from pymoo.algorithms.moo.nsga2 import NSGA2\n", - "from pymoo.operators.crossover.sbx import SBX\n", - "from pymoo.operators.mutation.pm import PM" - ], - "id": "5c560de6fcd540c7", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "algorithm = NSGA2(\n", - " pop_size=50,\n", - " #n_offsprings=10,\n", - " #sampling=FloatRandomSampling(),\n", - " crossover=SBX(prob=0.9, eta=15),\n", - " mutation=PM(eta=20),\n", - " eliminate_duplicates=True\n", - ")" - ], - "id": "36c4a5e29c3580da", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "Here, we will run 15 generation with a population of 50.", - "id": "14d5c56706972fb0" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "termination = get_termination(\"n_gen\", 15)\n", - "\n", - "res = minimize(problem,\n", - " algorithm,\n", - " termination,\n", - " seed=42,\n", - " verbose=True)\n", - "\n", - "print(\"Best solution found: \\nX = %s\\nF = %s\" % (res.X, res.F))" - ], - "id": "e7742f7b075e7bc4", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "Let's visualize the objectives functions results.", - "id": "7023f1282a7b0398" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "from pymoo.visualization.scatter import Scatter\n", - "Scatter().add(res.F).show()" - ], - "id": "5c3a4c4acc8b32f2", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "For a bi-objective problem,and helping us chosing the best set of parameters value, we can use the decomposition method called Augmented Scalarization Function (ASF), a well-known metric in the multi-objective optimization literature.\n", - "Let us assume the are equally important by setting the weights to 0.5 and 0.5 and setting these" - ], - "id": "9d9ff70f4e4e5873" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "from pymoo.decomposition.asf import ASF\n", - "F = res.F\n", - "approx_ideal = F.min(axis=0)\n", - "approx_nadir = F.max(axis=0)\n", - "nF = (F - approx_ideal) / (approx_nadir - approx_ideal)\n", - "\n", - "fl = nF.min(axis=0)\n", - "fu = nF.max(axis=0)\n", - "weights = np.array([0.5, 0.5])\n", - "decomp = ASF()\n", - "\n", - "i = decomp.do(nF, 1/weights).argmin()\n", - "\n", - "parameter_names = [param.name for param in calibration_params]\n", - "parameter_dict = {param_name: res.X[i][j] for j, param_name in enumerate(parameter_names)}\n", - "\n", - "print(\n", - " \"Best regarding ASF: Point \\ni = %s\\nF = %s\" % (i, F[i]),\n", - " parameter_dict\n", - ")\n" - ], - "id": "3310ee8c9851ee39", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "The estimated parameters seem consistent with our expectations. We can compare the profile of measured indoor temperature with the output that the model predicts given the identified optimal parameters. ", - "id": "430753101a36d277" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "result_optim = simu_PYTH.simulate(\n", - " property_dict=parameter_dict,\n", - " simulation_options=simulation_options_PYTH,\n", - ")" - ], - "id": "6af5abd476a2d1ce", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "import plotly.graph_objects as go\n", - "\n", - "fig = go.Figure()\n", - "\n", - "fig.add_trace(go.Scatter(\n", - " x=init_res_PYTH.index,\n", - " y=simulation_df_resample[\"T_ins\"],\n", - " fill=None,\n", - " mode='lines',\n", - " line_color='green',\n", - " name=\"T_insulation - Measurement\"\n", - "))\n", - "\n", - "fig.add_trace(go.Scatter(\n", - " x=init_res_PYTH.index,\n", - " y=init_res_PYTH[\"T_ins\"],\n", - " fill=None,\n", - " mode='lines',\n", - " line_color='orange',\n", - " name=\"T_insulation - Initial results\"\n", - "))\n", - "\n", - "\n", - "fig.add_trace(go.Scatter(\n", - " x=init_res_PYTH.index,\n", - " y=result_optim[\"T_ins\"],\n", - " fill=None,\n", - " mode='lines',\n", - " line_color='brown',\n", - " name=\"T_insulation - Optimization results\"\n", - "))\n", - "\n", - "\n", - "fig.update_layout(\n", - " title='Optimization vs. Measurement ',\n", - " xaxis_title='Date',\n", - " yaxis_title='Temperature [K]')\n", - "\n", - "fig.show()" - ], - "id": "9a396bd186cd8dce", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "import plotly.graph_objects as go\n", - "\n", - "fig = go.Figure()\n", - "\n", - "\n", - "fig.add_trace(go.Scatter(\n", - " x=init_res_PYTH.index,\n", - " y=simulation_df_resample[\"T_interface\"],\n", - " fill=None,\n", - " mode='lines',\n", - " line_color='green',\n", - " name=\"T_interface - Measurement\"\n", - "))\n", - "\n", - "fig.add_trace(go.Scatter(\n", - " x=init_res_PYTH.index,\n", - " y=init_res_PYTH[\"T_interface\"],\n", - " fill=None,\n", - " mode='lines',\n", - " line_color='orange',\n", - " name=\"T_interface - Initial results\"\n", - "))\n", - "\n", - "fig.add_trace(go.Scatter(\n", - " x=init_res_PYTH.index,\n", - " y=result_optim[\"T_interface\"],\n", - " fill=None,\n", - " mode='lines',\n", - " line_color='brown',\n", - " name=\"T_interface - Optimization results\"\n", - "))\n", - "\n", - "fig.update_layout(\n", - " title='Optimization vs. Measurement ',\n", - " xaxis_title='Date',\n", - " yaxis_title='Temperature [K]')\n", - "\n", - "fig.show()" - ], - "id": "3b13a58a1375404", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "### 4.3.2. All relevant parameters", - "id": "77bd39626a63b358" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "We can try to identify all five parameters at once, with larger intervals.", - "id": "bda927cd380f88d6" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "frac_p = 0.50\n", - "frac_conv = 0.50\n", - "\n", - "calibration_params = [\n", - " Parameter(\n", - " name='R_concrete',\n", - " interval=(1-frac_p, 1+frac_p),\n", - " ptype=\"Real\",\n", - " init_value=1 / (lambda_concrete / ep_concrete) / 2 / S_wall,\n", - " relabs=\"Relative\",\n", - " model_property='R_concrete',\n", - " ),\n", - " Parameter(\n", - " name='R_ins',\n", - " interval=(1-frac_p, 1+frac_p),\n", - " init_value=1 / (lambda_ins / ep_ins) / 2 / S_wall,\n", - " relabs=\"Relative\",\n", - " ptype=\"Real\",\n", - " model_property='R_ins',\n", - " ),\n", - " Parameter(\n", - " name='alpha',\n", - " interval=(0.1, 0.9),\n", - " ptype=\"Real\",\n", - " model_property='alpha',\n", - " ),\n", - " Parameter(\n", - " name='epsilon',\n", - " interval=(0.1, 0.9),\n", - " ptype=\"Real\",\n", - " model_property='epsilon',\n", - " ),\n", - " Parameter(\n", - " name='R_ext',\n", - " init_value= 0.04/S_wall,\n", - " interval=(0.02/S_wall, 0.06/S_wall),\n", - " ptype=\"Real\",\n", - " model_property='R_ext',\n", - " ),\n", - "]\n", - "\n", - "obj_func = ObjectiveFunction(\n", - " model=OpaqueWallSimple(),\n", - " simulation_options=simulation_options_PYTH,\n", - " parameters=calibration_params,\n", - " indicators_config={\n", - " \"T_ins\": (cv_rmse, feat_train[\"T_ins\"]),\n", - " \"T_interface\": (cv_rmse, feat_train[\"T_interface\"]),\n", - " },\n", - " scipy_obj_indicator=\"T_ins\",\n", - ")\n", - "\n", - "problem = Problem(\n", - " parameters=calibration_params,\n", - " evaluators=[obj_func],\n", - " objective_ids=[\"T_ins\", \"T_interface\"],\n", - ")\n", - "\n", - "algorithm = NSGA2(\n", - " pop_size=100,\n", - " crossover=SBX(prob=0.9, eta=15),\n", - " mutation=PM(eta=20),\n", - " eliminate_duplicates=True\n", - ")\n", - "\n", - "termination = get_termination(\"n_gen\", 10)\n", - "\n", - "res = minimize(problem,\n", - " algorithm,\n", - " termination,\n", - " seed=42,\n", - " verbose=True)\n", - "\n", - "print(\"Best solution found: \\nX = %s\\nF = %s\" % (res.X, res.F))" - ], - "id": "a3905edeb57f7d7", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "from pymoo.visualization.scatter import Scatter\n", - "Scatter().add(res.F).show()" - ], - "id": "58a5a1ba59b54b0c", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "from pymoo.decomposition.asf import ASF\n", - "F = res.F\n", - "approx_ideal = F.min(axis=0)\n", - "approx_nadir = F.max(axis=0)\n", - "nF = (F - approx_ideal) / (approx_nadir - approx_ideal)\n", - "\n", - "fl = nF.min(axis=0)\n", - "fu = nF.max(axis=0)\n", - "weights = np.array([0.9, 0.1])\n", - "decomp = ASF()\n", - "\n", - "i = decomp.do(nF, 1/weights).argmin()\n", - "\n", - "parameter_names = [param.name for param in calibration_params]\n", - "parameter_dict2 = {param_name: res.X[i][j] for j, param_name in enumerate(parameter_names)}\n", - "\n", - "print(\"Best regarding ASF: Point \\ni = %s\\nF = %s\" % (i, F[i]),\n", - " parameter_dict2\n", - " )" - ], - "id": "d22d1be82c4dd73a", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "result_optim2 = simu_PYTH.simulate(\n", - " property_dict=parameter_dict2,\n", - " simulation_options=simulation_options_PYTH,\n", - ")" - ], - "id": "6dd3323e92a6a6e", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "import plotly.graph_objects as go\n", - "\n", - "fig = go.Figure()\n", - "\n", - "fig.add_trace(go.Scatter(\n", - " x=init_res_PYTH.index,\n", - " y=simulation_df_resample[\"T_ins\"],\n", - " fill=None,\n", - " mode='lines',\n", - " line_color='green',\n", - " name=\"T_insulation - Measurement\"\n", - "))\n", - "\n", - "fig.add_trace(go.Scatter(\n", - " x=init_res_PYTH.index,\n", - " y=init_res_PYTH[\"T_ins\"],\n", - " fill=None,\n", - " mode='lines',\n", - " line_color='orange',\n", - " name=\"T_insulation - Initial results\"\n", - "))\n", - "\n", - "\n", - "fig.add_trace(go.Scatter(\n", - " x=init_res_PYTH.index,\n", - " y=result_optim[\"T_ins\"],\n", - " fill=None,\n", - " mode='lines',\n", - " line_color='brown',\n", - " name=\"T_insulation - Optimization results 1\"\n", - "))\n", - "\n", - "fig.add_trace(go.Scatter(\n", - " x=init_res_PYTH.index,\n", - " y=result_optim2[\"T_ins\"],\n", - " fill=None,\n", - " mode='lines',\n", - " line_color='pink',\n", - " name=\"T_insulation - Optimization results 2\"\n", - "))\n", - "\n", - "fig.update_layout(\n", - " title='Optimization vs. Measurement ',\n", - " xaxis_title='Date',\n", - " yaxis_title='Temperature [K]')\n", - "\n", - "fig.show()" - ], - "id": "9184271f2f7df10d", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "#### Validation set\n", - "An important step is to check identified parameters on validation set. Let's try on an new period using the last identified values." - ], - "id": "4b9e79b1b72c11ab" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "validation_set = reference_df.loc[\"2024-09-08 00:00\":\"2024-09-14 00:00\"]\n", - "validation_set.loc[:,\"time_sec\"] = datetime_to_seconds(validation_set.index)\n", - "\n", - "validation_set = validation_set.resample('5min').mean()\n", - "second_index = datetime_to_seconds(validation_set.index)\n", - "\n", - "new_simulation_options_PYTH={\n", - " \"dataframe\":validation_set,\n", - " \"startTime\": second_index[0],\n", - " \"endTime\": second_index[-1], \n", - "}" - ], - "id": "2d5b8b9c95df81cc", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "validation_results = simu_PYTH.simulate(\n", - " parameter_dict2,\n", - " new_simulation_options_PYTH\n", - ")" - ], - "id": "9aa4f7a4cc5c7ddc", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "import plotly.graph_objects as go\n", - "\n", - "fig = go.Figure()\n", - "\n", - "fig.add_trace(go.Scatter(\n", - " x=validation_results.index,\n", - " y=validation_results[\"T_ins\"] ,\n", - " fill=None,\n", - " mode='lines',\n", - " line_color='orange',\n", - " name=\"T_insulation - Validation result\"\n", - "))\n", - "\n", - "fig.add_trace(go.Scatter(\n", - " x=validation_results.index,\n", - " y=validation_set[\"T_ins\"],\n", - " fill=None,\n", - " mode='lines',\n", - " line_color='green',\n", - " name=\"T_insulation - Measurement\"\n", - "))\n", - "\n", - "fig.update_layout(\n", - " title='Simulation vs. Measurement ',\n", - " xaxis_title='Date',\n", - " yaxis_title='Temperature [K]')\n", - "\n", - "fig.show()" - ], - "id": "cad5e270e1e6ab01", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "cv_rmse(\n", - " validation_results[\"T_ins\"],\n", - " validation_set[\"T_ins\"]\n", - ")" - ], - "id": "c894241ee9e19da6", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": "## 4.3.3 Mixed parameters", - "id": "84673e7340839047" - }, - { - "metadata": {}, - "cell_type": "markdown", - "source": [ - "In some calibrations, some decision variables are discrete rather than purely continuous—e.g., on/off features (booleans), choices among a few valid modes, or integers for counts. Standard continuous optimizers used earlier assume a smooth search space and therefore won’t handle these variables correctly.\n", - "\n", - "To reflect realistic design decisions and configuration toggles, we now introduce a model with a boolean switch (e.g., enabling/disabling a physical effect) and restrict alpha to a small set of admissible values. This mixed-variable setup better captures practical constraints and uncertainty (e.g., unknown presence of a phenomenon, or controller setpoint options), and requires a mixed-variable optimization strategy." - ], - "id": "f85f87661de5c8c3" - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "import numpy as np\n", - "import pandas as pd\n", - "from corrai.base.model import Model\n", - "\n", - "class OpaqueWallBool(Model):\n", - " def simulate(\n", - " self,\n", - " property_dict: dict,\n", - " simulation_options: dict,\n", - " simulation_kwargs: dict | None = None,\n", - " **kwargs,\n", - " ) -> pd.DataFrame:\n", - "\n", - " default_parameters = {\n", - " \"R_ext\": 0.005,\n", - " \"R_int\": 0.01,\n", - " \"R_concrete\": 0.10,\n", - " \"R_ins\": 0.32,\n", - " \"C_concrete\": 2.95e6,\n", - " \"C_ins\": 3.64e4,\n", - " \"alpha\": 0.2,\n", - " \"S_wall\": 7,\n", - " \"epsilon\": 0.4,\n", - " \"fview\": 0.5,\n", - " \"has_LW_radiation\": True\n", - " }\n", - " parameters = {**default_parameters, **property_dict}\n", - "\n", - " R_ext = parameters[\"R_ext\"]\n", - " R_int = parameters[\"R_int\"]\n", - " R_concrete = parameters[\"R_concrete\"]\n", - " R_ins = parameters[\"R_ins\"]\n", - " C_concrete = parameters[\"C_concrete\"]\n", - " C_ins = parameters[\"C_ins\"]\n", - " alpha = parameters[\"alpha\"]\n", - " S_wall = parameters[\"S_wall\"]\n", - " epsilon = parameters[\"epsilon\"]\n", - " fview = parameters[\"fview\"]\n", - " has_LW_radiation = parameters[\"has_LW_radiation\"]\n", - "\n", - " sigma = 5.67e-8 # W/m^2/K^4\n", - "\n", - " df = simulation_options[\"dataframe\"]\n", - " time = df[\"time_sec\"].values\n", - " T_ext = df[\"T_ext\"].values\n", - " T_int = df[\"T_int\"].values\n", - " Q_rad = df[\"Pyr\"].values\n", - "\n", - " startTime = simulation_options.get(\"startTime\", time[0])\n", - " stopTime = simulation_options.get(\"stopTime\", time[-1])\n", - "\n", - " mask = (time >= startTime) & (time <= stopTime)\n", - " time = time[mask]\n", - " T_ext = T_ext[mask]\n", - " T_int = T_int[mask]\n", - " Q_rad = Q_rad[mask]\n", - "\n", - " # init\n", - " T_se = np.zeros(len(time))\n", - " T_concrete = np.zeros(len(time))\n", - " T_ins = np.zeros(len(time))\n", - " T_interface = np.zeros(len(time))\n", - " T_si = np.zeros(len(time))\n", - " T_sky = np.zeros(len(time))\n", - "\n", - " T_se[0] = T_ext[0]\n", - " T_concrete[0] = 299\n", - " T_ins[0] = T_int[0]\n", - " T_interface[0] = (T_ins[0] + T_concrete[0]) / 2\n", - " T_si[0] = T_int[0]\n", - " T_sky[0] = T_int[0]\n", - "\n", - " for t in range(1, len(time)):\n", - " dt = time[t] - time[t - 1]\n", - "\n", - "\n", - " ## boolean to turn off long wave radiation exchange with environnement and sky\n", - " T_sky[t] = 0.0552 * (T_ext[t] ** 1.5) if has_LW_radiation else 0.0\n", - "\n", - " Q_rad_sky = (\n", - " epsilon * fview * sigma * (T_se[t - 1] ** 4 - T_sky[t] ** 4) * S_wall\n", - " if has_LW_radiation else 0.0\n", - " )\n", - " Q_rad_amb = (\n", - " epsilon * fview * sigma * (T_se[t - 1] ** 4 - T_ext[t - 1] ** 4) * S_wall\n", - " if has_LW_radiation else 0.0\n", - " )\n", - " Q_rad_dir = Q_rad[t - 1] * alpha * S_wall\n", - "\n", - " T_se[t] = (\n", - " T_ext[t - 1] / R_ext\n", - " + T_ins[t - 1] / (R_ins / 2)\n", - " + Q_rad_dir - Q_rad_sky - Q_rad_amb\n", - " ) / (1 / R_ext + 1 / (R_ins / 2))\n", - "\n", - " T_interface[t] = (\n", - " T_ins[t - 1] / (R_ins / 2) + T_concrete[t - 1] / (R_concrete / 2)\n", - " ) / (1 / (R_concrete / 2) + 1 / (R_ins / 2))\n", - "\n", - " T_si[t] = (\n", - " T_int[t - 1] / R_int + T_concrete[t - 1] / (R_concrete / 2)\n", - " ) / (1 / R_int + 1 / (R_concrete / 2))\n", - "\n", - " T_ins[t] = T_ins[t - 1] + dt / C_ins * (\n", - " (T_se[t] - T_ins[t - 1]) / (R_ins / 2)\n", - " + (T_interface[t] - T_ins[t - 1]) / (R_ins / 2)\n", - " )\n", - "\n", - " T_concrete[t] = T_concrete[t - 1] + dt / C_concrete * (\n", - " (T_interface[t] - T_concrete[t - 1]) / (R_concrete / 2)\n", - " + (T_si[t] - T_concrete[t - 1]) / (R_concrete / 2)\n", - " )\n", - "\n", - " # output\n", - " df_out = pd.DataFrame(\n", - " {\n", - " \"T_concrete\": T_concrete,\n", - " \"T_interface\": T_interface,\n", - " \"T_ins\": T_ins,\n", - " },\n", - " index=df.index[mask],\n", - " )\n", - " self.simulation_options = simulation_options\n", - " return df_out" - ], - "id": "77844027c51e6f9b", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "mixed_params = [\n", - " Parameter(name=\"alpha\", values=(0.2, 0.4, 0.5), ptype=\"Choice\", model_property=\"alpha\"),\n", - " Parameter(name=\"epsilon\", interval=(0, 1), ptype=\"Real\", model_property=\"epsilon\"),\n", - " Parameter(name=\"has_LW_radiation\", ptype=\"Binary\", model_property=\"has_LW_radiation\"),\n", - "]" - ], - "id": "f38515d84c5b71cd", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "from corrai.multi_optimize import Problem\n", - "\n", - "obj_func = ObjectiveFunction(\n", - " model=OpaqueWallBool(),\n", - " simulation_options=simulation_options_PYTH,\n", - " parameters=mixed_params,\n", - " indicators_config={\n", - " \"T_ins\": (cv_rmse, feat_train[\"T_ins\"]),\n", - " },\n", - " scipy_obj_indicator=\"T_ins\",\n", - ")\n", - "\n", - "problem = Problem(\n", - " parameters=mixed_params,\n", - " evaluators=[obj_func],\n", - " objective_ids=[\"T_ins\"],\n", - ")" - ], - "id": "b2801bfaec82fcd6", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": [ - "from pymoo.termination import get_termination\n", - "from pymoo.algorithms.moo.nsga2 import RankAndCrowdingSurvival\n", - "from pymoo.core.mixed import MixedVariableGA\n", - "from pymoo.optimize import minimize\n", - "\n", - "algorithm = MixedVariableGA(pop_size=50, survival=RankAndCrowdingSurvival())\n", - "termination = get_termination(\"n_gen\", 5)\n", - "\n", - "res = minimize(problem,\n", - " algorithm,\n", - " termination,\n", - " seed=42,\n", - " verbose=True)\n", - "\n", - "print(\"Best solution found: \\nX = %s\\nF = %s\" % (res.X, res.F))" - ], - "id": "6c69db5bf099ef9", - "outputs": [], - "execution_count": null - }, - { - "metadata": {}, - "cell_type": "code", - "source": "", - "id": "c1003131b39bbf24", - "outputs": [], - "execution_count": null - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/tutorials/Sensitivity analysis_python model of an opaque wall.ipynb b/tutorials/Sensitivity analysis_python model of an opaque wall.ipynb new file mode 100644 index 0000000..aea3e30 --- /dev/null +++ b/tutorials/Sensitivity analysis_python model of an opaque wall.ipynb @@ -0,0 +1,1234 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "fe658bc78054a398", + "metadata": {}, + "source": [ + "This tutorial can be found and ran in the GITHUB libray corrAI: https://github.com/BuildingEnergySimulationTools/corrai" + ] + }, + { + "cell_type": "markdown", + "id": "d62323ea2b70844", + "metadata": {}, + "source": [ + "---\n", + "# Characterisation method tutorial\n", + "\n", + "The aim of this tutorial is to provide tools for sensitivity analyses of models using **CorrAI** modules. It contains several parts: \n", + "\n", + "1. **Model definition**: presentation of the test case used here, and definition of a mathematical model describing it.\n", + " \n", + "2. **Measurement import**: Loading of measurements performed on a test benchto serve as reference data for comparison with the model.\n", + "\n", + "3. **Performing sensitivity analysis**: Sensitivity analysis conducted to identify materials properties which have an influence on the discrepancy between model outputs and measured phenomenon\n", + "---\n" + ] + }, + { + "cell_type": "markdown", + "id": "4ed710c97edd95e1", + "metadata": {}, + "source": [ + "# 1. Model definition" + ] + }, + { + "cell_type": "markdown", + "id": "49fdc09f8bf34966", + "metadata": {}, + "source": [ + "## Use case presentation\n", + "\n", + "A \"real-scale\" test bench is used. The **O3BET** (or in this example BEF test bench in Anglet, France) offers experimental conditions to evaluate building façade solutions. Heat exchanges in a cell are restricted on five of its faces, while the sixth face is dedicated to the tested solution. Internal temperature and humidity conditions can be controlled or monitored. External conditions, including temperatures and solar radiation, are measured.\n", + "\n", + "The tested technology here is a green façade, coupled wiht insulated panels. The experimental setup is presented in the following picture: one cell is equiped with the technology (right one) and another serves as a reference (insulation only).\n", + "\n", + "
\n", + " \n", + "
\n", + " Figure : Pictures of reference wall and vegatal wall installation \n", + "
\n", + "\n", + "Sensors (heatflux density meters, thermocouples, RTD) are positioned in several parts: in the middle of insulation panels, between insulation and concrete layers, between leaves, in substrates, indoor. Climatic conditions (external temperature, incident solar radiation) are also monitored.\n", + "\n", + "\n", + "
\n", + "
\n", + " Figure : Sensors installation scheme\n", + "
\n", + "\n", + "- Measure campaign spans from april 2024 to october 2024\n", + "- Acquisition timestep is 60 secondes minimum." + ] + }, + { + "cell_type": "markdown", + "id": "f19fd128-6cc0-4fcc-bcfa-1dcd34e4aaa5", + "metadata": {}, + "source": [ + "## Description of the proposed model " + ] + }, + { + "cell_type": "markdown", + "id": "3e4ee318-e7ed-4acd-bea2-eabf9c532a0f", + "metadata": {}, + "source": [ + "The following framework is proposed to identify the **REFERENCE** wall thermal conductivity.\n", + "\n", + "For this example we propose a resistance/capacity approach. Based on electrical circuit analogy, each layer of the wall is modeled by two resistances and a capacity:\n", + "\n", + "\n", + "| Figure : RC model|\n", + "| :---: |\n", + "| | \n", + "\n", + "\n", + "Note that it is recommended to start with a **very simple** model (e.g., one resistance, one capacity with only Text and Tint as boundary conditions) and gradually make it more complex as you identify parameters. For the sake of this tutorial, the proposed model is already a bit detailed.\n", + "\n", + "\n", + "The following is a brief description of the thermal model:\n", + "\n", + "- Each wall layer is modeled by 2 thermal resistances and a capacity.\n", + " - Resistances to create a gradiant and better resolution of distribution of heat flow : $ R_1 = R_2 = \\frac{ep_{layer}}{lambda_{layer} \\times 2} $ \n", + " - Capacity in the middle of both our layers, representing its thermal mass and ability to store heat. : $ C = ep_{layer} \\times rho_{layer} \\times cap_{layer} $\n", + " \n", + "- Inside and outside convection/conduction transfers are model as a constant value thermal resistance.\n", + "\n", + "- Infrared transfers are considered :\n", + " - With the sky, with $ T_{sky} = 0.0552T_{ext}^{1.5} $ as the sky is a significant source of infrared radiation, especially at night. This radiation can have a considerable impact on the thermal behavior of the system, influencing both heating and cooling processes\n", + " - With the surrounding considered to be at $ T_{ext} $ as surroundings or environment also emit infrared radiation\n", + "\n", + "- Short wave solar radiation heat flux is computed $Sw_{gain} = Pyr \\times \\alpha_{coat} $ with $Pyr$ the measured solar radiation onthe wall (W/m²) and $\\alpha_{coat}$ the coating solar absorbtion coefficient.\n", + "\n", + "- Temperatures $ T_{ext}$ and $T_{int} $ are boundary conditions. $ T_{int}$ represents the temperature within the controlled environment of the system.\n", + "\n", + "Here are somes theoretical parameters for the model:" + ] + }, + { + "cell_type": "code", + "id": "536c528c698d62a", + "metadata": {}, + "source": [ + "# Surface of the tested wall\n", + "S_wall = 7\n", + "\n", + "# Thickness of layer\n", + "ep_concrete = 0.200 #m\n", + "ep_ins = 0.140 #m\n", + "\n", + "# Conductivity of layer\n", + "lambda_concrete = 0.13 # (W/mK)\n", + "lambda_ins = 0.031 # W/(mK)\n", + "\n", + "# Density of layer\n", + "rho_concrete = 2400 # kg/m3 \n", + "rho_ins = 32.5 # kg/m3\"\n", + "\n", + "\n", + "sc_concrete = 880 # J/kg.K\n", + "sc_ins = 1000 # J/kg.K\n", + "\n", + "# solar paremetesr\n", + "alpha = 0.2 # absorption coefficient\n", + "epsilon = 0.8 # emissivity \n", + "fview = 0.5 # view factor of tested wall" + ], + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Defining a Simulator in corrAI\n", + "\n", + "The following `OpaqueWallSimple` class implements a simplified wall heat transfer model, written purely in Python, and compatible with corrAI. It follows the `Model` interface, which guarantees that it can be used in calibration, sensitivity analysis, and optimization workflows and ensures a common interface across different model types: Pure Python models, FMU models (via `ModelicaFmuModel`), Modelica models (via [modelitool](https://github.com/BuildingEnergySimulationTools/modelitool))...\n", + "\n", + "A valid simulator must implement the following concepts:\n", + "\n", + "- **`simulate(property_dict, simulation_options, simulation_kwargs)`**\n", + " Main entry point. Runs the simulation and returns results as a `pandas.DataFrame` indexed by time.\n", + "\n", + "- **`property_dict`**\n", + " Dictionary of model parameters `{property_name: value}`.\n", + " These overwrite default values to perform calibration, parameter sweeps, or optimization.\n", + "Here, the wall is described by default physical properties, which can be overridden at runtime :\n", + "\n", + "| Parameter | Description | Unit |\n", + "|----------------|--------------------------------------------|--------|\n", + "| **R_ext** | External thermal resistance | K/W |\n", + "| **R_int** | Internal thermal resistance | K/W |\n", + "| **R_concrete** | Concrete thermal resistance | K/W |\n", + "| **R_ins** | Insulation thermal resistance | K/W |\n", + "| **C_concrete** | Heat capacity of concrete | J/K |\n", + "| **C_ins** | Heat capacity of insulation | J/K |\n", + "| **alpha** | Solar absorption coefficient | - |\n", + "| **S_wall** | Wall surface area | m² |\n", + "| **epsilon** | Emissivity | - |\n", + "| **fview** | View factor to the sky/ambient | - |\n", + "\n", + "\n", + "- **`simulation_options`**\n", + " Dictionary of simulation environment settings (time span, timestep, input data).\n", + " Defines how the model will be executed, and what external conditions are applied. Here, the model requires a dataframe containing environmental inputs:\n", + "\n", + "| Input | Description | Unit |\n", + "|:------------:|:------------------------------------:|:-----:|\n", + "| **time_sec** | Simulation time | s |\n", + "| **T_ext** | Outdoor air temperature | °C |\n", + "| **T_int** | Indoor air temperature | °C |\n", + "| **Pyr** | Solar radiation from pyranometer | W/m² |\n", + "\n", + "\n", + "\n", + "By always structuring models with these inputs and outputs, any simulator (FMU, Modelica, or Python) can be plugged into the corrAI workflow without rewriting downstream code.\n", + "\n", + "The simulation returns a `pandas.DataFrame` with T_concrete, T_interface, T_ins." + ], + "id": "bbfcb2748f2bb20f" + }, + { + "cell_type": "code", + "id": "1f574968-5e84-4ab2-abd2-d93270bf0aea", + "metadata": {}, + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "from corrai.base.model import Model\n", + "\n", + "class OpaqueWallSimple(Model):\n", + " def simulate(\n", + " self,\n", + " property_dict: dict,\n", + " simulation_options: dict,\n", + " simulation_kwargs: dict | None = None,\n", + " ) -> pd.DataFrame:\n", + "\n", + " default_parameters = {\n", + " \"R_ext\": 0.005,\n", + " \"R_int\": 0.01,\n", + " \"R_concrete\": 0.10,\n", + " \"R_ins\": 0.32,\n", + " \"C_concrete\": 2.95e6,\n", + " \"C_ins\": 3.64e4,\n", + " \"alpha\": 0.2,\n", + " \"S_wall\": 7,\n", + " \"epsilon\": 0.4,\n", + " \"fview\": 0.5,\n", + " }\n", + " parameters = {**default_parameters, **property_dict}\n", + "\n", + " R_ext = parameters[\"R_ext\"]\n", + " R_int = parameters[\"R_int\"]\n", + " R_concrete = parameters[\"R_concrete\"]\n", + " R_ins = parameters[\"R_ins\"]\n", + " C_concrete = parameters[\"C_concrete\"]\n", + " C_ins = parameters[\"C_ins\"]\n", + " alpha = parameters[\"alpha\"]\n", + " S_wall = parameters[\"S_wall\"]\n", + " epsilon = parameters[\"epsilon\"]\n", + " fview = parameters[\"fview\"]\n", + "\n", + " sigma = 5.67e-8 # W/m^2/K^4\n", + "\n", + " df = simulation_options[\"dataframe\"]\n", + " time = df[\"time_sec\"].values\n", + " T_ext = df[\"T_ext\"].values\n", + " T_int = df[\"T_int\"].values\n", + " Q_rad = df[\"Pyr\"].values\n", + "\n", + " startTime = simulation_options.get(\"startTime\", time[0])\n", + " stopTime = simulation_options.get(\"stopTime\", time[-1])\n", + "\n", + " mask = (time >= startTime) & (time <= stopTime)\n", + " time = time[mask]\n", + " T_ext = T_ext[mask]\n", + " T_int = T_int[mask]\n", + " Q_rad = Q_rad[mask]\n", + "\n", + " # init\n", + " T_se = np.zeros(len(time))\n", + " T_concrete = np.zeros(len(time))\n", + " T_ins = np.zeros(len(time))\n", + " T_interface = np.zeros(len(time))\n", + " T_si = np.zeros(len(time))\n", + " T_sky = np.zeros(len(time))\n", + "\n", + " T_se[0] = T_ext[0]\n", + " T_concrete[0] = 299 - 273.15\n", + " T_ins[0] = T_int[0]\n", + " T_interface[0] = (T_ins[0] + T_concrete[0]) / 2\n", + " T_si[0] = T_int[0]\n", + " T_sky[0] = T_int[0]\n", + "\n", + " for t in range(1, len(time)):\n", + " dt = time[t] - time[t - 1]\n", + "\n", + " T_sky[t] = 0.0552 * (T_ext[t] ** 1.5)\n", + "\n", + " Q_rad_sky = epsilon * fview * sigma * (T_se[t - 1] ** 4 - T_sky[t] ** 4) * S_wall\n", + " Q_rad_amb = epsilon * fview * sigma * (T_se[t - 1] ** 4 - T_ext[t - 1] ** 4) * S_wall\n", + " Q_rad_dir = Q_rad[t - 1] * alpha * S_wall\n", + "\n", + " T_se[t] = (\n", + " T_ext[t - 1] / R_ext\n", + " + T_ins[t - 1] / (R_ins / 2)\n", + " + Q_rad_dir - Q_rad_sky - Q_rad_amb\n", + " ) / (1 / R_ext + 1 / (R_ins / 2))\n", + "\n", + " T_interface[t] = (\n", + " T_ins[t - 1] / (R_ins / 2) + T_concrete[t - 1] / (R_concrete / 2)\n", + " ) / (1 / (R_concrete / 2) + 1 / (R_ins / 2))\n", + "\n", + " T_si[t] = (\n", + " T_int[t - 1] / R_int + T_concrete[t - 1] / (R_concrete / 2)\n", + " ) / (1 / R_int + 1 / (R_concrete / 2))\n", + "\n", + " T_ins[t] = T_ins[t - 1] + dt / C_ins * (\n", + " (T_se[t] - T_ins[t - 1]) / (R_ins / 2)\n", + " + (T_interface[t] - T_ins[t - 1]) / (R_ins / 2)\n", + " )\n", + "\n", + " T_concrete[t] = T_concrete[t - 1] + dt / C_concrete * (\n", + " (T_interface[t] - T_concrete[t - 1]) / (R_concrete / 2)\n", + " + (T_si[t] - T_concrete[t - 1]) / (R_concrete / 2)\n", + " )\n", + "\n", + " # output\n", + " df_out = pd.DataFrame(\n", + " {\n", + " \"T_concrete\": T_concrete,\n", + " \"T_interface\": T_interface,\n", + " \"T_ins\": T_ins,\n", + " },\n", + " index=df.index[mask],\n", + " )\n", + " self.simulation_options = simulation_options\n", + " return df_out" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "21e9de06-7c25-475a-b48d-f598c3ade9ce", + "metadata": {}, + "source": [ + "We can try to run. Let's create dummy data:\n", + "- T_ext: a daily sinusoidal variation around 20°C (e.g., min 15°C at night, max 30°C during the day).\n", + "- T_int: more stable indoor temperature (e.g. around 22°C ±1).\n", + "- Pyr: global solar radiation in the form of a bell curve (0 at night, max around midday at ~800 W/m²)." + ] + }, + { + "cell_type": "code", + "id": "ce77813f073f7e13", + "metadata": {}, + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "\n", + "def create_typical_input_df(day_seconds: int = 86400, step: int = 300) -> pd.DataFrame:\n", + " time = np.arange(0, day_seconds + step, step)\n", + " T_ext = 22.5 + 7.5 * np.sin(2 * np.pi * (time / day_seconds - 0.25))\n", + " T_int = 22 + 0.5 * np.sin(2 * np.pi * time / day_seconds)\n", + " Pyr = 800 * np.maximum(0, np.sin(np.pi * time / day_seconds))\n", + "\n", + " df = pd.DataFrame(\n", + " {\n", + " \"T_ext\": T_ext,\n", + " \"T_int\": T_int,\n", + " \"Pyr\": Pyr,\n", + " \"Pyr\": Pyr,\n", + " },\n", + " index=pd.Index(time, name=\"time_sec\")\n", + " )\n", + " df[\"time_sec\"]=df.index\n", + " return df\n", + "df_typical = create_typical_input_df()\n", + "df_typical.head()" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "5ab5f4fc-889b-4f29-9428-39d94e3c6d44", + "metadata": {}, + "source": [ + "Now, let's define: \n", + "- **simulation options**, with start-time, end-time, and a dataframe for boundary conditions\n", + "- **a dictionary**, containing values for our different parameters" + ] + }, + { + "cell_type": "code", + "id": "71b2a7cf44a5deb4", + "metadata": {}, + "source": [ + "sim_opt ={\n", + " \"dataframe\":df_typical,\n", + " \"startTime\": df_typical.index[0],\n", + " \"endTime\": df_typical.index[-1], \n", + "}" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "58ae79a228160f5", + "metadata": {}, + "source": [ + "param_dict = {\n", + " \"R_ext\": 0.04/S_wall, \n", + " \"R_int\": 0.13/S_wall, \n", + " \"R_concrete\": 1 / (lambda_concrete / ep_concrete) / 2 / S_wall, \n", + " \"R_ins\": 1 / (lambda_ins / ep_ins) / 2 / S_wall, \n", + " \"C_ins\": rho_ins*ep_ins*S_wall*sc_ins, \n", + " \"C_concrete\": rho_concrete*ep_concrete*S_wall*sc_concrete, \n", + " \"alpha\": alpha, \n", + " \"S_wall\": S_wall, \n", + " \"epsilon\": epsilon,\n", + " 'fview': fview\n", + "}" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "a2a5619d-1a3a-4d2a-8e5c-0dbfe6f7fa80", + "metadata": {}, + "source": [ + "model = OpaqueWallSimple()\n", + "model.simulate(\n", + " property_dict = param_dict, \n", + " simulation_options=sim_opt\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "cad4938a77f4a783", + "metadata": {}, + "source": [ + "# 2. Measurement import\n", + "Let's now load the reference cell measurement data on python that will be used as boundary conditions. Note that although the data loaded here should have be cleaned beforhand, it is always important to check it before using it for analyses." + ] + }, + { + "cell_type": "markdown", + "id": "5d96fa61-978c-45a5-be16-8fd7145fa0cc", + "metadata": {}, + "source": [ + "## Loading the dataframe" + ] + }, + { + "cell_type": "code", + "id": "59473d207b322155", + "metadata": {}, + "source": [ + "import pandas as pd\n", + "import numpy as np\n", + "import os\n", + "import plotly.express as px\n", + "from pathlib import Path\n", + "pd.options.mode.chained_assignment = None # default='warn'\n", + "\n", + "TUTORIAL_DIR = Path(os.getcwd()).as_posix()" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "6d01322236314e2b", + "metadata": {}, + "source": [ + "reference_df = pd.read_csv(\n", + " Path(TUTORIAL_DIR) / \"resources/tuto_data_SA.csv\",\n", + " index_col=0,\n", + " sep=\";\",\n", + " decimal=\".\",\n", + " parse_dates=True\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "8a81862b0c17b1b4", + "metadata": {}, + "source": [ + "reference_df.head()" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "019ca4ac-1752-46e1-93c4-a66ad3c8cbbc", + "metadata": {}, + "source": [ + "## Check missing or inconsistent data" + ] + }, + { + "cell_type": "code", + "id": "5a3e4bb69c962f76", + "metadata": {}, + "source": [ + "reference_df.isna().any().any()" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "3da1cce8d7320cb5", + "metadata": {}, + "source": [ + "There seems to be no missing data. Let's first plot all data to check if nothing stands out. Although data was cleaned (supposedly, some measurement errors and irregularities might have been missed in the process." + ] + }, + { + "cell_type": "code", + "id": "bed89b439bf7437", + "metadata": {}, + "source": [ + "px.line(reference_df)" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "c579c990290e894c", + "metadata": {}, + "source": [ + "Here we can see that one of the temperature sensors installed in the insulation panels stopped measuring on september 7th. Only T_ins_2 will be used as the insulation temperature." + ] + }, + { + "cell_type": "code", + "id": "b84fb1e0db1d3855", + "metadata": {}, + "source": [ + "reference_df.loc[:,\"T_ins\"] = reference_df['T_ins_2'] # middle of insulation temperature\n", + "reference_df.loc[:,\"T_int\"] =reference_df[['Tint_1', 'Tint_2']].mean(axis=1) # indoor temperature\n", + "reference_df.loc[:,\"T_interface\"] = reference_df[['T_interface_1', 'T_interface_2']].mean(axis=1) # interface temperature, between insulation and concrete panels" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "185a08dd-5af9-4f16-b596-611e11e6a773", + "metadata": {}, + "source": [ + "## Run the model with real boundary conditions" + ] + }, + { + "cell_type": "markdown", + "id": "6b4ea7c50fac69a0", + "metadata": {}, + "source": [ + "Let's rerun the model with real measurements, now. The following simulation period contans a short period with both sunny and cloudy days." + ] + }, + { + "cell_type": "code", + "id": "342da2f0046b008a", + "metadata": {}, + "source": [ + "simulation_df = reference_df.loc[\"2024-09-04 00:00\":\"2024-09-09 00:00\"]" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "e3db40a5-0c49-48ef-9002-f83eb164c4d9", + "metadata": {}, + "source": [ + "To create a \"time_sec\" column used by the model, we can use the following function:" + ] + }, + { + "cell_type": "code", + "id": "2f8d425878b94320", + "metadata": {}, + "source": [ + "import datetime as dt\n", + "\n", + "def datetime_to_seconds(index_datetime):\n", + " time_start = dt.datetime(index_datetime[0].year, 1, 1, tzinfo=dt.timezone.utc)\n", + " new_index = index_datetime.to_frame().diff().squeeze()\n", + " new_index.iloc[0] = dt.timedelta(\n", + " seconds=index_datetime[0].timestamp() - time_start.timestamp()\n", + " )\n", + " sec_dt = [elmt.total_seconds() for elmt in new_index]\n", + " return list(pd.Series(sec_dt).cumsum())" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "c1c3f01e-ec04-4c58-80b9-87732551dbe5", + "metadata": {}, + "source": [ + "simulation_df[\"time_sec\"] = datetime_to_seconds(simulation_df.index)\n", + "\n", + "sim_opt ={\n", + " \"dataframe\":simulation_df,\n", + " \"startTime\":simulation_df[\"time_sec\"].iloc[0],\n", + " \"endTime\": simulation_df[\"time_sec\"].iloc[-1], \n", + "}" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "a3d96c7e-f6d6-429c-88fe-b0816620d342", + "metadata": {}, + "source": [ + "model = OpaqueWallSimple()\n", + "init_res = model.simulate(\n", + " property_dict = param_dict, \n", + " simulation_options=sim_opt\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "8e2d5abe27a2cf9b", + "metadata": {}, + "source": "Let's compare results of simulation to measurement:" + }, + { + "cell_type": "code", + "id": "4d225f8a2dcdd2b6", + "metadata": {}, + "source": [ + "init_res.index = init_res.index.tz_localize(None)\n", + "init_res = init_res.rename(columns={\n", + " \"T_concrete\": \"T_concrete_PYTHON\",\n", + " \"T_interface\": \"T_interface_PYTHON\",\n", + " \"T_ins\": \"T_insulation_PYTHON\",\n", + "})\n", + "measure_comp = pd.concat([\n", + " simulation_df[[\"T_interface\", \"T_ins\"]], \n", + " init_res[[\"T_interface_PYTHON\", \"T_insulation_PYTHON\" ]]], axis = 1)\n", + "\n", + "color_map = {\n", + " \"T_interface\": \"darkblue\", \n", + " \"T_interface_PYTHON\": \"blue\", \n", + " \"T_ins\": \"darkgreen\", \n", + " \"T_insulation_PYTHON\": \"green\" \n", + "}\n", + "\n", + "import plotly.graph_objects as go\n", + "from plotly.subplots import make_subplots\n", + "\n", + "fig = make_subplots(rows=1, cols=2, subplot_titles=(\"Interface layer\", \"Insulation layer\"))\n", + "\n", + "color_map = {\n", + " \"T_interface\": \"darkblue\", \n", + " \"T_interface_PYTHON\": \"blue\", \n", + " \"T_ins\": \"darkgreen\", \n", + " \"T_insulation_PYTHON\": \"green\" \n", + "}\n", + "for col in [\"T_interface\", \"T_interface_PYTHON\"]:\n", + " fig.add_trace(\n", + " go.Scatter(x=measure_comp.index, y=measure_comp[col], mode=\"lines\",\n", + " name=col, line=dict(color=color_map[col])),\n", + " row=1, col=1\n", + " )\n", + "for col in [\"T_ins\", \"T_insulation_PYTHON\"]:\n", + " fig.add_trace(\n", + " go.Scatter(x=measure_comp.index, y=measure_comp[col], mode=\"lines\",\n", + " name=col, line=dict(color=color_map[col])),\n", + " row=1, col=2\n", + " )\n", + "fig.update_layout(\n", + " title=\"Comparaison Python vs Measurement\",\n", + " width=1000,\n", + " height=500,\n", + " legend=dict(orientation=\"h\", yanchor=\"bottom\", y=-0.2, xanchor=\"center\", x=0.5)\n", + ")\n", + "\n", + "fig.show()" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "ec5e69a99a60a6dd", + "metadata": {}, + "source": [ + "Not good. A sensitivity analysis should be performed to rank the parameters by order of influence on the error between measured temperature and model prediction.\n" + ] + }, + { + "cell_type": "markdown", + "id": "ccdfb5cca982ee18", + "metadata": {}, + "source": [ + "# 3. Sensitivity analysis\n", + "It is very important to know how our defined parameters have an influence on the model prediction. Therefore, we use a sensitivity analysis to \"rank\" the parameter by order of influence\n", + "on the model error. " + ] + }, + { + "cell_type": "markdown", + "id": "e46bcafc7717aa5a", + "metadata": {}, + "source": [ + "## Error function\n", + "The chosen error function is the CV_RMSE. The formula for CV_RMSE is given by:\n", + "\n", + "$$\n", + "CV\\_RMSE = \\frac{RMSE}{\\bar{y}}\n", + "$$\n", + "\n", + "Where:\n", + "- *RMSE* is the root mean squared error,\n", + "- *bar{y}* is the mean of the observed values.\n", + "\n", + "The RMSE is calculated as:\n", + "\n", + "$$\n", + "RMSE = \\sqrt{\\frac{1}{n} \\sum_{i=1}^{n} (y_i - \\hat{y}_i)^2}\n", + "$$\n", + "\n", + "Where:\n", + "- *n* is the number of observations,\n", + "- *y_i* is the observed value for the \\( i \\)-th observation,\n", + "- *hat{y}_i* is the predicted value for the \\( i \\)-th observation.\n", + "\n", + "The CV_RMSE measures the variation of the RMSE relative to the mean of the observed values. It provides a standardized measure of the error, which can be useful for comparing the performance of different models across different datasets.\n", + "Here, we can chose the error function as the CV_RMSE between measured temperature(s) and model prediction." + ] + }, + { + "cell_type": "markdown", + "id": "da0b289ff76e21de", + "metadata": {}, + "source": [ + "## Tested parameters\n", + "\n", + "The chosen parameters are the following model parameters : R_concrete, R_ins, C_ins, C_concrete, alpha, epsilon, R_ext, R_int.\n", + "\n", + "\n", + "Each decision variable of the optimization problem must be described using a `Parameter` object.\n", + "A parameter specifies:\n", + "* `name` — The variable name (string, must be unique within the problem).\n", + "* `ptype` — Variable type, one of:\n", + " * `\"Real\" `— Continuous real number\n", + " * `\"Integer\"` — Discrete integer\n", + " * `\"Binary\"` — Boolean, domain {False, True} (set automatically if no domain is given)\n", + " *` \"Choice\"` — Categorical variable with a fixed set of discrete options\n", + "* `Domain definition` — Choose exactly one of:\n", + " * ` interval=(lo, hi) `— Lower and upper bounds (for \"Real\" and \"Integer\", optional for \"Binary\" if you want (0,1))\n", + " * ` values=(v1, v2, …)` — Explicit list/tuple of allowed values (for \"Choice\", and optionally for \"Integer\", \"Real\", or \"Binary\")\n", + "*` Optional fields`:\n", + " * `init_value` — Initial value (or tuple/list of initial values for batch runs); must be within the defined domain\n", + " * `relabs` — `\"Absolute\"` or `\"Relative\"` (or a boolean flag, depending on usage in your model)\n", + " * `model_property` — String or tuple specifying the corresponding property in the simulation/model\n", + " * `min_max_interval` — Optional extra bounds for use in analysis/validation" + ] + }, + { + "cell_type": "code", + "id": "8f8e2bab5b874374", + "metadata": {}, + "source": "from corrai.base.parameter import Parameter", + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "64dfc055e9c59627", + "metadata": {}, + "source": [ + "frac_p = 0.40 # 20%\n", + "frac_conv = 0.60 # 40%\n", + "\n", + "params = [\n", + " Parameter(\n", + " name='R_concrete',\n", + " interval=(1-frac_p, 1+frac_p),\n", + " ptype=\"Real\",\n", + " init_value=1 / (lambda_concrete / ep_concrete) / 2 / S_wall,\n", + " relabs=\"Relative\",\n", + " model_property='R_concrete',\n", + " ),\n", + " Parameter(\n", + " name='R_ins',\n", + " interval=(1-frac_p, 1+frac_p),\n", + " init_value=1 / (lambda_ins / ep_ins) / 2 / S_wall,\n", + " relabs=\"Relative\",\n", + " ptype=\"Real\",\n", + " model_property='R_ins',\n", + " ),\n", + " Parameter(\n", + " name='C_ins',\n", + " interval=(1-frac_p, 1+frac_p),\n", + " init_value=rho_ins*ep_ins*S_wall*sc_ins,\n", + " relabs=\"Relative\",\n", + " ptype=\"Real\",\n", + " model_property='C_ins',\n", + " ),\n", + " Parameter(\n", + " name='C_concrete',\n", + " interval=(1-frac_p, 1+frac_p),\n", + " init_value=rho_concrete*ep_concrete*S_wall*sc_concrete,\n", + " relabs=\"Relative\",\n", + " ptype=\"Real\",\n", + " model_property='C_concrete',\n", + " ),\n", + " Parameter(\n", + " name='alpha',\n", + " interval=(0.1, 0.6),\n", + " ptype=\"Real\",\n", + " model_property='alpha',\n", + " ),\n", + " Parameter(\n", + " name='epsilon',\n", + " interval=(0.2, 0.9),\n", + " ptype=\"Real\",\n", + " model_property='epsilon',\n", + " ),\n", + " Parameter(\n", + " name='R_ext',\n", + " init_value= 0.04/S_wall,\n", + " interval=(1-frac_conv, 1+frac_conv),\n", + " ptype=\"Real\",\n", + " relabs=\"Relative\",\n", + " model_property='R_ext',\n", + " ),\n", + " Parameter(\n", + " name='R_int',\n", + " init_value= 0.13/S_wall,\n", + " interval=(1-frac_conv, 1+frac_conv),\n", + " ptype=\"Real\",\n", + " relabs=\"Relative\",\n", + " model_property='R_int',\n", + " ),\n", + "]" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "3c70d720457ec5a9", + "metadata": {}, + "source": [ + "## Sensitivity analysis methods\n", + "We can now load from `corrai.sensitivity` module a sensitivity method.\n", + "\n", + "*Note: for now, only SOBOL, FAST, RBD_FAST, \n", + "and MORRIS methods are implemented.*" + ] + }, + { + "cell_type": "markdown", + "id": "204bb3136c2d6d5b", + "metadata": {}, + "source": [ + "### MORRIS method" + ] + }, + { + "cell_type": "markdown", + "id": "bd08f44deb200457", + "metadata": {}, + "source": [ + "We should start off with a screening using MORRIS method." + ] + }, + { + "cell_type": "markdown", + "id": "71c9b5fa64936e4a", + "metadata": {}, + "source": [ + "For MORRIS method, two indices, µj* for the mean of the absolute values of these effects and σj for the standard deviation of these effects, are calculated as follows:\n", + "\n", + "$$\n", + "mu_{j}^{*} = \\frac{1}{r} \\sum_{i=1}^{r} E_{ij}\n", + "$$\n", + "\n", + "$$\n", + "sigma_{j} = \\sqrt{\\frac{1}{r-1} \\sum_{i=1}^{r} (E_{ij} - \\mu_{j}^{*})^2}\n", + "$$" + ] + }, + { + "cell_type": "markdown", + "id": "d057bfd1-76ee-414a-acc4-c518c907a4be", + "metadata": {}, + "source": [ + "Let's resample the data to 5 minute-steps. " + ] + }, + { + "cell_type": "code", + "id": "ccab3e5e-49cb-4eb4-b461-8eb51e0046e2", + "metadata": {}, + "source": [ + "simulation_df_resample = simulation_df.resample(\"5min\").mean()\n", + "simulation_df_resample[\"time_sec\"] = datetime_to_seconds(simulation_df_resample.index)\n", + "\n", + "sim_opt ={\n", + " \"dataframe\":simulation_df_resample,\n", + " \"startTime\":simulation_df_resample[\"time_sec\"].iloc[0],\n", + " \"endTime\": simulation_df_resample[\"time_sec\"].iloc[-1], \n", + "}" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "a9ef9dae1e1b2296", + "metadata": {}, + "source": [ + "from corrai.sensitivity import MorrisSanalysis\n", + "\n", + "sa_study = MorrisSanalysis(\n", + " parameters=params,\n", + " model=model,\n", + " simulation_options=sim_opt,\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "8ed249dc6495c87e", + "metadata": {}, + "source": [ + "sa_study.add_sample(N=2**6, n_cpu=-1)" + ], + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "First, we can plot all simulations in one graph and compare the simulated internal temperature to measured T_int. Argument show_legends can be set to True if you want see associated parameters values.", + "id": "d00ae88ab5626060" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "sa_study.sampler.plot_sample(\n", + " indicator=\"T_ins\",\n", + " reference_timeseries = simulation_df_resample[\"T_ins\"],\n", + " x_label=\"time\",\n", + " y_label=\"simulated temperature of T_ins [°C]\",\n", + " show_legends=False\n", + ")" + ], + "id": "9947412501c04aff", + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "d068e11a7572ee9c", + "metadata": {}, + "source": "First, we can first take a look on the parallel coordinate plot of all parameter values and one of the simulation outputs aggregated according to a chosen aggregation method:This graph can be very instructive as at some moments, simulations are far from measurements. It show that whatever the values of our parameters, it still does not fit reality: this is either due to a problem of measurement, or to our modeling approach (physical inconsistency, physical phenomenon not properly taken into account, etc.).\n" + }, + { + "cell_type": "markdown", + "id": "b9a30a39-9ae7-4bf0-92b2-ca1e5ef68cd7", + "metadata": {}, + "source": [ + "Let's perform a **Morris sensitivity analysis** on the following indicator: cv_rmse on `T_ins`. \n", + "The `analyze()` method computes Sobol indices based on a chosen performance metric—in this case, the **cv_rmse**—by comparing model predictions of `T_ins` against the provided reference time series (measured or baseline data)." + ] + }, + { + "cell_type": "code", + "id": "3cb2bc12ba38b9af", + "metadata": {}, + "source": [ + "res_analysis = sa_study.analyze(\n", + " indicator=\"T_ins\",\n", + " method=\"cv_rmse\",\n", + " reference_time_series=simulation_df_resample[\"T_ins\"],\n", + ")\n", + "res_analysis[\"cv_rmse_T_ins\"]" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "ccd6dc275855f2be", + "metadata": {}, + "source": [ + "The highter $mu_{j}$, the more the parameter $j$ contributes to an uncertain output, and the higher $sigma_{j}$, the more pronounced the interaction effects between the model parameters are. Plotting $sigma_{j}$ against $mu_{j}^{}$ is often used to distinguish factors with negligible, linear, and/or interaction effects." + ] + }, + { + "cell_type": "code", + "id": "345e88c76ddfeebd", + "metadata": {}, + "source": [ + "sa_study.plot_scatter(\n", + " indicator=\"T_ins\",\n", + " method=\"cv_rmse\",\n", + " reference_time_series=simulation_df_resample[\"T_ins\"],\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "6a4ceb844b59ab20", + "metadata": {}, + "source": [ + "Besides this visual interpretation, it is also possible to calculate the Euclidean distance $d$ to the origin to obtain the total effect of the uncertain parameters:" + ] + }, + { + "cell_type": "code", + "id": "9a387a5d3c8c5cc1", + "metadata": {}, + "source": [ + "sa_study.plot_bar(\n", + " indicator=\"T_ins\",\n", + " method=\"cv_rmse\",\n", + " reference_time_series=simulation_df_resample[\"T_ins\"],\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "76b78d1d9e64eca", + "metadata": {}, + "source": [ + "Five parameters seem to have more impact on the error than other: $alpha$, $R_{ext}$, $R_{ins}$, and $R_{concrete}$. \n", + "Let's check if these results are consistant with SOBOL method." + ] + }, + { + "cell_type": "markdown", + "id": "f8e6cd080e0732af", + "metadata": {}, + "source": [ + "### SOBOL method" + ] + }, + { + "cell_type": "code", + "id": "cba431b6fc562b4e", + "metadata": {}, + "source": [ + "from corrai.sensitivity import SobolSanalysis" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "2c00547d792def6c", + "metadata": {}, + "source": [ + "sa_study = SobolSanalysis(\n", + " parameters=params,\n", + " model=model,\n", + " simulation_options=sim_opt,\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "4fe53d7e43bada10", + "metadata": {}, + "source": [ + "The method add sample draw a sample of parameters to be simulated. Each method has its sampling method. Please see SALib documentation for further explanation (https://salib.readthedocs.io/en/latest/index.html)\n", + "\n", + "Note that:\n", + "- Convergence properties of the Sobol' sequence is only valid if\n", + " `N` (100) is equal to `2^n`.\n", + " N (int) – The number of samples to generate. Ideally a power of 2 and <= skip_values.\n", + "- Convergence properties of the Fast' method is only valid if sample size N > 4M^2 (M=4 by default)" + ] + }, + { + "cell_type": "code", + "id": "23e24bd8271d28e5", + "metadata": {}, + "source": "sa_study.add_sample(N=2**7, n_cpu=-1)", + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "990ac0bb9acddf35", + "metadata": {}, + "source": "As for MORRIS, we can perform a **Sobol sensitivity analysis** on the following indicator: cv_rmse on `T_ins` ." + }, + { + "cell_type": "code", + "id": "f5c2a855e8324d1d", + "metadata": {}, + "source": [ + "res_analysis =sa_study.analyze(\n", + " indicator=\"T_ins\",\n", + " method=\"cv_rmse\",\n", + " reference_time_series=simulation_df_resample[\"T_ins\"],\n", + ")\n", + "print(\"Sum of S1 indices is\", res_analysis[\"cv_rmse_T_ins\"][\"S1\"].sum())" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "3305230bc9a51f3d", + "metadata": {}, + "source": [ + "The `plot_bar()` method displays the Sobol sensitivity indices as a bar chart, allowing a quick comparison of parameter importance.\n", + "It uses the aggregated results from the `analyze()` step (e.g., based on `cv_rmse`) and shows first-order indices for each parameter, making it easy to identify which parameters most influence the chosen performance metric.\n", + "\n", + "The sum of all the indices should be close to 1. Also, the mean confidence interval should be very low. In that case, results of the sensitivity analysis can be considered as robust." + ] + }, + { + "cell_type": "code", + "id": "e53ad52d3a6e9896", + "metadata": {}, + "source": [ + "sa_study.plot_bar(\n", + " indicator=\"T_ins\",\n", + " method=\"cv_rmse\",\n", + " reference_time_series=simulation_df_resample[\"T_ins\"],\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "fd866b363245ec8", + "metadata": {}, + "source": [ + "The parameter alpha appears to have the most influence on the model errors. This impact is calculated on the overall error, but will depend on the time of the day. Let's observe the parameters' impact dynamically with a 15minutes frequency.\n", + "\n", + "The `plot_dynamic_metric()` method extends the analysis to a frequency view of the selected performance metric.\n", + "Here, we apply it to `T_ins` using the `cv_rmse` metric, comparing model output against the `reference_time_series` at regular intervals.\n", + "\n", + "Key arguments:\n", + "- **`freq`**: Controls the temporal resolution for aggregation (e.g., `freq=\"2h\"` computes sensitivity indices every two hours). This enables tracking how parameter influence changes over time.\n", + "- **`method`**: The metric used for comparison (here, `cv_rmse` between model predictions and the reference).\n", + "- **`reference_time_series`**: The baseline data against which each simulation is evaluated.\n", + "\n", + "This approach is particularly useful in dynamic systems, where parameter importance may vary significantly across different time periods.\n", + "\n", + "*Calculation can take time according to the number of simulations and frequency of data.*\n" + ] + }, + { + "cell_type": "code", + "id": "a31e489e8d7ee834", + "metadata": {}, + "source": [ + "sa_study.plot_dynamic_metric(\"T_ins\", freq=\"2h\", method=\"cv_rmse\", reference_time_series=simulation_df_resample[\"T_ins\"] )" + ], + "outputs": [], + "execution_count": null + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": "Finally, we can also take a look at the parallel coordinate plot of all parameter values and one of the simulation outputs aggregated according to a chosen aggregation method. This is a first step to foresee correct values of parameters to fit the measurement.", + "id": "84e6f96f7d3746b9" + }, + { + "metadata": {}, + "cell_type": "code", + "source": [ + "sa_study.plot_pcp(\n", + " indicator=\"T_ins\" ,\n", + " method=\"cv_rmse\",\n", + " reference_time_series=simulation_df_resample[\"T_ins\"],\n", + ")" + ], + "id": "9ea5a609f200cdbf", + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "4c84cb82b435f019", + "metadata": {}, + "source": [ + "# Conclusion on sensitivity analysis\n", + "\n", + "The sensitivity analysis allowed us to rank the influence of uncertain parameters\n", + "on an indicator. \n", + "\n", + "Results with Sobol are consistant with Morris. $alpha$ is the most influencial parameters, followed by $R_{ext}$.\n", + "\n", + "In the Identification tutorial, modules of corrai are used to identify the\n", + "optimal values for these parameters in order to fit the measurement. See : https://github.com/BuildingEnergySimulationTools/corrai/tutorials/Sensitivityanalysis for more details." + ] + }, + { + "cell_type": "code", + "id": "568bc28f10756902", + "metadata": {}, + "source": [], + "outputs": [], + "execution_count": null + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.11" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/tutorials/images/sensors.png b/tutorials/images/sensors.png index 578f2f6..b0cd2b2 100644 Binary files a/tutorials/images/sensors.png and b/tutorials/images/sensors.png differ