-
Notifications
You must be signed in to change notification settings - Fork 1
Dev
Warning
GitHub uses cookies to provide services and site functionality. For more information, see the GitHub General Privacy Statement.
This documentation is for individuals either wanting to contribute to Heatmapper, or deploy it to a server.
For reference, you should look at Running Client-Side PyShiny, as those instructions are also applicable to hosting Heatmapper on a server.
The setup.sh script located in the root of the repository can setup a complete environment for running Heatmapper, including setting up a virtual environment, installing dependencies, cloning Heatmapper, and resolving LFS files. It’s a bash script, so deployment on a Windows server will need to be done manually. From the repo, find setup.sh from the file list. Upon clicking on it, GitHub should take you to a viewer, with a download button in the top-right corner. Or, if you only have access to a terminal, you can curl the script via:
curl -O https://raw.githubusercontent.com/WishartLab/heatmapper2/main/setup.shFrom there, make it executable:
chmod +x setup.shThen, place it into the directory you want Heatmapper to live in. setup.sh will create two directories:
- The python virtual environment in
venv - The Heatmapper source code in
heatmapper2
Once the script is finished, or you’ve manually handled dependencies and installation, you’ll next need to activate the Virtual Environment (Assuming you’re using a venv and not just installing dependencies to the system). From the folder containing the venv folder, run:
source venv/bin/activateYou can deactivate the virtual environment at any time by typing:
deactivateNow, enter the heatmapper2 directory. For batch deployment, there are two scripts which automate the process:
-
deploy.shwill deploy each application on the host, starting at port8000for Expression, and ending with8006for Spatial. Each process will run in a separate process, so the script (and user session) can be closed without tearing down the applications themselves. -
teardown.shwill send aKILLsignal to all applications listening on the ports 8000-8006. If you’re only selectively hosting Heatmapper’s applications, this might kill non-related applications if they’re listening to that port.
However, if you want to be more selective about which applications are run, you have two primary options:
- Running it as a PyShiny application. To do this, navigate into the project to run, such as
expression, and enter itssrcdirectory. From there, execute:shiny run --host 0.0.0.0. Thehostargument is important to be listening on all network interfaces. If you want to enable reloading, so that changes to thesrcfolder will be transparently noted and changed within the application—without needing to stop it—add--reload. To specify a port, use the--portargument - Running it as a Static, WebAssembly application. This mode will instead server a connecting client with the WebAssembly files, which are then run on their computer. From the project folder
expression, there are two sub-folders,srcandsite. Simply runpython3 -m http.server --directory site --bind localhost 8008, where the value8008specifies the port.
This section outlines some general guidance on working within the Heatmapper repository.
For sake of consistency, Python Code should:
- Always use
fromimports, rather than importing the entire module: Dofrom shiny import App, notimport shiny - Use double quotes rather than single quotes for strings
- Use tabs, rather than spaces
- For naming convention:
- Local variables should use
snake_case - Global variables, functions, classes, and Shiny IDs should use
PascalCase
- Local variables should use
- Prefer code that is more concise. If a function only has a single line, put in the function definition, such as
async def Reset(): await DataCache.Purge(input) - Strive to consistently document the code-base. All non-trivial functions should have doc strings, which should follow Doxygen format.
- Use
shared.pydefinitions over creating something custom. If functionality is missing, add it to theshared.pyimplementation. - Always use the
Cacheobject for handling input - Always use the
Filterfunction to determine column names. - Always use the
NavBarfunction to create a navigation bar shared across all applications. -
shared.pyshould always be a symlink within thesrcfolder. Do not copy it.
When creating a new Application, there’s a few things to note:
- You should create a
DataCachevariable from theshared.Cacheclass, which will handle all your user-input. This should be in theserverfunction. - If you need to extend the
Cache, such as adding more file-types, create a function that you can pass to theCachecall.- Treat it like a switch statement. You will be passed a single argument,
path. Compare against the suffix to see if it matches your custom file type. If it doesn’t, returnDataCache.DefaultHandler(path). Do not modify the Default Handler, it bogs down all the applications.
- Treat it like a switch statement. You will be passed a single argument,
-
FileSelectionshould be used to generate the UI for uploading/selecting input. Importantly:- It will create Shiny input IDs
SourceFilefor whether the user is selecting Upload/Example.Filefor the user-uploaded file, andExamplefor the selected example. Additionally, it will create theExampleInfoButtonandExampleInfoIDs. ID conflicts cause Shiny to fail. - You will need to manually set
ExampleInfo. The easiest way to is to make a reactive function that looks at a dictionary defined in theserver:def ExampleInfo(): return Info[input.Example()] - The
multipleargument should be used with caution. It requires you do manually handle parsing input. See Spatial for an implementation
- It will create Shiny input IDs
- The
MainTabfunction supports adding additional tabs via the*argsargument. See Spatial or Expression for implementations. It will create IDs:Heatmap, which should be your main page Heatmap,Table, which you shouldn’t need to touch, as it handles creating all the associated values, and itself has an ID ofMainTab. You may need to add ID’sResetso that your reactive functions update when the user updates the table. - You will need to manually Filter columns. This involves calling
Filterin a reactive function with the following arguments:- The input, usually
(await DataCache.Load(input).columns - The type of column to look for, see
shared.pyfor values. - A UI element to update, such as
NameColumn
- The input, usually
When changes are made within the code-base, they are not reflected in the WebAssembly site, which can cause incongruity when pushed to GitHub. Run the rebase.sh script at the root of the repository to perform this action across all applications.
Heatmapper is designed to be easily deployed for different purposes, and to this effect most of the interface can be modified without having to modify the code itself (Technically you modify code, but that’s just so that configuration is bundled in web assembly).
Each project contains a config.py file, a Python file which provides defaults and overrides to every configurable option in that program. However, the base config.py is within Heatmapper’s version control system, which means that modification of it can cause clashes when attempting to update. For that reason, you should copy config.py, creating a file named user.py. Heatmapper will first check if user.py exists, and use that for configuration, only falling back to config.py if the former doesn’t exist. Do not modify config.py Consider the configuration provided in Pairwise:
# Distance/Correlation
"MatrixType": Config(selected="Distance", visible=True),This variable is attached to the associated input.MatrixType which defines whether the user wants to select a Distance Matrix, or Correlation Matrix. Let’s break it down:
-
MatrixType, the input name, and cannot be modified as it’s explicitly used within the main program. You cannot add new configurations (Every user input that can be modified is already present in the file) -
Configis fromshared.py, and is simply a class that wraps configuration. Every configuration is anConfigobject. -
selectedis the only required argument of any configuration. This specifies what Heatmapper should assign as the default value when loading the application. A comment above each Config outlines what your values can be. Some configurations usesvalueinstead, which is simply because some inputs “select” a value, such as the titularui.input_select, whereas others simply have a value, such asui.input_checkbox. The configuration already provides the correct keyword, so this has no impact on actually configuring the application so long as the original configuration keyword isn’t deleted. -
visibleis an optional argument that defaults toTrue. WhenvisibleisTrue, the associated user input in the sidebar will be visible when loading the application, and the user can make modifications to the value. WhenvisibleisFalse, the input will be hidden from the sidebar, and the user will be unable to change theselectedvalue. This is useful where an application has no need for the option to be available (Such as only needing to display Distance Matrices) and helps declutter the sidebar and prevent user confusion. - Finally, something that is not shown in any of the default configurations, is that the
Configclass takes any key-word argument and stores it, applying them directly to the Shiny input object. Therefore, if we wanted to make sure theMatrixType’s radio buttons are not inline, we could modify the configuration toMatrixType = Config(default="Distance", inline=False). You may notice that Heatmapper already definesinline=Truewithin Pairwise’s code, butConfigobjects will check for these conflicts, and will default to the Configuration. You can therefore override all of the parameters of the input, save the input type itself. Refer to Shiny’s excellent documentation if you want to make any such changes; note that you cannot change the input type itself, and some modifications may cause issues with the application (IE specifyingmultiple=Truewhere Heatmapper does not expect multiple inputs)
Heatmapper has some configurations that do not have an associated value. There are such types, both of which warrant additional explanation:
- Configurations that are only there for visibility. Consider:
"DownloadTable": Config(). This is an input that doesn’t expose any “values,” it’s simply a button. These configurations exist to toggle visibility of features through thevisiblekeyword. - Configurations that are dynamic inputs. Examples include
"Keys": Config()in Spatial, and"KeyColumn": Config()in Geomap. These inputs are dynamically updated by Heatmapper because input files often have different column names for different values, such as some files usingNAME, others usingKEY, etc. While These configurations support both setting aselected=andvisible=keyword, the behavior differs in important ways:- When
visible=True, theselectedvalue will be defaulted to, so long as it exists in the data. If you defineselected="NAME", Heatmapper will default (Remember, the user can still change this value when the input is visible) to the selected value, case-sensitive, until a file is provided where the column does not exist. When that happens, Heatmapper will use its Filtering mechanism and automatically choose a more appropriate column name. - When
visible=False, theselectedvalue is constant and unchanging. Even if the column doesn’t exist in the input data, Heatmapper will use it; this means that you need to be very careful with what you select for a default value, and what input you provide to the application, as if the column name doesn’t exist, Heatmapper will not rectify the incongruity and will simply fail to render.
- When
Column Filter is an important facet of Heatmapper’s design, so it’s recommended not to touch the dynamic inputs, especially disabling their visibility, as it encumbers the application to hard-coded values that are antithesis to its design. However, if your use-case requires very specific file formats, where the column names are known and will not change, disabling the Filtering can reduce user confusion.
If you’re working within the code-base, you may wonder how to actually work with Configuration values. In essence, they’re just wrappers on Shiny’s input values (If the input UI’s aren’t visible, that’s literally all they are). They can’t be used as reactive decorators, but with caching you shouldn’t need to use reactive decorators in the first place.
Configuration variables are optional. You can use regular Shiny input’s just as well as you can use configuration values, but while you don’t need to use the former to use the latter, the reverse is not true. To create a Configuration value, there are three steps:
- Define the
Configclass within theconfig.pyfile. See the above Configuration section on its structure. - Wrap the
ui.inputvalue in theapp_uiwith the Configuration’sUImembers. For example, if you have a config"MatrixType": Config(), you’ll want to take the Shiny input withid="MatrixTypewithin theapp_ui, and change it to:config.MatrixType.UI(ui.input, id="MatrixType", ...)Some things to note:- The
ui.inputobject does not take the keyword arguments, don’t doui.input(id="MatrixType", ...)) - You must exclusively use keyword arguments, and they’ll be passed to the
ui.inputobject
- The
- Replace uses of
input.X()withconfig.X(). Don’t use them in reactive decorators.
Heatmapper employs two types of Caching, Web Resource Caching and Computation Caching:
Web Resource Caching should always be utilized, and if you fetch information using the DataCache it will be done automatically. You’ll need to use the FileSelection function within your app_ui. If you need to fetch more than just a single example, you can fetch any arbitrary content using the Cache. Consider an example from Geomap. Firstly, you need to define a reactive variable, and an updater function:
JSON = reactive.value(None)
#...
@reactive.effect
@reactive.event(input.JSONUpload, input.JSONSelection, input.JSONFile)
async def UpdateGeoJSON(): JSON.set(await DataCache.Load(
input,
source_file=input.JSONUpload(),
example_file=input.JSONSelection(),
source=URL,
input_switch=input.JSONFile(),
default=None
))
# ...
json = JSON()Some things to note:
- Use a reactive variable. constantly querying the Cache is wasteful and inefficient.
- Ensure you have reactive decorators. This is one of the only functions in Heatmapper that you should have decorators, as this will cause the reactive variable to be modified, and will trigger all functions that rely on it.
- Make it asynchronous; as with decorators, this will be the only function where you should do this, and you should only let the server call this function. When you need the value, call the variable:
json = JSON(). - You cannot use configuration values for the reactive values. You need to use regular Shiny input values.
- Note the arguments to the Cache:
-
source_fileis a Shinyui.input_file. Shiny and Heatmapper handle taking user input and parsing it. -
example_filedictates the name of the example file. You have two formats in this regard.- A file name relative to the
sourcevariable. By default, this points to your example directory, so if you have a file stored inexample_input/my_test,input.JSONUpload()can simply bemy_test. - A URL. If the
source_filestarts withhttps://,sourcewill be completely ignored and thesource_filebe fetched directly. Look at Geomap’s example files to see how one of the examples are fetched from outside the normal place, simply by using a URL.
- A file name relative to the
-
sourceDefines whereexample_filewill be located. Usually, this theexample_inputfolder for the application, but you can set it wherever you want. For this example, the URL points todatawithin the Geomap folder. Importantly, this source has to be local when running as a server, or remote when running as WebAssembly. You’ll need to use thePyodidevariable in shared to know what more Heatmapper is running in; for Geomap, it sets the URL to../datain server mode and a link to GitHub otherwise. Heatmapper expects example files to be located on disk when not running under Pyodide. -
input_switchdefines the input that defines whether we’re expecting an example file, or a user-uploaded file. If it’s equal to"Upload", it’ll be looking atsource_file, otherwise it looks atexample_file. -
defaultdefines what to return if there’s nothing to return. This defaults to a DataFrame, but you may want to change it so whatever type you expect to return, otherwise you might get unexpected objects when there is nothing to return.
-
Heatmapper also supports arbitrary computation caching, although you’ll need to go out of your way to use it. In essence, you’ll be using three functions in your Cache object: In(), Get(), and Store(). Firstly, you’ll need to make a list of inputs that this computation uses. That way, changes to inputs will ensure that an invalid cached object isn’t return. Heatmapper makes no effort to ensure all your inputs are accounted for. Consider the Imaging Caching used by Pairwise, Expression, and Image. Firstly, at the start of each Heatmap call, it creates a list of inputs:
inputs = [
input.File() if input.SourceFile() == "Upload" else input.Example(),
input.Image(),
config.ColorMap(),
config.Opacity(),
config.Algorithm(),
config.Levels(),
config.Features(),
config.TextSize(),
config.DPI(),
]Notice that we take the value of these (IE it’s a list of strings, not a list of reactive objects), and that we can be conditional about what values truly make up the hash (We don’t need both File and Example, we just need whatever is selected). Then, we use the first function, In():
if not DataCache.In(inputs):
# ...It’s recommended to check the absence of the object in the Cache, compute it and place it in the cache, and then return it so that both branches in the condition have the same return statement. In the case that the object isn’t in the Cache, the application will do the regular computation to create the output, and then stores it in the Cache:
b = BytesIO()
fig.savefig(b, format="png", dpi=config.DPI())
b.seek(0)
DataCache.Store(b.read(), inputs)Note that we cannot store MatPlotLib plots directly, we save it as an image, and store the image’s bytes within the Cache, associating it with the inputs used to make it. Finally, we return the object within the Cache:
b = DataCache.Get(inputs)
with NamedTemporaryFile(delete=False, suffix=".png") as temp:
temp.write(b)
temp.close()
img: types.ImgData = {"src": temp.name, "height": f"{config.Size()}vh"}
return imgThe Temporary File shenanigans aren’t important, what is important is that we use Get() to retrieve that binary stream, and then return it appropriately.