Skip to content

Initial Dataset Management GUI#197

Closed
robertbartel wants to merge 46 commits intoNOAA-OWP:masterfrom
robertbartel:e21/dataset_mgr_gui/main
Closed

Initial Dataset Management GUI#197
robertbartel wants to merge 46 commits intoNOAA-OWP:masterfrom
robertbartel:e21/dataset_mgr_gui/main

Conversation

@robertbartel
Copy link
Copy Markdown
Contributor

No description provided.

super().__init__(*args, **kwargs)

def _get_dataset_content_details(self, dataset_name: str):
result = asyncio.get_event_loop().run_until_complete(self.dataset_client.get_dataset_content_details(name=dataset_name))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there are several calls to asyncio.get_event_loop().run_until_complete. It might help to have some synchronous versions of these functions on the dataset client since there appears to be some intention to call them synchronously. Maybe.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base functions for sending data through websockets are (and I am pretty sure must be) async, as are the functions making sure the client is authenticated (which relies on the former also). It would take a decent bit of changing to be able to do this and have everything work, but we can look more at that in the future.


def get(self, request, *args, **kwargs):
request_type = request.GET.get("request_type", None)
if request_type == 'download_file':
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if chain leads me to believe that splitting this out across several smaller views might help.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did originally start to do that, but it occurred to me that this kind of violated the intent of having only a single websocket (at least per authenticated session) open for all client communication. In fairness, that isn't working properly at the moment, but I think we want to try to have it that way eventually, or else we'd move to REST. And if we are eventually going for that, I don't think we want too many different views (i.e., because eventually we'll have to consolidate them), although I could be assessing that incorrectly.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, if there's a plan to consolidate splitting them out doesn't make a whole lot of sense.

return self._get_dataset_json(dataset_name=request.GET.get("name", None))
elif request_type == 'contents':
return self._get_dataset_content_details(dataset_name=request.GET.get("name", None))
if request_type == 'delete':
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could probably be handled through a delete handler instead of a get handler. I don't know what all's going on in the workflow (like what the authentication looks like), but it looks like a user could possibly delete datasets through a browser address.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, but that should eventually get protected by our authentication and authorization mechanisms. And then this relates to my other comment about how we (probably) want to consolidate views now, or will eventually, for how we want our websocket communication to work.

return self._delete_dataset(dataset_name=request.GET.get("name", None))

# TODO: finish
return JsonResponse({}, status=400)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Django has a specialized response that might help here at django.http.HttpResponseBadRequest. If you use it it can help out when diagnosing issues in debug mode.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^^
Related, do we want to provide any context as to why a 400 was the response? i.e. expected one of request_type: .

errors, warnings, info = extract_log_data(kwargs)

# Gather map of serialized datasets, keyed by dataset name
serial_dataset_map = asyncio.get_event_loop().run_until_complete(self.get_datasets())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another asyncio.get_event_loop().run_until_complete call. It might help to have two get_datasets - get_datasets and get_datasets_async, with the former just calling the latter with the event loop wrapper.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can, but that seems cluttered. Not that I haven't already done that kind of cluttering in places ...

Do you think that would just enhance developer usability of the client class, or were there other advantages you were anticipating?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that I'm the type of goober that would end up calling self.get_datasets() and forget to use the asyncio logic. That's just about the only thing that the change would address.

FileWrapper(custom_filelike, chunk_size),
content_type="application/octet-stream"
)
response['Content-Length'] = asyncio.get_event_loop().run_until_complete(self.dataset_client.get_item_size(dataset_name, item_name))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StreamingHttpResponses are a little wonky - I don't think they are compatible with the Content-Length header and will clash with Gunicorn. It might be better to have a websocket that emits this data and let the client build the blob.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have this part working anyway, so it can be implemented differently. I can pull this and other such things out entirely, but I wasn't sure if parts might be useful to show the direction I was trying to go.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's precisely why it should be left in for now. I just wanted to drop some notes while I was looking at it. I don't think any of us has the bandwidth to make major changes at the moment.

arguments[0].appendChild(row);
}

function buildDatasetFilesEmptyDiv() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might help to define these styles in a css document rather than baking them into the template. They'd be easier to change and help reduce some code bloat.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I started down that road for several things but didn't get everything and wanted to push this. I plan to do more clean up, it's just a question of whether it's before or after the review and merge of this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will be able to remove a lot of this if / when we move to css grid / flexbox. I think we are all on the same page though.

});
}

function ajaxBuildDatasetFilesTable(dataset_name, contents_json, table) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be easier to take advantage of a template tag here instead of programmatic element creation. That'd help reduce possible nesting errors and make things a little easier to read (since you'd be reading markup instead of js).

*/
</script>
<style>
#header {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably consolidate these and styles from other templates into a common css document to help maintain consistency between pages. Branding, for instance, could be used on just about every page, for example.

@@ -0,0 +1,797 @@
<!DOCTYPE html>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Templates can inherit from base templates and override/fill out different elements from the parent. Using a base template here could probably cut down the size of this template down by a large portion and make it so the only things that need to be written here are the things specific to dataset management and dataset management alone.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and we can share common things like stylesheets.

</style>
</head>
<body onload="initGuiViews()">
<div id="container">
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort of boilerplate could be farmed out to a base template so you don't have to worry about all that here.

@robertbartel
Copy link
Copy Markdown
Contributor Author

With PRs #197 and #199, I believe once one of them is merged, the other will need modifications to resolve conflicts.

@christophertubbs
Copy link
Copy Markdown
Contributor

I approved #199, so this will need to be modified.

from typing import Dict


class AbstractDatasetView(View, DMODMixin, ABC):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the desire to add blanket implementation here for downstream classes, but given that all methods and properties are implemented, is the use of abstract here to indicate to downstream users that they should inherit from this? Still coming up the speed on the codebase so this might be an idiom ive not picked up on yet.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially yes. In particular, to reusably abstract a view that has a dataset client property. But it's not especially meaningful on it's own.

from pathlib import Path
from typing import List, Optional, Tuple, Type

GUI_STATIC_SSL_DIR = Path('/usr/maas_portal/ssl')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might want to mark this as a TODO in the future to make this configurable.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should also be handled by Django as well. Shouldn't need to be too necessary. Should be in settings.py

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks for catching this. A comment noting this implicit behavior would be appreciated!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason I thought I put that in settings.py.


def read(self, blksize: int) -> AnyStr:

result = asyncio.get_event_loop().run_until_complete(
Copy link
Copy Markdown
Member

@aaraney aaraney Oct 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to try catch here? If the download_item_block call raises an exception and a downstream user of DatasetFileWebsocketFilelike catches an exception, the state of their DatasetFileWebsocketFilelike could be corrupted or at least unclear what the behavior should be. Meaning, here, _read_index, if caught during a failed read, should continue from the failure point. Is this the expected behavior, or should _read_index be reset to 0?

updateFormatChange(selection) {
let dy_div = document.getElementById(this.dynamicVarsDivId);
//let initial_length = dy_div.childNodes.length;
dy_div.childNodes.forEach(child => child.remove());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to #198. See this comment specifically. This should be changed to the following:

// remove all dynamic input child elements
while (dy_div.firstChild) {
  dy_div.removeChild(dy_div.firstChild);
}

super().__init__(*args, **kwargs)

def _get_dataset_content_details(self, dataset_name: str):
result = asyncio.get_event_loop().run_until_complete(self.dataset_client.get_dataset_content_details(name=dataset_name))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we might want to wrap these methods (all methods with leading _) with try except's and return something meaningful to the client if the request fails. I didn't go looking, but if the result has something like an ok property, we might want to return a non-200 here.


# Gather map of serialized datasets, keyed by dataset name
serial_dataset_map = asyncio.get_event_loop().run_until_complete(self.get_datasets())
serial_dataset_list = [serial_dataset_map[d] for d in serial_dataset_map]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is nit-picky, but for code readability, serial_dataset_list = list( serial_dataset_map.values()) might be preferred.

if self.last_response.success:
return self.last_response.data
else:
return {}
Copy link
Copy Markdown
Member

@aaraney aaraney Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might want to return something to the caller indicating why, if we know, the response was unsuccessful. thoughts?

if self.last_response.success:
return self.last_response.data
else:
return -1
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^^ ditto

if self.last_response.success:
return self.last_response.data
else:
return ''
Copy link
Copy Markdown
Member

@aaraney aaraney Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^^ ditto. For non-dict return types, does it make sense to raise an exception here?

@robertbartel
Copy link
Copy Markdown
Contributor Author

FWIW, I don't particularly understand why tests are (now) failing after rebasing in other upstream changes. The same tests are not failing locally for me or when run on ucs3. Given that, I think we should proceed with this (pending final review approval).

Either ``communication.MessageEventType.DATASET_MANAGEMENT`` or
``communication.MessageEventType.DATA_TRANSMISSION``.
"""
# TODO:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue to track this? Or will this be completed in this PR?

from .AbstractDatasetView import AbstractDatasetView


class DatasetManagementView(AbstractDatasetView):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on adding a delete method here to handle deleting a dataset? Or will that fall under the responsibility of post? Meaning, post would then likely handle creation, deletion, and modification.

Updating service to respond to GET_DATASET_ITEMS queries and requests
for data with an offset start (i.e. partials).
Adding initial second view display and manage existing datasets, along
with navigation functionality to toggle between "manage" and "create"
views.
Sending serialized datasets to HTML template as list/array rather than
dict/map.
Implementing layout and initial details-viewing behavior for dataset
management GUI view.
Adding commented-out line for host bind-mount volume to GUI Docker
service config to make debugging Django template and Javascript code
faster by mounted code from the host into the running service.
Make DatasetManagementView inherit from AbstractDatasetView so certain
required pieces could be centrally located and reused.
Adding several classes used by Dataset management GUI view.
Cleanup, a bit of additional behavior, and attempts at getting downloads
working.
Adding incomplete DatasetFileWebsocketFilelike class, since it is
referenced by other things.
Fixing issue from some upstream changes where numpy was still required
but no longer included implicitly/transitively as a dependency.
Fixing issue from some upstream changes where migrations were necessary
for dataset management GUI to function (even though they weren't used
there) but were not run automatically.
@robertbartel
Copy link
Copy Markdown
Contributor Author

Closing; consolidating these and other changes in #240.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants