Skip to content

Problems setting up the host-manager on smurf-so10 #425

@jlashner

Description

@jlashner

I'm currently trying to setup the hostmanager on smurf-so10 for satp3 the same way I did for satp1, but am encountering some errors in the hostmanager agent which I have not seen before.

Here is the PR with the configuration I'm testing: https://github.com/simonsobs/ocs-deployment-configs/pull/364

When parsing the configs, the hostmanager throws this error:

Failed to interpret /home/cryo/repos/ocs-deployment-configs/satp3/smurf-so10-satp3/docker-compose-ocs.yml and/or its service states: 'com.docker.compose.project.working_dir'

After investigating it, it seems like this is thrown in the _InspectContainer function, when the container info does not have the key com.docker.compose/project/working_dir.

The version of docker on smurf-so10-satp3 is 24.0.2

I do not know why this is happening, but there are a few strange things going on here:

  1. The command docker compose -f <file-path> ps does not restrict output to containers from the specified file. This is different than the docker-compose -f <file-path> ps command, as is shown below:
    cryo@smurf-srv-SO10:~/repos/ocs-deployment-configs/satp3/smurf-so10-satp3$ docker compose -f docker-compose-ocs.yml ps
    NAME                            IMAGE                                               COMMAND                  SERVICE                         CREATED             STATUS              PORTS
    ocs-daq-sync-smurf-so10         simonsobs/socs:v0.5.0-5-g6688b61-dev                "dumb-init ocs-agent…"   ocs-daq-sync-smurf-so10         3 hours ago         Up 2 hours          
    ocs-daq-sync-timestreams-so10   simonsobs/socs:v0.5.0-5-g6688b61-dev                "dumb-init ocs-agent…"   ocs-daq-sync-timestreams-so10   3 hours ago         Up 2 hours          
    ocs-det-controller-c2s3         simonsobs/ocs-pysmurf-agent:v0.5.0-5-g6688b61-dev   "dumb-init ocs-agent…"   ocs-det-controller-c2s3         2 hours ago         Up 2 hours          
    ocs-det-controller-c2s4         simonsobs/ocs-pysmurf-agent:v0.5.0-5-g6688b61-dev   "dumb-init ocs-agent…"   ocs-det-controller-c2s4         2 hours ago         Up 2 hours          
    ocs-det-crate-2                 simonsobs/socs:v0.5.0-5-g6688b61-dev                "dumb-init ocs-agent…"   ocs-det-crate-2                 2 hours ago         Up 2 hours          
    ocs-det-monitor-so10            simonsobs/socs:v0.5.0-5-g6688b61-dev                "dumb-init ocs-agent…"   ocs-det-monitor-so10            3 hours ago         Up 2 hours          
    ocs-ocs-det-controller-c2s5     simonsobs/ocs-pysmurf-agent:v0.5.0-5-g6688b61-dev   "dumb-init ocs-agent…"   ocs-det-controller-c2s5         2 hours ago         Up 2 hours          
    smurf-jupyter                   simonsobs/ocs-pysmurf-agent:v0.5.0-5-g6688b61-dev   "jupyter notebook /d…"   smurf-jupyter                   2 hours ago         Up 2 hours          
    smurf-streamer-s3               simonsobs/smurf-streamer:v0.4.4                     "python3 -u /usr/loc…"   smurf-streamer-s3               2 hours ago         Up 2 hours          
    smurf-streamer-s4               simonsobs/smurf-streamer:v0.4.4                     "python3 -u /usr/loc…"   smurf-streamer-s4               2 hours ago         Up 2 hours          
    smurf-streamer-s5               simonsobs/smurf-streamer:v0.4.4                     "python3 -u /usr/loc…"   smurf-streamer-s5               2 hours ago         Up 2 hours          
    cryo@smurf-srv-SO10:~/repos/ocs-deployment-configs/satp3/smurf-so10-satp3$ docker-compose -f docker-compose-ocs.yml ps
                Name                           Command               State   Ports
    ------------------------------------------------------------------------------
    ocs-daq-sync-smurf-so10         dumb-init ocs-agent-cli          Up           
    ocs-daq-sync-timestreams-so10   dumb-init ocs-agent-cli          Up           
    ocs-det-controller-c2s3         dumb-init ocs-agent-cli -- ...   Up           
    ocs-det-controller-c2s4         dumb-init ocs-agent-cli -- ...   Up           
    ocs-det-crate-2                 dumb-init ocs-agent-cli          Up           
    ocs-det-monitor-so10            dumb-init ocs-agent-cli          Up           
    ocs-ocs-det-controller-c2s5     dumb-init ocs-agent-cli -- ...   Up           
    
  2. Non-ocs dockers don't have the "com.docker.compose.working_dir` label in the docker config:
    > docker inspect smurf-streamer-s3
        ...
        "OnBuild": null,
        "Labels": {
            "com.docker.compose.config-hash": "9fa9d8eb97a1556d425c4dd8e8d66281cb464ad53df8cc9deaf55f5bafcc56ea",
            "com.docker.compose.container-number": "1",
            "com.docker.compose.oneoff": "False",
            "com.docker.compose.project": "smurf-so10-satp3",
            "com.docker.compose.service": "smurf-streamer-s3",
            "com.docker.compose.version": "1.23.2",
            "org.opencontainers.image.ref.name": "ubuntu",
            "org.opencontainers.image.version": "20.04"
        }
       ...
    
    Compare this with the ocs-dockers, which do have the working_dir label:
        >> docker inspect ocs-det-controller-c2s3
           "OnBuild": null,
           "Labels": {
               "com.docker.compose.config-hash": "8fda13dbfd4be6557d96768e045c9538dcdb0cafa3e6b81e36d051991acfad02",
               "com.docker.compose.container-number": "1",
               "com.docker.compose.depends_on": "",
               "com.docker.compose.image": "sha256:abe4709f4e4675f3fe5e5a06e65a8b608f115f85175fbe71b880df4bcb5765f9",
               "com.docker.compose.oneoff": "False",
               "com.docker.compose.project": "smurf-so10-satp3",
               "com.docker.compose.project.config_files": "/home/cryo/repos/ocs-deployment-configs/satp3/smurf-so10-satp3/docker-compose-ocs.yml",
               "com.docker.compose.project.working_dir": "/home/cryo/repos/ocs-deployment-configs/satp3/smurf-so10-satp3",
               "com.docker.compose.replace": "91a5ea4ddc892c8ac0c8ff21f2e89f1085e354fb54c8bec6b15dba41047ff3ca",
               "com.docker.compose.service": "ocs-det-controller-c2s3",
               "com.docker.compose.version": "2.18.1",
               "org.opencontainers.image.ref.name": "ubuntu",
               "org.opencontainers.image.version": "20.04"
           }
    

It could be that there is something wrong in the configuration which is making this happen, however its not obvious to me what, or what is different between this configuration and the changes on satp1 that I made and tested successfully.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions