Skip to content

[edge_model_serving] Torch serving is not working properly since the model is not found #116

@bojeanson

Description

@bojeanson

Description

When launching torch model serving docker named edge_torch_serving, the logs indicate the model was not found and therefor no model is served. Please fix it to ensure the model is served.

Here are the logs:

$ docker run torch_serving:latest

WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2025-03-06T10:43:03,353 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager -  Loading snapshot serializer plugin...
nvidia-smi not available or failed: Cannot run program "nvidia-smi": error=2, No such file or directory
2025-03-06T10:43:03,369 [DEBUG] main org.pytorch.serve.util.ConfigManager - xpu-smi not available or failed: Cannot run program "xpu-smi": error=2, No such file or directory
2025-03-06T10:43:03,370 [WARN ] main org.pytorch.serve.util.ConfigManager - Your torchserve instance can access any URL to load models. When deploying to production, make sure to limit the set of allowed_urls in config.properties
2025-03-06T10:43:03,383 [INFO ] main org.pytorch.serve.util.TokenAuthorization - 
######
TorchServe now enforces token authorization by default.
This requires the correct token to be provided when calling an API.
Key file located at /torch_serving/key_file.json
Check token authorization documenation for information: https://github.com/pytorch/serve/blob/master/docs/token_authorization_api.md 
######

2025-03-06T10:43:03,383 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...
2025-03-06T10:43:03,408 [INFO ] main org.pytorch.serve.metrics.configuration.MetricConfiguration - Successfully loaded metrics configuration from /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml
2025-03-06T10:43:03,439 [INFO ] main org.pytorch.serve.ModelServer - 
Torchserve version: 0.12.0
TS Home: /home/venv/lib/python3.9/site-packages
Current directory: /torch_serving
Temp directory: /home/model-server/tmp
Metrics config path: /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml
Number of GPUs: 0
Number of CPUs: 2
Max heap size: 490 M
Python executable: /home/venv/bin/python
Config file: N/A
Inference address: http://127.0.0.1:8080
Management address: http://127.0.0.1:8081
Metrics address: http://127.0.0.1:8082
Model Store: /torch_serving/models
Initial Models: fastrcnn=fastrcnn.mar
Log dir: /torch_serving/logs
Metrics dir: /torch_serving/logs
Netty threads: 0
Netty client threads: 0
Default workers per model: 2
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Limit Maximum Image Pixels: true
Prefer direct buffer: false
Allowed Urls: [file://.*|http(s)?://.*]
Custom python dependency for model allowed: false
Enable metrics API: true
Metrics mode: LOG
Disable system metrics: false
Workflow Store: /torch_serving/models
CPP log config: N/A
Model config: N/A
System metrics command: default
Model API enabled: false
2025-03-06T10:43:03,445 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: fastrcnn.mar
2025-03-06T10:43:03,455 [INFO ] main org.pytorch.serve.archive.model.ModelArchive - createTempDir /home/model-server/tmp/models/bd728281476241cd80b314c98089185b
2025-03-06T10:43:03,456 [WARN ] main org.pytorch.serve.ModelServer - Failed to load model: fastrcnn.mar
org.pytorch.serve.archive.model.ModelNotFoundException: Model not found at: fastrcnn.mar
        at org.pytorch.serve.archive.model.ModelArchive.downloadModel(ModelArchive.java:118) ~[model-server.jar:?]
        at org.pytorch.serve.wlm.ModelManager.createModelArchive(ModelManager.java:185) ~[model-server.jar:?]
        at org.pytorch.serve.wlm.ModelManager.registerModel(ModelManager.java:143) ~[model-server.jar:?]
        at org.pytorch.serve.ModelServer.initModelStore(ModelServer.java:266) [model-server.jar:?]
        at org.pytorch.serve.ModelServer.startRESTserver(ModelServer.java:399) [model-server.jar:?]
        at org.pytorch.serve.ModelServer.startAndWait(ModelServer.java:124) [model-server.jar:?]
        at org.pytorch.serve.ModelServer.main(ModelServer.java:105) [model-server.jar:?]
2025-03-06T10:43:03,461 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2025-03-06T10:43:03,482 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://127.0.0.1:8080
2025-03-06T10:43:03,483 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel.
2025-03-06T10:43:03,484 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http://127.0.0.1:8081
2025-03-06T10:43:03,484 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.
2025-03-06T10:43:03,485 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://127.0.0.1:8082
Model server started.
2025-03-06T10:43:03,633 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:a8807015d002,timestamp:1741257783
2025-03-06T10:43:03,634 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:87.50933074951172|#Level:Host|#hostname:a8807015d002,timestamp:1741257783
2025-03-06T10:43:03,635 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:8.297050476074219|#Level:Host|#hostname:a8807015d002,timestamp:1741257783
2025-03-06T10:43:03,635 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:8.7|#Level:Host|#hostname:a8807015d002,timestamp:1741257783
2025-03-06T10:43:03,636 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:366.80078125|#Level:Host|#hostname:a8807015d002,timestamp:1741257783
2025-03-06T10:43:03,636 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:1499.2421875|#Level:Host|#hostname:a8807015d002,timestamp:1741257783
2025-03-06T10:43:03,636 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:81.3|#Level:Host|#hostname:a8807015d002,timestamp:1741257783

Acceptance criteria

Torch serving serves correctly the model

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions