[SC-11676] Improve API timeout support to prevent timeouts with large datasets by juanmleng · Pull Request #411 · validmind/validmind-library

juanmleng · 2025-08-07T22:04:40Z

Pull Request Description

What and why?

What

Added configurable batching for figure uploads via VM_FIGURE_MAX_BATCH_SIZE environment variable (defaults to 20 figures per batch)
Made API request timeout configurable via VM_API_TIMEOUT environment variable (defaults to 30 seconds)

Why

Test results with many figures (e.g., 20+ plots) could cause depending on the network environment timeouts issues when uploaded simultaneously. Batching prevents overwhelming the API.
Sequential batch processing with concurrent uploads within each batch provides a balance between speed and stability, reducing failed uploads due to resource constraints.
Configurable timeout and batch size allow users to adjust based on their network conditions without code changes.

How to test

Add the following code before importing validmind:

import os
os.environ["LOG_LEVEL"] = "DEBUG"

To reproduce a timeout error, set VM_API_TIMEOUT=5 in your .env and run the application_scorecard_full_suite.ipynb notebook.
To check the effect of batch processing, set VM_API_TIMEOUT=5 and VM_FIGURE_MAX_BATCH_SIZE=2 to avoid timeout errors.

What needs special review?

Dependencies, breaking changes, and deployment notes

Release notes

Checklist

cachafla

Nice

github-actions · 2025-08-08T07:22:10Z

PR Summary

This PR introduces several functional improvements related to API logging and figure upload processing in the project. The key changes include:

In the API client, the HTTP client timeout is now configurable via an environment variable (VM_API_TIMEOUT) instead of being hardcoded to 30 seconds. This allows dynamic adjustment of API call timeouts without code changes.
A new decorator, log_api_operation, has been implemented in the logging module. This decorator enables function-level logging for API operations. It logs the operation (using a provided operation name or the function name) and supports the extraction of a key from the function arguments for more informative logs. The decorator activates only if the logging level is set to DEBUG (unless forced), aiming to reduce unnecessary logging in production.
In the result module, the processing of figure uploads has been enhanced by breaking the figures into configurable batches. The batch size is determined by the environment variable (VM_FIGURE_MAX_BATCH_SIZE) with a default of 20, which allows better control over resource usage during the upload process. Each batch upload is wrapped with the new log_api_operation decorator to track the operation.

Overall, the changes aim to provide more flexibility and transparency in API operations and improve the handling of batch uploads for figures.

Test Suggestions

Verify that the API client session timeout correctly reads the value from the VM_API_TIMEOUT environment variable and falls back to 30 when not set.
Test the log_api_operation decorator to ensure it logs the operation name and extracted key correctly, particularly when the logger is in DEBUG mode.
Create tests for the alog_figure function to ensure that it is properly wrapped by log_api_operation and logs as expected.
Simulate different sizes of the figures list to confirm that batching works correctly with the batch size determined by VM_FIGURE_MAX_BATCH_SIZE.
Run tests with different log levels to ensure that the logging behavior changes appropriately when not in DEBUG mode.

AnilSorathiya

Lgtm 👍

Make timeout configurable and send figures in batches

625a874

juanmleng self-assigned this Aug 7, 2025

juanmleng added the bug Something isn't working label Aug 7, 2025

juanmleng requested review from AnilSorathiya, cachafla and johnwalz97 August 7, 2025 22:12

cachafla approved these changes Aug 8, 2025

View reviewed changes

2.8.31

b348d7d

AnilSorathiya approved these changes Aug 8, 2025

View reviewed changes

juanmleng merged commit 60fdeaf into main Aug 8, 2025
7 of 8 checks passed

juanmleng deleted the juan/sc-11676/improve-api-timeout-support-to-prevent-timeouts-with-large-datasets branch August 8, 2025 07:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SC-11676] Improve API timeout support to prevent timeouts with large datasets#411

[SC-11676] Improve API timeout support to prevent timeouts with large datasets#411
juanmleng merged 2 commits intomainfrom
juan/sc-11676/improve-api-timeout-support-to-prevent-timeouts-with-large-datasets

juanmleng commented Aug 7, 2025 •

edited

Loading

Uh oh!

cachafla left a comment

Uh oh!

github-actions bot commented Aug 8, 2025

Uh oh!

AnilSorathiya left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

juanmleng commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Description

What and why?

What

Why

How to test

What needs special review?

Dependencies, breaking changes, and deployment notes

Release notes

Checklist

Uh oh!

cachafla left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 8, 2025

PR Summary

Test Suggestions

Uh oh!

AnilSorathiya left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

juanmleng commented Aug 7, 2025 •

edited

Loading