Skip to content

Prototype Python Rest API for sumo explorer#662

Open
magnesj wants to merge 9 commits intodevfrom
fastapi-prototype-01
Open

Prototype Python Rest API for sumo explorer#662
magnesj wants to merge 9 commits intodevfrom
fastapi-prototype-01

Conversation

@magnesj
Copy link
Owner

@magnesj magnesj commented Feb 8, 2026

PR Type

Enhancement


Description

  • Add Python FastAPI server wrapper for Sumo Explorer API

  • Implement RiaSumoExplorerConnector for local server management

  • Add preferences for Python path, server port, and auto-start

  • Create integration plan for connector selection in cloud data UI


Diagram Walkthrough

flowchart LR
  A["ResInsight C++"] -->|RiaSumoExplorerConnector| B["FastAPI Server"]
  B -->|HTTP Requests| C["Python Process"]
  C -->|fmu.sumo.explorer| D["Sumo Cloud API"]
  A -->|Preferences| E["RiaPreferencesSumoExplorer"]
  E -->|Config| B
  F["RiaConnectorTools"] -->|Auto-start| B
Loading

File Walkthrough

Relevant files
Enhancement
15 files
RiaApplication.cpp
Add Sumo Explorer connector initialization                             
+23/-0   
RiaApplication.h
Declare Sumo Explorer connector member and factory             
+8/-5     
RiaPreferences.cpp
Initialize Sumo Explorer preferences in main preferences 
+16/-0   
RiaPreferences.h
Add Sumo Explorer preferences field and accessor                 
+5/-0     
RiaPreferencesSumoExplorer.cpp
Implement Sumo Explorer preferences with configuration     
+94/-0   
RiaPreferencesSumoExplorer.h
Define Sumo Explorer preferences class structure                 
+49/-0   
RiaConnectorTools.cpp
Add auto-start logic for Sumo Explorer server                       
+20/-0   
RiaSumoExplorerConnector.cpp
Implement full Sumo Explorer connector with server lifecycle
+878/-0 
RiaSumoExplorerConnector.h
Define Sumo Explorer connector interface and data types   
+126/-0 
RiaSumoExplorerDefines.cpp
Implement Sumo Explorer configuration constants                   
+47/-0   
RiaSumoExplorerDefines.h
Define Sumo Explorer data structures and utilities             
+89/-0   
__init__.py
Create Python package initialization for server                   
+8/-0     
models.py
Define Pydantic models for API responses                                 
+76/-0   
sumo_client.py
Implement Sumo Explorer API wrapper client                             
+356/-0 
sumo_explorer_server.py
Implement FastAPI server with REST endpoints                         
+255/-0 
Configuration changes
3 files
CMakeLists_files.cmake
Add Sumo Explorer preferences to build system                       
+2/-0     
CMakeLists_files.cmake
Add Sumo Explorer connector files to build system               
+4/-0     
CMakeLists.txt
Install Python Sumo Explorer server files                               
+8/-0     
Dependencies
1 files
requirements.txt
Define Python dependencies for server                                       
+6/-0     
Documentation
2 files
README.md
Document server architecture and API endpoints                     
+51/-0   
sumo-explorer-integration.md
Outline integration plan for connector selection UI           
+376/-0 

magnesj and others added 8 commits February 8, 2026 10:56
Plan outlines approach to make Sumo Explorer connector data available
in RimCloudDataSourceCollection with preference-based connector selection.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* Refactor and extend percentile calculation utilities

Refactored percentile and quantile calculation logic for improved flexibility and maintainability. Added helper functions for quantile/percentile validation, introduced a new vector-based calculatePercentiles method, and updated calculateStatisticsCurves to use it. Refactored nearest-rank and interpolated percentile methods to accept percentiles as vectors and improved parameter validation. Enhanced documentation throughout and updated function signatures in the header for clarity.
@qodo-code-review
Copy link

qodo-code-review bot commented Feb 8, 2026

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
Arbitrary process execution

Description: The server launcher executes a user-configurable pythonPath (or python from PATH) via
QProcess::start(), enabling arbitrary local code execution if the preferences/config are
tampered with (e.g., setting pythonPath to a malicious executable or script).
RiaSumoExplorerConnector.cpp [62-123]

Referred Code
bool RiaSumoExplorerConnector::startServer()
{
    if ( m_serverRunning )
    {
        RiaLogging::info( "Sumo Explorer server already running" );
        return true;
    }

    // Determine Python executable
    QString pythonCmd = m_pythonPath.isEmpty() ? "python" : m_pythonPath;

    // Determine server script path
    QString appPath    = QCoreApplication::applicationDirPath();
    QString serverPath = appPath + "/Python/sumo_explorer_server/sumo_explorer_server.py";

    if ( !QFile::exists( serverPath ) )
    {
        setError( QString( "Sumo Explorer server script not found: %1" ).arg( serverPath ) );
        return false;
    }



 ... (clipped 41 lines)
Unauthenticated local API

Description: The FastAPI service exposes Sumo data endpoints without any authentication/authorization,
so any local user/process that can reach 127.0.0.1: can query assets/cases/ensembles and
retrieve Base64-encoded parquet data using the current user's Sumo credentials.
sumo_explorer_server.py [89-237]

Referred Code
@app.get("/assets", response_model=List[Asset])
async def get_assets():
    """Get list of available assets (fields)"""
    if sumo_client is None or not sumo_client.is_connected:
        raise HTTPException(status_code=503, detail="Not connected to Sumo")

    try:
        assets = sumo_client.get_assets()
        return assets
    except Exception as e:
        logger.error(f"Failed to get assets: {e}")
        raise HTTPException(status_code=500, detail=str(e))


@app.get("/cases/{field_name}", response_model=List[Case])
async def get_cases(field_name: str):
    """Get cases for a specific field"""
    if sumo_client is None or not sumo_client.is_connected:
        raise HTTPException(status_code=503, detail="Not connected to Sumo")

    try:


 ... (clipped 128 lines)
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Blocking call deadlock: The new blocking helper wrapAndCallNetworkRequest starts an event loop but never quits it
on network completion (and never invokes replyHandler), causing requests to hang until
timeout and leaving results unhandled.

Referred Code
void RiaSumoExplorerConnector::wrapAndCallNetworkRequest( std::function<void()>                        requestCallable,
                                                          const std::function<void( QNetworkReply* )>& replyHandler )
{
    QEventLoop eventLoop;

    QTimer timer;
    timer.setSingleShot( true );

    QObject::connect( &timer, &QTimer::timeout, [&] { RiaLogging::error( "Sumo Explorer request timed out." ); } );
    QObject::connect( &timer, &QTimer::timeout, &eventLoop, &QEventLoop::quit );

    // Call the function that will execute the request
    requestCallable();

    timer.start( RiaSumoExplorerDefines::requestTimeoutMillis() );
    eventLoop.exec( QEventLoop::ProcessEventsFlag::ExcludeUserInputEvents );
}

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status:
Leaks internal errors: Multiple endpoints return HTTPException(..., detail=str(e)), exposing internal exception
messages to API consumers instead of providing a generic user-facing error while logging
details securely.

Referred Code
@app.get("/assets", response_model=List[Asset])
async def get_assets():
    """Get list of available assets (fields)"""
    if sumo_client is None or not sumo_client.is_connected:
        raise HTTPException(status_code=503, detail="Not connected to Sumo")

    try:
        assets = sumo_client.get_assets()
        return assets
    except Exception as e:
        logger.error(f"Failed to get assets: {e}")
        raise HTTPException(status_code=500, detail=str(e))


@app.get("/cases/{field_name}", response_model=List[Case])
async def get_cases(field_name: str):
    """Get cases for a specific field"""
    if sumo_client is None or not sumo_client.is_connected:
        raise HTTPException(status_code=503, detail="Not connected to Sumo")

    try:


 ... (clipped 128 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
No authz/authn: The newly added FastAPI server exposes data endpoints (e.g., /assets, /summary/data)
without any authentication/authorization controls, relying only on localhost binding and
thereby failing the checklist’s requirement for proper access control on external inputs.

Referred Code
app = FastAPI(
    title="Sumo Explorer API",
    description="REST API wrapper for Sumo Explorer",
    version="1.0.0",
    lifespan=lifespan,
)


@app.get("/health", response_model=HealthResponse)
async def health_check():
    """Health check endpoint"""
    return HealthResponse(status="healthy", version="1.0.0")


@app.get("/status", response_model=StatusResponse)
async def status():
    """Get Sumo connection status"""
    if sumo_client is None:
        return StatusResponse(
            connected=False, environment=None, error="Sumo client not initialized"
        )


 ... (clipped 157 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
No user context: The new server auto-start and server lifecycle logging does not include any user
identifier or equivalent context needed to reconstruct “who initiated” the critical
action.

Referred Code
// Start Sumo Explorer server if auto-start is enabled
if ( preferences->sumoExplorerPreferences()->autoStartServer() )
{
    RiaLogging::info( "Auto-starting Sumo Explorer server..." );
    auto* connector = RiaApplication::instance()->makeSumoExplorerConnector();
    if ( connector && !connector->isServerRunning() )
    {
        if ( connector->startServer() )
        {
            RiaLogging::info( "Sumo Explorer server started successfully" );
        }
        else
        {
            RiaLogging::warning( QString( "Failed to auto-start Sumo Explorer server: %1" ).arg( connector->lastError() ) );
        }
    }
}

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status:
Logs server stderr: The connector logs raw Python server stdout/stderr (including on failures), which may
contain sensitive details (e.g., environment, tokens, stack traces) depending on
server/library behavior and thus needs review/redaction.

Referred Code
// Read any error output before stopping
if ( m_serverProcess )
{
    QString stdErr = QString::fromUtf8( m_serverProcess->readAllStandardError() );
    QString stdOut = QString::fromUtf8( m_serverProcess->readAllStandardOutput() );

    if ( !stdErr.isEmpty() )
    {
        RiaLogging::error( QString( "Server stderr: %1" ).arg( stdErr ) );
    }
    if ( !stdOut.isEmpty() )
    {
        RiaLogging::info( QString( "Server stdout: %1" ).arg( stdOut ) );
    }

Learn more about managing compliance generic rules or creating your own custom rules

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link

qodo-code-review bot commented Feb 8, 2026

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Consider a simpler C++/Python integration

Instead of using a local FastAPI server for C++/Python communication, consider a
more direct integration with a library like Pybind11. This would simplify the
architecture by removing the need for process management and an HTTP layer.

Examples:

ApplicationLibCode/Application/Tools/Cloud/RiaSumoExplorerConnector.cpp [62-156]
bool RiaSumoExplorerConnector::startServer()
{
    if ( m_serverRunning )
    {
        RiaLogging::info( "Sumo Explorer server already running" );
        return true;
    }

    // Determine Python executable
    QString pythonCmd = m_pythonPath.isEmpty() ? "python" : m_pythonPath;

 ... (clipped 85 lines)
ApplicationLibCode/Application/Tools/Cloud/RiaSumoExplorerConnector.cpp [245-250]
void RiaSumoExplorerConnector::requestAssetsBlocking()
{
    auto requestCallable = [this]() { requestAssets(); };
    auto replyHandler    = [this]( QNetworkReply* reply ) { parseAssetsReply( reply ); };
    wrapAndCallNetworkRequest( requestCallable, replyHandler );
}

Solution Walkthrough:

Before:

// C++ Connector
class RiaSumoExplorerConnector {
public:
    bool startServer() {
        m_serverProcess = new QProcess();
        // ... setup process arguments for "uvicorn", working directory, etc. ...
        m_serverProcess->start("python", args);
        m_serverProcess->waitForStarted();
        return waitForServerReady(); // Ping /health endpoint
    }

    void requestAssetsBlocking() {
        // ... make HTTP GET request to "http://127.0.0.1:54527/assets" ...
        // ... wait for reply and parse JSON response ...
    }
private:
    QProcess* m_serverProcess;
    QNetworkAccessManager* m_networkManager;
};

After:

// C++ Wrapper using Pybind11
class SumoExplorerWrapper {
public:
    SumoExplorerWrapper() {
        pybind11::gil_scoped_acquire acquire;
        // from fmu.sumo.explorer import Explorer
        auto explorer_module = pybind11::module_::import("fmu.sumo.explorer");
        m_explorer = explorer_module.attr("Explorer")("prod");
    }

    std::vector<Asset> get_assets() {
        pybind11::gil_scoped_acquire acquire;
        // assets = self._explorer.cases
        pybind11::list py_assets = m_explorer.attr("cases");
        // ... convert py_assets to std::vector<Asset> ...
        return assets;
    }
private:
    pybind11::object m_explorer;
};
Suggestion importance[1-10]: 9

__

Why: This is a high-impact architectural suggestion that proposes a valid alternative (Pybind11) to the complex process and network management introduced, which could significantly simplify the code, improve performance, and increase robustness.

High
Possible issue
Correct implementation of blocking requests

Refactor blocking data request methods like requestAssetsBlocking to fix a
performance issue. The current implementation causes them to always wait for a
full timeout. The fix involves creating the QNetworkRequest and QNetworkReply
directly within these methods and then using a corrected helper function to
block until the reply is finished.

ApplicationLibCode/Application/Tools/Cloud/RiaSumoExplorerConnector.cpp [245-250]

 void RiaSumoExplorerConnector::requestAssetsBlocking()
 {
-    auto requestCallable = [this]() { requestAssets(); };
-    auto replyHandler    = [this]( QNetworkReply* reply ) { parseAssetsReply( reply ); };
-    wrapAndCallNetworkRequest( requestCallable, replyHandler );
+    if ( !m_serverRunning )
+    {
+        RiaLogging::error( "Sumo Explorer server not running. Please start the server first." );
+        return;
+    }
+
+    QString url = makeUrl( "/assets" );
+    RiaLogging::debug( QString( "Requesting assets from: %1" ).arg( url ) );
+
+    QNetworkRequest request( url );
+    QNetworkReply*  reply = m_networkManager->get( request );
+
+    wrapAndCallNetworkRequest( reply );
+
+    if ( reply->error() == QNetworkReply::NoError )
+    {
+        parseAssetsReply( reply );
+    }
+    else
+    {
+        RiaLogging::error( QString( "Failed to request assets from %1: %2" ).arg( url ).arg( reply->errorString() ) );
+        if ( reply->error() == QNetworkReply::ConnectionRefusedError )
+        {
+            m_serverRunning = false;
+            RiaLogging::error( "Server connection refused. The server may have stopped or failed to start properly." );
+        }
+    }
+    reply->deleteLater();
 }
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: This suggestion correctly identifies a significant performance bug where blocking network calls always wait for the full timeout. It provides a comprehensive fix for the pattern, which improves application responsiveness.

Medium
Avoid blocking the server's event loop

To prevent blocking the server's event loop, run expensive I/O operations in the
FastAPI server within a separate thread pool. Use
fastapi.concurrency.run_in_threadpool for functions like get_summary_data and
get_parameters.

GrpcInterface/Python/sumo_explorer_server/sumo_explorer_server.py [165-201]

+from fastapi.concurrency import run_in_threadpool
+...
 @app.get("/summary/data", response_model=SummaryDataResponse)
 async def get_summary_data(
     case_id: str = Query(..., description="Case ID"),
     ensemble: str = Query(..., description="Ensemble name"),
     vector: str = Query(..., description="Vector name"),
 ):
     """Get summary data as Base64-encoded parquet"""
     if sumo_client is None or not sumo_client.is_connected:
         raise HTTPException(status_code=503, detail="Not connected to Sumo")
 
     try:
-        # Get parquet data
-        parquet_bytes = sumo_client.get_summary_data(case_id, ensemble, vector)
+        # Get parquet data in a thread to avoid blocking the event loop
+        parquet_bytes = await run_in_threadpool(
+            sumo_client.get_summary_data, case_id, ensemble, vector
+        )
 
         if not parquet_bytes:
             raise HTTPException(
                 status_code=404, detail=f"No data found for vector {vector}"
             )
 
         # Encode as Base64
         data_base64 = base64.b64encode(parquet_bytes).decode("utf-8")
 
         # Estimate row count (parquet metadata would be better, but this is simpler)
         row_count = len(parquet_bytes) // 100  # Rough estimate
 
         return SummaryDataResponse(
             case_id=case_id,
             ensemble_name=ensemble,
             vector_name=vector,
             data_base64=data_base64,
             row_count=row_count,
         )
     except HTTPException:
         raise
     except Exception as e:
         logger.error(f"Failed to get summary data for {case_id}/{ensemble}/{vector}: {e}")
         raise HTTPException(status_code=500, detail=str(e))

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 8

__

Why: This suggestion correctly identifies that blocking I/O calls in an async function will block the server's event loop, which is a critical performance issue. The proposed solution using run_in_threadpool is the standard and correct way to handle this in FastAPI, significantly improving server responsiveness.

Medium
Use the correct API for fetching assets

Refactor the get_assets method in sumo_client.py to use the direct
self._explorer.get_assets() API call. This is more efficient and correct than
the current implementation, which derives assets by iterating through all cases.

GrpcInterface/Python/sumo_explorer_server/sumo_client.py [60-88]

 def get_assets(self) -> List[Asset]:
     """
     Get list of available assets (fields)
 
     Returns:
         List of Asset objects
     """
     if not self._connected or self._explorer is None:
         logger.warning("Not connected to Sumo")
         return []
 
     try:
-        # Get all cases and extract unique assets
-        cases = self._explorer.cases
-        assets_dict = {}
-
-        for case in cases:
-            asset_name = getattr(case, "name", "Unknown")
-            asset_id = getattr(case, "uuid", asset_name)
-
-            if asset_id not in assets_dict:
-                assets_dict[asset_id] = Asset(
-                    asset_id=asset_id, kind="field", name=asset_name
+        # Get all assets directly
+        sumo_assets = self._explorer.get_assets()
+        assets = []
+        for asset in sumo_assets:
+            assets.append(
+                Asset(
+                    asset_id=asset.uuid,
+                    kind="field",
+                    name=asset.name,
                 )
-
-        return list(assets_dict.values())
+            )
+        return assets
     except Exception as e:
         logger.error(f"Failed to get assets: {e}")
         return []
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies an inefficient and potentially incorrect implementation for fetching assets. Using the direct API call self._explorer.get_assets() is more efficient and robust, improving both performance and correctness.

Medium
  • Update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments