Skip to content

Explore alternatives to lftp jobs -v parsing for transfer status #294

@nitrobass24

Description

@nitrobass24

Summary

The lftp jobs -v parser is the most fragile part of the codebase. PTY line-wrapping causes recurring crashes because lftp's verbose output contains long lines (filenames + chunk progress) that break at arbitrary byte positions. We've fixed these reactively (#253, #258, #260, #290, #293) but the root cause remains.

This issue explores alternatives to eliminate or reduce dependence on jobs -v parsing.

Current architecture

pexpect spawns lftp in a PTY
  → sends "jobs -v"
  → reads output from PTY
  → LftpJobStatusParser parses 500+ line output
  → builds per-file transfer states (speed, ETA, chunk progress)
  → ModelBuilder uses transfer states to set file DOWNLOADING/QUEUED status

The PTY has a column width. Despite setting COLUMNS=10000 and setwinsize(24, 10000), some environments (Unraid, certain SSH configs) override these, causing line wrapping.

What jobs -v provides

Data Source Used for
Job ID, type, state jobs (no -v needed) Job tracking
Overall mirror progress (4.7G/29G 16%) jobs (no -v needed) UI progress bar
Overall speed and ETA jobs (no -v needed) UI speed display
Per-file transfer state (DOWNLOADING) -v only Individual file status in UI
Per-file speed and ETA -v only Per-file ETA in UI
Chunk positions -v only Not used directly (already available in .lftp-pget-status files)

Key insight

The lines that crash the parser are all from -v verbose output — \chunk, \transfer, and their associated progress lines with long filenames. The top-level jobs output (job headers, overall progress) uses short lines that don't wrap.

Options

Option A: Drop -v flag (simplest)

Call jobs instead of jobs -v. We keep the top-level job info (ID, type, overall progress, speed, ETA) and lose per-file chunk detail.

What we lose:

  • Per-file DOWNLOADING vs QUEUED distinction within a directory mirror
  • Per-file speed and ETA

What we keep:

  • Overall transfer speed and ETA
  • Job state (running, queued)
  • Overall progress percentage

How to compensate:

  • Per-file progress: read from .lftp-pget-status files locally (we already do this in SystemScanner for partial file sizes)
  • Per-file state: infer from local file size changes between scan intervals (file growing = DOWNLOADING)
  • Per-file ETA: derive from size delta over time

Impact: Eliminates all chunk/transfer line parsing. Parser becomes trivial. No more PTY wrapping crashes.

Option B: Hybrid — jobs for status + local filesystem for per-file detail

  • Use jobs (no -v) for top-level job status, speed, and ETA
  • Read .lftp-pget-status files for per-file chunk progress
  • Use LocalScanner file size deltas for per-file speed estimation
  • This is essentially Option A with explicit compensation for the lost per-file data

Option C: Drop PTY entirely — pipe commands to lftp stdin

Instead of pexpect/PTY, spawn lftp with subprocess.Popen and pipe commands to stdin. No PTY means no terminal wrapping.

Considerations:

  • lftp prompts and output formatting may differ without a PTY
  • Need to handle command/response synchronization without pexpect's expect patterns
  • Larger refactor than Options A/B

Option D: Use lftp's log file

lftp supports set log:file /path/to/log. We could write transfer events to a log file and parse that instead of polling jobs -v.

Considerations:

  • Log format may not have all the data we need
  • Adds filesystem I/O instead of PTY I/O
  • Need to investigate what lftp logs contain

Recommendation

Start with Option A (drop -v flag). It's a minimal change — one character removed from the jobs -v command — and eliminates the entire class of PTY wrapping bugs. The per-file detail loss can be compensated with local filesystem data we already collect.

If per-file speed/ETA is important to users, follow up with Option B to restore that data from local sources.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions