Skip to content

Question about reproducing inference timing in Fig. 3 #28

@taulabsio

Description

@taulabsio

Dear authors,

I am currently working on reproducing the inference-time comparison shown in Fig. 3 of your paper ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model, and I had a follow-up question regarding the reported efficiency.

Using what I believe is a comparable setup, I am measuring an inference time of around 0.07 seconds per 1-second audio clip. However, in Fig. 3 your method appears to achieve approximately 0.01 seconds, and I would really appreciate some clarification on how to reach that level of efficiency.

In particular, could you help clarify:

  1. Scope of timing:
    Does the reported 0.01 s include only the ARTalk motion generation network, or also upstream/downstream steps such as HuBERT feature extraction, VQ decoding, FLAME reconstruction, or rendering?

  2. Input setup:
    Was the timing measured strictly on a 1-second clip, or within a larger temporal window (e.g., 4 seconds / 100 frames) with amortized cost?

  3. Measurement methodology:
    Were results averaged over multiple runs with warm-up iterations excluded?
    Was GPU synchronization (e.g., torch.cuda.synchronize()) used to ensure accurate timing?

  4. Inference configuration:
    Was inference performed with model.eval() and torch.no_grad()?
    Was mixed precision (FP16) or any other optimization (e.g., TensorRT) used?

  5. Hardware consistency:**
    I assume the reported number corresponds to an A100 GPU—could you confirm if there were any specific settings (batch size, CUDA version, etc.) that are important for reproducing the result?

  6. Code reference:
    If available, could you point me to the exact script or section of the code used to generate the timing results in Fig. 3?

I want to ensure that I am measuring the same portion of the pipeline and using a fair setup, so any guidance would be extremely helpful.

Thank you very much for your time and for your excellent work.

@xg-chu @latifah221b

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions