Skip to content

HyperlinksSpace/TinyModel

Repository files navigation

TinyModel

Tiny, deployable text classification baseline for rapid product iteration

Model Space Hub Live preview

TinyModel1 is a practical starter model line for text classification. End users consume deployed Hugging Face model and Space endpoints. Maintainer deployment policy lives in texts/HUGGING_FACE_DEPLOYMENT_INTERNAL.md.

Repository: HyperlinksSpace/TinyModel

TinyModel1 on Hugging Face

Availability in Russia

Some features may not work reliably from Russia—for example live preview or other flows that depend on third-party hosts or regions that are blocked or throttled. If you hit that, you can try third-party tools such as the free tier of 1VPN (browser extension or app), or Happ (paid subscription). One place people buy Happ subscriptions is this Telegram bot. These are all third-party services; use at your own discretion and follow applicable laws.

Model card (README) — On the Hub, the model card is the README.md file at the root of the model repo (same URL as the model). In this repository, the template is implemented by write_model_card() in scripts/train_tinymodel1_agnews.py; training writes README.md and artifact.json next to the weights. We do not run CI that downloads full model weights into the repo or runner caches for republish; update the card by retraining and publishing, or edit README.md on the Hub and keep weights unchanged.

1) Local testing

Train locally after cloning the repo:

python scripts/train_tinymodel1_agnews.py --output-dir .tmp/TinyModel-local

Quick local inference sanity check:

python -c "from transformers import pipeline; p=pipeline('text-classification', model='.tmp/TinyModel-local', tokenizer='.tmp/TinyModel-local'); print(p('Stocks rallied after central bank comments', top_k=None))"

Expected local output folder:

  • .tmp/TinyModel-local/model.safetensors
  • .tmp/TinyModel-local/config.json
  • .tmp/TinyModel-local/tokenizer.json
  • .tmp/TinyModel-local/README.md
  • .tmp/TinyModel-local/artifact.json

2) Using the Hub model and Space

Load the published model by id (no local files required):

python -c "from transformers import pipeline; p=pipeline('text-classification', model='HyperlinksSpace/TinyModel1', tokenizer='HyperlinksSpace/TinyModel1'); print(p('Stocks rallied after central bank comments', top_k=None))"

Or open the demo: direct app · on the Hub.

Quick checks:

  • Space loads; inference returns labels and scores; no errors in Space logs.

3) GitHub Actions workflows

Workflow definitions live under .github/workflows/. Trigger them from Actions → select the workflow → Run workflow. Runners use ubuntu-latest unless you change the workflow.

Repository secrets (Settings → Secrets and variables → Actions)

Configure these once per repository (or organization). They are not committed to git.

Secret Used by Purpose
HF_TOKEN Workflows below Hugging Face access token with write permission to create/update models and Spaces in the target namespace.
KAGGLE_USERNAME train-via-kaggle-to-hf.yml only Your Kaggle username (same value as in Kaggle Account → API).
KAGGLE_KEY train-via-kaggle-to-hf.yml only Kaggle API key from AccountCreate New API Token.

No other GitHub secrets are read by these workflows. Internal step outputs (GITHUB_ENV) such as KAGGLE_OWNER / KAGGLE_KERNEL_SLUG are set automatically during the Kaggle run.

Core flows (validated on the GitHub Actions free tier)

Workflow File
Deploy versioned Space to Hugging Face deploy-hf-space-versioned.yml
Train on Hugging Face Jobs and publish versioned model train-hf-job-versioned.yml
  • deploy-hf-space-versioned.yml — Builds the Gradio Space with scripts/build_space_artifact.py and uploads {namespace}/TinyModel{version}Space.

    • Secrets: HF_TOKEN.
    • Workflow inputs: version, namespace, model_id (for example HyperlinksSpace/TinyModel1).
  • train-hf-job-versioned.yml — Submits training on Hugging Face Jobs, then publishes {namespace}/TinyModel{version}.

    • Secrets: HF_TOKEN (also passed into the remote job so it can run publish_hf_artifact.py).
    • Workflow inputs: version, namespace, optional commit_sha (empty = current workflow SHA), flavor, timeout, max_train_samples, max_eval_samples, epochs, batch_size, learning_rate.
    • If Hugging Face returns 402 Payment Required for Jobs, add billing/credits on your HF account or train locally and publish with scripts/publish_hf_artifact.py (see texts/HUGGING_FACE_DEPLOYMENT_INTERNAL.md).

Optional: train via Kaggle

Workflow File
Train via Kaggle and publish to Hugging Face train-via-kaggle-to-hf.yml
  • train-via-kaggle-to-hf.yml — Creates a Kaggle kernel run, trains, downloads outputs, and pushes {namespace}/TinyModel{version} to the Hub.
    • Secrets: KAGGLE_USERNAME, KAGGLE_KEY, and HF_TOKEN (for upload to Hugging Face).
    • Workflow inputs: version, namespace, max_train_samples, max_eval_samples, epochs, batch_size, learning_rate.
    • External quota: Kaggle GPU/CPU weekly limits and any Kaggle compute credits your account uses; not covered by GitHub Actions alone.

4) Further development

Illustrative directions for evolving the TinyModel line (pick what matches your product goals):

  • Accuracy and capacity — Train on more AG News samples or epochs; adjust the tiny BERT config (depth, width, sequence length); add LR schedules, warmup, or regularization suited to your budget.
  • Domains and label sets — Fine-tune on proprietary or niche corpora; replace the four AG News classes with your own taxonomy and a labeled dataset.
  • Shipping inference — Document ONNX or quantized exports for edge and serverless; add batch-inference examples; optionally wire a Hugging Face Inference Endpoint for a stable HTTP API.
  • Space and API UX — Batch inputs, per-class thresholds, richer examples, or client snippets (Python and JavaScript) for integrators.
  • Evaluation discipline — Fixed test split, confusion matrix, calibration, and versioned eval reports alongside artifact.json.
  • Repository hygiene — Lightweight CI (lint, script smoke tests) that never pulls large weights; optional Hub Collections or docs that link model, Space, and release notes.

Nothing here is committed on a fixed timeline; treat it as a backlog of sensible next steps for a small classification stack.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages