Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
grok.py	grok.py
pyproject.toml	pyproject.toml
recsys_model.py	recsys_model.py
recsys_retrieval_model.py	recsys_retrieval_model.py
run_ranker.py	run_ranker.py
run_retrieval.py	run_retrieval.py
runners.py	runners.py
test_recsys_model.py	test_recsys_model.py
test_recsys_retrieval_model.py	test_recsys_retrieval_model.py
uv.lock	uv.lock

Phoenix: Recommendation System

This repository contains JAX example code for the Phoenix recommendation system, which powers content ranking and retrieval. Phoenix uses transformer-based architectures for both retrieval (finding relevant candidates from millions of items) and ranking (ordering a smaller set of candidates by predicted engagement).

Note: The sample transformer implementation in this repository is ported from the Grok-1 open source release by xAI. The core transformer architecture comes from Grok-1, adapted here for recommendation system use cases with custom input embeddings and attention masking for candidate isolation. This code is representative of the model used internally with the exception of specific scaling optimizations.

Overview
Architecture
Key Design Decisions
Running the Code
License

Overview

Phoenix is a recommendation system that predicts user engagement (likes, reposts, replies, etc.) for content. It operates in two stages:

Retrieval: Efficiently narrow down millions of candidates to hundreds using approximate nearest neighbor (ANN) search
Ranking: Score and order the retrieved candidates using a more expressive transformer model

Architecture

Two-Stage Recommendation Pipeline

┌─────────────────────────────────────────────────────────────────────────────────┐
│                           RECOMMENDATION PIPELINE                               │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                 │
│   ┌──────────┐     ┌─────────────────────┐     ┌─────────────────────┐          │
│   │          │     │                     │     │                     │          │
│   │   User   │────▶│   STAGE 1:          │────▶│   STAGE 2:          │────▶ Feed│
│   │ Request  │     │   RETRIEVAL         │     │   RANKING           │          │
│   │          │     │   (Two-Tower)       │     │   (Transformer)     │          │
│   └──────────┘     │                     │     │                     │          │
│                    │   Millions → 1000s  │     │   1000s → Ranked    │          │
│                    └─────────────────────┘     └─────────────────────┘          │
│                                                                                 │
└─────────────────────────────────────────────────────────────────────────────────┘

Retrieval: Two-Tower Model

The retrieval stage uses a two-tower architecture that enables efficient similarity search at scale.

How Retrieval Works

User Tower: Encodes user features and engagement history through a transformer to produce a normalized user embedding [B, D]
Candidate Tower: Computes normalized embeddings for all items in the corpus [N, D]
Similarity Search: Retrieves top-K candidates using dot product similarity

Ranking: Transformer with Candidate Isolation

The ranking model uses a transformer architecture where candidates cannot attend to each other during inference. This is a critical design choice that ensures the score for a candidate doesn't depend on which other candidates are in the batch

Ranking Model Architecture

                              PHOENIX RANKING MODEL
    ┌────────────────────────────────────────────────────────────────────────────┐
    │                                                                            │
    │                              OUTPUT LOGITS                                 │
    │                        [B, num_candidates, num_actions]                    │
    │                                    │                                       │
    │                                    │ Unembedding                           │
    │                                    │ Projection                            │
    │                                    │                                       │
    │                    ┌───────────────┴───────────────┐                       │
    │                    │                               │                       │
    │                    │    Extract Candidate Outputs  │                       │
    │                    │    (positions after history)  │                       │
    │                    │                               │                       │
    │                    └───────────────┬───────────────┘                       │
    │                                    │                                       │
    │                    ┌───────────────┴───────────────┐                       │
    │                    │                               │                       │
    │                    │         Transformer           │                       │
    │                    │     (with special masking)    │                       │
    │                    │                               │                       │
    │                    │   Candidates CANNOT attend    │                       │
    │                    │   to each other               │                       │
    │                    │                               │                       │
    │                    └───────────────┬───────────────┘                       │
    │                                    │                                       │
    │    ┌───────────────────────────────┼───────────────────────────────┐       │
    │    │                               │                               │       │
    │    ▼                               ▼                               ▼       │
    │ ┌──────────┐              ┌─────────────────┐              ┌────────────┐  │
    │ │   User   │              │     History     │              │ Candidates │  │
    │ │Embedding │              │   Embeddings    │              │ Embeddings │  │
    │ │  [B, 1]  │              │    [B, S, D]    │              │  [B, C, D] │  │
    │ │          │              │                 │              │            │  │
    │ │ User     │              │ Posts + Authors │              │ Posts +    │  │
    │ │ Hashes   │              │ + Actions +     │              │ Authors +  │  │
    │ │          │              │ Product Surface │              │ Product    │  │
    │ └──────────┘              └─────────────────┘              │ Surface    │  │
    │                                                            └────────────┘  │
    │                                                                            │
    └────────────────────────────────────────────────────────────────────────────┘

Attention Mask: Candidate Isolation

A key detail is the attention mask that prevents candidates from attending to each other while still allowing them to attend to the user and history:

                    ATTENTION MASK VISUALIZATION

         Keys (what we attend TO)
         ─────────────────────────────────────────────▶

         │ User │    History (S positions)    │   Candidates (C positions)    │
    ┌────┼──────┼─────────────────────────────┼───────────────────────────────┤
    │    │      │                             │                               │
    │ U  │  ✓   │  ✓   ✓   ✓   ✓   ✓   ✓   ✓  │  ✗   ✗   ✗   ✗   ✗   ✗   ✗    │
    │    │      │                             │                               │
    ├────┼──────┼─────────────────────────────┼───────────────────────────────┤
 Q  │    │      │                             │                               │
 u  │ H  │  ✓   │  ✓   ✓   ✓   ✓   ✓   ✓   ✓  │  ✗   ✗   ✗   ✗   ✗   ✗   ✗    │
 e  │ i  │  ✓   │  ✓   ✓   ✓   ✓   ✓   ✓   ✓  │  ✗   ✗   ✗   ✗   ✗   ✗   ✗    │
 r  │ s  │  ✓   │  ✓   ✓   ✓   ✓   ✓   ✓   ✓  │  ✗   ✗   ✗   ✗   ✗   ✗   ✗    │
 i  │ t  │  ✓   │  ✓   ✓   ✓   ✓   ✓   ✓   ✓  │  ✗   ✗   ✗   ✗   ✗   ✗   ✗    │
 e  │    │      │                             │                               │
 s  ├────┼──────┼─────────────────────────────┼───────────────────────────────┤
    │    │      │                             │  DIAGONAL ONLY (self-attend)  │
 │  │ C  │  ✓   │  ✓   ✓   ✓   ✓   ✓   ✓   ✓  │  ✓   ✗   ✗   ✗   ✗   ✗   ✗    │
 │  │ a  │  ✓   │  ✓   ✓   ✓   ✓   ✓   ✓   ✓  │  ✗   ✓   ✗   ✗   ✗   ✗   ✗    │
 │  │ n  │  ✓   │  ✓   ✓   ✓   ✓   ✓   ✓   ✓  │  ✗   ✗   ✓   ✗   ✗   ✗   ✗    │
 │  │ d  │  ✓   │  ✓   ✓   ✓   ✓   ✓   ✓   ✓  │  ✗   ✗   ✗   ✓   ✗   ✗   ✗    │
 │  │ i  │  ✓   │  ✓   ✓   ✓   ✓   ✓   ✓   ✓  │  ✗   ✗   ✗   ✗   ✓   ✗   ✗    │
 │  │ d  │  ✓   │  ✓   ✓   ✓   ✓   ✓   ✓   ✓  │  ✗   ✗   ✗   ✗   ✗   ✓   ✗    │
 ▼  │ s  │  ✓   │  ✓   ✓   ✓   ✓   ✓   ✓   ✓  │  ✗   ✗   ✗   ✗   ✗   ✗   ✓    │
    │    │      │                             │                               │
    └────┴──────┴─────────────────────────────┴───────────────────────────────┘

    ✓ = Can attend (1)          ✗ = Cannot attend (0)

    Legend:
    ├─ User + History: Full bidirectional attention among themselves
    ├─ Candidates → User/History: Candidates CAN attend to user and history  
    └─ Candidates → Candidates: Candidates CANNOT attend to each other (only self)

Key Design Decisions

1. Hash-Based Embeddings

Both models use multiple hash functions for embedding lookup

2. Shared Architecture

The retrieval user tower uses the same transformer architecture as the ranking model

3. Multi-Action Prediction

The ranking model predicts multiple engagement types simultaneously:

Output: [B, num_candidates, num_actions]
                              │
                              ▼
        ┌─────────────────────────────────────┐
        │ Like │ Repost │ Reply │ Click │ ... │
        └─────────────────────────────────────┘

Running the Code

Installation

Install uv

Running the Ranker

uv run run_ranker.py

Running Retrieval

uv run run_retrieval.py

Running Tests

uv run pytest test_recsys_model.py test_recsys_retrieval_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Phoenix: Recommendation System

Table of Contents

Overview

Architecture

Two-Stage Recommendation Pipeline

Retrieval: Two-Tower Model

How Retrieval Works

Ranking: Transformer with Candidate Isolation

Ranking Model Architecture

Attention Mask: Candidate Isolation

Key Design Decisions

1. Hash-Based Embeddings

2. Shared Architecture

3. Multi-Action Prediction

Running the Code

Installation

Running the Ranker

Running Retrieval

Running Tests

FilesExpand file tree

phoenix

Directory actions

More options

Directory actions

More options

Latest commit

History

phoenix

Folders and files

parent directory

README.md

Phoenix: Recommendation System

Table of Contents

Overview

Architecture

Two-Stage Recommendation Pipeline

Retrieval: Two-Tower Model

How Retrieval Works

Ranking: Transformer with Candidate Isolation

Ranking Model Architecture

Attention Mask: Candidate Isolation

Key Design Decisions

1. Hash-Based Embeddings

2. Shared Architecture

3. Multi-Action Prediction

Running the Code

Installation

Running the Ranker

Running Retrieval

Running Tests