Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions .github/workflows/release-flashinfer-jit-cache-wheel.yml
Original file line number Diff line number Diff line change
Expand Up @@ -122,3 +122,42 @@ jobs:
name: ${{ steps.artifact-name.outputs.name }}
retention-days: 7
path: flashinfer-jit-cache/dist/*

release:
needs: build-wheel
runs-on: [self-hosted, Linux, x86_64]
steps:
- uses: actions/checkout@v4
with:
ref: ${{ inputs.tag }}

- uses: actions/download-artifact@v4
with:
path: dist/
merge-multiple: true
pattern: wheel-*

- run: ls -lah dist/

- uses: softprops/action-gh-release@v1
with:
tag_name: ${{ inputs.tag }}
files: |
dist/flashinfer_jit_cache*.whl

- name: Clone wheel index
run: git clone https://oauth2:${WHL_TOKEN}@github.com/flashinfer-ai/whl.git flashinfer-whl
env:
WHL_TOKEN: ${{ secrets.WHL_TOKEN }}

- name: Update wheel index
run: python3 scripts/update_flashinfer_jit_cache_whl_index.py

- name: Push wheel index
run: |
cd flashinfer-whl
git config --local user.name "github-actions[bot]"
git config --local user.email "41898282+github-actions[bot]@users.noreply.github.com"
git add -A
git commit -m "update flashinfer-jit-cache whl"
git push
31 changes: 31 additions & 0 deletions scripts/update_flashinfer_jit_cache_whl_index.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
import hashlib
import pathlib
import re

for path in sorted(pathlib.Path("dist").glob("*.whl")):
with open(path, "rb") as f:
sha256 = hashlib.sha256(f.read()).hexdigest()
Comment on lines +6 to +7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Reading the entire file into memory to calculate the SHA256 hash can be inefficient and cause high memory usage for large wheel files. It's a better practice to read the file in chunks to avoid this.

Suggested change
with open(path, "rb") as f:
sha256 = hashlib.sha256(f.read()).hexdigest()
sha256_hash = hashlib.sha256()
with open(path, "rb") as f:
while chunk := f.read(4096):
sha256_hash.update(chunk)
sha256 = sha256_hash.hexdigest()

# Extract version and CUDA version from wheel name
# Example: flashinfer_jit_cache-1.2.3+cu128-cp39-abi3-manylinux_2_28_x86_64.whl
# Example: flashinfer_jit_cache-1.2.3rc1+cu128-cp39-abi3-manylinux_2_28_x86_64.whl
# Example: flashinfer_jit_cache-1.2.3.post1+cu128-cp39-abi3-manylinux_2_28_x86_64.whl
match = re.search(
r"flashinfer_jit_cache-([0-9]+\.[0-9]+\.[0-9]+[a-z0-9.]*)\+cu(\d+)-",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The regular expression for parsing the version string is quite specific and might break if the versioning scheme changes slightly (e.g., from 1.2.3 to 1.3). Using a more general pattern would make the script more robust and less prone to breaking on future version format changes.

Suggested change
r"flashinfer_jit_cache-([0-9]+\.[0-9]+\.[0-9]+[a-z0-9.]*)\+cu(\d+)-",
r"flashinfer_jit_cache-([\w.]+)\+cu(\d+)-",

path.name,
)
if not match:
print(f"Warning: Could not parse wheel name: {path.name}")
continue

ver, cu = match.groups()

# Create directory structure: cu{version}/flashinfer-jit-cache/
# No torch subdirectory since we don't separate by torch version
index_dir = pathlib.Path(f"flashinfer-whl/cu{cu}/flashinfer-jit-cache")
index_dir.mkdir(parents=True, exist_ok=True)

base_url = "https://github.com/flashinfer-ai/flashinfer/releases/download"
full_url = f"{base_url}/v{ver}/{path.name}#sha256={sha256}"

with (index_dir / "index.html").open("a") as f:
f.write(f'<a href="{full_url}">{path.name}</a><br>\n')
Comment on lines +30 to +31
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Opening the index.html file in append mode ("a") can lead to duplicate entries if the script is run multiple times on the same set of wheel files. This could result in a malformed package index. A more robust approach would be to collect all links for each CUDA version first, and then write the index.html file once for each, overwriting any existing file. This ensures the index is always clean and correct.