-
Notifications
You must be signed in to change notification settings - Fork 58
chore: add dedicated claude skill for RNE #800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| @@ -0,0 +1,660 @@ | |||
| --- | |||
| name: react-native-executorch | |||
| description: Build on-device AI into React Native apps using ExecuTorch. Provides hooks for LLMs, computer vision, OCR, audio processing, and embeddings without cloud dependencies. Use when building AI features into mobile apps - AI chatbots, image recognition, speech processing, or text search. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| description: Build on-device AI into React Native apps using ExecuTorch. Provides hooks for LLMs, computer vision, OCR, audio processing, and embeddings without cloud dependencies. Use when building AI features into mobile apps - AI chatbots, image recognition, speech processing, or text search. | |
| description: Integrate on-device AI into React Native apps using ExecuTorch. Provides hooks for LLMs, computer vision, OCR, audio processing, and embeddings without cloud dependencies. Use when building AI features into mobile apps - AI chatbots, image recognition, speech processing, or text search. |
| @@ -0,0 +1,660 @@ | |||
| --- | |||
| name: react-native-executorch | |||
| description: Build on-device AI into React Native apps using ExecuTorch. Provides hooks for LLMs, computer vision, OCR, audio processing, and embeddings without cloud dependencies. Use when building AI features into mobile apps - AI chatbots, image recognition, speech processing, or text search. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe mention SWM here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also i feel like this strongly implies that its just executorch while its more than that :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather not mention SWM here as based on description the agent decides whether to use this skill or not (if it's relevant). However, I can add some references/links to SWM in the file.
|
|
||
| ## Overview | ||
|
|
||
| React Native Executorch is a library that enables on-device AI model execution in React Native applications. It provides hooks and utilities for running machine learning models directly on mobile devices without requiring cloud infrastructure or internet connectivity (after initial model download). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| React Native Executorch is a library that enables on-device AI model execution in React Native applications. It provides hooks and utilities for running machine learning models directly on mobile devices without requiring cloud infrastructure or internet connectivity (after initial model download). | |
| React Native Executorch is a library that enables on-device AI model execution in React Native applications. It provides APIs for running machine learning models directly on mobile devices without requiring cloud infrastructure or internet connectivity (after initial model download). |
| ### Use Case 2: Image Recognition & Tagging | ||
|
|
||
| **Trigger:** User needs to classify images, detect objects, or recognize content in photos | ||
|
|
||
| **Steps:** | ||
|
|
||
| 1. Select vision model (classification, detection, or segmentation) | ||
| 2. Load model for image processing task | ||
| 3. Pass image URI and process results | ||
| 4. Display detections or classifications in app UI | ||
|
|
||
| **Result:** App that understands image content without sending data to servers | ||
|
|
||
| **Reference:** [./references/reference-cv.md](./references/reference-cv.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should rename this use case to Computer vision, or something more general?
|
|
||
| 1. Select vision model (classification, detection, or segmentation) | ||
| 2. Load model for image processing task | ||
| 3. Pass image URI and process results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we don't need to say the URI part here, as we 100% will introduce more ways to pass images
|
|
||
| **Supported tasks:** | ||
|
|
||
| - **Speech-to-Text** - Transcribe audio to text (supports English and multilingual) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - **Speech-to-Text** - Transcribe audio to text (supports English and multilingual) | |
| - **Speech-to-Text** - Transcribe audio to text (supports multiple languages including English) |
| **What to do:** | ||
|
|
||
| 1. Choose a model from available LLM options (consider device memory constraints) | ||
| 2. Use the `useLLM` hook to load the model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or the TS API
| Use `useLLM` with tool definitions to allow the model to call predefined functions. | ||
|
|
||
| **What to do:** | ||
|
|
||
| 1. Define tools with name, description, and parameter schema | ||
| 2. Configure the LLM with tool definitions | ||
| 3. Implement callbacks to execute tools when the model requests them | ||
| 4. Parse tool results and pass them back to the model | ||
|
|
||
| **Reference:** [./references/reference-llms.md](./references/reference-llms.md) - Tool Calling section | ||
|
|
||
| --- | ||
|
|
||
| ### I want structured data extraction from text | ||
|
|
||
| Use `useLLM` with structured output generation using JSON schema validation. | ||
|
|
||
| **What to do:** | ||
|
|
||
| 1. Define a schema (JSON Schema or Zod) for desired output format | ||
| 2. Configure the LLM with the schema | ||
| 3. Generate responses and validate against the schema | ||
| 4. Use the validated structured data in your app | ||
|
|
||
| **Reference:** [./references/reference-llms.md](./references/reference-llms.md) - Structured Output section | ||
|
|
||
| --- | ||
|
|
||
| ### I want to classify or recognize objects in images | ||
|
|
||
| Use `useClassification` for simple categorization or `useObjectDetection` for locating specific objects. | ||
|
|
||
| **What to do:** | ||
|
|
||
| 1. Choose appropriate computer vision model based on task | ||
| 2. Load the model with the appropriate hook | ||
| 3. Pass image URI (local, remote, or base64) | ||
| 4. Process results (classifications, detections with bounding boxes) | ||
|
|
||
| **Reference:** [./references/reference-cv.md](./references/reference-cv.md) | ||
|
|
||
| **Model options:** [./references/reference-models.md](./references/reference-models.md) - Classification and Object Detection sections | ||
|
|
||
| --- | ||
|
|
||
| ### I want to extract text from images | ||
|
|
||
| Use `useOCR` for horizontal text or `useVerticalOCR` for vertical text (experimental). | ||
|
|
||
| **What to do:** | ||
|
|
||
| 1. Choose appropriate OCR model and recognizer matching your target language | ||
| 2. Load the model with `useOCR` or `useVerticalOCR` hook | ||
| 3. Pass image URI | ||
| 4. Extract detected text regions with bounding boxes and confidence scores | ||
| 5. Process results based on your application needs | ||
|
|
||
| **Reference:** [./references/reference-ocr.md](./references/reference-ocr.md) | ||
|
|
||
| **Model options:** [./references/reference-models.md](./references/reference-models.md) - OCR section | ||
|
|
||
| --- | ||
|
|
||
| ### I want to convert speech to text or text to speech | ||
|
|
||
| Use `useSpeechToText` for transcription or `useTextToSpeech` for voice synthesis. | ||
|
|
||
| **What to do:** | ||
|
|
||
| - **For Speech-to-Text:** Capture or load audio, ensure 16kHz sample rate, transcribe | ||
| - **For Text-to-Speech:** Prepare text, specify voice parameters, generate audio waveform, play using audio context | ||
|
|
||
| **Reference:** [./references/reference-audio.md](./references/reference-audio.md) | ||
|
|
||
| **Model options:** [./references/reference-models.md](./references/reference-models.md) - Speech to Text and Text to Speech sections | ||
|
|
||
| --- | ||
|
|
||
| ### I want to find similar images or texts | ||
|
|
||
| Use `useImageEmbeddings` for images or `useTextEmbeddings` for text. | ||
|
|
||
| **What to do:** | ||
|
|
||
| 1. Load appropriate embeddings model | ||
| 2. Generate embeddings for your content | ||
| 3. Compute similarity metrics (cosine similarity, dot product) | ||
| 4. Use similarity scores for search, clustering, or deduplication | ||
|
|
||
| **Reference:** | ||
|
|
||
| - Text: [./references/reference-nlp.md](./references/reference-nlp.md) | ||
| - Images: [./references/reference-cv-2.md](./references/reference-cv-2.md) | ||
|
|
||
| --- | ||
|
|
||
| ### I want to apply artistic filters to photos | ||
|
|
||
| Use `useStyleTransfer` to apply predefined artistic styles to images. | ||
|
|
||
| **What to do:** | ||
|
|
||
| 1. Choose from available artistic styles (Candy, Mosaic, Udnie, Rain Princess) | ||
| 2. Load the style transfer model | ||
| 3. Pass image URI | ||
| 4. Retrieve and use the stylized image | ||
|
|
||
| **Reference:** [./references/reference-cv-2.md](./references/reference-cv-2.md) | ||
|
|
||
| **Model options:** [./references/reference-models.md](./references/reference-models.md) - Style Transfer section | ||
|
|
||
| --- | ||
|
|
||
| ### I want to generate images from text | ||
|
|
||
| Use `useTextToImage` to create images based on text descriptions. | ||
|
|
||
| **What to do:** | ||
|
|
||
| 1. Load the text-to-image model | ||
| 2. Provide text description (prompt) | ||
| 3. Optionally specify image size and number of generation steps | ||
| 4. Receive generated image (may take 20-60 seconds depending on device) | ||
|
|
||
| **Reference:** [./references/reference-cv-2.md](./references/reference-cv-2.md) | ||
|
|
||
| **Model options:** [./references/reference-models.md](./references/reference-models.md) - Text to Image section | ||
|
|
||
| --- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rephrase all of these so there is an information about respective TS api class, not just about the hook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think that we should directly mention all Typescipt API equivalents in all reference files as well? For now I followed an approach where I describe hooks in detail and mention Typescript API implementation in additional resources section (with link to docs) only
|
|
||
| Audio must be in correct sample rate for processing: | ||
|
|
||
| - **Speech-to-Text input:** 16kHz sample rate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - **Speech-to-Text input:** 16kHz sample rate | |
| - **Speech-to-Text or VAD input:** 16kHz sample rate |
|
|
||
| 1. Choose appropriate computer vision model based on task | ||
| 2. Load the model with the appropriate hook | ||
| 3. Pass image URI (local, remote, or base64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as Jakub suggested earlier
|
|
||
| 1. Choose from available artistic styles (Candy, Mosaic, Udnie, Rain Princess) | ||
| 2. Load the style transfer model | ||
| 3. Pass image URI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here as well (probably every CV task)
Description
This PR adds a Claude Skill for RN Executorch that can help with building, prototyping and debugging RNE apps.
Introduces a breaking change?
Type of change
Tested on
Testing instructions
The same version of this skill was uploaded to this repository so it's possible to use this skill globally now.
To do this run:
(after merging this PR it will be possible to add this skill to the project with
npx skills add software-mansion/react-native-executorch)Screenshots
Related issues
Checklist
Additional notes