Skip to content

Add dynamic expression display with clean chat interface#17

Merged
hiyouga merged 8 commits intomainfrom
copilot/add-emoji-display-widget
Dec 28, 2025
Merged

Add dynamic expression display with clean chat interface#17
hiyouga merged 8 commits intomainfrom
copilot/add-emoji-display-widget

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Dec 28, 2025

Implements a live expression viewer that parses [Expression: ...] and [Action: ...] tags from AI responses to display corresponding character images from assets/gen_imgs/, simulating natural conversational expressions. Tags are automatically removed from chat messages for a clean user experience.

Changes

  • UI Enhancement: Added gr.Image component displaying character expressions, positioned in right column with 4:1 ratio layout
  • Regex Parsing: Extracts expression (neutral/smile/serious/confused/surprised/sad) and action (none/nod/shake/wave/jump/point) from complete response content
  • Clean Display: Expression and action tags are removed from chat messages before display - users only see the actual message content
  • Simplified Implementation: Parse once from complete content instead of character-by-character parsing
  • Optimized Performance: Single combined regex pattern for tag removal
  • Fallback: Uses default avatar when expression images missing
  • Configuration: EXPRESSION_IMGS_DIR environment variable for custom image directory

Implementation

Expression images follow naming convention {expression}_{action}.jpg. Tags are parsed and removed in a single pass:

def _parse_expression_and_action(self, content: str) -> tuple[str, str, str]:
    expression = "neutral"
    action = "none"
    
    expression_match = re.search(r"\[Expression:\s*(\w+)\]", content, re.IGNORECASE)
    if expression_match:
        expression = expression_match.group(1).lower()
    
    action_match = re.search(r"\[Action:\s*(\w+)\]", content, re.IGNORECASE)
    if action_match:
        action = action_match.group(1).lower()
    
    # Remove tags from content in a single pass
    cleaned_content = re.sub(r"\[(Expression|Action):\s*\w+\]", "", content, flags=re.IGNORECASE)
    cleaned_content = re.sub(r"\n\s*\n\s*\n", "\n\n", cleaned_content)
    cleaned_content = cleaned_content.strip()
    
    return expression, action, cleaned_content

Simplified streaming with cleaned content:

for msg in ai_messages:
    # Get complete content and parse expression/action once
    content = msg.get("content", "")
    expression, action, cleaned_content = self._parse_expression_and_action(content)
    image_path = self._get_expression_image_path(expression, action)
    
    # Stream the cleaned content character by character
    for char in cleaned_content:
        history[-1]["content"] += char
        time.sleep(self.streaming_delay)
        yield history, image_path

Screenshots

Initial state with default expression:

Clean chat display - expression tags removed from messages:

The chat shows only the actual message content ("你好,我是Ema。" and "请问有什么可以帮助你的吗?") while expression metadata controls the character image display on the right.

Original prompt

我现在使用 scripts/generate_character_images.py 已经生成了一系列人物表情,放在了assets/gen_imgs 里面,在webui里面建立一个ema的表情展示器(用 gr.Image)控件,同时通过正则方法来解析ai 返回的content,基于ai返回的expression和action在表情展示器里面展示对应的图片。模拟和真实人物对话的表情变化


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Add emoji display widget for character expressions Add dynamic expression display with real-time character image updates Dec 28, 2025
Copilot AI requested a review from hiyouga December 28, 2025 18:39
Copilot AI changed the title Add dynamic expression display with real-time character image updates Add dynamic expression display with clean chat interface Dec 28, 2025
Copilot AI requested a review from hiyouga December 28, 2025 18:53
Copilot AI and others added 8 commits December 29, 2025 02:55
Co-authored-by: hiyouga <16256802+hiyouga@users.noreply.github.com>
Co-authored-by: hiyouga <16256802+hiyouga@users.noreply.github.com>
Co-authored-by: hiyouga <16256802+hiyouga@users.noreply.github.com>
Co-authored-by: hiyouga <16256802+hiyouga@users.noreply.github.com>
Co-authored-by: hiyouga <16256802+hiyouga@users.noreply.github.com>
Co-authored-by: hiyouga <16256802+hiyouga@users.noreply.github.com>
@hiyouga hiyouga force-pushed the copilot/add-emoji-display-widget branch from 43f01ac to 92b1e43 Compare December 28, 2025 18:55
@hiyouga hiyouga marked this pull request as ready for review December 28, 2025 18:55
Copilot AI review requested due to automatic review settings December 28, 2025 18:56
@hiyouga hiyouga merged commit 887c340 into main Dec 28, 2025
5 checks passed
@hiyouga hiyouga deleted the copilot/add-emoji-display-widget branch December 28, 2025 18:56
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a dynamic expression display feature that parses expression and action tags from AI responses and displays corresponding character images, creating a more immersive chat experience. The implementation cleanly removes metadata tags from chat messages while using them to update the character's visual representation in real-time.

Key changes:

  • Added expression image viewer using gr.Image component with regex-based parsing of [Expression: ...] and [Action: ...] tags
  • Implemented tag removal from chat messages for clean user-facing content display
  • Modified UI layout to 3:1 column ratio with expression display in right column

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
mini_ema/ui/chat_ui.py Adds expression/action parsing methods, updates UI layout to include expression image viewer, and modifies streaming logic to handle both chat content and expression images
mini_ema/bot/simple_bot.py Updates example responses to include expression and action tags demonstrating the new feature
README.md Documents the new expression display feature, image generation script usage, and supported expressions/actions

Comment on lines +80 to +81
# Clean up extra whitespace and newlines
cleaned_content = re.sub(r"\n\s*\n\s*\n", "\n\n", cleaned_content)
Copy link

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whitespace cleanup pattern only handles 3+ consecutive newlines, but after removing expression/action tags, there may be exactly 2 newlines left (e.g., when tags are at the start like "[Expression: smile] [Action: wave]\n\n你好"). This can leave unwanted leading newlines in the cleaned content. Consider using a more comprehensive cleanup that handles leading/trailing whitespace and consolidates multiple newlines to at most 2, for example:

  1. First remove the tags
  2. Then consolidate any sequence of 2+ newlines (with possible whitespace between them) to exactly 2 newlines
  3. Finally strip leading and trailing whitespace

A pattern like r"\n\s*\n+" -> "\n\n" would better handle these cases.

Suggested change
# Clean up extra whitespace and newlines
cleaned_content = re.sub(r"\n\s*\n\s*\n", "\n\n", cleaned_content)
# Clean up extra whitespace and newlines:
# - Collapse any sequence of 2+ newlines (with optional whitespace) to exactly 2 newlines
# - Then strip leading and trailing whitespace
cleaned_content = re.sub(r"\n\s*\n+", "\n\n", cleaned_content)

Copilot uses AI. Check for mistakes.

Generated images are named in the format: `{expression}_{action}.jpg` (e.g., `smile_wave.jpg`, `sad_none.jpg`)

**Note:** Image generation uses the Gemini API and may incur costs. Placeholder images are included by default.
Copy link

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The statement "Placeholder images are included by default" is potentially misleading. The implementation falls back to the default Ema avatar (EMA_AVATAR) when specific expression images are not found, but no placeholder images for each expression/action combination are included in the repository. Consider revising to: "Falls back to the default Ema avatar when expression images are not generated."

Suggested change
**Note:** Image generation uses the Gemini API and may incur costs. Placeholder images are included by default.
**Note:** Image generation uses the Gemini API and may incur costs. Falls back to the default Ema avatar when expression images are not generated.

Copilot uses AI. Check for mistakes.
Comment on lines +176 to +194
with gr.Row():
with gr.Column(scale=3):
# Bot selector
bot_selector = gr.Dropdown(
choices=list(self.bots.keys()),
value=list(self.bots.keys())[0],
label="🤖 Select Bot",
interactive=True,
)

# User name input
username_input = gr.Textbox(
value="Phoenix",
label="👤 Username",
placeholder="Enter username...",
interactive=True,
)

with gr.Column(scale=1):
Copy link

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description mentions "4:1 ratio layout" but the actual implementation uses scale=3 and scale=1, which creates a 3:1 ratio, not 4:1. Consider updating the PR description to match the implementation or adjusting the code if 4:1 was intended.

Copilot uses AI. Check for mistakes.
Comment on lines +69 to +76
expression_match = re.search(r"\[Expression:\s*(\w+)\]", content, re.IGNORECASE)
if expression_match:
expression = expression_match.group(1).lower()

# Try to match action pattern: [Action: <action>]
action_match = re.search(r"\[Action:\s*(\w+)\]", content, re.IGNORECASE)
if action_match:
action = action_match.group(1).lower()
Copy link

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex patterns are recompiled on every call to this method. For better performance, consider compiling the regex patterns once as class-level constants and reusing them. For example, define EXPRESSION_PATTERN = re.compile(r"\[Expression:\s*(\w+)\]", re.IGNORECASE) and ACTION_PATTERN = re.compile(r"\[Action:\s*(\w+)\]", re.IGNORECASE) at the class level, then use EXPRESSION_PATTERN.search(content) instead of re.search(...).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants