Implementation of example storage and example selector classes [draft] by 3LayerPerceptron · Pull Request #428 · deeppavlov/chatsky

3LayerPerceptron · 2025-04-02T17:05:56Z

Description

Please describe here what changes are made and why.

Checklist

I have performed a self-review of the changes

List here tasks to complete in order to mark this PR as ready for review.

To Consider

Add tests (if functionality is changed)
Update API reference / tutorials / guides
Update CONTRIBUTING.md (if devel workflow is changed)
Update .ignore files, scripts (such as lint), distribution manifest (if files are added/deleted)
Search for references to changed entities in the codebase

github-actions

It appears this PR is a release PR (change its base from master if that is not the case).

Here's a release checklist:

Update package version
Update poetry.lock
Change PR merge option
Update template repo
Search for objects to be deprecated
Test parts not covered with pytest:
- web_api tutorials
- Test integrations with external services (telegram; stats)

chatsky/llm/example_selector.py

3LayerPerceptron · 2025-04-02T18:21:20Z

Также, во время написания у меня возникли вопросы к архитектуре решения.

Основная проблема архитектуры заключается в том, что в langchain ExampleSelector является холдером примеров. Наша обертка тоже пытается быть холдером примеров.

Из этого следуют описанные ниже проблемы:

Проблема владения примерами
Если мы действительно хотим оставить владение примерами на стороне обертки, то мы столкнемся с тем, что примеры в селекторе и обертке не общаются между собой.
Также будет происходить дублирование информации, т.к. и обертка и селектор владеют копиями одних и тех же примеров.
Это связано с тем, что логика выбора примеров реализована в селекторе и ему необходимо знать из чего выбирать.

Возможный фикс
Добавить в нашу обертку метод add_example и конструктор, которые будут синхронизировать примеры.
Также нужно будет передавать примеры в метод select_examples, чтобы избежать дубликации.

**Возникают вопросы. **
А для чего нам дополнительный уровень абстракции в виде обертки?
Если селектору надо передавать примеры, так может просто логику селектора реализовать в качестве метода в обертку?

Допустим, мы решили, что холдером может быть только один из классов.

Если холдер - это селектор из langchain, то зачем нам обертка, почему просто не использовать переопределенный селектор.

Если холдер - это обертка, то зачем мы вообще наследуемся от langchain базы?
Выходит так, что мы не используем в этом случае один из методов селектора, а другой используем только потому, что наследуемся от этой базы.

chatsky/llm/example_selector.py

RLKRo

А для чего нам дополнительный уровень абстракции в виде обертки?

It's desirable to provide both simple and advanced ways of adding examples.

ExamplePrompt(
    examples=[
        {"input": "3+4", "output": "7"},
    ]
)

-- this is simple and easy to use in scripts.

example_selector = SemanticSimilarityExampleSelector(vector_store=Chroma(...))

example_selector.add_examples(...)

ExamplePrompt(
    examples=example_selector
)

-- difficult to setup but more customizable.

Another layer of abstraction would allow the same API for the both options.

chatsky/llm/example_selector.py

chatsky/llm/prompt.py

… support added

chatsky/llm/prompt.py

chatsky/llm/example_selector.py

chatsky/llm/prompt.py

…e function

tests/llm/test_example_selector.py

chatsky/llm/example_selector.py

…changed for clarity;

3LayerPerceptron · 2025-04-30T12:20:19Z

All coding is done and tests have been added.
Moving to the writing proper docstrings and guide page.

tests/llm/test_example_selector.py

tests/llm/test_static_example_selector.py

chatsky/llm/example_selector.py

tests/llm/test_example_selector.py

tests/llm/test_to_langchain_context.py

…rompt

chatsky/llm/example_selector.py

tutorials/llm/5_example_selector.py

tests/llm/test_example_selector.py

3LayerPerceptron · 2025-05-17T17:52:01Z

I run poe lint and refactored code accordingly.
The most recent commit also removes prompt specification in integration tests, since changes in Prompt related classes broke it and they are still ongoing to my understanding.

RLKRo · 2025-05-21T11:50:13Z

chatsky/llm/langchain_context.py

-            prompt = Prompt.model_validate(element)
-            prompt_langchain_message = await message_to_langchain(await prompt.message(ctx), ctx, source="human")
-
+            prompt_messages = await element.to_langchain_messages(ctx)


Element is of type Any. It should be converted to Prompt first.

RLKRo · 2025-05-21T11:51:41Z

chatsky/llm/prompt.py

+from chatsky.core import BaseResponse, AnyResponse, MessageInitTypes, Message, Context
+from chatsky.llm._langchain_imports import HumanMessage, AIMessage, SystemMessage
+from langchain_core.example_selectors.base import BaseExampleSelector
+from chatsky.llm.example_selector import to_langchain_context


Imports should be grouped:
https://peps.python.org/pep-0008/#imports

RLKRo · 2025-05-21T11:55:07Z

chatsky/llm/prompt.py

+    Uses Langchain's example selectors and prompt templates for few-shot learning.
+    """
+    template: Optional[str] = Field(None)
+    examples: Optional[BaseExampleSelector] = None


This should be Union[StaticExampleSelector|BaseExampleSelector] to allow initializing the class as

FewShotExamplePrompt(examples=[ {"input": "", "output": ""}, ])

RLKRo · 2025-05-21T13:55:55Z

tests/llm/test_prompt.py

+
+
+@pytest.fixture(name="ctx")
+def ctx() -> Context:


What is the difference between this fixture and book_context?
They return almost the same thing.

RLKRo · 2025-05-21T13:56:13Z

tests/llm/test_prompt.py

+# --------------------
+# Tests for BasePrompt
+# --------------------
+class TestBasePrompt:


No need for this comment.

RLKRo · 2025-05-21T14:11:22Z

tests/llm/test_prompt.py

+        assert isinstance(result[0], SystemMessage)
+        assert "Query: 2 + 2" in result[0].content[0]["text"]
+        assert "Answer: 4" in result[0].content[0]["text"]
+        assert "Answer the following:" in result[0].content[0]["text"]


I think it would be better to compare result to the actual result.
e.g.

assert result == [SystemMessage("...")]

RLKRo · 2025-05-21T14:16:51Z

chatsky/llm/prompt.py

+                prefix=self.prefix,
+                suffix=self.suffix,
+            )
+            text = prompt_template.format(input=user_input, output="")


I don't think we should support including user_input in this prompt.
User request is already added to context history in get_langchain_context.

RLKRo · 2025-05-21T14:51:40Z

tests/llm/test_prompt.py

+        assert called_args["vars"]["input"] == ""
+
+    @pytest.mark.asyncio
+    async def test_message_to_langchain_error(ctx, monkeypatch):


What is the purpose of this test?

RLKRo · 2025-05-21T14:58:30Z

tests/llm/test_prompt.py

+            await prompt.to_langchain_messages(ctx)
+
+    @pytest.mark.asyncio
+    async def test_template_without_prefix_suffix(self, ctx, monkeypatch):


I don't see the point of this test.
Empty strings are allowed in langchain's FewShotPromptTemplate and our FewShotExamplePrompt does not do anything special with suffix or prefix.

It would make more sense to pass None as prefix and suffix instead of empty strings, since our model allows None but FewShotPromptTemplate accepts only strings.

RLKRo · 2025-05-21T15:11:31Z

tests/llm/test_prompt.py

There are a lot of tests but a lot of them seem unnecessary.

While some more important tests concerning UX are missing.
E.g. test what happens when prefix is None and template is not or vice versa (there should be a warning or exception if prefix is used without template).

Implementation of example storage and example selector classes

8210a84

github-actions bot reviewed Apr 2, 2025

View reviewed changes

3LayerPerceptron commented Apr 2, 2025

View reviewed changes

chatsky/llm/example_selector.py Show resolved Hide resolved

3LayerPerceptron commented Apr 2, 2025

View reviewed changes

chatsky/llm/example_selector.py Outdated Show resolved Hide resolved

3LayerPerceptron commented Apr 2, 2025

View reviewed changes

chatsky/llm/example_selector.py Show resolved Hide resolved

3LayerPerceptron commented Apr 2, 2025

View reviewed changes

chatsky/llm/example_selector.py Outdated Show resolved Hide resolved

RLKRo reviewed Apr 3, 2025

View reviewed changes

add BasePromt and FewShotExamplePrompt class

5890636

RLKRo reviewed Apr 3, 2025

View reviewed changes

chatsky/llm/prompt.py Outdated Show resolved Hide resolved

chatsky/llm/prompt.py Outdated Show resolved Hide resolved

chatsky/llm/prompt.py Outdated Show resolved Hide resolved

chatsky/llm/prompt.py Outdated Show resolved Hide resolved

chatsky/llm/prompt.py Outdated Show resolved Hide resolved

3LayerPerceptron and others added 2 commits April 10, 2025 17:18

New selector class architecture implemented, proper structured output…

cb0dde5

… support added

fix conflict with type

f8682a6

RLKRo reviewed Apr 11, 2025

View reviewed changes

3LayerPerceptron and others added 3 commits April 17, 2025 18:58

Code refactoring + to_langchain_context() method made to be standalon…

7347119

…e function

Move get_langchain_context to prompt.py and align types

3e79763

Example selector has been cleaned + Covered with tests

9a57e98

RLKRo reviewed Apr 24, 2025

View reviewed changes

tests/llm/test_example_selector.py Outdated Show resolved Hide resolved

tests/llm/test_example_selector.py Outdated Show resolved Hide resolved

chatsky/llm/example_selector.py Outdated Show resolved Hide resolved

3LayerPerceptron added 4 commits April 26, 2025 15:26

imports have been formatted

19493f3

Signatures fixed; Redundant test cases (add_example) removed

2c2e3ef

to_langchain_context and its tests are now async

9f456b0

unit test added; big test decomposed; to_langchain_context signature …

276154f

…changed for clarity;

RLKRo reviewed May 5, 2025

View reviewed changes

3LayerPerceptron added 7 commits May 10, 2025 19:11

StaticExampleSelector.unpack_model() is static now

072a8de

All tests have been merged into test_example_selector.py

c61722d

convoluted tests - fixed; requested test added

9c332b4

tests fixed via fixtures; code refactored accordingly

1a7f6f4

docstrings added

5cee964

Langchain test - fixed;

0ade971

tutorial added

e8ded28

katimanova and others added 11 commits May 15, 2025 17:54

Integrate example_selector in FewFhotExample prompt with priority

8ef981d

change Prompt field into BasePrompt

a9683c3

unused method removed

cab25c8

now scope specified in a fixture call

44dc25a

naming in fixture changed

1fc8b43

testing file formatted

2fe428e

docstrings fixed

006ac5e

test - checking prompt structure (procees... )

e35b655

guide clarification

8947a81

simple implementation logic for context

f036ad8

fix unused parameters and correct logical in class Few Shot Example P…

73eea4d

…rompt

RLKRo changed the base branch from master to dev May 16, 2025 18:00

RLKRo reviewed May 16, 2025

View reviewed changes

chatsky/llm/example_selector.py Outdated Show resolved Hide resolved

tutorials/llm/5_example_selector.py Outdated Show resolved Hide resolved

tests/llm/test_example_selector.py Outdated Show resolved Hide resolved

katimanova and others added 3 commits May 16, 2025 22:09

correct using function message_to_langchain & to_langchain_messages

791e8d7

refactoring

b4b4790

prompt specification removed due to frequent changes in prompt class

f50d0bb

katimanova and others added 6 commits May 18, 2025 22:07

fix format prompt.py by linters

5c1942e

tutorial final change

38d0f2d

Add tests for BasePrompt

a3ac6b2

add test for Prompt

fb8e352

add TestFewShotExamplePrompt and TestPromptIntegration

ee98435

add tutorials

7caa257

RLKRo reviewed May 21, 2025

View reviewed changes

katimanova added 6 commits May 21, 2025 19:10

wrapper in Prompt

5fdfee2

grouping of imports

a638dea

correct type examples and remove user_input for template

dc63888

remove test_init_with_message and test_init_with_str

85beda4

replace test test_prompt_with_base_response

98b280c

uniform assert

d7823e2

Conversation

3LayerPerceptron commented Apr 2, 2025

Description

Checklist

To Consider

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

3LayerPerceptron commented Apr 2, 2025

Uh oh!

Uh oh!

RLKRo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

3LayerPerceptron commented Apr 30, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

3LayerPerceptron commented May 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants