Releases · dais-polymtl/flock

15 Dec 22:29

v0.2.1

ad676de

Latest

This is a simple enhancement release based on feedback after we released v0.2.0. MODEL(s) and PROMPT(s) can be defined with CREATE commands as GLOBAL (persisting across multiple databases) or LOCAL to the database. By default and if not specified, the model or prompt defined is LOCAL.

What's Changed

Openai-Compatible Providers by @dorbanianas in #97
Add the global and local prompts/models support by @dorbanianas in #99
Added a Global and Local example to the product review notebook by @dorbanianas in #104
Upgrade to duckdb 1.1.3 by @dorbanianas in #106

New Contributors

@aborruso made their first contribution in #95

Full Changelog: v0.2.0...v0.2.1

Contributors

aborruso and anasdorbani

Assets 12

11 Dec 02:53

queryproc

v0.2.0

b92ae14

FlockMTL v0.2.0 "St-Viateur Bagel"

What's Changed

remove redundant code by @dorbanianas in #41
Add llm_max & llm_min Aggregate Functions for Selecting Most/Least Relevant Row by group by @dorbanianas in #43
create llm_rerank aggregate function by @dorbanianas in #44
change prompt_name for aggregate functions into search_query by @dorbanianas in #47
Refactor/change max min to first last by @dorbanianas in #48
Fix Combine for multithreading & windows build issues by @dorbanianas in #49
Feature/model provider setup by @SunnyYasser in #45
fix windows and linux builds by @dorbanianas in #50
drop model_name new lines removal from modal manager by @dorbanianas in #51
fix llm_rerank index accessing and gpt-4o-mini ranked list issue by @dorbanianas in #52
Added Jupyter Notebook for FlockMTL setup by @SunnyYasser in #57
Feature/getting started with flock mtl example by @SunnyYasser in #58
Added python script for FlockMTL examples by @SunnyYasser in #59
Add llm_reduce Aggregate Function by @dorbanianas in #53
Add the batching to the llm_embedding by @dorbanianas in #54
Ollama provider onboarding by @SunnyYasser in #55
Feat/batching on output max size by @dorbanianas in #56
Refactor llm_* Function Call Structure for Enhanced Flexibility in Prompt Specification by @dorbanianas in #61
Add Support for Prompt Versioning by @dorbanianas in #63
Add Support for Configuring Secrets for Multiple Model Providers by @dorbanianas in #64
Add fusion_relative Function for Score Normalization and Max Aggregation by @dorbanianas in #65
Add Support for Configuring Secrets for Multiple Model Providers by @dorbanianas in #66
Fix Windows Build by Enabling C++17 by @dorbanianas in #67
Prompt Manager Refactoring by @dorbanianas in #68
Integrate New Prompt Manager Implementation within Scalar and Aggregate Functions by @dorbanianas in #69
Refactor and Improve Model Manager Implementation by @dorbanianas in #70
move the custom parser from flockmtl/core/parser into flockmtl/custom… by @dorbanianas in #73
Enhanced String Manipulation with duckdb_fmt::format by @dorbanianas in #75
Fix the missing double quotation mark by @dorbanianas in #76
Refactor/functions registry by @dorbanianas in #77
Refactor the scalar and aggregate functions by @dorbanianas in #78
Add the full website with the GH action by @dorbanianas in #81
Fix deploy gh action by @dorbanianas in #82
Enable workflow dispatch by @dorbanianas in #83
Remove Env Variable Requirement and Add Support for New Secrets in Secret Manager by @dorbanianas in #80
Fix the redifinition of supported providers by @dorbanianas in #84
fix typo and add paths ignore to main distribution pipeline by @dorbanianas in #86
Fix secret queries by @dorbanianas in #87
Udpate the readme file by @dorbanianas in #85
udpate the azure type into azure_llm in secret manager by @dorbanianas in #88
Add Product Review Notebook Script & Edit the GH Action Paths by @dorbanianas in #89
Update README.md by @dorbanianas in #90

Full Changelog: v0.1.0...v0.2.0

Contributors

yassersun and anasdorbani

Assets 14

21 Oct 23:58

queryproc

v0.1.0

1bd8ac0

FlockMTL v0.1.0 "Schwartz's Deli"

FlockMTL v0.1.0 is a DuckDB extension that integrates language model (LLM) capabilities directly into your queries and workflows. This experimental extension enables DuckDB users to add semantic analysis (classification, filtering, completion, all w/ structured output) and embeddings using GPT models—all from within SQL commands. Following the tradition of declarativity, we introduce an administrative view of MODEL(s) and PROMPT(s) akin to TABLE(s).

Key Features:

LLM Integration: Seamlessly call OpenAI models, such as GPT-4, directly from SQL with different function signatures to obtain structured output, VARCHAR completions, boolean filters.
Prompt Definition: Define and store custom prompts for repeated use across queries. Similarly, fix the query once with a prompt name and iterate on the prompt separately or even change it later.
Model Definition: Define and switch between different models by giving them names and updating the models name without a change to the query.

The full documentation can be found in our repo's README file.

Next, we demonstrate how to use the DuckDB LLM Extension and using product reviews analysis as an examples using a product_reviews table with attributes review_text, review_id, and customer_name. You can also manage LLM prompts and models but we do not show those examples here.

1. Text Generation with `llm_complete`

llm_complete generates text based on a given prompt and LLM model. The following examples show how this functionality can be used in real-world scenarios.

Basic Text Generation – Product Description:
Imagine you have a list of product names, and you want to automatically generate product descriptions based on the product name.

SELECT product_name, llm_complete('generate_product_description', 'default', {'product_name': product_name}, {'max_tokens': 150}) AS product_description 
FROM products;

2. JSON Output with `llm_complete_json`

llm_complete_json is useful when you want structured output, such as detailed classification or multi-part answers.

Example: Sentiment and Detailed Analysis:
Classify whether a review is positive or negative and provide a detailed analysis of the sentiment in JSON format.
```
SELECT review_id, llm_complete_json('detailed_sentiment_analysis', 'default', {'text': review_text}, {'max_tokens': 200}) AS analysis 
FROM product_reviews;
```
Result:

review_id analysis

112233 {"sentiment": "negative", "confidence": 0.85, "keywords": ["bad service", "slow response"]}

review_id	analysis
112233	{"sentiment": "negative", "confidence": 0.85, "keywords": ["bad service", "slow response"]}

3. Filtering with `llm_filter`

Use llm_filter to filter records based on LLM-powered classifications, such as identifying specific sentiments or themes in product reviews.

Example: Filter Reviews Mentioning Specific Features:
Filter reviews to return only those that mention specific product features, such as "battery life" or "performance."
```
SELECT review_id, customer_name, review_text 
FROM product_reviews 
WHERE llm_filter('mentions_feature', 'default', {'text': review_text, 'feature': 'battery life'});
```
Result:

review_id customer_name review_text

56789 John Doe "The battery life on this phone is fantastic! It lasted me a full two days without needing a recharge."

review_id	customer_name	review_text
56789	John Doe	"The battery life on this phone is fantastic! It lasted me a full two days without needing a recharge."

4. Text Embedding with `llm_embedding`

llm_embedding generates vector embeddings for text, which can be used for tasks like semantic similarity, clustering, or advanced search.

Example: Find Reviews Similar to a Target Review:
Generate embeddings for each review and compare them to a target review to find similar reviews based on their semantic content.

WITH input_embedding AS (
    SELECT llm_embedding({'text': 'Comfortable to wear, but the sound quality is just average. Expected better for the price.'},
             'text-embedding-3-small') AS embedding
),
review_embeddings AS (
    SELECT 
        review_id, 
        array_distance(input_embedding.embedding::DOUBLE[1536], 
                       llm_embedding({'text': review_text}, 'text-embedding-3-small')::DOUBLE[1536]) AS similarity
    FROM product_reviews, input_embedding
)
SELECT review_id AS similar_review_id
FROM review_embeddings
ORDER BY similarity
LIMIT 1;

Result:

similar_review_id
4

Assets 12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

New Contributors

Contributors

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

Contributors

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

1. Text Generation with `llm_complete`

2. JSON Output with `llm_complete_json`

3. Filtering with `llm_filter`

4. Text Embedding with `llm_embedding`

Uh oh!

Releases: dais-polymtl/flock

v0.2.1 Enhancement Release

What's Changed

New Contributors

Contributors

Uh oh!

FlockMTL v0.2.0 "St-Viateur Bagel"

What's Changed

Contributors

Uh oh!

FlockMTL v0.1.0 "Schwartz's Deli"

1. Text Generation with llm_complete

2. JSON Output with llm_complete_json

3. Filtering with llm_filter

4. Text Embedding with llm_embedding

Uh oh!

1. Text Generation with `llm_complete`

2. JSON Output with `llm_complete_json`

3. Filtering with `llm_filter`

4. Text Embedding with `llm_embedding`