Skip to content

A notebook for managing user notes, powered by artificial intelligence, hybrid search (semantic and lexical) and semantic discrimination.

Notifications You must be signed in to change notification settings

IsmailKattan/Notebook

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“ Smart Notes Application

A powerful Flask-based notes application with advanced hybrid search capabilities and an AI-powered chatbot using OpenSearch and DeepSeek Chat for intelligent note exploration.

Python Flask OpenSearch License

✨ Features

  • πŸ“ Complete Note Management: Create, edit, delete, and organize your notes
  • πŸ” Hybrid Search: Semantic + lexical search powered by OpenSearch
  • πŸ€– RAG Chatbot: AI assistant that answers questions using your notes via DeepSeek Chat
  • 🏷️ Organization: Categories, tags, and color-coding for easy note organization
  • ⭐ Favorites: Mark important notes for quick access
  • πŸ—‘οΈ Soft Delete: Recover accidentally deleted notes
  • πŸ‘€ User Authentication: Secure login and registration system
  • πŸ’… Modern UI: Beautiful interface with Tailwind CSS and custom SCSS
  • πŸ’¬ Chat Sessions: Maintain conversation history with the AI chatbot

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Browser   │────▢│ Flask App    │────▢│   SQLite DB     β”‚
β”‚  (Frontend) β”‚     β”‚  (Backend)   β”‚     β”‚  (Notes, Users) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                           β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Άβ”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                           β”‚               β”‚  OpenSearch     β”‚
                           β”‚               β”‚  - Hybrid Searchβ”‚
                           β”‚               β”‚  - Embeddings   β”‚
                           β”‚               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Άβ”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                           β”‚  DeepSeek API   β”‚
                                           β”‚  - RAG Chatbot  β”‚
                                           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Tech Stack

Backend

  • Flask 3.1.2 - Web framework
  • SQLAlchemy - ORM for database operations
  • SQLite - Lightweight database
  • Flask-Login - User session management

Search & AI

  • OpenSearch 3.4.0 - Hybrid search engine with ML capabilities
  • Sentence Transformers - Text embeddings (all-MiniLM-L6-v2)
  • DeepSeek Chat API - RAG-powered conversational AI

Frontend

  • Tailwind CSS - Utility-first CSS framework
  • SCSS - CSS preprocessing
  • Vanilla JavaScript - Interactive UI components

DevOps

  • Docker & Docker Compose - Containerization
  • Node.js & npm - Build tools

πŸ“‹ Prerequisites

Ensure you have the following installed:

  • Python 3.8 or higher
  • Node.js 14.x or higher
  • Docker and Docker Compose
  • npm or yarn
  • DeepSeek API Key (get one from DeepSeek API)

πŸ› οΈ Installation

1. Clone the Repository

git clone <repository-url>
cd notebook

2. Create Virtual Environment

Windows:

python -m venv env
env\Scripts\activate

macOS/Linux:

python3 -m venv env
source env/bin/activate

3. Install Python Dependencies

pip install -r requirements.txt

4. Install Node.js Dependencies

npm install

5. Set Up Environment Variables

Create a .env file in the project root:

# Flask Configuration
SECRET_KEY=your-secret-key-here-change-in-production
SQLALCHEMY_DATABASE_URI=sqlite:///notebook.db

# OpenSearch Configuration
OPENSEARCH_HOST=localhost
OPENSEARCH_PORT=9200

# DeepSeek Configuration
DEEPSEEK_API_KEY=sk-your-deepseek-api-key-here
DEEPSEEK_MODEL=deepseek-chat

# Chatbot Configuration
MAX_CONTEXT_NOTES=5
MAX_CONTEXT_LENGTH=2000

6. Start OpenSearch

# Start OpenSearch in detached mode
docker-compose up -d

# Verify OpenSearch is running
curl http://localhost:9200

Wait 1-2 minutes for OpenSearch to fully initialize.

βš™οΈ OpenSearch Configuration

You have two options for configuring OpenSearch:

Option A: Automated Setup (Recommended)

Use the provided Python script to automatically configure everything:

# Make sure your virtual environment is activated
# Set your DeepSeek API key in environment
export DEEPSEEK_API_KEY=sk-your-deepseek-api-key-here  # Linux/macOS
# or
set DEEPSEEK_API_KEY=sk-your-deepseek-api-key-here     # Windows CMD
# or
$env:DEEPSEEK_API_KEY="sk-your-deepseek-api-key-here"  # Windows PowerShell

# Run the setup script
python opensearch_setup.py

The script will:

  • βœ… Configure all cluster settings
  • βœ… Register model groups and models
  • βœ… Create pipelines and indexes
  • βœ… Set up the DeepSeek connector
  • βœ… Create the RAG search pipeline
  • βœ… Output all environment variables for your .env file

Copy the generated environment variables to your .env file!

Option B: Manual Setup with cURL

If you prefer manual control or want to understand each step, follow these steps in order after OpenSearch is running:

Step 1: Configure ML Commons Settings

curl --location --request PUT 'http://localhost:9200/_cluster/settings' \
--header 'Content-Type: application/json' \
--data '{
  "persistent": {
    "plugins.ml_commons.allow_registering_model_via_url": "true",
    "plugins.ml_commons.only_run_on_ml_node": "false",
    "plugins.ml_commons.model_access_control_enabled": "true",
    "plugins.ml_commons.native_memory_threshold": "99"
  }
}'

Step 2: Register Model Group

curl --location 'localhost:9200/_plugins/_ml/model_groups/_register' \
--header 'Content-Type: application/json' \
--data '{
  "name": "note_search_with_highlighter",
  "description": "Models for note search with highlighter"
}'

πŸ“ Save the model_group_id from the response (e.g., aHkiSpsBv7PT9JWEQJcl)

Step 3: Register Sentence Transformer Model

curl --location 'http://localhost:9200/_plugins/_ml/models/_register' \
--header 'Content-Type: application/json' \
--data '{
  "name": "huggingface/sentence-transformers/all-MiniLM-L6-v2",
  "version": "1.0.1",
  "model_format": "TORCH_SCRIPT",
  "model_group": "aHkiSpsBv7PT9JWEQJcl"
}'

Replace aHkiSpsBv7PT9JWEQJcl with your model_group_id.

πŸ“ Save the task_id from the response (e.g., bXkjSpsBv7PT9JWEZZfG)

Step 4: Wait for Model Registration

curl --location 'http://localhost:9200/_plugins/_ml/tasks/bXkjSpsBv7PT9JWEZZfG'

Replace with your task_id. Wait until state: "COMPLETED".

πŸ“ Save the model_id from the response (e.g., cXkjSpsBv7PT9JWEbZeJ)

Step 5: Create Embedding Pipeline

curl --location --request PUT 'http://localhost:9200/_ingest/pipeline/note-embedding-pipeline' \
--header 'Content-Type: application/json' \
--data '{
  "description": "Generate embeddings for note content",
  "processors": [
    {
      "text_embedding": {
        "model_id": "cXkjSpsBv7PT9JWEbZeJ",
        "field_map": {
          "content": "content_embedding"
        }
      }
    }
  ]
}'

Replace cXkjSpsBv7PT9JWEbZeJ with your model_id.

Step 6: Create Notes Index

curl --location --request PUT 'http://localhost:9200/notes' \
--header 'Content-Type: application/json' \
--data '{
  "settings": {
    "index": {
      "number_of_shards": 1,
      "number_of_replicas": 0,
      "default_pipeline": "note-embedding-pipeline",
      "knn": true
    },
    "analysis": {
      "analyzer": {
        "default": {
          "type": "standard"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "note_id": {"type": "integer"},
      "title": {"type": "text", "analyzer": "standard"},
      "content": {"type": "text", "analyzer": "standard"},
      "text_to_embed": {"type": "text"},
      "embedding": {
        "type": "knn_vector",
        "dimension": 384,
        "method": {
          "name": "hnsw",
          "space_type": "cosinesimil",
          "engine": "lucene",
          "parameters": {
            "ef_construction": 128,
            "m": 24
          }
        }
      },
      "category": {"type": "keyword"},
      "tags": {"type": "keyword"},
      "user_id": {"type": "integer"},
      "created_at": {"type": "date"},
      "color": {"type": "keyword"}
    }
  }
}'

Step 7: Register Semantic Highlighter Model

curl --location 'http://localhost:9200/_plugins/_ml/models/_register?deploy=true' \
--header 'Content-Type: application/json' \
--data '{
  "name": "amazon/sentence-highlighting/opensearch-semantic-highlighter-v1",
  "version": "1.0.0",
  "model_format": "TORCH_SCRIPT",
  "function_name": "QUESTION_ANSWERING"
}'

πŸ“ Save the task_id from the response (e.g., DRe_XJsBQ3Oda9xdkRE4)

Step 8: Wait for Highlighter Model Deployment

curl --location 'http://localhost:9200/_plugins/_ml/tasks/DRe_XJsBQ3Oda9xdkRE4'

Replace DRe_XJsBQ3Oda9xdkRE4 with your task_id. Wait until state: "COMPLETED".

πŸ“ Save the model_id from the response (e.g., Dhe_XJsBQ3Oda9xdmBFJ)

Step 9: Create Hybrid RRF Pipeline

curl --location --request PUT 'http://localhost:9200/_search/pipeline/hybrid-rrf-pipeline' \
--header 'Content-Type: application/json' \
--data '{
  "description": "Post processor for hybrid RRF search",
  "phase_results_processors": [
    {
      "score-ranker-processor": {
        "combination": {
          "technique": "rrf"
        }
      }
    }
  ]
}'

πŸ€– DeepSeek RAG Chatbot Configuration

Step 10: Create DeepSeek Connector

curl --location 'localhost:9200/_plugins/_ml/connectors/_create' \
--header 'Content-Type: application/json' \
--data '{
  "name": "DeepSeek Chat",
  "description": "Connector for DeepSeek Chat API",
  "version": "1",
  "protocol": "http",
  "parameters": {
    "endpoint": "api.deepseek.com",
    "model": "deepseek-chat"
  },
  "credential": {
    "deepSeek_key": "sk-your-deepseek-api-key-here"
  },
  "actions": [
    {
      "action_type": "predict",
      "method": "POST",
      "url": "https://${parameters.endpoint}/v1/chat/completions",
      "headers": {
        "Content-Type": "application/json",
        "Authorization": "Bearer ${credential.deepSeek_key}"
      },
      "request_body": "{\"model\": \"${parameters.model}\", \"messages\": ${parameters.messages}}"
    }
  ]
}'

πŸ“ Save the connector_id from the response (e.g., 9ockcJsByj6d9U--59XV)

Step 11: Register DeepSeek Model

curl --location 'http://localhost:9200/_plugins/_ml/models/_register?deploy=true' \
--header 'Content-Type: application/json' \
--data '{
  "name": "DeepSeek Chat model",
  "function_name": "remote",
  "description": "DeepSeek Chat",
  "model_group": "aHkiSpsBv7PT9JWEQJcl",
  "connector_id": "9ockcJsByj6d9U--59XV"
}'

Replace aHkiSpsBv7PT9JWEQJcl with your model_group_id and 9ockcJsByj6d9U--59XV with your connector_id.

πŸ“ Save the task_id from the response

Step 12: Wait for Model Deployment

curl --location 'http://localhost:9200/_plugins/_ml/tasks/YOUR_TASK_ID'

Wait until state: "COMPLETED".

πŸ“ Save the model_id from the response (e.g., -ocscJsByj6d9U--oNXX)

Step 13: Create RAG Search Pipeline

curl --location --request PUT 'http://localhost:9200/_search/pipeline/my-conversation-search-pipeline-deepseek-chat' \
--header 'Content-Type: application/json' \
--data '{
  "response_processors": [
    {
      "retrieval_augmented_generation": {
        "tag": "Notes RAG Pipeline",
        "description": "RAG pipeline using DeepSeek Chat",
        "model_id": "-ocscJsByj6d9U--oNXX",
        "context_field_list": ["title", "content", "category", "tags"],
        "system_prompt": "You are a helpful assistant that helps users explore and understand their personal notes. Use the provided context from the user'\''s notes to answer questions accurately and helpfully. Reference specific notes by title when relevant. If the context doesn'\''t contain enough information, acknowledge this politely. Be conversational and helpful.",
        "user_instructions": "Answer based on these notes from my collection"
      }
    }
  ]
}'

Replace -ocscJsByj6d9U--oNXX with your DeepSeek model_id.

Step 14: Test the RAG Pipeline

curl --location 'http://localhost:9200/notes/_search?search_pipeline=my-conversation-search-pipeline-deepseek-chat' \
--header 'Content-Type: application/json' \
--data '{
  "query": {
    "match": {
      "content": "test"
    }
  },
  "ext": {
    "generative_qa_parameters": {
      "llm_question": "What notes do I have?",
      "llm_model": "deepseek-chat"
    }
  }
}'

βœ… Configuration Complete! Proceed to the next section to build and run the application.

🎨 Build and Run

1. Build CSS Assets

# One-time build
npm run build

# Watch mode (auto-rebuild on changes)
npm run watch

2. Start the Application

python app.py

The application will be available at http://localhost:5000

3. Default Login

  • Username: ismail
  • Password: ismail

⚠️ Important: Change these credentials in production!

πŸ“ Project Structure

notebook/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ __init__.py                 # Flask app factory
β”‚   β”œβ”€β”€ models/                     # Database models
β”‚   β”‚   β”œβ”€β”€ user.py
β”‚   β”‚   β”œβ”€β”€ note.py
β”‚   β”‚   β”œβ”€β”€ category.py
β”‚   β”‚   β”œβ”€β”€ chat_session.py
β”‚   β”‚   └── chat_message.py
β”‚   β”œβ”€β”€ routes/                     # Flask blueprints
β”‚   β”‚   β”œβ”€β”€ auth.py                 # Authentication
β”‚   β”‚   β”œβ”€β”€ main.py                 # Home/dashboard
β”‚   β”‚   β”œβ”€β”€ note.py                 # Note CRUD
β”‚   β”‚   └── chatbot.py              # Chatbot API
β”‚   β”œβ”€β”€ services/                   # Business logic
β”‚   β”‚   β”œβ”€β”€ auth_service.py
β”‚   β”‚   β”œβ”€β”€ note_service.py
β”‚   β”‚   β”œβ”€β”€ category_service.py
β”‚   β”‚   β”œβ”€β”€ opensearch_service.py
β”‚   β”‚   β”œβ”€β”€ rag_service.py
β”‚   β”‚   β”œβ”€β”€ session_service.py
β”‚   β”‚   └── message_service.py
β”‚   β”œβ”€β”€ static/                     # Static assets
β”‚   β”‚   β”œβ”€β”€ dist/                   # Compiled CSS
β”‚   β”‚   β”œβ”€β”€ src/                    # SCSS source
β”‚   β”‚   β”œβ”€β”€ js/                     # JavaScript
β”‚   β”‚   └── images/                 # Icons & images
β”‚   β”œβ”€β”€ templates/                  # Jinja2 templates
β”‚   β”‚   β”œβ”€β”€ base.html
β”‚   β”‚   β”œβ”€β”€ index.html
β”‚   β”‚   β”œβ”€β”€ login.html
β”‚   β”‚   β”œβ”€β”€ register.html
β”‚   β”‚   └── chatbot_popup.html
β”‚   └── utils/                      # Helper utilities
β”‚       └── Note_search_result.py
β”œβ”€β”€ instance/
β”‚   └── notebook.db                 # SQLite database
β”œβ”€β”€ app.py                          # Application entry point
β”œβ”€β”€ config.py                       # Configuration
β”œβ”€β”€ docker-compose.yml              # OpenSearch container
β”œβ”€β”€ requirements.txt                # Python dependencies
β”œβ”€β”€ package.json                    # Node.js dependencies
β”œβ”€β”€ tailwind.config.js              # Tailwind configuration
└── .env                            # Environment variables

πŸ”§ Configuration

Environment Variables

Edit .env to customize:

# Security
SECRET_KEY=your-secret-key

# Database
SQLALCHEMY_DATABASE_URI=sqlite:///notebook.db

# OpenSearch
OPENSEARCH_HOST=localhost
OPENSEARCH_PORT=9200

# DeepSeek
DEEPSEEK_API_KEY=sk-your-key
DEEPSEEK_MODEL=deepseek-chat

# RAG Settings
MAX_CONTEXT_NOTES=5
MAX_CONTEXT_LENGTH=2000

🎯 Features Deep Dive

Hybrid Search

Combines two search methods:

  1. Lexical Search: Traditional keyword matching using BM25
  2. Semantic Search: Vector similarity using sentence embeddings
  3. RRF Fusion: Reciprocal Rank Fusion combines results intelligently

RAG Chatbot

The chatbot uses Retrieval-Augmented Generation:

  1. Retrieval: Searches your notes using hybrid search
  2. Augmentation: Adds relevant notes as context
  3. Generation: DeepSeek Chat generates responses based on your notes

Benefits:

  • Answers are grounded in your actual notes
  • Cites specific notes when relevant
  • Maintains conversation history
  • Understands natural language queries

πŸ› Troubleshooting

OpenSearch Won't Start

# Check container status
docker ps -a

# View logs
docker-compose logs -f opensearch

# Restart
docker-compose restart opensearch

Model Registration Fails

  • Ensure sufficient memory (512MB minimum)
  • Check Java heap settings in docker-compose.yml
  • Verify ML Commons plugin is active

Chatbot Not Working

  • Verify DeepSeek API key is valid
  • Check RAG pipeline is created correctly
  • Test connector separately first
  • Review application logs for errors

CSS Not Loading

# Rebuild CSS
npm run build

# Check for errors
npm run watch

Database Issues

# Reset database (deletes all data!)
rm instance/notebook.db
python app.py

πŸ“Š API Endpoints

Notes API

  • GET / - Dashboard with all notes
  • POST /notes/create - Create new note
  • PUT /notes/<id>/edit - Edit note
  • DELETE /notes/<id>/delete - Soft delete note
  • POST /notes/search - Search notes

Chatbot API

  • GET /api/chatbot/health - Health check
  • POST /api/chatbot/chat - Send message to chatbot
  • GET /api/chatbot/sessions - List chat sessions
  • POST /api/chatbot/sessions - Create new session

Authentication API

  • POST /auth/register - User registration
  • POST /auth/login - User login
  • GET /auth/logout - User logout

πŸš€ Deployment

Production Checklist

  • Change default user credentials
  • Generate strong SECRET_KEY
  • Use PostgreSQL instead of SQLite
  • Set FLASK_ENV=production
  • Enable HTTPS
  • Configure proper OpenSearch cluster
  • Set up backup for database
  • Monitor API usage (DeepSeek)
  • Configure rate limiting

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature-name
  3. Commit changes: git commit -am 'Add feature'
  4. Push to branch: git push origin feature-name
  5. Submit a pull request

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • OpenSearch - Powerful search and analytics engine
  • DeepSeek - Advanced language model API
  • Flask - Lightweight web framework
  • Sentence Transformers - State-of-the-art text embeddings
  • Tailwind CSS - Modern utility-first CSS framework

About

A notebook for managing user notes, powered by artificial intelligence, hybrid search (semantic and lexical) and semantic discrimination.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published