From d4bd228bf619b55c3fe4e8d527dbb1102d5e4d30 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Fri, 12 Sep 2025 11:45:10 +0000 Subject: [PATCH 1/2] Initial plan From 7ef86ff821fde745b8e8247be374c4c0818b32d3 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Fri, 12 Sep 2025 11:56:28 +0000 Subject: [PATCH 2/2] Complete Turing ES documentation improvements - comprehensive user-friendly guides Co-authored-by: alegauss <331174+alegauss@users.noreply.github.com> --- .../turing_0_getting_started.markdown | 6 + docs/_documentation/turing_7_faq.markdown | 6 + .../turing_8_troubleshooting.markdown | 6 + .../turing_9_best_practices.markdown | 6 + docs/turing/0.3.10/getting-started.adoc | 300 +++++++ .../0.3.10/turing-administration-guide.adoc | 40 +- docs/turing/0.3.10/turing-best-practices.adoc | 734 ++++++++++++++++ .../turing/0.3.10/turing-developer-guide.adoc | 734 +++++++++++++++- docs/turing/0.3.10/turing-faq.adoc | 585 +++++++++++++ .../0.3.10/turing-installation-guide.adoc | 95 +- .../turing/0.3.10/turing-troubleshooting.adoc | 811 ++++++++++++++++++ docs/turing/index.markdown | 70 +- 12 files changed, 3368 insertions(+), 25 deletions(-) create mode 100644 docs/_documentation/turing_0_getting_started.markdown create mode 100644 docs/_documentation/turing_7_faq.markdown create mode 100644 docs/_documentation/turing_8_troubleshooting.markdown create mode 100644 docs/_documentation/turing_9_best_practices.markdown create mode 100644 docs/turing/0.3.10/getting-started.adoc create mode 100644 docs/turing/0.3.10/turing-best-practices.adoc create mode 100644 docs/turing/0.3.10/turing-faq.adoc create mode 100644 docs/turing/0.3.10/turing-troubleshooting.adoc diff --git a/docs/_documentation/turing_0_getting_started.markdown b/docs/_documentation/turing_0_getting_started.markdown new file mode 100644 index 0000000..0270b59 --- /dev/null +++ b/docs/_documentation/turing_0_getting_started.markdown @@ -0,0 +1,6 @@ +--- +title: Getting Started Guide +description: Quick start guide to get up and running with Turing ES in 30 minutes. +docurl: /turing/0.3.10/getting-started/ +product: turing +--- \ No newline at end of file diff --git a/docs/_documentation/turing_7_faq.markdown b/docs/_documentation/turing_7_faq.markdown new file mode 100644 index 0000000..9a76849 --- /dev/null +++ b/docs/_documentation/turing_7_faq.markdown @@ -0,0 +1,6 @@ +--- +title: Frequently Asked Questions +description: Common questions and answers about Turing ES installation, configuration, and usage. +docurl: /turing/0.3.10/faq/ +product: turing +--- \ No newline at end of file diff --git a/docs/_documentation/turing_8_troubleshooting.markdown b/docs/_documentation/turing_8_troubleshooting.markdown new file mode 100644 index 0000000..61c637c --- /dev/null +++ b/docs/_documentation/turing_8_troubleshooting.markdown @@ -0,0 +1,6 @@ +--- +title: Troubleshooting Guide +description: Comprehensive guide to diagnose and resolve common Turing ES issues. +docurl: /turing/0.3.10/troubleshooting/ +product: turing +--- \ No newline at end of file diff --git a/docs/_documentation/turing_9_best_practices.markdown b/docs/_documentation/turing_9_best_practices.markdown new file mode 100644 index 0000000..b332a6f --- /dev/null +++ b/docs/_documentation/turing_9_best_practices.markdown @@ -0,0 +1,6 @@ +--- +title: Best Practices Guide +description: Proven strategies and recommendations for optimal Turing ES implementation and usage. +docurl: /turing/0.3.10/best-practices/ +product: turing +--- \ No newline at end of file diff --git a/docs/turing/0.3.10/getting-started.adoc b/docs/turing/0.3.10/getting-started.adoc new file mode 100644 index 0000000..a09298f --- /dev/null +++ b/docs/turing/0.3.10/getting-started.adoc @@ -0,0 +1,300 @@ += Viglet Turing ES: Getting Started Guide +Viglet Team +:page-layout: documentation +:organization: Viglet Turing +ifdef::backend-pdf[:toc: left] +:toclevels: 5 +:toc-title: Table of Content +:doctype: book +:revnumber: 0.3.10 +:revdate: 25-12-2024 +:source-highlighter: rouge +:pdf-theme: viglet +:pdf-themesdir: {docdir}/../themes/ +:page-breadcrumb-title: Getting Started Guide +:page-permalink: /turing/0.3.10/getting-started/ +:imagesdir: ../../../ +:page-pdf: /turing/turing-getting-started-0.3.10.pdf +:page-product: turing + +[preface] +== Welcome to Turing ES! + +Congratulations on choosing Viglet Turing ES! This guide will help you get up and running quickly with your new semantic search and AI-powered navigation system. + +*What you'll accomplish:* + +* ✅ Understand what Turing ES can do for you +* ✅ Install and configure Turing ES +* ✅ Create your first search site +* ✅ Index some sample content +* ✅ Experience the power of semantic search + +*Time required:* Approximately 30-45 minutes + +:numbered: + +== What is Turing ES? + +Viglet Turing ES is an intelligent search and navigation platform that goes beyond traditional keyword matching. It understands the meaning and context of your content, making it easier for users to find exactly what they're looking for. + +=== Key Features + +**🔍 Semantic Search** +Unlike traditional search engines that only match keywords, Turing ES understands the meaning behind queries. For example, searching for "apple fruit" will prioritize fruit-related content over technology articles about Apple Inc. + +**🤖 AI-Powered Navigation** +Built-in Natural Language Processing (NLP) capabilities help organize and categorize content automatically. Choose from multiple NLP providers like OpenNLP, spaCy, CoreNLP, or OpenText Content Analytics. + +**💬 Chatbot Integration** +Create intelligent chatbots that can answer questions based on your content, providing instant support to your users. + +**⚡ Enterprise Ready** +Built on Apache Solr, Turing ES can handle large volumes of content with fast response times and high availability. + +=== Common Use Cases + +* *Corporate Intranets*: Help employees find policies, procedures, and resources quickly +* *E-commerce Sites*: Improve product discovery with intelligent search +* *Documentation Sites*: Enable users to find answers using natural language +* *Content Portals*: Organize and navigate large content repositories +* *Knowledge Bases*: Power intelligent customer support systems + +== Before You Begin + +=== System Requirements + +Make sure your system meets these minimum requirements: + +.Hardware Requirements +[%header,cols="1,2,2"] +|=== +|Component |Minimum |Recommended +|RAM |4 GB |8 GB or more +|CPU |2 cores |4 cores or more +|Storage |10 GB free space |20 GB or more +|=== + +.Software Requirements +[%header,cols="1,2"] +|=== +|Software |Requirement +|Java |Java 21 (JDK or JRE) +|Operating System |Windows 10+, macOS 10.14+, or Linux +|Web Browser |Chrome, Firefox, Safari, or Edge (latest versions) +|=== + +=== What You'll Need + +Before starting, gather these items: + +* ✅ Administrative access to your system +* ✅ Java 21 installed (we'll verify this) +* ✅ Sample content to index (optional - we provide examples) +* ✅ About 30-45 minutes of time + +== Step 1: Verify Java Installation + +Turing ES requires Java 21. Let's check if you have it installed: + +=== Windows Users + +1. Open Command Prompt (press `Win + R`, type `cmd`, press Enter) +2. Type the following command and press Enter: ++ +---- +java -version +---- ++ +3. You should see output similar to: ++ +---- +java version "21.0.1" 2023-10-17 +Java(TM) SE Runtime Environment (build 21.0.1+12-LTS-29) +Java HotSpot(TM) 64-Bit Server VM (build 21.0.1+12-LTS-29, mixed mode) +---- + +=== macOS/Linux Users + +1. Open Terminal +2. Type the following command and press Enter: ++ +---- +java -version +---- ++ +3. You should see similar output as above + +=== If Java 21 is Not Installed + +If you don't have Java 21, download it from: + +* https://www.oracle.com/java/technologies/downloads/[Oracle JDK 21] (Recommended) +* https://adoptium.net/[Eclipse Temurin JDK 21] (Free alternative) + +IMPORTANT: Make sure to set the `JAVA_HOME` environment variable to point to your Java installation directory. + +== Step 2: Download Turing ES + +1. Visit the https://github.com/openviglet/turing/releases/latest[latest release page] +2. Download the `viglet-turing.jar` file (approximately 271 MB) +3. Save it to a directory where you want to run Turing ES (e.g., `/opt/turing` on Linux/Mac or `C:\turing` on Windows) + +TIP: Choose a directory with a short path and no spaces for easier command-line usage. + +== Step 3: Start Turing ES + +Now let's start your Turing ES server: + +=== First Startup + +1. Open your command prompt or terminal +2. Navigate to the directory where you saved the JAR file +3. Run the following command: ++ +---- +java -jar viglet-turing.jar +---- ++ +4. Wait for the startup to complete (you'll see "Started TurApplication") +5. Open your web browser and go to: http://localhost:2700 + +The first startup will take a few minutes as Turing ES initializes the database and search engine. + +=== Login to the Admin Console + +1. In your browser, go to http://localhost:2700 +2. Use these default credentials: + * *Username:* `admin` + * *Password:* `admin` + +CAUTION: Remember to change the default password in production environments! + +== Step 4: Create Your First Site + +Once logged in, let's create your first search site: + +=== Create a New Site + +1. Click on *"Sites"* in the navigation menu +2. Click *"Add Site"* +3. Fill in the basic information: + * *Name:* `My First Site` + * *Description:* `A sample site to test Turing ES` + * *Default Locale:* Select your preferred language + +4. Click *"Save"* + +=== Configure Site Settings + +Your new site will have a unique identifier (like `my-first-site`). This will be used in API calls and URLs. + +== Step 5: Index Sample Content + +Let's add some content to see Turing ES in action: + +=== Using the Web Interface + +1. Navigate to *"Content"* > *"Import"* +2. You can upload documents in various formats: + * PDF files + * Word documents + * HTML files + * Plain text files + +=== Using the API (Optional) + +For developers, you can also use the REST API to index content: + +---- +curl -X POST http://localhost:2700/api/sn/my-first-site/import \ + -H "Content-Type: application/json" \ + -d '{ + "title": "Welcome to Turing ES", + "content": "This is my first document in Turing ES. It demonstrates semantic search capabilities.", + "url": "https://example.com/welcome", + "date": "2024-12-25" + }' +---- + +== Step 6: Test Your Search + +Now let's see Turing ES in action: + +=== Web Search Interface + +1. Go to http://localhost:2700/sites/my-first-site +2. Try searching for terms related to your indexed content +3. Notice how Turing ES provides: + * Relevant results even with partial matches + * Highlighted search terms + * Suggested queries + +=== API Search + +You can also test the search API directly: + +---- +curl "http://localhost:2700/api/sn/my-first-site/search?q=welcome" +---- + +This returns JSON results that you can integrate into your applications. + +== What's Next? + +Congratulations! You now have Turing ES up and running. Here are some next steps: + +=== Immediate Next Steps + +* 📖 Read the link:administration-guide/[Administration Guide] to learn about configuration options +* 🔧 Explore the link:developer-guide/[Developer Guide] for API integration examples +* 🔌 Check out link:connectors/[Connectors] to import content from various sources + +=== Advanced Configuration + +* Configure NLP providers for enhanced semantic understanding +* Set up user authentication and authorization +* Configure clustering for high availability +* Implement custom search rankings and rules + +=== Get Help + +* 💬 Join our https://github.com/openviglet/turing/discussions[Community Discussions] +* 🐛 Report issues on https://github.com/openviglet/turing/issues[GitHub Issues] +* 📧 Contact the team: opensource@viglet.com + +== Troubleshooting + +=== Common Issues + +**"Port 2700 already in use"** +Another application is using port 2700. You can start Turing ES on a different port: +---- +java -jar viglet-turing.jar --server.port=8080 +---- + +**"Java version not supported"** +Make sure you're using Java 21. Check your Java version with `java -version`. + +**"Out of memory errors"** +Increase JVM heap size: +---- +java -Xmx4g -jar viglet-turing.jar +---- + +**"Cannot connect to search engine"** +Turing ES includes an embedded Solr instance. If you see connection errors, wait a few minutes for full startup or check the logs for errors. + +== Summary + +You've successfully: + +* ✅ Installed and started Turing ES +* ✅ Created your first search site +* ✅ Indexed sample content +* ✅ Performed your first searches +* ✅ Learned about next steps + +Turing ES is now ready for you to explore its full potential. Whether you're building a corporate intranet, e-commerce site, or knowledge base, you have the foundation for intelligent search and navigation. + +Welcome to the world of semantic search! 🎉 \ No newline at end of file diff --git a/docs/turing/0.3.10/turing-administration-guide.adoc b/docs/turing/0.3.10/turing-administration-guide.adoc index e1654a6..6690bd7 100644 --- a/docs/turing/0.3.10/turing-administration-guide.adoc +++ b/docs/turing/0.3.10/turing-administration-guide.adoc @@ -18,12 +18,48 @@ ifdef::backend-pdf[:toc: left] :page-product: turing [preface] -= Preface +== Welcome to Turing ES Administration -Viglet Turing ES (https://viglet.com/turing) is an open source solution (https://github.com/openviglet), which has Semantic Navigation and Chatbot as its main features. You can choose from several NLPs to enrich the data. All content is indexed in Solr as search engine. +This guide helps you configure, manage, and optimize your Viglet Turing ES installation. Whether you're a system administrator setting up your first search site or managing an enterprise deployment, this guide provides the knowledge you need. + +**What you'll learn:** + +* 🏗️ **Architecture Overview** - Understand how Turing ES components work together +* 🔧 **Configuration** - Set up NLP providers, data sources, and search settings +* 👥 **User Management** - Control access and permissions +* 📊 **Monitoring & Performance** - Keep your system running smoothly +* 🔒 **Security** - Implement best practices for production environments + +**Who should use this guide:** + +* System Administrators +* DevOps Engineers +* IT Managers +* Anyone responsible for maintaining Turing ES + +TIP: New to Turing ES? Start with the link:getting-started/[Getting Started Guide] first. :numbered: +== Understanding Turing ES + +Before diving into administration tasks, let's understand what you're working with. + +**Viglet Turing ES** is an intelligent search and navigation platform that combines: + +* **Search Engine** (Apache Solr) - Handles indexing and search operations +* **NLP Processing** - Extracts meaning and context from content +* **Web Interface** - Provides user-friendly administration and search +* **REST APIs** - Enables integration with other systems + +=== How It All Works Together + +1. **Content Ingestion** → Content is imported from various sources (files, databases, APIs) +2. **NLP Processing** → Natural language processors analyze and enrich the content +3. **Indexing** → Processed content is stored in the search engine (Solr) +4. **Search & Discovery** → Users find content through web interface or APIs +5. **Continuous Learning** → The system improves based on user interactions + == Architecture [#img-architecture] diff --git a/docs/turing/0.3.10/turing-best-practices.adoc b/docs/turing/0.3.10/turing-best-practices.adoc new file mode 100644 index 0000000..228fc66 --- /dev/null +++ b/docs/turing/0.3.10/turing-best-practices.adoc @@ -0,0 +1,734 @@ += Viglet Turing ES: Best Practices Guide +Viglet Team +:page-layout: documentation +:organization: Viglet Turing +ifdef::backend-pdf[:toc: left] +:toclevels: 5 +:toc-title: Table of Content +:doctype: book +:revnumber: 0.3.10 +:revdate: 25-12-2024 +:source-highlighter: rouge +:pdf-theme: viglet +:pdf-themesdir: {docdir}/../themes/ +:page-breadcrumb-title: Best Practices +:page-permalink: /turing/0.3.10/best-practices/ +:imagesdir: ../../../ +:page-pdf: /turing/turing-best-practices-0.3.10.pdf +:page-product: turing + +[preface] +== Best Practices for Turing ES + +This guide shares proven strategies and recommendations for getting the most out of your Turing ES implementation. From initial setup to production optimization, these practices will help you build a robust, efficient, and user-friendly search experience. + +**What you'll learn:** + +* 🏗️ **Setup & Configuration** - Start with a solid foundation +* 📊 **Content Strategy** - Organize and structure your data effectively +* ⚡ **Performance Optimization** - Keep your system running fast +* 🔒 **Security & Compliance** - Protect your data and users +* 📈 **Monitoring & Maintenance** - Ensure long-term success +* 👥 **User Experience** - Create search experiences users love + +:numbered: + +== Planning Your Implementation + +=== Content Analysis + +Before implementing Turing ES, understand your content landscape: + +**Content Audit Checklist:** +* ✅ **Volume**: How much content will you index? (documents, pages, records) +* ✅ **Types**: What formats? (PDFs, web pages, databases, images) +* ✅ **Languages**: Multi-language content requirements? +* ✅ **Growth Rate**: How quickly does content volume increase? +* ✅ **Update Frequency**: How often does content change? +* ✅ **Access Patterns**: Which content is accessed most frequently? + +**Example Content Matrix:** +[cols="2,1,2,2"] +|=== +|Content Type |Volume |Update Frequency |Priority + +|Product Catalog +|50,000 items +|Daily +|High + +|Documentation +|1,000 pages +|Weekly +|High + +|News Articles +|10,000 articles +|Daily +|Medium + +|Support Tickets +|100,000 records +|Hourly +|Low +|=== + +=== Architecture Planning + +**Small to Medium Deployments (< 1M documents):** +* Single Turing ES instance +* PostgreSQL database +* Embedded Solr +* 8GB RAM, 4 CPU cores + +**Large Deployments (1M+ documents):** +* Multiple Turing ES instances with load balancer +* PostgreSQL cluster or Oracle RAC +* SolrCloud cluster (3+ nodes) +* 16GB+ RAM per instance + +**High Availability Setup:** +* 3+ Turing ES instances across different availability zones +* Database replication/clustering +* Distributed search index +* Health monitoring and auto-failover + +=== Capacity Planning + +**Storage Requirements:** +* Index size is typically 20-30% of source content size +* Database storage for metadata: ~1KB per document +* Log storage: Plan for 1-2GB per month per instance + +**Performance Benchmarks:** +* Target search response time: < 200ms for 95% of queries +* Indexing throughput: 100-1000 documents per minute (depends on content complexity) +* Concurrent users: 50-500 per instance (depending on query complexity) + +== Content Strategy Best Practices + +=== Document Structure + +**Optimize Content for Search:** + +1. **Clear Titles**: Use descriptive, keyword-rich titles + * ✅ Good: "Turing ES Installation Guide for Windows" + * ❌ Poor: "Setup Instructions" + +2. **Structured Metadata**: Include consistent metadata fields + * Author, creation date, categories, tags + * Custom fields for domain-specific information + +3. **Content Hierarchy**: Organize with clear sections and headings + * Use proper heading structures (H1, H2, H3) + * Include table of contents for long documents + +**Example Document Structure:** +[source,json] +---- +{ + "title": "Customer Onboarding Process", + "content": "Detailed process description...", + "summary": "Brief overview of the onboarding steps", + "category": ["processes", "customer-service"], + "tags": ["onboarding", "workflow", "customer"], + "author": "Jane Smith", + "department": "Customer Success", + "date_created": "2024-12-25T10:00:00Z", + "date_updated": "2024-12-25T15:30:00Z", + "security_level": "internal", + "language": "en", + "content_type": "process-document" +} +---- + +=== Content Tagging Strategy + +**Implement Consistent Taxonomy:** + +1. **Hierarchical Categories**: + * Level 1: Broad topics (Products, Services, Support) + * Level 2: Specific areas (Product A, Installation, Troubleshooting) + * Level 3: Detailed topics (Feature X, Windows Install, Error Codes) + +2. **Tagging Guidelines**: + * Use lowercase, hyphenated tags: "customer-service", "api-documentation" + * Limit to 5-10 tags per document + * Create tag glossary for consistency + +3. **Metadata Standards**: + * Date formats: ISO 8601 (2024-12-25T10:00:00Z) + * Author format: "FirstName LastName" + * Categories: Use predefined vocabulary + +=== Multi-Language Content + +**Best Practices for Internationalization:** + +1. **Language Detection**: Enable automatic language detection +2. **Locale-Specific Fields**: Configure language-specific analyzers +3. **Translation Strategy**: + * Maintain separate documents per language + * Link related translations with common ID + +[source,json] +---- +{ + "title_en": "User Guide", + "title_es": "Guía del Usuario", + "title_pt": "Guia do Usuário", + "content_en": "English content...", + "content_es": "Contenido en español...", + "content_pt": "Conteúdo em português...", + "language": "en", + "translation_group": "user-guide-v1" +} +---- + +== Search Configuration Best Practices + +=== Field Boosting Strategy + +**Configure Relevance Scoring:** + +1. **Title Boost**: 3.0x - Titles are most important for relevance +2. **Summary/Description**: 2.0x - Brief descriptions carry weight +3. **Tags/Categories**: 2.5x - Metadata helps with precision +4. **Content Body**: 1.0x - Base relevance score +5. **Author/Source**: 1.5x - May be relevant for expert content + +**Example Configuration:** +[source,json] +---- +{ + "field_boosts": { + "title": 3.0, + "summary": 2.0, + "tags": 2.5, + "content": 1.0, + "author": 1.5, + "category": 2.0 + } +} +---- + +=== Search Interface Design + +**Create Intuitive Search Experiences:** + +1. **Search Box Placement**: + * Prominent position (top of page) + * Adequate size (minimum 400px wide) + * Placeholder text with examples + +2. **Auto-Complete Configuration**: + * Show 5-8 suggestions + * Include popular searches + * Mix of query completions and direct results + +3. **Faceted Search**: + * Start with 3-5 most useful facets + * Use clear, user-friendly labels + * Show result counts per facet value + +4. **Results Display**: + * Title (linked to full content) + * Snippet with search term highlighting + * Metadata (date, author, category) + * Relevance indicators + +**Search Interface Example:** +[source,html] +---- +
+ + +
+

Filter by:

+
+
Content Type
+ + + +
+
+ +
+
+

Installation Guide

+

Step-by-step instructions for installing Turing ES...

+ +
+
+
+---- + +=== Query Understanding + +**Enhance Search Intelligence:** + +1. **Synonyms Configuration**: + * Business-specific terminology + * Common abbreviations and acronyms + * Alternative spellings and variations + +2. **Stop Words Management**: + * Remove domain-specific noise words + * Keep important business terms + * Language-specific considerations + +3. **Query Expansion**: + * Add related terms automatically + * Use thesaurus for term enrichment + * Implement "did you mean?" suggestions + +**Synonym Examples:** +[source,text] +---- +# Technology synonyms +api,rest api,web service,endpoint +login,signin,authentication,auth +config,configuration,settings,setup + +# Business synonyms +customer,client,user +purchase,buy,order +support,help,assistance +---- + +== Performance Optimization + +=== Indexing Performance + +**Optimize Content Processing:** + +1. **Batch Processing**: + * Index documents in batches of 100-1000 + * Use bulk import APIs when possible + * Schedule large imports during off-peak hours + +2. **Content Optimization**: + * Limit document size (< 10MB per document) + * Extract text efficiently from binary formats + * Remove unnecessary whitespace and formatting + +3. **NLP Configuration**: + * Choose appropriate NLP providers for your content + * Balance processing quality vs. speed + * Consider caching NLP results for similar content + +**Batch Import Example:** +[source,bash] +---- +# Good: Batch import +curl -X POST http://localhost:2700/api/sn/site/import/bulk \ + -H "Content-Type: application/json" \ + -d '[ + {"title": "Doc 1", "content": "..."}, + {"title": "Doc 2", "content": "..."}, + {"title": "Doc 3", "content": "..."} + ]' + +# Avoid: Individual imports in loop +for doc in docs: + curl -X POST .../import -d single_doc +---- + +=== Search Performance + +**Optimize Query Response Times:** + +1. **Query Optimization**: + * Use specific field searches when possible + * Implement query result caching + * Limit result set sizes with pagination + +2. **Index Optimization**: + * Regular index optimization (weekly) + * Remove unused fields from index + * Configure appropriate field types + +3. **Caching Strategy**: + * Cache popular searches + * Use CDN for static search interface assets + * Implement browser caching for results + +**Performance Monitoring:** +[source,bash] +---- +# Monitor search response times +curl -w "@curl-format.txt" "http://localhost:2700/api/sn/site/search?q=test" + +# curl-format.txt contents: + time_namelookup: %{time_namelookup}\n + time_connect: %{time_connect}\n + time_appconnect: %{time_appconnect}\n + time_pretransfer: %{time_pretransfer}\n + time_redirect: %{time_redirect}\n + time_starttransfer: %{time_starttransfer}\n + ----------\n + time_total: %{time_total}\n +---- + +=== System Resource Management + +**Optimize Hardware Usage:** + +1. **Memory Management**: + * Allocate 50-70% of system RAM to Java heap + * Leave sufficient memory for OS and other processes + * Monitor garbage collection frequency + +2. **CPU Optimization**: + * Use appropriate number of processing threads + * Balance between concurrent users and processing power + * Consider CPU-intensive NLP processing + +3. **Storage Optimization**: + * Use SSD storage for search index + * Implement log rotation + * Archive old content appropriately + +**JVM Tuning Example:** +[source,bash] +---- +# Production JVM settings +java -Xmx8g -Xms4g \ + -XX:+UseG1GC \ + -XX:MaxGCPauseMillis=200 \ + -XX:+UseStringDeduplication \ + -XX:+UnlockExperimentalVMOptions \ + -XX:G1NewSizePercent=20 \ + -XX:G1MaxNewSizePercent=30 \ + -jar viglet-turing.jar +---- + +== Security & Compliance + +=== Access Control + +**Implement Proper Security:** + +1. **Authentication Strategy**: + * Change default passwords immediately + * Use strong password policies + * Implement multi-factor authentication for admin accounts + +2. **Authorization Model**: + * Role-based access control (RBAC) + * Principle of least privilege + * Regular access reviews + +3. **API Security**: + * Use HTTPS in production + * Implement rate limiting + * Validate all input parameters + +**Security Configuration Example:** +[source,properties] +---- +# Strong password requirements +turing.security.password.min-length=12 +turing.security.password.require-special-chars=true +turing.security.password.require-numbers=true + +# Session security +server.servlet.session.cookie.secure=true +server.servlet.session.cookie.http-only=true +server.servlet.session.timeout=30m + +# API rate limiting +turing.api.rate-limit.requests-per-minute=100 +turing.api.rate-limit.burst-capacity=20 +---- + +=== Data Protection + +**Protect Sensitive Information:** + +1. **Content Classification**: + * Identify sensitive content types + * Implement content-level security + * Use encryption for sensitive data + +2. **Privacy Compliance**: + * GDPR compliance for EU users + * Data retention policies + * Right to deletion implementation + +3. **Audit Logging**: + * Log all administrative actions + * Track content access patterns + * Implement log retention policies + +**Data Classification Example:** +[source,json] +---- +{ + "title": "Employee Handbook", + "content": "...", + "security_classification": "internal", + "access_groups": ["employees", "hr"], + "retention_period": "7years", + "contains_pii": false, + "geographic_restrictions": ["EU", "US"] +} +---- + +=== Backup & Recovery + +**Implement Comprehensive Backup Strategy:** + +1. **Database Backups**: + * Daily automated backups + * Test restore procedures monthly + * Off-site backup storage + +2. **Search Index Backups**: + * Index snapshots before major updates + * Replication to secondary sites + * Quick restore capabilities + +3. **Configuration Backups**: + * Version control for configuration files + * Document all customizations + * Export/import procedures for site configurations + +**Backup Script Example:** +[source,bash] +---- +#!/bin/bash +# Daily backup script + +DATE=$(date +%Y%m%d) +BACKUP_DIR="/backups/turing-$DATE" + +# Create backup directory +mkdir -p $BACKUP_DIR + +# Database backup +pg_dump -h localhost -U turing turing > $BACKUP_DIR/database.sql + +# Solr index backup +curl "http://localhost:8983/solr/turing/replication?command=backup&location=$BACKUP_DIR" + +# Configuration backup +cp -r /opt/turing/config $BACKUP_DIR/ + +# Compress backup +tar -czf $BACKUP_DIR.tar.gz $BACKUP_DIR +rm -rf $BACKUP_DIR + +echo "Backup completed: $BACKUP_DIR.tar.gz" +---- + +== Monitoring & Maintenance + +=== Health Monitoring + +**Implement Comprehensive Monitoring:** + +1. **System Metrics**: + * CPU, memory, disk usage + * Network connectivity + * JVM heap and garbage collection + +2. **Application Metrics**: + * Search response times + * Indexing throughput + * Error rates and types + +3. **Business Metrics**: + * Search volume and patterns + * Content usage statistics + * User satisfaction indicators + +**Monitoring Setup Example:** +[source,yaml] +---- +# Prometheus monitoring configuration +version: '3' +services: + turing: + image: viglet/turing + ports: + - "2700:2700" + environment: + - MANAGEMENT_ENDPOINTS_WEB_EXPOSURE_INCLUDE=health,metrics,prometheus + + prometheus: + image: prom/prometheus + ports: + - "9090:9090" + volumes: + - ./prometheus.yml:/etc/prometheus/prometheus.yml + + grafana: + image: grafana/grafana + ports: + - "3000:3000" +---- + +=== Regular Maintenance + +**Establish Maintenance Routines:** + +1. **Daily Tasks**: + * Monitor system health dashboards + * Check error logs for anomalies + * Verify backup completion + +2. **Weekly Tasks**: + * Review search analytics + * Optimize search index + * Update content that's changed + +3. **Monthly Tasks**: + * Security patch updates + * Performance trend analysis + * Capacity planning review + +4. **Quarterly Tasks**: + * Disaster recovery testing + * Security audit and access review + * Content strategy evaluation + +**Maintenance Checklist:** +[cols="2,1,3"] +|=== +|Task |Frequency |Notes + +|System Health Check +|Daily +|CPU, memory, disk space, error logs + +|Index Optimization +|Weekly +|Improve search performance + +|Security Updates +|Monthly +|OS, Java, application patches + +|Backup Testing +|Monthly +|Verify restore procedures work + +|Performance Review +|Monthly +|Response times, throughput analysis + +|Disaster Recovery Test +|Quarterly +|Full failover testing + +|Content Audit +|Quarterly +|Remove outdated, add missing content +|=== + +=== Analytics & Insights + +**Leverage Search Data for Improvement:** + +1. **Query Analysis**: + * Most popular search terms + * Zero-result queries (improve content) + * Query refinement patterns + +2. **Content Performance**: + * Most/least accessed content + * Content freshness analysis + * User engagement metrics + +3. **User Behavior**: + * Search session patterns + * Click-through rates + * Time spent with results + +**Analytics Dashboard Metrics:** +* **Search Volume**: Queries per day/hour +* **Performance**: Average response time, 95th percentile +* **Success Rate**: Queries returning results vs. zero results +* **Content Health**: Freshness, coverage, quality scores +* **User Satisfaction**: Click-through rates, session duration + +== User Experience Best Practices + +=== Search Interface Design + +**Create Intuitive Experiences:** + +1. **Progressive Disclosure**: + * Start with simple search + * Reveal advanced options as needed + * Don't overwhelm new users + +2. **Responsive Design**: + * Mobile-first approach + * Touch-friendly interface elements + * Consistent experience across devices + +3. **Accessibility**: + * Keyboard navigation support + * Screen reader compatibility + * High contrast options + +=== Content Discovery + +**Help Users Find What They Need:** + +1. **Faceted Navigation**: + * Clear category hierarchies + * Visual indicators of filter states + * Easy filter removal + +2. **Related Content**: + * "More like this" suggestions + * Popular content recommendations + * Trending topics + +3. **Search Guidance**: + * Search tips and examples + * Query suggestion improvements + * Clear error messages and recovery paths + +=== Performance Expectations + +**Meet User Performance Standards:** + +1. **Response Time Targets**: + * Search results: < 200ms + * Auto-complete: < 100ms + * Content loading: < 1 second + +2. **Progressive Loading**: + * Show search interface immediately + * Load results as they become available + * Provide loading indicators + +3. **Offline Capabilities**: + * Cache recent searches + * Provide basic functionality without connectivity + * Clear messaging about offline state + +== Conclusion + +Implementing these best practices will help you create a robust, efficient, and user-friendly search experience with Turing ES. Remember: + +* **Start Simple**: Begin with basic functionality and improve iteratively +* **Monitor Continuously**: Use metrics to guide improvements +* **Think Like Users**: Design from the user's perspective +* **Plan for Growth**: Build scalable solutions from the start +* **Stay Secure**: Implement security best practices from day one + +Success with Turing ES comes from understanding your users' needs, organizing content effectively, and continuously optimizing based on real usage patterns. The effort invested in following these practices will pay dividends in user satisfaction and system reliability. + +For additional guidance, consult the link:faq/[FAQ], link:troubleshooting/[Troubleshooting Guide], or reach out to the https://github.com/openviglet/turing/discussions[community] for support. + +Happy searching! 🚀 \ No newline at end of file diff --git a/docs/turing/0.3.10/turing-developer-guide.adoc b/docs/turing/0.3.10/turing-developer-guide.adoc index d94f7a3..4a4ffdb 100644 --- a/docs/turing/0.3.10/turing-developer-guide.adoc +++ b/docs/turing/0.3.10/turing-developer-guide.adoc @@ -18,40 +18,736 @@ ifdef::backend-pdf[:toc: left] :page-product: turing [preface] -== Preface +== Welcome, Developer! 👩‍💻 -Viglet Turing ES (https://viglet.com/turing) is an open source solution (https://github.com/openviglet), which has Semantic Navigation and Chatbot as its main features. You can choose from several NLPs to enrich the data. All content is indexed in Solr as search engine. +Ready to integrate Turing ES into your applications or contribute to its development? This guide provides everything you need to get started with Turing ES APIs, SDKs, and development environment setup. + +**What you'll learn:** + +* 🚀 **Quick API Integration** - Get up and running with REST APIs +* 📡 **Search & Index APIs** - Build powerful search experiences +* 🔧 **Development Setup** - Contribute to Turing ES development +* 💡 **Code Examples** - Real-world integration patterns +* 🛠️ **SDKs & Libraries** - Available tools and frameworks + +**Popular Integration Patterns:** + +* Add semantic search to websites and applications +* Build intelligent chatbots and virtual assistants +* Create content discovery and navigation systems +* Integrate with existing CMS and e-commerce platforms + +TIP: Need to get Turing ES running first? Check the link:getting-started/[Getting Started Guide]. + +:numbered: + +== API Quick Start + +Let's get you making API calls to Turing ES in minutes. + +=== Base URL Structure + +All Turing ES APIs follow this pattern: + +---- +http://localhost:2700/api/{service}/{site-id}/{endpoint} +---- + +Where: +* `{service}` - Service type (`sn` for Semantic Navigation, `se` for Search Engine) +* `{site-id}` - Your site identifier (e.g., "my-first-site") +* `{endpoint}` - Specific API endpoint + +=== Authentication + +Most read operations don't require authentication, but administrative operations do: + +[source,bash] +---- +curl -X POST http://localhost:2700/api/auth/login \ + -H "Content-Type: application/json" \ + -d '{"username": "admin", "password": "admin"}' +---- + +=== Your First Search + +Try this simple search request: + +[source,bash] +---- +curl "http://localhost:2700/api/sn/my-first-site/search?q=welcome" +---- + +Response: +[source,json] +---- +{ + "results": { + "numFound": 1, + "start": 0, + "docs": [ + { + "title": "Welcome to Turing ES", + "content": "This is my first document...", + "url": "https://example.com/welcome", + "score": 0.95 + } + ] + }, + "facets": {}, + "queryInfo": { + "query": "welcome", + "responseTime": 23 + } +} +---- + +=== Index Content + +Add content to your search index: + +[source,bash] +---- +curl -X POST http://localhost:2700/api/sn/my-first-site/import \ + -H "Content-Type: application/json" \ + -d '{ + "title": "Getting Started with APIs", + "content": "Learn how to integrate Turing ES APIs into your applications.", + "url": "https://example.com/api-guide", + "date": "2024-12-25T10:00:00Z", + "category": ["documentation", "api"] + }' +---- + +== Common API Endpoints + +=== Search APIs + +**Basic Search** +---- +GET /api/sn/{site}/search?q={query} +---- + +**Advanced Search with Filters** +---- +GET /api/sn/{site}/search?q={query}&fq=category:api&rows=20&start=0 +---- + +**Autocomplete Suggestions** +---- +GET /api/sn/{site}/ac?q={partial-query} +---- + +**Search by Date Range** +---- +GET /api/sn/{site}/search?q={query}&fq=date:[2024-01-01T00:00:00Z TO 2024-12-31T23:59:59Z] +---- + +=== Content Management APIs + +**Import Single Document** +---- +POST /api/sn/{site}/import +Content-Type: application/json +---- + +**Bulk Import** +---- +POST /api/sn/{site}/import/bulk +Content-Type: application/json +---- + +**Delete Document** +---- +DELETE /api/sn/{site}/document/{id} +---- + +=== NLP APIs + +**Spell Check** +---- +GET /api/nlp/spellcheck?q={text}&locale=en +---- + +**Language Detection** +---- +POST /api/nlp/detect-language +Content-Type: application/json +{"text": "Hello, this is a sample text"} +---- + +== SDK Examples + +=== JavaScript/Node.js + +[source,javascript] +---- +// Install: npm install axios + +const axios = require('axios'); + +class TuringClient { + constructor(baseUrl = 'http://localhost:2700', siteId) { + this.baseUrl = baseUrl; + this.siteId = siteId; + } + + async search(query, options = {}) { + const params = new URLSearchParams({ + q: query, + ...options + }); + + const response = await axios.get( + `${this.baseUrl}/api/sn/${this.siteId}/search?${params}` + ); + + return response.data; + } + + async addDocument(doc) { + return axios.post( + `${this.baseUrl}/api/sn/${this.siteId}/import`, + doc + ); + } +} + +// Usage +const client = new TuringClient('http://localhost:2700', 'my-site'); + +// Search +const results = await client.search('artificial intelligence'); +console.log(`Found ${results.results.numFound} results`); + +// Add document +await client.addDocument({ + title: 'AI in Modern Applications', + content: 'Artificial Intelligence is transforming...', + url: 'https://example.com/ai-guide' +}); +---- + +=== Python + +[source,python] +---- +# Install: pip install requests + +import requests +import json + +class TuringClient: + def __init__(self, base_url='http://localhost:2700', site_id=None): + self.base_url = base_url + self.site_id = site_id + + def search(self, query, **kwargs): + params = {'q': query, **kwargs} + url = f"{self.base_url}/api/sn/{self.site_id}/search" + + response = requests.get(url, params=params) + return response.json() + + def add_document(self, doc): + url = f"{self.base_url}/api/sn/{self.site_id}/import" + + response = requests.post(url, json=doc) + return response.json() + +# Usage +client = TuringClient(site_id='my-site') + +# Search +results = client.search('machine learning', rows=10) +print(f"Found {results['results']['numFound']} results") + +# Add document +client.add_document({ + 'title': 'Introduction to ML', + 'content': 'Machine learning is a subset of AI...', + 'url': 'https://example.com/ml-intro' +}) +---- + +=== Java + +[source,java] +---- +// Add dependency: OkHttp or similar HTTP client + +import okhttp3.*; +import com.google.gson.Gson; + +public class TuringClient { + private final OkHttpClient client = new OkHttpClient(); + private final String baseUrl; + private final String siteId; + private final Gson gson = new Gson(); + + public TuringClient(String baseUrl, String siteId) { + this.baseUrl = baseUrl; + this.siteId = siteId; + } + + public SearchResponse search(String query) throws IOException { + String url = String.format("%s/api/sn/%s/search?q=%s", + baseUrl, siteId, query); + + Request request = new Request.Builder() + .url(url) + .build(); + + try (Response response = client.newCall(request).execute()) { + return gson.fromJson(response.body().string(), + SearchResponse.class); + } + } + + public void addDocument(Document doc) throws IOException { + String url = String.format("%s/api/sn/%s/import", baseUrl, siteId); + + RequestBody body = RequestBody.create( + gson.toJson(doc), + MediaType.get("application/json") + ); + + Request request = new Request.Builder() + .url(url) + .post(body) + .build(); + + client.newCall(request).execute(); + } +} + +// Usage +TuringClient client = new TuringClient("http://localhost:2700", "my-site"); + +SearchResponse results = client.search("deep learning"); +System.out.println("Found " + results.getResults().getNumFound() + " results"); +---- + +== Integration Patterns + +=== Website Search + +Add intelligent search to your website: + +[source,html] +---- + + + + Smart Search + + +
+ +
+
+ + + + +---- + +=== React Component + +[source,jsx] +---- +import React, { useState, useEffect } from 'react'; + +const TuringSearch = ({ siteId = 'my-site' }) => { + const [query, setQuery] = useState(''); + const [results, setResults] = useState([]); + const [loading, setLoading] = useState(false); + + useEffect(() => { + const search = async () => { + if (query.length < 2) { + setResults([]); + return; + } + + setLoading(true); + try { + const response = await fetch( + `http://localhost:2700/api/sn/${siteId}/search?q=${encodeURIComponent(query)}` + ); + const data = await response.json(); + setResults(data.results.docs); + } catch (error) { + console.error('Search failed:', error); + } finally { + setLoading(false); + } + }; + + const timeoutId = setTimeout(search, 300); // Debounce + return () => clearTimeout(timeoutId); + }, [query, siteId]); + + return ( +
+ setQuery(e.target.value)} + placeholder="Search..." + className="search-input" + /> + + {loading &&
Searching...
} + +
+ {results.map((doc, index) => ( +
+

{doc.title}

+

{doc.content?.substring(0, 200)}...

+
+ ))} +
+
+ ); +}; + +export default TuringSearch; +---- :numbered: -== More Documentation +== Contributing to Turing ES Development + +Want to contribute to Turing ES? Here's how to set up your development environment. + +=== Development Environment Setup + +==== Prerequisites + +* *Java 21* - https://adoptium.net/temurin/releases/?package=jdk&version=21[Eclipse Temurin] (recommended) +* *Node.js 16+* - https://nodejs.org/en/download/[Download Node.js] +* *Git* - https://git-scm.com/downloads[Download Git] +* *IDE* - IntelliJ IDEA, VS Code, or Eclipse + +==== Technology Stack + +**Backend:** +* Spring Boot 3.x (REST APIs and web services) +* JPA/Hibernate (data persistence) +* Apache Solr (search engine) +* H2/PostgreSQL/Oracle/SQL Server (database options) + +**Frontend:** +* AngularJS (legacy UI - being migrated) +* Angular 12+ (new UI in development) +* Primer CSS (GitHub's design system) + +**Build & Deploy:** +* Maven (dependency management and builds) +* Docker & Docker Compose (containerization) +* GitHub Actions (CI/CD) + +=== Quick Development Setup + +==== 1. Clone the Repository + +[source,bash] +---- +git clone https://github.com/openviglet/turing.git +cd turing +---- + +==== 2. Build the Project + +[source,bash] +---- +# Build everything +mvn clean install + +# Or build specific modules +mvn clean install -pl turing-app +---- + +==== 3. Run in Development Mode + +**Option A: Full Development Mode (with UI hot reload)** +[source,bash] +---- +mvn spring-boot:run -pl turing-app +---- + +**Option B: API-only Mode (faster startup)** +[source,bash] +---- +mvn spring-boot:run -pl turing-app -Dspring-boot.run.profiles=dev-api +---- + +**Option C: With External Services (using Docker)** +[source,bash] +---- +# Start dependencies +docker-compose up -d + +# Run the application +mvn spring-boot:run -pl turing-app -Dspring-boot.run.profiles=external +---- + +=== Development Workflow + +==== Making Changes + +1. **Create a feature branch:** +[source,bash] +---- +git checkout -b feature/my-awesome-feature +---- + +2. **Make your changes** in the appropriate modules: + * `turing-app/` - Main application + * `turing-nlp/` - NLP integrations + * `turing-commons/` - Shared utilities + +3. **Test your changes:** +[source,bash] +---- +# Run unit tests +mvn test + +# Run integration tests +mvn integration-test + +# Run specific test class +mvn test -Dtest=MyTestClass +---- + +4. **Submit a pull request** with: + * Clear description of changes + * Tests for new functionality + * Documentation updates if needed + +=== Project Structure + +---- +turing/ +├── turing-app/ # Main Spring Boot application +│ ├── src/main/java/ # Java source code +│ ├── src/main/resources/ # Configuration and static files +│ └── src/main/angular/ # Angular frontend (new UI) +├── turing-nlp/ # NLP provider integrations +├── turing-commons/ # Shared utilities and models +├── turing-connectors/ # Data source connectors +└── docker-compose.yml # Development dependencies +---- + +=== Debugging + +==== IDE Debugging + +**IntelliJ IDEA:** +1. Open the project root in IntelliJ +2. Create a Spring Boot run configuration for `turing-app` +3. Set breakpoints and run in debug mode + +**VS Code:** +1. Install Java Extension Pack +2. Use the built-in Spring Boot debugging support +3. Create `.vscode/launch.json` configuration + +==== Remote Debugging + +[source,bash] +---- +mvn spring-boot:run -pl turing-app -Dspring-boot.run.jvmArguments="-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=5005" +---- + +Then connect your IDE debugger to `localhost:5005`. + +=== Testing + +==== Unit Tests +[source,bash] +---- +# Run all unit tests +mvn test + +# Run tests for specific module +mvn test -pl turing-app + +# Run with coverage report +mvn test jacoco:report +---- + +==== Integration Tests +[source,bash] +---- +# Run integration tests +mvn integration-test + +# Run with test containers (requires Docker) +mvn integration-test -Dspring.profiles.active=testcontainers +---- + +==== API Testing +[source,bash] +---- +# Start the application +mvn spring-boot:run -pl turing-app + +# Test endpoints (in another terminal) +curl http://localhost:2700/api/sn/sample/search?q=test +---- + +=== Database Development -Technical documentation on Turing ES is available at https://docs.viglet.com/turing. +==== H2 Console (Development) +When running with H2 database, access the console at: +http://localhost:2700/h2-console -== Open Source Development +**Connection details:** +* JDBC URL: `jdbc:h2:mem:testdb` +* Username: `sa` +* Password: _(empty)_ -You can collaborate with Turing, participating in its development. Below are the steps to create your Turing environment. +==== Using PostgreSQL for Development -=== Development Structure +1. Start PostgreSQL with Docker: +[source,bash] +---- +docker run --name turing-postgres -e POSTGRES_DB=turing -e POSTGRES_USER=turing -e POSTGRES_PASSWORD=turing -p 5432:5432 -d postgres:13 +---- -==== Frameworks -Turing ES was developed using Spring Boot (https://spring.io/projects/spring-boot) for its backend. +2. Run with PostgreSQL profile: +[source,bash] +---- +mvn spring-boot:run -pl turing-app -Dspring-boot.run.profiles=postgresql +---- -The UI is currently using AngularJS (https://angularjs.org), but a new UI is being developed using Angular 12 (https://angular.io) with Primer CSS (https://primer.style/css). +=== Frontend Development -In addition to Java, you also need to have Git (https://git-scm.com/downloads) and NodeJS (https://nodejs.org/en/download/) installed. +==== Legacy AngularJS UI -==== Databases -By default it uses the H2 database (https://www.h2database.com), but can be changed to other databases using Spring Boot properties. It comes bundled with OpenNLP (https://opennlp.apache.org/) in the same JVM. +The current production UI is built with AngularJS. To work on it: -==== Programming Language and Deploy -It uses Java 21 (https://adoptium.net/temurin/releases/?package=jdk&version=21) and its deployment is done with Maven and works on Unix and Windows. +[source,bash] +---- +cd turing-app/src/main/resources/public -==== Docker -To use Semantic Navigation and Chatbot you must have a Solr (https://solr.apache.org) service available. If you prefer to work with all the services Turing depends on, you can use docker-compose (https://docs.docker.com/compose/install) to start these services, we use the Docker Desktop (https://www.docker.com/products/docker-desktop) installed on computer. +# Install dependencies (first time only) +npm install -==== IDE -You can use Spring Tools 4 for Eclipse (https://spring.io/tools) or Eclipse (https://www.eclipse.org/downloads/) or Visual Studio Code (https://code.visualstudio.com/) or IntelliJ (https://www.jetbrains.com/pt-br/idea/) as IDEs. +# Watch for changes (optional) +npm run watch +---- + +==== New Angular UI (In Development) + +The new UI is being built with Angular 12+: + +[source,bash] +---- +cd turing-app/src/main/angular + +# Install dependencies +npm install + +# Start development server +ng serve + +# Build for production +ng build +---- + +=== Docker Development + +==== Full Docker Setup + +[source,bash] +---- +# Build the Docker image +docker build -t viglet/turing . + +# Run with Docker Compose (includes Solr, PostgreSQL) +docker-compose up -d + +# View logs +docker-compose logs -f turing +---- + +==== Development with External Services + +Use Docker only for dependencies: + +[source,bash] +---- +# Start only external services +docker-compose up -d solr postgres + +# Run Turing ES locally +mvn spring-boot:run -pl turing-app -Dspring-boot.run.profiles=external +---- + +=== Contribution Guidelines + +==== Code Style + +* Follow Java coding conventions +* Use meaningful variable and method names +* Add JavaDoc for public APIs +* Write unit tests for new functionality + +==== Pull Request Process + +1. Fork the repository +2. Create a feature branch +3. Make your changes with tests +4. Update documentation if needed +5. Submit a pull request with clear description + +==== Reporting Issues + +Before reporting bugs or requesting features: + +1. Check existing issues on GitHub +2. Provide detailed reproduction steps +3. Include system information and logs +4. Use issue templates when available + +=== Getting Help + +**Community Support:** +* GitHub Discussions: https://github.com/openviglet/turing/discussions +* Issues: https://github.com/openviglet/turing/issues + +**Documentation:** +* API Reference: link:../api-reference/[API Documentation] +* Architecture Guide: link:../architecture/[Architecture Overview] + +**Contact:** +* Email: opensource@viglet.com +* Twitter: @VigletTuring + +:numbered: === Download diff --git a/docs/turing/0.3.10/turing-faq.adoc b/docs/turing/0.3.10/turing-faq.adoc new file mode 100644 index 0000000..51c88d5 --- /dev/null +++ b/docs/turing/0.3.10/turing-faq.adoc @@ -0,0 +1,585 @@ += Viglet Turing ES: Frequently Asked Questions +Viglet Team +:page-layout: documentation +:organization: Viglet Turing +ifdef::backend-pdf[:toc: left] +:toclevels: 5 +:toc-title: Table of Content +:doctype: book +:revnumber: 0.3.10 +:revdate: 25-12-2024 +:source-highlighter: rouge +:pdf-theme: viglet +:pdf-themesdir: {docdir}/../themes/ +:page-breadcrumb-title: FAQ +:page-permalink: /turing/0.3.10/faq/ +:imagesdir: ../../../ +:page-pdf: /turing/turing-faq-0.3.10.pdf +:page-product: turing + +[preface] +== Frequently Asked Questions + +Got questions about Turing ES? You're in the right place! This FAQ covers the most common questions about installation, configuration, usage, and troubleshooting. + +Can't find your answer here? Check our https://github.com/openviglet/turing/discussions[Community Discussions] or https://github.com/openviglet/turing/issues[GitHub Issues]. + +:numbered: + +== General Questions + +=== What is Turing ES? + +**Viglet Turing ES** is an open-source semantic search and navigation platform. It combines artificial intelligence (AI) and natural language processing (NLP) to create intelligent search experiences that understand user intent rather than just matching keywords. + +**Key features:** +* Semantic search that understands context and meaning +* AI-powered chatbots and virtual assistants +* Multiple NLP provider support (OpenNLP, spaCy, CoreNLP, OpenText) +* Built on Apache Solr for enterprise-grade performance +* RESTful APIs for easy integration + +=== How is Turing ES different from traditional search engines? + +Traditional search engines match keywords literally. Turing ES understands the *meaning* behind queries: + +**Traditional Search:** +Query: "apple fruit nutrition" +Results: Might return articles about Apple Inc. computers mixed with fruit content + +**Turing ES Semantic Search:** +Query: "apple fruit nutrition" +Results: Prioritizes nutritional information about apple fruits, understanding the context + +=== Is Turing ES free to use? + +Yes! Turing ES is completely free and open source under the Apache 2.0 license. You can: +* Use it commercially without restrictions +* Modify the source code for your needs +* Distribute your modifications +* Get support from the community + +=== What's the difference between Turing ES and Elasticsearch? + +Both are search platforms, but they serve different purposes: + +**Elasticsearch:** +* General-purpose search and analytics engine +* Requires separate NLP/AI components +* Focus on speed and scale +* More technical setup required + +**Turing ES:** +* Semantic search and navigation platform +* Built-in AI/NLP capabilities +* Focus on understanding user intent +* User-friendly administration interface + +== Installation & Setup + +=== What are the system requirements? + +**Minimum Requirements:** +* Java 21 (JDK or JRE) +* 4 GB RAM +* 10 GB disk space +* Windows 10+, macOS 10.14+, or Linux + +**Recommended for Production:** +* 8+ GB RAM +* 4+ CPU cores +* 50+ GB disk space (depending on content volume) +* Load balancer for high availability + +=== Can I use Java 17 or Java 11? + +No, Turing ES requires Java 21 specifically. Earlier versions are not supported due to dependencies on Java 21 features and Spring Boot 3.x requirements. + +To check your Java version: +---- +java -version +---- + +Download Java 21 from: +* https://adoptium.net/[Eclipse Temurin] (free) +* https://www.oracle.com/java/technologies/downloads/[Oracle JDK] + +=== How do I change the default port (2700)? + +You can change the port when starting Turing ES: + +---- +java -jar viglet-turing.jar --server.port=8080 +---- + +Or set it permanently in `application.properties`: +---- +server.port=8080 +---- + +=== Can I use an external database? + +Yes! While Turing ES uses H2 database by default, you can configure it to use: + +* PostgreSQL (recommended for production) +* Oracle Database +* Microsoft SQL Server +* MySQL/MariaDB + +Configure in `application.properties`: +---- +spring.datasource.url=jdbc:postgresql://localhost:5432/turing +spring.datasource.username=turing +spring.datasource.password=your-password +spring.jpa.database-platform=org.hibernate.dialect.PostgreSQLDialect +---- + +=== How do I run Turing ES in production? + +For production deployments: + +1. **Use external database** (PostgreSQL recommended) +2. **Configure proper memory settings:** + ---- + java -Xmx8g -Xms4g -jar viglet-turing.jar + ---- +3. **Set up SSL/HTTPS** with reverse proxy (nginx, Apache) +4. **Configure clustering** for high availability +5. **Set up monitoring** and log management +6. **Change default passwords** +7. **Configure backups** for database and search index + +== Configuration & Usage + +=== How do I create a new search site? + +1. Log into the admin console (http://localhost:2700) +2. Go to **Sites** > **Add Site** +3. Fill in the details: + * Name: Friendly name for your site + * Description: Brief description + * Default Locale: Primary language +4. Click **Save** + +Your site will get a unique identifier (e.g., "my-site") used in API calls. + +=== How do I index content? + +There are several ways to add content to Turing ES: + +**Web Interface:** +1. Go to **Content** > **Import** +2. Upload files (PDF, Word, HTML, text) +3. Content is automatically processed and indexed + +**REST API:** +---- +curl -X POST http://localhost:2700/api/sn/my-site/import \ + -H "Content-Type: application/json" \ + -d '{ + "title": "Document Title", + "content": "Document content here...", + "url": "https://example.com/doc1" + }' +---- + +**Connectors:** +* File system connector +* Database connector +* Web crawler +* SharePoint connector +* And more... + +=== What file formats does Turing ES support? + +Turing ES can process many document formats: + +**Supported formats:** +* PDF documents +* Microsoft Word (.docx, .doc) +* PowerPoint presentations +* Excel spreadsheets +* HTML web pages +* Plain text files +* XML documents +* JSON data +* CSV files + +**Through connectors:** +* Database records +* Web pages (via crawler) +* SharePoint documents +* File system content + +=== How do I configure NLP providers? + +Turing ES supports multiple NLP providers for content analysis: + +1. Go to **NLP** > **Providers** +2. Click **Add Provider** +3. Choose from available options: + * **OpenNLP** (included, no setup required) + * **spaCy** (requires separate spaCy service) + * **Stanford CoreNLP** (requires CoreNLP server) + * **OpenText Content Analytics** (commercial) + +4. Configure endpoint URLs and API keys as needed +5. Enable the provider for your sites + +=== Can I customize search rankings? + +Yes! Turing ES provides several ways to influence search results: + +**Boost Fields:** +Increase relevance of specific fields (title, content, tags) + +**Targeting Rules:** +Create rules that promote specific content for certain queries + +**Custom Scoring:** +Implement custom relevance algorithms + +**Facet Configuration:** +Set up filters and categories for refined search + +Configure these in **Sites** > **Your Site** > **Search Configuration**. + +== API & Integration + +=== How do I integrate Turing ES with my website? + +The easiest way is using the REST API: + +**Basic search integration:** +---- +// JavaScript example +fetch(`http://localhost:2700/api/sn/my-site/search?q=${query}`) + .then(response => response.json()) + .then(data => { + // Display search results + data.results.docs.forEach(doc => { + console.log(doc.title, doc.content); + }); + }); +---- + +**Available APIs:** +* Search API - Perform searches +* Autocomplete API - Search suggestions +* Import API - Add/update content +* NLP APIs - Language processing +* Administration APIs - Site management + +=== Is there an SDK or client library? + +While there's no official SDK yet, the REST APIs are straightforward to use with any HTTP client: + +**Popular HTTP clients:** +* **JavaScript:** fetch(), axios, jQuery +* **Python:** requests, httpx +* **Java:** OkHttp, Apache HttpClient +* **PHP:** Guzzle, cURL +* **C#:** HttpClient +* **Go:** net/http + +We provide code examples in the link:developer-guide/[Developer Guide]. + +=== Can I use Turing ES with mobile apps? + +Absolutely! The REST APIs work great with mobile applications: + +**iOS:** Use URLSession or Alamofire +**Android:** Use Retrofit or OkHttp +**React Native:** Use fetch() or axios +**Flutter:** Use http package + +The JSON responses are mobile-friendly and efficient. + +=== How do I handle authentication in API calls? + +For read-only operations (search), authentication is usually not required. + +For administrative operations: + +1. **Get authentication token:** +---- +curl -X POST http://localhost:2700/api/auth/login \ + -H "Content-Type: application/json" \ + -d '{"username": "admin", "password": "admin"}' +---- + +2. **Use token in subsequent requests:** +---- +curl -X POST http://localhost:2700/api/sn/my-site/import \ + -H "Authorization: Bearer YOUR_TOKEN_HERE" \ + -H "Content-Type: application/json" \ + -d '{"title": "New Document", ...}' +---- + +== Troubleshooting + +=== Turing ES won't start - port already in use + +**Error:** "Port 2700 already in use" + +**Solutions:** +1. **Use different port:** + ---- + java -jar viglet-turing.jar --server.port=8080 + ---- + +2. **Find what's using port 2700:** + ---- + # Windows + netstat -ano | findstr :2700 + + # Linux/Mac + lsof -i :2700 + ---- + +3. **Kill the process** or choose a different port + +=== Out of memory errors + +**Error:** "OutOfMemoryError" or "GC overhead limit exceeded" + +**Solutions:** +1. **Increase heap size:** + ---- + java -Xmx4g -jar viglet-turing.jar + ---- + +2. **For large content volumes:** + ---- + java -Xmx8g -Xms4g -jar viglet-turing.jar + ---- + +3. **Check available system memory:** + ---- + # Linux/Mac + free -h + + # Windows + wmic OS get TotalVisibleMemorySize /value + ---- + +=== Search returns no results + +**Possible causes:** + +1. **No content indexed yet** + - Check **Content** > **Browse** in admin console + - Verify content was successfully imported + +2. **Wrong site ID in API calls** + - Check site identifier in **Sites** section + - Use exact site ID in API URLs + +3. **Content not committed to search index** + - Content indexing may take a few seconds + - Check **System** > **Indexing Status** + +4. **NLP processing failed** + - Check **Logs** for NLP errors + - Verify NLP providers are configured and running + +=== Cannot connect to database + +**Error:** Database connection failures + +**Solutions:** + +1. **For H2 (default):** Usually resolves automatically on restart + +2. **For external databases:** + - Verify database server is running + - Check connection settings in `application.properties` + - Ensure database and user exist + - Test connectivity: `telnet db-host 5432` + +3. **Check firewall settings** and network connectivity + +=== NLP provider not working + +**Issues with NLP processing:** + +1. **OpenNLP (built-in):** Should work automatically +2. **spaCy:** Ensure spaCy service is running on specified port +3. **CoreNLP:** Check CoreNLP server status and endpoint URL +4. **OpenText:** Verify license and service availability + +**Debugging steps:** +1. Check **NLP** > **Providers** status +2. View logs for NLP errors +3. Test NLP endpoint directly +4. Disable and re-enable provider + +=== Performance is slow + +**Common performance issues:** + +**Large result sets:** +- Use pagination (`rows=20&start=0`) +- Add filters to narrow results (`fq=category:news`) + +**Complex queries:** +- Optimize search queries +- Use field-specific searches when possible +- Configure proper field boosting + +**Hardware limitations:** +- Monitor CPU and memory usage +- Consider scaling horizontally +- Optimize database queries +- Add more RAM or CPU cores + +**Search index optimization:** +- Regular index optimization +- Remove unused fields +- Configure proper caching + +== Advanced Topics + +=== Can I customize the user interface? + +Yes, you can customize the Turing ES interface: + +**Built-in customization:** +* Themes and branding in admin console +* Custom CSS and JavaScript +* Logo and color scheme changes + +**Full customization:** +* Build custom search interface using APIs +* Integrate search into existing applications +* Create mobile apps with search functionality + +The new Angular-based UI (in development) will provide more customization options. + +=== How do I backup and restore Turing ES? + +**Database backup:** +* H2: Copy the database files from the data directory +* PostgreSQL: Use `pg_dump` utility +* Other databases: Use vendor-specific backup tools + +**Search index backup:** +* Solr index files are in the Solr data directory +* Use Solr backup/restore APIs +* Consider replication for high availability + +**Configuration backup:** +* Export site configurations via admin interface +* Backup `application.properties` and custom configurations + +**Full restore process:** +1. Restore database from backup +2. Restore Solr index +3. Apply configuration files +4. Restart Turing ES + +=== How do I scale Turing ES for high traffic? + +**Horizontal scaling:** +* Load balancer in front of multiple Turing ES instances +* Shared database for all instances +* Distributed Solr cluster (SolrCloud) + +**Vertical scaling:** +* Increase RAM and CPU cores +* SSD storage for better I/O performance +* Optimize JVM settings + +**Caching strategies:** +* Redis for session and query caching +* CDN for static assets +* Database query optimization + +**Monitoring:** +* Application performance monitoring (APM) +* Database performance metrics +* Search response times +* System resource utilization + +== Getting Help + +=== Where can I get support? + +**Free Community Support:** +* https://github.com/openviglet/turing/discussions[GitHub Discussions] - Ask questions and share ideas +* https://github.com/openviglet/turing/issues[GitHub Issues] - Report bugs and request features +* Documentation - Comprehensive guides and tutorials + +**Professional Support:** +* Email: opensource@viglet.com +* Custom development and consulting services +* Enterprise support agreements + +**Social Media:** +* Twitter: @VigletTuring +* Facebook: facebook.com/viglet +* YouTube: Viglet channel with tutorials + +=== How do I report a bug? + +When reporting bugs, please include: + +1. **Turing ES version** (`java -jar viglet-turing.jar --version`) +2. **Java version** (`java -version`) +3. **Operating system** (Windows 10, Ubuntu 20.04, etc.) +4. **Steps to reproduce** the issue +5. **Expected vs actual behavior** +6. **Error logs** (check console output and log files) +7. **Configuration details** (if relevant) + +Submit bug reports at: https://github.com/openviglet/turing/issues + +=== How can I contribute to Turing ES? + +We welcome all kinds of contributions: + +**Code contributions:** +* Bug fixes and improvements +* New features and enhancements +* Performance optimizations +* Documentation improvements + +**Non-code contributions:** +* Testing and bug reports +* Documentation writing and translation +* Community support and discussions +* Tutorials and examples + +**Getting started:** +1. Fork the repository +2. Read the link:developer-guide/[Developer Guide] +3. Join GitHub discussions +4. Submit pull requests + +=== Is there professional training available? + +While formal training programs are not currently available, we provide: + +**Free learning resources:** +* Comprehensive documentation +* Video tutorials on YouTube +* Code examples and demos +* Community workshops (occasional) + +**Custom training:** +* On-site training for enterprise customers +* Custom workshops for development teams +* Consulting services for complex implementations + +Contact opensource@viglet.com for professional training inquiries. + +== Still Have Questions? + +If you can't find the answer to your question here: + +1. **Search the documentation** - Use the search function on this site +2. **Check GitHub Discussions** - Someone may have asked the same question +3. **Join the community** - Ask questions and help others +4. **Contact us directly** - opensource@viglet.com + +We're here to help make your Turing ES experience successful! 🚀 \ No newline at end of file diff --git a/docs/turing/0.3.10/turing-installation-guide.adoc b/docs/turing/0.3.10/turing-installation-guide.adoc index 227ad6e..bd6ab2f 100644 --- a/docs/turing/0.3.10/turing-installation-guide.adoc +++ b/docs/turing/0.3.10/turing-installation-guide.adoc @@ -17,10 +17,103 @@ ifdef::backend-pdf[:toc: left] :page-pdf: /turing/turing-installation-guide-0.3.10.pdf :page-product: turing -include::_adoc_includes/turing/0.3.9/installation/preface.adoc[] +[preface] +== Preface + +Viglet Turing ES (https://viglet.com/turing) is an open source solution (https://github.com/openviglet), which has Semantic Navigation and Chatbot as its main features. You can choose from several NLPs to enrich the data. All content is indexed in Solr as search engine. + +This installation guide provides step-by-step instructions for setting up Turing ES in different environments, from simple standalone installations to enterprise-grade deployments. + +== Installation Options + +Choose the installation method that best fits your needs: + +[.lead] +**🚀 Quick Start (Recommended for Testing)** +Get Turing ES running in minutes with the embedded database and search engine. Perfect for evaluation, development, and small deployments. + +**🏢 Production Setup** +Configure Turing ES with external databases and clustering for enterprise environments. + +**🐳 Docker Deployment** +Use containerized deployment for cloud and DevOps environments. :numbered: +[[quick-start]] +== Quick Start Installation + +This section gets you up and running with Turing ES in just a few minutes using the embedded H2 database and Solr search engine. Perfect for testing, development, and small deployments. + +=== Prerequisites + +Before starting, ensure you have: + +* *Java 21* - Download from https://www.oracle.com/java/technologies/downloads/[Oracle] or https://adoptium.net/[Eclipse Temurin] +* *4GB RAM minimum* (8GB recommended) +* *10GB free disk space* +* *Administrative privileges* on your system + +=== Step 1: Verify Java Installation + +Open your terminal or command prompt and verify Java 21 is installed: + +[source,bash] +---- +java -version +---- + +You should see output similar to: +---- +java version "21.0.1" 2023-10-17 +Java(TM) SE Runtime Environment (build 21.0.1+12-LTS-29) +---- + +If Java 21 is not installed, download and install it from the links above. + +WARNING: Turing ES requires Java 21. Earlier versions are not supported. + +=== Step 2: Download Turing ES + +1. Visit the https://github.com/openviglet/turing/releases/latest[latest release page] +2. Download `viglet-turing.jar` (approximately 271 MB) +3. Save it to your preferred directory (e.g., `/opt/turing` or `C:\turing`) + +=== Step 3: Start Turing ES + +Navigate to the directory where you saved the JAR file and run: + +[source,bash] +---- +java -jar viglet-turing.jar +---- + +The first startup takes 2-3 minutes as Turing ES initializes the embedded database and search engine. + +TIP: To run with more memory: `java -Xmx4g -jar viglet-turing.jar` + +=== Step 4: Access the Admin Console + +Once started (you'll see "Started TurApplication"), open your browser to: + +**http://localhost:2700** + +Use these default credentials: +* *Username:* `admin` +* *Password:* `admin` + +IMPORTANT: Change the default password in production environments! + +=== Step 5: Create Your First Site + +1. Click *"Sites"* → *"Add Site"* +2. Enter a name like "My First Site" +3. Click *"Save"* + +Congratulations! Turing ES is now running. Continue reading for production setup options. + +--- + [[installing-java]] == Installing Java diff --git a/docs/turing/0.3.10/turing-troubleshooting.adoc b/docs/turing/0.3.10/turing-troubleshooting.adoc new file mode 100644 index 0000000..3b3de74 --- /dev/null +++ b/docs/turing/0.3.10/turing-troubleshooting.adoc @@ -0,0 +1,811 @@ += Viglet Turing ES: Troubleshooting Guide +Viglet Team +:page-layout: documentation +:organization: Viglet Turing +ifdef::backend-pdf[:toc: left] +:toclevels: 5 +:toc-title: Table of Content +:doctype: book +:revnumber: 0.3.10 +:revdate: 25-12-2024 +:source-highlighter: rouge +:pdf-theme: viglet +:pdf-themesdir: {docdir}/../themes/ +:page-breadcrumb-title: Troubleshooting Guide +:page-permalink: /turing/0.3.10/troubleshooting/ +:imagesdir: ../../../ +:page-pdf: /turing/turing-troubleshooting-0.3.10.pdf +:page-product: turing + +[preface] +== Troubleshooting Turing ES + +Having issues with Turing ES? This guide helps you diagnose and resolve common problems quickly. From installation issues to performance problems, we've got you covered. + +**Quick Help:** +* 🚀 **Installation Issues** - Can't get Turing ES started? +* 🔍 **Search Problems** - Not finding the results you expect? +* ⚡ **Performance Issues** - Running slowly or timing out? +* 🔌 **Integration Problems** - APIs not working as expected? +* 🗄️ **Database Issues** - Connection or data problems? + +TIP: Most issues can be resolved by checking the logs. Look for error messages in the console output or log files. + +:numbered: + +== Getting Help with Logs + +=== Viewing Console Logs + +When running Turing ES from the command line, logs appear in your terminal: + +[source,bash] +---- +java -jar viglet-turing.jar +---- + +Look for ERROR, WARN, or FATAL messages that indicate problems. + +=== Log Files + +Turing ES creates log files in the `logs/` directory: + +* `turing.log` - Main application logs +* `turing-error.log` - Error-specific logs +* `access.log` - HTTP request logs + +=== Increasing Log Detail + +For debugging, enable debug logging: + +[source,bash] +---- +java -jar viglet-turing.jar --logging.level.com.viglet.turing=DEBUG +---- + +Or add to `application.properties`: +[source,properties] +---- +logging.level.com.viglet.turing=DEBUG +logging.level.org.springframework=INFO +---- + +== Installation & Startup Issues + +=== Java Version Problems + +**Problem:** "Unsupported Java version" or startup fails + +**Symptoms:** +* Application won't start +* Error mentioning Java version compatibility +* ClassNotFoundException errors + +**Solutions:** + +1. **Verify Java 21 is installed:** +[source,bash] +---- +java -version +# Should show version 21.x.x +---- + +2. **Check JAVA_HOME environment variable:** +[source,bash] +---- +echo $JAVA_HOME +# Should point to Java 21 installation +---- + +3. **Install Java 21 if needed:** +* Download from https://adoptium.net/[Eclipse Temurin] +* Update PATH and JAVA_HOME variables +* Restart terminal/command prompt + +**Windows JAVA_HOME setup:** +[source,cmd] +---- +set JAVA_HOME=C:\Program Files\Eclipse Adoptium\jdk-21.0.1.12-hotspot +set PATH=%JAVA_HOME%\bin;%PATH% +---- + +**Linux/macOS JAVA_HOME setup:** +[source,bash] +---- +export JAVA_HOME=/usr/lib/jvm/java-21-openjdk +export PATH=$JAVA_HOME/bin:$PATH +---- + +=== Port Already in Use + +**Problem:** "Port 2700 already in use" or "Address already in use" + +**Symptoms:** +* Application starts but immediately exits +* Error message about port binding +* Cannot access http://localhost:2700 + +**Solutions:** + +1. **Find what's using the port:** +[source,bash] +---- +# Windows +netstat -ano | findstr :2700 + +# Linux/macOS +lsof -i :2700 +---- + +2. **Kill the conflicting process:** +[source,bash] +---- +# Windows (replace PID with actual process ID) +taskkill /PID 1234 /F + +# Linux/macOS +kill -9 PID +---- + +3. **Use a different port:** +[source,bash] +---- +java -jar viglet-turing.jar --server.port=8080 +---- + +4. **Access on new port:** +http://localhost:8080 + +=== Out of Memory Errors + +**Problem:** OutOfMemoryError, GC overhead limit, or slow startup + +**Symptoms:** +* Application crashes with memory errors +* Very slow performance +* Frequent garbage collection messages + +**Solutions:** + +1. **Increase heap size:** +[source,bash] +---- +# For 4GB heap +java -Xmx4g -Xms2g -jar viglet-turing.jar + +# For 8GB heap +java -Xmx8g -Xms4g -jar viglet-turing.jar +---- + +2. **Check available system memory:** +[source,bash] +---- +# Linux +free -h + +# macOS +vm_stat + +# Windows +wmic OS get TotalVisibleMemorySize /value +---- + +3. **Optimize garbage collection:** +[source,bash] +---- +java -Xmx4g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -jar viglet-turing.jar +---- + +=== Permission Errors + +**Problem:** File permission or access denied errors + +**Symptoms:** +* Cannot create database files +* Cannot write to logs directory +* Configuration files not readable + +**Solutions:** + +1. **Run from writable directory:** +[source,bash] +---- +mkdir /home/user/turing +cd /home/user/turing +java -jar viglet-turing.jar +---- + +2. **Fix file permissions (Linux/macOS):** +[source,bash] +---- +chmod 755 viglet-turing.jar +chown user:user viglet-turing.jar +---- + +3. **Run as administrator (Windows):** +* Right-click Command Prompt +* Select "Run as administrator" + +== Database Connection Issues + +=== H2 Database Problems + +**Problem:** Cannot connect to embedded H2 database + +**Symptoms:** +* "Database may be already in use" error +* Cannot access H2 console +* Data not persisting between restarts + +**Solutions:** + +1. **Check for multiple instances:** +[source,bash] +---- +# Look for other java processes running Turing +ps aux | grep turing +---- + +2. **Clean database files:** +[source,bash] +---- +# Stop Turing ES first, then remove H2 files +rm -f *.db *.trace.db *.lock.db +---- + +3. **Use different database path:** +[source,bash] +---- +java -jar viglet-turing.jar --spring.datasource.url="jdbc:h2:file:./data/turing" +---- + +=== External Database Connection + +**Problem:** Cannot connect to PostgreSQL/MySQL/Oracle + +**Symptoms:** +* Connection timeout errors +* Authentication failures +* SSL connection errors + +**Solutions:** + +1. **Verify database server is running:** +[source,bash] +---- +# Test connectivity +telnet db-server 5432 # PostgreSQL +telnet db-server 3306 # MySQL +---- + +2. **Check connection configuration:** +[source,properties] +---- +# In application.properties +spring.datasource.url=jdbc:postgresql://localhost:5432/turing +spring.datasource.username=turing +spring.datasource.password=your-password + +# Test connection +spring.datasource.hikari.connection-test-query=SELECT 1 +---- + +3. **Verify database and user exist:** +[source,sql] +---- +-- PostgreSQL +\l -- List databases +\du -- List users +GRANT ALL ON DATABASE turing TO turing; + +-- MySQL +SHOW DATABASES; +SELECT User FROM mysql.user; +GRANT ALL PRIVILEGES ON turing.* TO 'turing'@'%'; +---- + +4. **Check firewall and network:** +[source,bash] +---- +# Test network connectivity +ping db-server +nmap -p 5432 db-server +---- + +== Search & Indexing Issues + +=== No Search Results + +**Problem:** Searches return no results despite having content + +**Symptoms:** +* All searches return 0 results +* Recently added content not appearing +* Previously working searches now fail + +**Solutions:** + +1. **Check if content is indexed:** +* Go to Admin Console → Content → Browse +* Verify documents are listed +* Check document count in site statistics + +2. **Verify site configuration:** +[source,bash] +---- +# Check API with correct site ID +curl "http://localhost:2700/api/sn/YOUR-SITE-ID/search?q=test" +---- + +3. **Check Solr index status:** +* Go to Admin Console → System → Search Engine +* Verify Solr connection +* Check index statistics + +4. **Reindex all content:** +* Go to Admin Console → Content → Reindex +* Or use API: +[source,bash] +---- +curl -X POST "http://localhost:2700/api/sn/YOUR-SITE/reindex" +---- + +=== Search Results Not Relevant + +**Problem:** Search returns irrelevant or poorly ranked results + +**Solutions:** + +1. **Check field boosting:** +* Go to Site Configuration → Search Settings +* Increase boost for title and important fields +* Example: title^3.0 content^1.0 tags^2.0 + +2. **Configure NLP providers:** +* Ensure NLP providers are active +* Check semantic processing is working +* Test with simple, then complex queries + +3. **Review targeting rules:** +* Create rules for specific query patterns +* Promote important content for key terms + +=== Content Not Indexing + +**Problem:** New content not appearing in search results + +**Symptoms:** +* Upload succeeds but content not searchable +* API import returns success but no results +* Some file types not being processed + +**Solutions:** + +1. **Check file format support:** +* Verify file type is supported (PDF, Word, HTML, etc.) +* Check file size limits +* Ensure file is not corrupted + +2. **Monitor indexing queue:** +* Go to Admin Console → System → Indexing Status +* Check for failed indexing jobs +* Look for error messages in logs + +3. **Verify NLP processing:** +[source,bash] +---- +# Check NLP provider status +curl "http://localhost:2700/api/nlp/providers" + +# Test NLP directly +curl -X POST "http://localhost:2700/api/nlp/analyze" \ + -H "Content-Type: application/json" \ + -d '{"text": "test content", "provider": "opennlp"}' +---- + +4. **Manual content refresh:** +[source,bash] +---- +# Force refresh of specific document +curl -X PUT "http://localhost:2700/api/sn/YOUR-SITE/document/DOC-ID/refresh" +---- + +== Performance Issues + +=== Slow Search Response + +**Problem:** Search queries take too long to return results + +**Symptoms:** +* Search timeouts (>30 seconds) +* High CPU usage during searches +* Users complaining about slow interface + +**Solutions:** + +1. **Optimize search queries:** +[source,bash] +---- +# Use pagination to limit results +curl "http://localhost:2700/api/sn/site/search?q=test&rows=20&start=0" + +# Add filters to narrow results +curl "http://localhost:2700/api/sn/site/search?q=test&fq=category:news" +---- + +2. **Check system resources:** +[source,bash] +---- +# Monitor CPU and memory +top # Linux/macOS +htop # Better alternative +taskmgr # Windows +---- + +3. **Optimize Solr configuration:** +* Increase Solr cache sizes +* Configure proper warming queries +* Use appropriate request handlers + +4. **Database optimization:** +[source,sql] +---- +-- Add indexes for frequently queried fields +CREATE INDEX idx_content_date ON content(date_created); +CREATE INDEX idx_content_site ON content(site_id); + +-- Analyze query performance +EXPLAIN ANALYZE SELECT * FROM content WHERE site_id = 'my-site'; +---- + +=== High Memory Usage + +**Problem:** Turing ES consuming too much memory + +**Solutions:** + +1. **Optimize JVM settings:** +[source,bash] +---- +java -Xmx4g -Xms2g \ + -XX:+UseG1GC \ + -XX:MaxGCPauseMillis=200 \ + -XX:+UseStringDeduplication \ + -jar viglet-turing.jar +---- + +2. **Monitor memory usage:** +[source,bash] +---- +# Add JVM monitoring +java -XX:+PrintGCDetails -XX:+PrintGCTimeStamps \ + -Xloggc:gc.log -jar viglet-turing.jar +---- + +3. **Configure connection pooling:** +[source,properties] +---- +# Limit database connections +spring.datasource.hikari.maximum-pool-size=20 +spring.datasource.hikari.minimum-idle=5 + +# Configure cache sizes +spring.cache.caffeine.spec=maximumSize=10000,expireAfterWrite=1h +---- + +=== Disk Space Issues + +**Problem:** Running out of disk space + +**Solutions:** + +1. **Check disk usage:** +[source,bash] +---- +# Check available space +df -h # Linux/macOS +dir C:\ # Windows + +# Find large files +du -h --max-depth=1 | sort -hr +---- + +2. **Clean up log files:** +[source,bash] +---- +# Archive old logs +gzip logs/*.log + +# Configure log rotation +logging.file.max-size=100MB +logging.file.max-history=30 +---- + +3. **Optimize search index:** +[source,bash] +---- +# Optimize Solr index +curl "http://localhost:8983/solr/YOUR-CORE/update?optimize=true" + +# Remove unused fields from index +# Configure in schema.xml or managed schema +---- + +== API & Integration Issues + +=== API Authentication Failures + +**Problem:** Getting 401 Unauthorized errors + +**Solutions:** + +1. **Check authentication endpoint:** +[source,bash] +---- +curl -X POST http://localhost:2700/api/auth/login \ + -H "Content-Type: application/json" \ + -d '{"username": "admin", "password": "admin"}' +---- + +2. **Verify token usage:** +[source,bash] +---- +# Get token from login response +TOKEN="your-jwt-token-here" + +# Use in API calls +curl -H "Authorization: Bearer $TOKEN" \ + "http://localhost:2700/api/admin/sites" +---- + +3. **Check token expiration:** +* Tokens expire after a set time (default 24 hours) +* Implement token refresh in your application +* Handle 401 errors by re-authenticating + +=== CORS Issues + +**Problem:** Cross-origin requests blocked in web browsers + +**Symptoms:** +* API works with cURL but fails in JavaScript +* Browser console shows CORS errors +* Preflight OPTIONS requests failing + +**Solutions:** + +1. **Configure CORS in application.properties:** +[source,properties] +---- +# Allow all origins (development only) +turing.cors.allowed-origins=* + +# Production: specify domains +turing.cors.allowed-origins=https://mysite.com,https://app.mysite.com +turing.cors.allowed-methods=GET,POST,PUT,DELETE,OPTIONS +turing.cors.allowed-headers=* +---- + +2. **Use server-side proxy:** +[source,javascript] +---- +// Instead of direct API calls, proxy through your server +fetch('/api/proxy/search?q=test') // Your server proxies to Turing ES + .then(response => response.json()) +---- + +3. **Configure reverse proxy (production):** +[source,nginx] +---- +# nginx configuration +location /api/ { + proxy_pass http://localhost:2700/api/; + add_header Access-Control-Allow-Origin *; + add_header Access-Control-Allow-Methods "GET, POST, OPTIONS"; +} +---- + +=== JSON Parsing Errors + +**Problem:** API responses not parsing correctly + +**Solutions:** + +1. **Check response content-type:** +[source,bash] +---- +curl -I "http://localhost:2700/api/sn/site/search?q=test" +# Should return: Content-Type: application/json +---- + +2. **Validate JSON structure:** +[source,bash] +---- +curl "http://localhost:2700/api/sn/site/search?q=test" | jq . +# jq will highlight JSON syntax errors +---- + +3. **Handle encoding issues:** +[source,javascript] +---- +// Ensure proper encoding in requests +fetch('/api/sn/site/search?q=' + encodeURIComponent(query)) +---- + +== NLP & AI Issues + +=== NLP Provider Not Working + +**Problem:** Natural Language Processing not functioning + +**Solutions:** + +1. **Check provider status:** +* Go to Admin Console → NLP → Providers +* Verify enabled providers show as "Active" +* Test individual providers + +2. **OpenNLP (built-in) issues:** +[source,bash] +---- +# Check OpenNLP models are loaded +curl "http://localhost:2700/api/nlp/providers/opennlp/status" +---- + +3. **External NLP service issues:** +[source,bash] +---- +# Test spaCy service +curl "http://localhost:2800/process" \ + -H "Content-Type: application/json" \ + -d '{"text": "test sentence"}' + +# Test CoreNLP service +curl "http://localhost:9000/?properties={\"annotators\":\"tokenize,ssplit,pos\"}" \ + -d "This is a test sentence." +---- + +4. **Configuration problems:** +[source,properties] +---- +# Check NLP provider URLs in application.properties +turing.nlp.spacy.url=http://localhost:2800 +turing.nlp.corenlp.url=http://localhost:9000 +turing.nlp.opentext.url=http://localhost:40000 +---- + +== Production Environment Issues + +=== SSL/HTTPS Configuration + +**Problem:** Setting up secure connections + +**Solutions:** + +1. **Configure SSL certificate:** +[source,properties] +---- +# In application.properties +server.port=8443 +server.ssl.key-store=keystore.jks +server.ssl.key-store-password=yourpassword +server.ssl.key-store-type=JKS +---- + +2. **Generate self-signed certificate (testing only):** +[source,bash] +---- +keytool -genkeypair -alias turing \ + -keyalg RSA -keysize 2048 \ + -storetype JKS -keystore keystore.jks \ + -validity 365 +---- + +3. **Use reverse proxy (recommended):** +[source,nginx] +---- +# nginx HTTPS proxy +server { + listen 443 ssl; + server_name turing.example.com; + + ssl_certificate /path/to/certificate.pem; + ssl_certificate_key /path/to/private-key.pem; + + location / { + proxy_pass http://localhost:2700; + proxy_set_header Host $host; + proxy_set_header X-Forwarded-Proto https; + } +} +---- + +=== Load Balancer Configuration + +**Problem:** Setting up high availability + +**Solutions:** + +1. **Session affinity:** +[source,properties] +---- +# Enable sticky sessions +server.servlet.session.cookie.name=TURING-SESSION +server.servlet.session.timeout=30m +---- + +2. **Health check endpoint:** +[source,bash] +---- +# Configure load balancer to check +curl "http://localhost:2700/actuator/health" +---- + +3. **Shared database configuration:** +[source,properties] +---- +# All instances use same database +spring.datasource.url=jdbc:postgresql://db-cluster:5432/turing +spring.jpa.hibernate.ddl-auto=validate +---- + +== Getting Additional Help + +=== Collecting Diagnostic Information + +When asking for help, collect this information: + +1. **System information:** +[source,bash] +---- +# Java version +java -version + +# System info +uname -a # Linux/macOS +systeminfo # Windows + +# Available memory +free -h # Linux +vm_stat # macOS +---- + +2. **Turing ES version:** +[source,bash] +---- +java -jar viglet-turing.jar --version +---- + +3. **Configuration files:** +* application.properties +* Any custom configuration files +* Environment variables + +4. **Log files:** +* Console output during error +* turing.log file contents +* Database logs if relevant + +5. **Network configuration:** +[source,bash] +---- +# Check listening ports +netstat -an | grep LISTEN +---- + +=== Where to Get Help + +**Community Support:** +* https://github.com/openviglet/turing/discussions[GitHub Discussions] +* https://github.com/openviglet/turing/issues[Bug Reports] + +**Professional Support:** +* Email: opensource@viglet.com +* Enterprise support contracts available + +**Documentation:** +* link:getting-started/[Getting Started Guide] +* link:faq/[Frequently Asked Questions] +* link:developer-guide/[Developer Guide] + +Remember: The community is here to help! Don't hesitate to ask questions and share your solutions with others. 🚀 \ No newline at end of file diff --git a/docs/turing/index.markdown b/docs/turing/index.markdown index f505ea7..a3b5060 100644 --- a/docs/turing/index.markdown +++ b/docs/turing/index.markdown @@ -1,8 +1,72 @@ --- layout: product title: Turing ES Documentation -banner-title: Turing ES Reference Documentation -description: Documentation about Turing ES. +banner-title: Turing ES - Semantic Search & AI-Powered Navigation +description: Complete documentation for Viglet Turing ES - Open Source Semantic Navigation and Search Engine with AI/NLP capabilities. product: turing product-name: Turing ES ---- \ No newline at end of file +--- + +## Welcome to Turing ES! 🚀 + +**Viglet Turing ES** is a powerful, open-source semantic search and navigation platform that transforms how you organize and discover content. Built with AI and Natural Language Processing (NLP) at its core, Turing ES helps you create intelligent search experiences and chatbots that understand your users' intent. + +### What Makes Turing ES Special? + +🔍 **Smart Search**: Goes beyond keyword matching with semantic understanding +🤖 **AI-Powered**: Integrates with multiple NLP providers (OpenNLP, spaCy, CoreNLP, OpenText) +💬 **Chatbot Ready**: Built-in conversational capabilities +⚡ **Fast & Scalable**: Powered by Apache Solr for enterprise-grade performance +🔗 **Easy Integration**: RESTful APIs and comprehensive connector support +🌍 **Multi-Language**: Supports multiple languages and locales + +### Quick Start Options + +
+
+

🚀 New to Turing ES?

+

Start with our step-by-step installation guide

+ Installation Guide +
+ +
+

⚙️ System Administrator?

+

Configure and manage your Turing ES instance

+ Admin Guide +
+ +
+

👩‍💻 Developer?

+

Integrate Turing ES into your applications

+ Developer Guide +
+
+ +### Use Cases + +Turing ES is perfect for: + +- **Corporate Intranets**: Intelligent content discovery across your organization +- **E-commerce**: Smart product search with natural language queries +- **Documentation Sites**: Help users find answers faster +- **Content Management**: Organize and navigate large content repositories +- **Customer Support**: AI-powered chatbots and knowledge bases + +--- + +## Documentation Overview + +### 📚 Core Guides +- [**Installation Guide**](0.3.10/installation-guide/) - Get Turing ES up and running +- [**Administration Guide**](0.3.10/administration-guide/) - Configure and manage your system +- [**Developer Guide**](0.3.10/developer-guide/) - APIs, SDKs, and integration examples +- [**Connectors Guide**](0.3.10/connectors/) - Connect your data sources + +### 📖 Additional Resources +- [**Release Notes**](0.3.10/release-notes/) - What's new in the latest version +- [**API Reference**](0.3.10/developer-guide/#api-reference) - Complete API documentation + +### 🎯 Quick Links +- [Download Latest Release](https://github.com/openviglet/turing/releases/latest) +- [GitHub Repository](https://github.com/openviglet/turing) +- [Community Support](https://github.com/openviglet/turing/discussions) \ No newline at end of file