diff --git a/.cursor/rules/development-workflow.mdc b/.cursor/rules/development-workflow.mdc new file mode 100644 index 0000000..bca8c3c --- /dev/null +++ b/.cursor/rules/development-workflow.mdc @@ -0,0 +1,36 @@ +--- +description: +globs: +alwaysApply: false +--- +# Development Workflow Guide + +## Building and Testing +The project uses Make for build automation. Common tasks are defined in the [Makefile](mdc:Makefile): + +```bash +make # Build the project +make test # Run tests +make bench # Run benchmarks +make deps # Install dependencies +``` + +## Tool Development +When implementing new tools: + +1. Define the tool struct and interface implementation in `pkg/tools/[toolname]/[toolname].go` +2. Follow the Tool interface defined in [pkg/tools/tool.go](mdc:pkg/tools/tool.go) +3. Write tests in `pkg/tools/[toolname]/[toolname]_test.go` +4. Example: [pkg/tools/math/math.go](mdc:pkg/tools/math/math.go) + +## Testing Guidelines +- Write unit tests for all new functionality in `pkg/tools/[toolname]/[toolname]_test.go` +- Include benchmarks for performance-critical operations +- Use table-driven tests where appropriate +- Example: [pkg/tools/rand/rand.go](mdc:pkg/tools/rand/rand.go) + +## Code Organization +- Keep tool implementations modular and focused +- Follow Go best practices and idioms +- Use meaningful package names and file structure +- Example: [pkg/tools/math/math.go](mdc:pkg/tools/math/math.go) diff --git a/.cursor/rules/git-commit-practices.mdc b/.cursor/rules/git-commit-practices.mdc new file mode 100644 index 0000000..b93c988 --- /dev/null +++ b/.cursor/rules/git-commit-practices.mdc @@ -0,0 +1,5 @@ +--- +description: +globs: +alwaysApply: false +--- diff --git a/.cursor/rules/llm-integration.mdc b/.cursor/rules/llm-integration.mdc new file mode 100644 index 0000000..f9356e2 --- /dev/null +++ b/.cursor/rules/llm-integration.mdc @@ -0,0 +1,33 @@ +--- +description: +globs: +alwaysApply: false +--- +# LLM Integration Guide + +## Overview +The project uses Language Model (LLM) integration for processing. The core components are: + +## LLM Interface +- [pkg/llm/llm.go](mdc:pkg/llm/llm.go) defines the LLM interface and registry +- All LLM implementations must implement the Process method +- Registry pattern allows multiple LLM backends + +## OpenAI Implementation +- [pkg/llm/openai/openai.go](mdc:pkg/llm/openai/openai.go) provides OpenAI integration +- Handles API communication and response parsing +- Supports environment-based configuration + +## Integration Guidelines +When implementing new LLM providers: + +1. Create a new package under `pkg/llm/` +2. Implement the LLM interface +3. Handle API keys and configuration securely +4. Follow the OpenAI implementation as a reference + +## Usage Example +```go +llm := openai.New(apiKey) +response, err := llm.Process(systemPrompt, userInput) +``` diff --git a/.cursor/rules/math-tool.mdc b/.cursor/rules/math-tool.mdc new file mode 100644 index 0000000..785abc6 --- /dev/null +++ b/.cursor/rules/math-tool.mdc @@ -0,0 +1,85 @@ +--- +description: +globs: +alwaysApply: false +--- +# Math Tool Implementation Guide + +The math tool in [pkg/tools/math/math.go](mdc:pkg/tools/math/math.go) provides mathematical expression parsing and calculation capabilities for the Gendo runtime. This guide outlines key implementation details and testing requirements. + +## Core Components + +### Expression Extraction +The tool extracts mathematical expressions from natural language input through these key functions: + +1. `extractFirstExpression`: Main entry point that handles: + - Natural language prefixes (e.g., "What is") + - Quoted expressions + - Word-based operators + - Multiple expression formats + +2. `tryExtractExpression`: Low-level parser that: + - Handles numeric values and operators + - Maintains expression validity + - Removes unnecessary punctuation + - Preserves negative numbers + +3. `convertWordOperators`: Converts natural language operators to symbols: + - "plus" → "+" + - "minus" → "-" + - "times" or "multiplied by" → "*" + - "divided by" → "/" + +### Expression Parsing +The `parseExpression` function handles: +- Operator precedence +- Multiple operators +- Decimal numbers +- Error conditions + +## Testing Requirements + +All changes must be verified through [pkg/tools/math/math_test.go](mdc:pkg/tools/math/math_test.go). Test cases must cover: + +1. Natural Language Processing: + - Prefixes ("What is") + - Suffixes ("equals", "is") + - Word operators + - Mixed formats + +2. Mathematical Operations: + - Basic arithmetic + - Negative numbers + - Decimal values + - Multiple operators + +3. Error Handling: + - Invalid expressions + - Division by zero + - Missing operators + - Invalid numbers + +## Integration Testing + +The calculator example in [examples/calculator.gendo](mdc:examples/calculator.gendo) serves as an integration test. Changes must be verified by running: + +```bash +echo 'What is 1+1?' | ./gendo --verbose examples/calculator.gendo +``` + +## Development Guidelines + +1. Follow TDD practices: + - Write tests first + - Verify edge cases + - Maintain test coverage + +2. Error Handling: + - Return meaningful error messages + - Validate input thoroughly + - Handle edge cases gracefully + +3. Code Organization: + - Keep functions focused and single-purpose + - Document complex logic + - Use clear variable names diff --git a/.cursor/rules/project-structure.mdc b/.cursor/rules/project-structure.mdc new file mode 100644 index 0000000..5f801dd --- /dev/null +++ b/.cursor/rules/project-structure.mdc @@ -0,0 +1,30 @@ +--- +description: +globs: +alwaysApply: false +--- +# Project Structure Guide + +## Overview +Gendo is a Go-based tool system with modular architecture. The main components are: + +## Core Components +- [pkg/tools/tool.go](mdc:pkg/tools/tool.go) - Defines the core Tool interface and Registry +- [pkg/llm/llm.go](mdc:pkg/llm/llm.go) - Language Model interface and Registry +- [pkg/llm/openai/openai.go](mdc:pkg/llm/openai/openai.go) - OpenAI LLM implementation + +## Tools +The tools package contains various tool implementations: + +### I/O Tools +- [pkg/tools/io/io.go](mdc:pkg/tools/io/io.go) - File reading and writing tools +- [pkg/tools/io/io_test.go](mdc:pkg/tools/io/io_test.go) - Tests and benchmarks for I/O tools + +### Math Tools +- [pkg/tools/math/math.go](mdc:pkg/tools/math/math.go) - Mathematical operations tool + +### Random Tools +- [pkg/tools/rand/rand.go](mdc:pkg/tools/rand/rand.go) - Random number generation tool + +## Build System +- [Makefile](mdc:Makefile) - Build automation and development tasks diff --git a/.cursor/rules/testing-requirements.mdc b/.cursor/rules/testing-requirements.mdc new file mode 100644 index 0000000..9d914be --- /dev/null +++ b/.cursor/rules/testing-requirements.mdc @@ -0,0 +1,63 @@ +--- +description: +globs: +alwaysApply: false +--- +# Gendo Runtime Testing Requirements + +## Test-Driven Development + +All changes to the Gendo runtime must follow strict test-driven development practices: + +1. Write tests before implementing features +2. Ensure all code paths are covered by tests +3. Verify edge cases and error conditions +4. Document test cases with clear descriptions + +## Testing Layers + +### Unit Tests +- Each package must have comprehensive unit tests +- Test files should be named `*_test.go` +- Use table-driven tests for multiple test cases +- Mock external dependencies appropriately + +### Integration Tests +- Examples in [examples/](mdc:examples/) serve as integration tests +- Do not modify example files unless updating the specification +- Verify changes against example scripts +- Use verbose logging for debugging + +### Runtime Verification +- Build the project using `make` +- Run example scripts to verify runtime behavior +- Test with the local LLM endpoint (http://localhost:18080/v1) +- Do not use external APIs or install new software + +## LLM Integration + +When testing LLM-related functionality: + +1. Use the local endpoint only +2. Preserve complete prompt definitions +3. Verify prompt construction +4. Test error handling for LLM responses + +## Error Handling + +Test cases must verify: + +1. Invalid inputs +2. Malformed scripts +3. Network failures +4. Resource limitations +5. Edge cases in data processing + +## Documentation + +All test files must include: + +1. Clear test case descriptions +2. Expected inputs and outputs +3. Error conditions being tested +4. Any special setup requirements diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md new file mode 100644 index 0000000..88954fd --- /dev/null +++ b/.github/copilot-instructions.md @@ -0,0 +1,34 @@ +When reviewing code with GitHub Copilot, please ensure all code follows these fundamental principles. Every piece of +code must be implemented as a small, independent module with clear boundaries and responsibilities. Each module should +have a single, well-defined purpose and minimal dependencies on other modules. + +All code must be accompanied by comprehensive unit tests. Test files should be written before implementing features, +following test-driven development practices. Each test file must cover all code paths, including edge cases and error +conditions. Use table-driven tests for multiple test cases and mock external dependencies appropriately. + +Code organization is critical. Keep implementations modular and focused, following language-specific best practices and +idioms. Use meaningful package names and maintain a clear file structure. Each module should be self-contained and +easily testable in isolation. + +Error handling must be thorough and explicit. Test cases should verify invalid inputs, malformed data, network failures, +resource limitations, and edge cases in data processing. All error conditions must be properly documented and tested. + +Documentation is essential. All code must include clear descriptions of its purpose, expected inputs and outputs, error +conditions, and any special setup requirements. Comments should explain why code is written a certain way, not what it +does. + +When reviewing code, ensure that all dependencies are properly managed and that the code follows the project's +established patterns and conventions. Code should be maintainable, readable, and follow the principle of least surprise. + +Integration tests are required for any code that interacts with external systems or other modules. These tests should +verify the correct behavior of the system as a whole, not just individual components. + +Performance considerations should be taken into account for any code that processes data or handles user interactions. +Include benchmarks for performance-critical operations and ensure that the code scales appropriately. + +Security is paramount. All code must be reviewed for potential security vulnerabilities, especially when handling user +input or interacting with external systems. Follow the principle of least privilege and implement proper input +validation and sanitization. + +Remember that code quality is not just about functionality but also about maintainability, readability, and reliability. +Every line of code should be written with these principles in mind. diff --git a/.gitignore b/.gitignore index 6f72f89..28b3c74 100644 --- a/.gitignore +++ b/.gitignore @@ -1,3 +1,9 @@ +# Compiled gendo runtime +gendo + +# Jetbrains configurations +.idea + # If you prefer the allow list template instead of the deny list, see community template: # https://github.com/github/gitignore/blob/main/community/Golang/Go.AllowList.gitignore # diff --git a/LICENSE b/LICENSE index 4d19e16..a2bbd6e 100644 --- a/LICENSE +++ b/LICENSE @@ -1,6 +1,6 @@ MIT License -Copyright (c) 2025 HyperifyIO +Copyright (c) 2025 Jaakko Heusala Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal diff --git a/Makefile b/Makefile new file mode 100644 index 0000000..eb73d7f --- /dev/null +++ b/Makefile @@ -0,0 +1,55 @@ +.PHONY: all build test bench clean + +# Default target +all: build + +# Build the project +build: + @echo "Building..." + go build -o gendo ./cmd/gendo + +# Run all tests +test: + @echo "Running tests..." + go test -v ./... + +# Run benchmarks +bench: + @echo "Running benchmarks..." + go test -bench=. -benchmem ./... + +# Run tests with coverage +coverage: + @echo "Running tests with coverage..." + go test -coverprofile=coverage.out ./... + go tool cover -html=coverage.out + +# Clean build artifacts +clean: + @echo "Cleaning..." + rm -f gendo + rm -f coverage.out + +# Install dependencies +deps: + @echo "Installing dependencies..." + go mod download + go mod tidy + +# Run linter +lint: + @echo "Running linter..." + golangci-lint run + +# Help target +help: + @echo "Available targets:" + @echo " all - Build the project (default)" + @echo " build - Build the project" + @echo " test - Run all tests" + @echo " bench - Run benchmarks" + @echo " coverage - Run tests with coverage report" + @echo " clean - Remove build artifacts" + @echo " deps - Install dependencies" + @echo " lint - Run linter" + @echo " help - Show this help message" \ No newline at end of file diff --git a/README.md b/README.md index a8dcb71..99d1c7c 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,174 @@ -# gendo -Gendo Programming Language +***This is old version. *** + +Latest development is here: https://github.com/hyperifyio/gnd/issues/31 + +--- + +# Gendo Language Specification v0.2 + +## 1. Introduction + +Gendo is a minimalist, prompt-based programming language designed for live, incremental code generation and execution via small, local AI models. Programs consist of self-contained **nodes**—each defining behavior or invoking AI prompts—that pass plain-text streams to each other, enabling rapid composition of functionality without mutable global state or hidden dependencies. + +## 2. Core Concepts + +### 2.1 Nodes + +- **Definition**: `nodeID : refID refID … [: prompt text]` + - `nodeID`: unique integer identifier + - `refID`s: list of nodeIDs this node can call + - `prompt text` *(optional)*: instructions for the AI when this node is invoked +- **Invocation**: `[errorDest !] [dest <] src input text` + - Routes stdout (`dest`) and stderr (`errorDest`) to designated nodes + - Defaults: stdout→1, stderr→2 + +### 2.2 Streams + +- Nodes exchange plain-text. AI-enabled nodes transform input via their prompt; passthrough nodes output input verbatim. +- Errors are first-class data, buffered and routed like stdout. + +### 2.3 Default Handlers + +You can set **default** destinations for stdout and stderr across subsequent invocations by writing a line with only the handler syntax. Whitespace may be used to indent purely for readability; it has no semantic effect. + +```gendo +# Only redefine stdout default to node 3 (errors still go to node 2) +3 < + +# Errors still go to previously set default (node 2) +5 Another input + +# You can override the default by specifying both handlers on a command: +# Here, errors→5, stdout→6 for this line only +5 ! 6 < Overridden command text +``` + +You can individually redefine defaults: + +```gendo +# Only redefine stdout default to node 3 (errors still go to node 2) +3 < + + # Errors still go to previously set default (node 2) + 5 Another input + + # You can override the default by specifying it + 5 < 6 Second command text +``` + +The default handlers remain in effect until redefined or the script ends. + +## 3. Structured Control Flow + +*(Looping and conditionals TBD—let's agree on design here before fleshing out.)* + +## 4. Modular Units & Files + +*(Modular units, namespaces, and imports TBD—let's agree on design before fleshing this out.)* + +## 5. Built-in Utilities + +> **Note:** Each tool-backed node requires enabling the corresponding tool in the Gendo runtime configuration. If a tool (e.g., `math`, `rand`, `read`, `write`) is not enabled, attempting to invoke its node will result in an error. + + +### 5.1 Math + +Gendo uses explicit **tool nodes** for arithmetic. If a node’s ref list includes the special `tool` directive, the runtime connects it to the math evaluator. + +- **Definition Syntax**: `nodeID : tool : math [config...]` + - `tool` marks a tool-backed node. + - Optional `config` may specify precision or mode (e.g., `float`). + +**Example Definition** +```gendo +# Node 50 runs the host math evaluator +50 : tool : math +``` + +**Example Invocation** +```gendo +# Evaluate an expression +< 50 3 * (2 + 5) +# → 21 +``` + +Tool nodes are sandboxed and only execute their designated operation. + +### 5.2 Random + +Gendo defines **tool nodes** for randomness. Including `tool` with `rand` uses the host RNG. + +- **Definition Syntax**: `nodeID : tool : rand [config...]` + - `config` may specify distribution (`uniform`, `normal`) or bounds. + +**Example Definition** +```gendo +# Node 51 runs the host RNG +51 : tool : rand +``` + +**Example Invocation** +```gendo +# Generate a random integer in [1,100] +< 51 1 100 +# → 73 (example) +``` + +Tool nodes are sandboxed and only execute their designated operation. + +### 5.3 I/O & Persistence I/O & Persistence + +Gendo also uses **tool nodes** for safe, sandboxed file operations. Include `tool` in the ref list and specify `read` or `write` as the tool name. + +- **Definition Syntax**: `nodeID : tool : read|write [filename]` + - `read` nodes take no input arguments and output the contents of the named file. + - `write` nodes accept stdin and save it to the named file, returning a confirmation message. + +**Example Definitions** +```gendo + +# Note 10 prints in +10 : tool : echo + +# Node 60 reads "config.json" +60 : tool : read : config.json + +# Node 61 writes to "results.txt" +61 : tool : write : results.txt +``` + +**Example Invocations** +```gendo +# Load configuration and send to node 61 +# → {"threshold":10} +61 < 60 + +# Write to node 61 +61 < 10 Some computed output text +# → "Written to results.txt" +``` + +- Filenames are sandboxed and isolated per program; no arbitrary paths allowed. + +## 6. Safety & Concurrency + +Gendo emphasizes reliability and performance: + +- **Stateless Nodes**: By default, nodes have no hidden state; all side effects occur through explicit tool nodes (e.g., I/O), ensuring predictable behavior. +- **Error Handling**: Errors are treated as first-class data. You choose where to route error messages via the `errorDest !` syntax; unhandled errors by default go to node 2. This allows logging, retries, or feeding errors into AI prompts for recovery. +- **Concurrency and Parallelism**: The runtime can execute independent node invocations in parallel when there are no data dependencies. This lets you leverage multi-core CPUs without adding complex syntax. +- **Sandboxing**: Tool nodes (math, rand, read, write) are isolated from arbitrary host resources. Filesystem and network access occur only through sandboxed APIs, preventing unauthorized operations. + +## 7. Data Model + +Gendo operates purely on plain text streams. Each node receives a string and returns a string. For structured data (e.g., JSON), simply define your prompts or AI nodes to parse and emit valid JSON. Gendo does not enforce data schemas, offering maximum flexibility. + +## 8. Community and Next Steps + +Gendo invites developers to build small, focused units that grow at runtime via AI. Its minimal core encourages experimentation: + +- **Extensibility**: Community-contributed tools and node libraries can add capabilities (e.g., HTTP, database connectors) without altering the core. +- **Safety**: All extensions must register as explicit tools and respect sandbox rules. +- **Example Library**: Curated sets of nodes for common tasks (e.g., data processing pipelines, chat bots). + +*Gendo makes it so.* diff --git a/cmd/gendo/main.go b/cmd/gendo/main.go new file mode 100644 index 0000000..2a7edbc --- /dev/null +++ b/cmd/gendo/main.go @@ -0,0 +1,34 @@ +// Package main is the entry point for the Gendo CLI tool. +// It handles command-line argument parsing and initializes the Gendo runtime. +// The tool accepts a script file as input and optional flags for verbosity and model selection. +package main + +import ( + "flag" + "fmt" + "os" + + "gendo/internal/gendo" + "gendo/pkg/log" +) + +func main() { + verbose := flag.Bool("verbose", false, "Enable verbose logging") + model := flag.String("model", "", "Model to use for LLM (overrides GENDO_MODEL environment variable)") + flag.StringVar(model, "m", "", "Model to use for LLM (shorthand)") + flag.Parse() + + args := flag.Args() + if len(args) != 1 { + fmt.Fprintf(os.Stderr, "Usage: %s [-verbose] [-m model]