Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
217 changes: 197 additions & 20 deletions seatunnel-skill/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,13 @@ description: Apache SeaTunnel - A multimodal, high-performance, distributed data
author: auto-generated by repo2skill
platform: github
source: https://github.com/apache/seatunnel
tags: [data-integration, data-pipeline, etl, elt, real-time-streaming, batch-processing, cdc, distributed-computing, apache, java]
website_source: https://github.com/apache/seatunnel-website
tags: [data-integration, data-pipeline, etl, elt, real-time-streaming, batch-processing, cdc, distributed-computing, apache, java, docusaurus, documentation]
version: 2.3.13
generated: 2026-01-28
generated: 2026-02-26
license: Apache 2.0
repository: apache/seatunnel
website_repository: apache/seatunnel-website
---

# Apache SeaTunnel OpenCode Skill
Expand Down Expand Up @@ -45,10 +47,10 @@ mvn clean install -pl seatunnel-core -DskipTests
Visit the [official download page](https://seatunnel.apache.org/download) and select your version:

```bash
# Example: Download SeaTunnel 2.3.12
wget https://archive.apache.org/dist/seatunnel/2.3.12/apache-seatunnel-2.3.12-bin.tar.gz
tar -xzf apache-seatunnel-2.3.12-bin.tar.gz
cd apache-seatunnel-2.3.12
VERSION=2.3.12
wget https://archive.apache.org/dist/seatunnel/${VERSION}/apache-seatunnel-${VERSION}-bin.tar.gz
tar -xzf apache-seatunnel-${VERSION}-bin.tar.gz
cd apache-seatunnel-${VERSION}
```

#### 4. Basic Configuration
Expand Down Expand Up @@ -130,12 +132,17 @@ seatunnel.sh -c config/hello_world.conf -e flink
- **Full + Incremental**: Combined approach

3. **100+ Pre-built Connectors**
- Databases: MySQL, PostgreSQL, Oracle, SQL Server, MongoDB
- Databases: MySQL, PostgreSQL, Oracle, SQL Server, MongoDB, DB2, OceanBase
- Data Warehouses: Snowflake, BigQuery, Redshift, Iceberg
- Data Lakes: Hive, Iceberg, Hudi, Paimon
- Cloud SaaS: Salesforce, Shopify, Google Sheets
- Message Queues: Kafka, RabbitMQ, Pulsar
- Search Engines: Elasticsearch, OpenSearch
- Object Storage: S3, GCS, HDFS
- Message Queues: Kafka, RabbitMQ, Pulsar, RocketMQ, ActiveMQ
- Search Engines: Elasticsearch, OpenSearch, Easysearch
- OLAP Engines: ClickHouse, StarRocks, Doris, Druid
- Time-series Databases: IoTDB, TDengine, InfluxDB
- Vector Databases: Milvus, Qdrant
- Graph Databases: Neo4j
- Object Storage: S3, GCS, HDFS, OssFile, CosFile

4. **Multi-Engine Support**
- **Zeta Engine**: Lightweight, standalone deployment (no Spark/Flink required)
Expand All @@ -162,9 +169,9 @@ seatunnel.sh -c config/hello_world.conf -e flink

### Developer-Friendly

- **SQL-like Configuration**: Intuitive job definition syntax
- **SQL-like Configuration**: Intuitive HOCON job definition syntax
- **Visual Web UI**: Drag-and-drop job builder (SeaTunnel Web Project)
- **Extensive Documentation**: Comprehensive guides and examples
- **Extensive Documentation**: Comprehensive guides with i18n (English + Chinese)
- **Community Support**: Active community via Slack and mailing lists

### Production Ready
Expand Down Expand Up @@ -521,13 +528,21 @@ sink {
### Core Connector Types

#### Source Connectors
- **Jdbc**: Generic JDBC databases (MySQL, PostgreSQL, Oracle, SQL Server)
- **Jdbc**: Generic JDBC databases (MySQL, PostgreSQL, Oracle, SQL Server, DB2, OceanBase)
- **Kafka**: Apache Kafka topics
- **Mysql**: MySQL with CDC support
- **MongoDB**: MongoDB collections
- **PostgreSQL**: PostgreSQL with CDC
- **S3**: Amazon S3 and compatible storage
- **S3/OssFile/CosFile/HdfsFile**: Object storage / file systems
- **Http**: HTTP/HTTPS endpoints
- **Hive/Iceberg/Hudi/Paimon**: Data lake formats
- **Elasticsearch/Easysearch**: Search engines
- **Redis/HBase/Cassandra**: NoSQL databases
- **Pulsar/RocketMQ/RabbitMQ/ActiveMQ**: Message queues
- **ClickHouse/StarRocks/Doris/Druid**: OLAP engines
- **IoTDB/TDengine/InfluxDB**: Time-series databases
- **Milvus/Qdrant**: Vector databases
- **Neo4j**: Graph databases
- **FakeSource**: For testing and development

#### Transform Connectors
Expand All @@ -544,8 +559,12 @@ sink {
- **Redis**: Write to Redis
- **HBase**: Write to HBase tables
- **StarRocks**: Write to StarRocks tables
- **ClickHouse/Doris/Druid**: Write to OLAP engines
- **Hive/Iceberg/Hudi/Paimon**: Write to data lakes
- **Console**: Output to console (testing)

All source connectors above also have corresponding sink implementations where applicable.

### Configuration Options

#### Common Source Options
Expand Down Expand Up @@ -771,6 +790,160 @@ source {

---

## Website & Documentation System

The official documentation site (https://seatunnel.apache.org) is managed through the `apache/seatunnel-website` repository, built with **Docusaurus 2.4.3**.

### How Documentation Works

Documentation content lives in the **main codebase** (`apache/seatunnel` repo, `docs/` directory). The website repo fetches docs via `npm run sync` (or `tools/build-docs.js`) before building.

The sync process:
1. Creates `swap/` directory, clones `apache/seatunnel` codebase
2. Copies `docs/en` to website's `docs/` directory
3. Copies `docs/zh` to `i18n/zh-CN/` for Chinese translations
4. Copies `docs/images` to `static/image_en` and `static/image_zh`
5. Copies `docs/sidebars.js` to root level

### Website Repository Structure
```
seatunnel-website/
├── blog/ # Blog posts (EN)
├── community/ # Community docs
│ ├── contribution_guide/ # Contribution guidelines
│ │ ├── contribute.md
│ │ ├── committer.md
│ │ ├── code-review.md
│ │ ├── release.md
│ │ └── subscribe.md
│ └── submit_guide/ # Submission guidelines
│ ├── document.md
│ ├── license.md
│ └── submit-code.md
├── docs/ # Current version docs (synced from main repo)
├── docusaurus.config.js # Site configuration
├── i18n/zh-CN/ # Chinese translations
│ ├── docusaurus-plugin-content-blog/
│ ├── docusaurus-plugin-content-docs/
│ └── docusaurus-plugin-content-docs-community/
├── package.json
├── sidebars.js # Doc sidebar config (synced)
├── src/
│ ├── components/
│ ├── css/
│ ├── pages/
│ │ ├── home/ # Homepage
│ │ ├── team/ # Team page (with Base64 avatars)
│ │ ├── user/ # User showcase
│ │ └── versions/ # Version selector
│ └── styles/
├── static/ # Static assets
│ ├── doc/image/, image_en/, image_zh/
│ ├── home/, image/, user/
│ └── js/google_translate_init.js
├── tools/
│ ├── build-docs.js # Doc sync script
│ ├── common.js # Shared constants
│ ├── version.js # Version management
│ ├── image-copy.js # Image processing
│ └── fetch-team-avatars.js # Team avatar updater
├── versioned_docs/ # Historical version docs (20 versions: 1.x ~ 2.3.12)
├── versioned_sidebars/ # Historical sidebars
├── versions.json # Version registry
└── seatunnel_web_versions.json # Web UI version registry
```

### Documentation Categories (per version)

Each version's docs contain:
- `about.md` - Project overview
- `command/` - CLI commands (e.g., connector-check)
- `concept/` - Core concepts (config, schema-evolution, speed-limit, SQL config, event-listener)
- `connector-v2/` - Connector documentation
- `source/` - Source connector docs (100+ connectors)
- `sink/` - Sink connector docs
- `changelog/` - Per-connector changelogs
- `formats/` - Data formats (Avro, Canal JSON, Debezium JSON, OGG JSON, Protobuf)
- `contribution/` - Contributor guides
- `faq.md` - Frequently asked questions
- `other-engine/` - Flink/Spark engine specifics
- `seatunnel-engine/` - Zeta engine docs + telemetry
- `start-v2/` - Getting started guides
- `docker/` - Docker deployment
- `kubernetes/` - K8s deployment
- `locally/` - Local setup
- `transform-v2/` - Transform documentation (SQL, etc.)

### Versioned Documentation

20 historical versions maintained: `1.x`, `2.1.0`~`2.1.3`, `2.2.0-beta`, `2.3.0-beta`~`2.3.12`

Total: ~3691 versioned doc files + ~1652 i18n files

### Website Development

```bash
# Clone
git clone git@github.com:apache/seatunnel-website.git
cd seatunnel-website

# Sync docs from main repo
npm run sync
# Or with SSH: export PROTOCOL_MODE=ssh && npm run sync

# Install dependencies
npm install

# Dev server (English)
npm run start

# Dev server (Chinese)
npm run start-zh

# Production build (needs ~10GB heap)
npm run build
# Or faster parallel build
npm run build:fast

# Serve built site
npm run serve
```

### Adding a New Version

```bash
# 1. Create versioned snapshot
npm run version <target_version>

# 2. Update download links
# Edit src/pages/download/st_data.json
```

### Branching Strategy

| Branch | Purpose |
|---|---|
| `main` | Default development branch |
| `asf-site` | Production (https://seatunnel.apache.org) |
| `asf-staging` | Staging (https://seatunnel.staged.apache.org) |

### CI/CD

GitHub Actions workflow (`.github/workflows/deploy.yml`):
- Triggers: push to `main`, PRs to `main`, daily cron (5:00 AM UTC)
- Node.js 18.20.7
- Steps: `npm install` -> `npm run sync` -> `npm run build` -> deploy to `asf-site` branch

### Key Conventions

- Directory names: lowercase, underscore-separated, plural (e.g., `scripts`, `components`)
- JS/static files: lowercase, dash-separated (e.g., `render-dom.js`)
- Images: stored under `static/{module_name}`
- Styles: placed in `src/css/`
- Most pages have "Edit this page" link pointing to GitHub source

---

## Development

### Project Architecture
Expand All @@ -785,7 +958,7 @@ seatunnel/
│ ├── seatunnel-engine-flink/
│ ├── seatunnel-engine-spark/
│ └── seatunnel-engine-zeta/
├── seatunnel-connectors/ # Connector implementations
├── seatunnel-connectors/ # 100+ connector implementations
│ ├── seatunnel-connectors-*/ # One per connector type
└── seatunnel-dist/ # Distribution package
```
Expand Down Expand Up @@ -1159,11 +1332,14 @@ java -jar target/seatunnel-web-*.jar

## Resources

### Official Documentation
### Official Links
- [SeaTunnel Official Website](https://seatunnel.apache.org/)
- [GitHub Repository](https://github.com/apache/seatunnel)
- [GitHub (Engine)](https://github.com/apache/seatunnel)
- [GitHub (Website)](https://github.com/apache/seatunnel-website)
- [GitHub (Web UI)](https://github.com/apache/seatunnel-web)
- [Documentation Hub](https://seatunnel.apache.org/docs/)
- [Connector List](https://seatunnel.apache.org/docs/2.3.12/connector-v2/overview)
- [Downloads](https://seatunnel.apache.org/download)

### Community
- [Slack Channel](https://the-asf.slack.com/archives/C01CB5186TL)
Expand All @@ -1184,8 +1360,9 @@ java -jar target/seatunnel-web-*.jar
- [Distributed Systems Concepts](https://en.wikipedia.org/wiki/Distributed_computing)

### Version History
- **2.3.12** (Stable) - Current recommended version
- **2.3.12** (Latest Stable) - Current recommended version
- **2.3.13-SNAPSHOT** (Development)
- 20 historical versions maintained in documentation (1.x ~ 2.3.12)
- [All Releases](https://archive.apache.org/dist/seatunnel/)

---
Expand All @@ -1207,6 +1384,6 @@ Apache License 2.0 - See [LICENSE](https://github.com/apache/seatunnel/blob/mast

---

**Last Updated**: 2026-01-28
**Last Updated**: 2026-02-26
**Skill Version**: 2.3.13
**Status**: Production Ready ✓
**Sources**: apache/seatunnel + apache/seatunnel-website
Loading