From 6ce1e8f758adbbc40c669466acff6e498bcdb4aa Mon Sep 17 00:00:00 2001 From: The-Inceptions <83852285+The-Inceptions@users.noreply.github.com> Date: Mon, 23 Feb 2026 05:27:53 +0000 Subject: [PATCH] docs: fill empty pages and fix broken internal links MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three pages were empty/blank and are now fully written: - contributing/contributing.md: developer guide covering git workflow, build/test commands, engine architecture (mermaid diagram), and a step-by-step plugin authoring guide based on engine/plugins/api/chaos.go - asset_db/index.md: asset database overview with supported backends (SQLite/PostgreSQL/Neo4j), configuration, and how it fits together - data_sources/data_sources.md: credentials setup guide and a full reference table for all 38 data sources with credential types and TTLs Five pages had broken internal links which are now fixed: - open_asset_model/relations/index.md: removed two references to ../framework_tools/assoc.md (page does not exist); corrected ../assetdb/triples.md → ../../asset_db/triples.md (wrong path) - open_asset_model/properties/index.md: same two fixes as above - open_asset_model/assets/index.md: same two fixes as above - open_asset_model/assets/identifier.md: fixed double extension typo contact_record.md.md → contact_record.md - asset_db/postgres.md: fixed ../configuration/configuration.md → ../configuration/index.md (file does not exist at that path) Verified with: mkdocs build --strict (0 warnings) Co-Authored-By: Claude Sonnet 4.6 --- docs/asset_db/index.md | 77 ++++++ docs/asset_db/postgres.md | 2 +- docs/contributing/contributing.md | 307 +++++++++++++++++++++ docs/data_sources/data_sources.md | 129 +++++++++ docs/open_asset_model/assets/identifier.md | 2 +- docs/open_asset_model/assets/index.md | 3 +- docs/open_asset_model/properties/index.md | 3 +- docs/open_asset_model/relations/index.md | 6 +- 8 files changed, 520 insertions(+), 9 deletions(-) diff --git a/docs/asset_db/index.md b/docs/asset_db/index.md index e69de29b..fdb87b51 100644 --- a/docs/asset_db/index.md +++ b/docs/asset_db/index.md @@ -0,0 +1,77 @@ +# :simple-owasp: Asset Database + +The **Asset Database** is the persistent store for all +entities and relations discovered during an Amass +enumeration. It is backed by the +[`owasp-amass/asset-db`][asset-db] library, which +provides a unified repository interface over multiple +storage backends. + +[asset-db]: https://github.com/owasp-amass/asset-db + +Every Amass **Session** receives its own database +connection. Assets are stored as typed entities +following the [Open Asset Model][oam]; the connections +between them are stored as directed edges (relations) +with optional properties attached to both entities +and edges. + +[oam]: ../open_asset_model/index.md + +## :material-database: Supported Backends + +| Backend | Connection String | Best For | +| :--- | :--- | :--- | +| **SQLite** | Auto-created | Dev, quick scans | +| **PostgreSQL** | `postgres://u:p@host:5432/db` | Production | +| **Neo4j** | `neo4j://u:p@host:7687/db` | Graph queries | + +If no database is configured, Amass automatically +creates a local SQLite database in its configuration +directory. + +## :material-cog: Configuring the Database + +Set the connection string in `config.yaml`: + +```yaml +options: + database: "postgres://amass:amass4OWASP@host:5432/db" +``` + +You can also configure the database via environment +variables, which take precedence over the config file: + +```text +AMASS_DB_USER database username +AMASS_DB_PASSWORD database password +AMASS_DB_HOST database host +AMASS_DB_PORT database port +AMASS_DB_NAME database name +``` + +## :material-graph-outline: How It Fits Together + +The asset database sits at the center of the data +model. The Amass engine writes every discovered asset +into the database during enumeration. Assets are +stored once and referenced many times — the same IP +address discovered by multiple data sources is a +single entity with multiple `SourceProperty` +annotations. + +Relations between assets (e.g., an FQDN resolving to +an IP address via a `dns_record` relation) are stored +as directed edges. This graph structure is what makes +the [Triples Query Language](triples.md) possible. + +## :material-page-next: Next Steps + +- [PostgreSQL Setup](postgres.md) — production + PostgreSQL setup guide +- [Triples Query Language](triples.md) — traverse + the asset graph with subject-predicate-object queries + +--- + +*© 2025 Jeff Foley — Licensed under Apache 2.0.* diff --git a/docs/asset_db/postgres.md b/docs/asset_db/postgres.md index ab91c5b8..ba643532 100644 --- a/docs/asset_db/postgres.md +++ b/docs/asset_db/postgres.md @@ -101,7 +101,7 @@ Amass is now ready to store data in your PostgreSQL database. This enables you t ## See Also -* [Amass Configuration](../configuration/configuration.md) +* [Amass Configuration](../configuration/index.md) * [PostgreSQL Documentation](https://www.postgresql.org/docs/current/index.html) * [PostgreSQL `pg_trgm` Extension Docs](https://www.postgresql.org/docs/current/pgtrgm.html) * [Managing Environment Variables Securely](https://direnv.net/) diff --git a/docs/contributing/contributing.md b/docs/contributing/contributing.md index 8b137891..ff389fdd 100644 --- a/docs/contributing/contributing.md +++ b/docs/contributing/contributing.md @@ -1 +1,308 @@ +# :simple-owasp: Contributing to OWASP Amass +Welcome to the OWASP Amass Project! This guide covers +everything you need to contribute — whether you're filing +a bug, writing documentation, developing a data source +plugin, or working on the engine internals. + +## :material-forum: Getting Involved + +Start by joining the community: + +- **Discord:** — the + primary place for discussion and questions +- **GitHub Issues:** + — + check what needs help +- **Repos:** All under : + `amass`, `asset-db`, `open-asset-model`, `docs` + +## :material-source-branch: Git Workflow + +### Forking + +Go requires code to live at its original import path. +Use `git remote` to set up your fork against the +original repository: + +```bash +# 1. Fork on GitHub, then from your local clone: +git remote rename origin upstream +git remote add origin \ + git@github.com:YOUR_USERNAME/amass.git + +# 2. Fetch updates from upstream +git fetch upstream +git rebase upstream/develop +``` + +### Branch and PR Rules + +- Branch from the tip of `develop`, **not** `main` +- Do not `--force` push onto `develop` (only allowed + when reverting a broken commit) +- Rebase your branch on top of `develop` before + opening a PR: `git rebase upstream/develop` +- All PRs target `develop` — do not open PRs + against `main` + +## :material-code-tags: Code Standards + +```bash +# Required before every commit +gofmt -w . + +# Lint (same flags as CI) +golangci-lint run --timeout=60m ./... + +# Build (CGO must be disabled) +CGO_ENABLED=0 go install -v ./... + +# Test +CGO_ENABLED=0 go test -v ./... +CGO_ENABLED=0 go test -v \ + -run TestName ./path/to/package/... +``` + +Go version: **1.26.0**. `CGO_ENABLED=0` is required +for all builds and tests. + +## :material-engine-outline: Engine Architecture + +Understanding how the engine works is essential for +plugin development and engine contributions. + +```mermaid +flowchart TD + A["REST API\n:4000\nPOST /sessions/{id}/assets"] --> B + + B["Session Backlog\nSQLite\nclaim/ack/lease"] + + B -->|"ClaimNext(assetType)"| C + + C["Dispatcher\nrunEvents loop\nauto-scales / 5 s"] + + C -->|"consistent hash ring"| D + + D["PipelinePool\n1 pool per asset type\nFQDN/IP: 4-32\nothers: 1-4"] + + D --> E["Pipeline Stages\nPos 1-50\nFIFO/DynPool\nor Parallel"] + + E --> F["Handler Callback\nfunc(*Event) error"] + + F -->|"DispatchEvent\nnew assets"| B +``` + +### Key Components + +**SessionManager** — each enumeration runs in an +isolated `Session`. Each session has its own database +connection, `Backlog`, `Scope`, structured logger, +and temp directory. + +**Registry** — plugins register `Handler` structs at +startup, keyed by `EventType` (asset type) and +`Position` (priority 1–50). The registry builds +pipelines from these registrations. + +**Dispatcher** — claims events from session backlogs, +routes them to asset-type-specific `PipelinePool` +instances via a consistent hash ring, and auto-scales +pool size every 5 seconds based on backlog depth. + +**PipelinePool** — per-asset-type pool. A consistent +hash ring with 50 slots routes events to specific +instances (ensuring locality). FQDN and IPAddress +pools scale between 4 and 32 instances; all other +types scale between 1 and 4. Hard cap is +`maxInstances × 2`. + +**Backlog** — durable SQLite-backed work queue with +claim/ack/lease semantics. Items progress through: +`queued → leased (in-flight) → done`. Expired leases +automatically return items to `queued`, enabling +recovery from failures. + +### Handler Instance Constants + +Defined in `engine/plugins/support/support.go`: + +| Constant | Value | Typical Use | +| :--- | :---: | :--- | +| `MinHandlerInstances` | 4 | Low-volume handlers | +| `MidHandlerInstances` | 16 | Standard API sources | +| `HighHandlerInstances` | 32 | High-throughput | +| `MaxHandlerInstances` | 64 | DNS apex, bulk res. | + +## :material-puzzle-outline: Writing a Plugin + +Plugins live in `engine/plugins/` organized by +category (`api/`, `dns/`, `scrape/`, `whois/`, +`enrich/`, `brute/`, `horizontals/`, +`service_discovery/`). + +### Step 1 — Implement the Plugin Interface + +```go +// engine/types/registry.go +type Plugin interface { + Name() string + // Called once at startup; register handlers here + Start(r Registry) error + // Clean up goroutines and channels + Stop() +} +``` + +### Step 2 — Register Handlers in Start + +```go +func (p *myPlugin) Start(r et.Registry) error { + p.log = r.Log().WithGroup("plugin"). + With("name", p.name) + + if err := r.RegisterHandler(&et.Handler{ + Plugin: p, + Name: "MyPlugin-Handler", + // Priority 1-50; lower runs first + Position: 21, + // Max concurrency (4/16/32/64) + MaxInstances: support.MidHandlerInstances, + // Asset type that triggers this handler + EventType: oam.FQDN, + // Asset types this handler may produce + Transforms: []string{ + string(oam.FQDN), + }, + Callback: p.check, + }); err != nil { + return err + } + + p.log.Info("Plugin started") + return nil +} +``` + +Handler field reference: + +| Field | Description | +| :--- | :--- | +| `Position` | Priority 1–50; lower runs first | +| `Exclusive` | If `true`, sole handler at this pos | +| `MaxInstances` | Enables `DynamicPool` when `> 1` | +| `Transforms` | Asset types you will dispatch | +| `EventType` | OAM asset type triggering handler | +| `Callback` | `func(*Event) error` — the logic | + +### Step 3 — Implement the Callback + +The pattern below is drawn from +`engine/plugins/api/chaos.go`, the clearest +minimal example: + +```go +func (p *myPlugin) check(e *et.Event) error { + // 1. Type-assert the triggering asset + fqdn, ok := e.Entity.Asset.(*oamdns.FQDN) + if !ok { + return errors.New( + "failed to extract the FQDN asset") + } + + // 2. Confirm the SLD is in scope + if !support.HasSLDInScope(e) { + return nil + } + + // 3. Confirm credentials are present + ds := e.Session.Config(). + GetDataSourceConfig(p.name) + if ds == nil || len(ds.Creds) == 0 { + return nil + } + + // 4. Check TTL — skip if processed recently + since, err := support.TTLStartTime( + e.Session.Config(), + string(oam.FQDN), + string(oam.FQDN), + p.name, + ) + if err != nil { + return err + } + if support.AssetMonitoredWithinTTL( + e.Session, e.Entity, p.source, since) { + return nil + } + + // 5. Query, store, dispatch + names := p.query(e, fqdn.Name) + support.MarkAssetMonitored( + e.Session, e.Entity, p.source) + + if len(names) > 0 { + entities := support.StoreFQDNsWithSource( + e.Session, names, p.source, + p.name, p.name+"-Handler") + support.ProcessFQDNsWithSource( + e, entities, p.source) + } + return nil +} +``` + +Common support utilities +(`engine/plugins/support/support.go`): + +| Function | Purpose | +| :--- | :--- | +| `HasSLDInScope(e)` | True if SLD is in scope | +| `TTLStartTime(...)` | Cache window cutoff time | +| `AssetMonitoredWithinTTL(...)` | Already done? | +| `MarkAssetMonitored(...)` | Record processing | +| `StoreFQDNsWithSource(...)` | Bulk-create FQDNs | +| `ProcessFQDNsWithSource(...)` | Dispatch events | + +### Step 4 — Register in load.go + +Add your constructor to the `pluginNewFuncs` slice +in `engine/plugins/load.go`: + +```go +var pluginNewFuncs = []func() et.Plugin{ + // ...existing plugins... + mypackage.NewMyPlugin, +} +``` + +### Step 5 — Add Credentials Template + +If your plugin requires credentials, add a +commented-out entry to `resources/datasources.yaml`: + +```yaml +# - name: MyDataSource +# ttl: 1440 +# creds: +# account: +# apikey: null +``` + +## :material-source-pull: Contributing to Other Repos + +The same git workflow applies to all repos under the +`owasp-amass` organization: + +- **`asset-db`** — database schema, repository + interface, and migrations +- **`open-asset-model`** — asset, relation, and + property type definitions in Go +- **`docs`** — this documentation site; + `mkdocs build --strict` must pass before + opening a PR + +--- + +*© 2025 Jeff Foley — Licensed under Apache 2.0.* diff --git a/docs/data_sources/data_sources.md b/docs/data_sources/data_sources.md index 8b137891..430e662d 100644 --- a/docs/data_sources/data_sources.md +++ b/docs/data_sources/data_sources.md @@ -1 +1,130 @@ +# :simple-owasp: Data Sources +Data sources are plugins that query external services +to discover assets. They include API integrations, web +scrapers, and WHOIS/RDAP lookups. Each data source +registers one or more Handlers that are triggered when +specific asset types are processed by the engine. + +All data sources that require credentials are configured +through a `datasources.yaml` file. + +## :material-key: Configuring Credentials + +Point Amass to your data sources file from +`config.yaml`: + +```yaml +options: + datasources: "./datasources.yaml" +``` + +Each source in `datasources.yaml` follows this format: + +```yaml +datasources: + - name: Shodan + ttl: 10080 # optional: cache TTL (minutes) + creds: + account: + apikey: YOUR_API_KEY +``` + +For sources requiring a username and API key: + +```yaml + - name: PassiveTotal + creds: + account: + username: your@email.com + apikey: YOUR_API_KEY +``` + +For sources requiring a username and password: + +```yaml + - name: CIRCL + creds: + account: + username: YOUR_USERNAME + password: YOUR_PASSWORD +``` + +Multiple credential accounts are supported per +source — useful for rate limit distribution: + +```yaml + - name: C99 + creds: + account1: + apikey: FIRST_KEY + account2: + apikey: SECOND_KEY +``` + +## :material-tune: Global Options + +Set a minimum TTL floor that applies to all sources: + +```yaml +global_options: + minimum_ttl: 1440 # 1 day (minutes) — default +``` + +Sources without an explicit `ttl` value in their entry +use this global floor. + +## :material-table: Available Data Sources + +All 38 data sources require credentials. Default TTL +is 1440 minutes (1 day) unless listed otherwise. The +full template for all sources is in +[`resources/datasources.yaml`][ds-yaml]. + +[ds-yaml]: + https://github.com/owasp-amass/amass/blob/main/resources/datasources.yaml + +| Source | Cred Type | TTL | +| :--- | :--- | :---: | +| 360PassiveDNS | API Key | 3600 | +| ASNLookup | API Key | 1440 | +| Ahrefs | API Key | 4320 | +| AlienVault | API Key | 1440 | +| BeVigil | API Key | 1440 | +| BigDataCloud | API Key | 1440 | +| BinaryEdge | API Key | 10080 | +| BufferOver | API Key | 1440 | +| BuiltWith | API Key | 10080 | +| C99 | API Key (multi-acct) | 4320 | +| CertCentral | User + API Key | 1440 | +| Chaos | API Key | 4320 | +| CIRCL | User + Password | 1440 | +| DNSDB | API Key | 4320 | +| DNSRepo | API Key | 1440 | +| Detectify | API Key | 1440 | +| FOFA | User + API Key | 10080 | +| FullHunt | API Key | 1440 | +| GitHub | API Key | 4320 | +| GitLab | API Key | 4320 | +| HackerTarget | API Key | 1440 | +| Hunter | API Key | 1440 | +| IntelX | API Key | 1440 | +| IPdata | API Key | 1440 | +| IPinfo | API Key | 1440 | +| LeakIX | API Key | 1440 | +| Netlas | API Key | 1440 | +| PassiveTotal | User + API Key | 10080 | +| PentestTools | API Key | 10080 | +| PublicWWW | API Key | 10080 | +| SecurityTrails | API Key | 1440 | +| Shodan | API Key | 10080 | +| URLScan | API Key | 1440 | +| VirusTotal | API Key | 10080 | +| WhoisXMLAPI | API Key | 1440 | +| Yandex | User + API Key | 1440 | +| ZETAlytics | API Key | 1440 | +| ZoomEye | User + Password | 1440 | + +--- + +*© 2025 Jeff Foley — Licensed under Apache 2.0.* diff --git a/docs/open_asset_model/assets/identifier.md b/docs/open_asset_model/assets/identifier.md index a869ddc3..3f9e06bb 100644 --- a/docs/open_asset_model/assets/identifier.md +++ b/docs/open_asset_model/assets/identifier.md @@ -53,7 +53,7 @@ allrel --> contact | Relation Type | Relation Label | Target Assets | Description | | :-----------------: | :----------------: | :--------------: | :------------ | -| [`SimpleRelation`](../relations/simple_relation.md) | `registration_agency` | [`ContactRecord`](./contact_record.md.md) | Links a registration ID with the registering agency | +| [`SimpleRelation`](../relations/simple_relation.md) | `registration_agency` | [`ContactRecord`](./contact_record.md) | Links a registration ID with the registering agency | | [`SimpleRelation`](../relations/simple_relation.md) | `issuing_authority` | [`ContactRecord`](./contact_record.md) | Links an identifier with its issuing authority | | [`SimpleRelation`](../relations/simple_relation.md) | `issuing_agent` | [`ContactRecord`](./contact_record.md) | Links an identifier with its issuing agent | diff --git a/docs/open_asset_model/assets/index.md b/docs/open_asset_model/assets/index.md index 0d0b4699..4e369c7f 100644 --- a/docs/open_asset_model/assets/index.md +++ b/docs/open_asset_model/assets/index.md @@ -102,8 +102,7 @@ Take a look at the pages where details are provided for each asset type. - [Relations](../relations/index.md) – Overview of Relations in the Open Asset Model. - [Properties](../properties/index.md) - Overview of a Property in the Open Asset Model. -- [Triples](../assetdb/triples.md) – Querying the graph with SPARQL‑inspired triples. -- [Assoc Tool](../framework_tools/assoc.md) – Using the command-line tool that queries the graph. +- [Triples](../../asset_db/triples.md) – Querying the graph with SPARQL‑inspired triples. --- diff --git a/docs/open_asset_model/properties/index.md b/docs/open_asset_model/properties/index.md index 972b73de..3f228b35 100644 --- a/docs/open_asset_model/properties/index.md +++ b/docs/open_asset_model/properties/index.md @@ -81,8 +81,7 @@ Explore the types of data used to enrich and explain assets in the graph: - [Assets](../assets/index.md) – The core entities in the graph. - [Relations](../relations/index.md) – Overview of Relations in the Open Asset Model. -- [Triples](../assetdb/triples.md) – Performing graph queries using SPARQL-style syntax. -- [Assoc Tool](../framework_tools/assoc.md) – Using the command-line tool that queries the graph. +- [Triples](../../asset_db/triples.md) – Performing graph queries using SPARQL-style syntax. --- diff --git a/docs/open_asset_model/relations/index.md b/docs/open_asset_model/relations/index.md index d1ed1320..a35b42cd 100644 --- a/docs/open_asset_model/relations/index.md +++ b/docs/open_asset_model/relations/index.md @@ -37,7 +37,7 @@ Each relation answers three questions: } ``` -See specific relation types for actual JSON field names. Also, see the [Assoc Tool](../framework_tools/assoc.md) and [Triples](../assetdb/triples.md) for more information regarding how to query the data collected. +See specific relation types for actual JSON field names. Also, see [Triples](../../asset_db/triples.md) for more information regarding how to query the data collected. ## :material-graph-outline: Common Relation Labels (Partial) @@ -82,8 +82,8 @@ Learn more about the structure and usage of the model: - [Assets](../assets/index.md) – The core entities in the graph. - [Properties](../properties/index.md) – Descriptive metadata that enrich assets. -- [Triples](../assetdb/triples.md) – Performing graph queries using SPARQL-style syntax. -- [Assoc Tool](../framework_tools/assoc.md) – Using the command-line tool that queries the graph. +- [Triples](../../asset_db/triples.md) – Performing graph queries using SPARQL-style syntax. + --- *© 2025 Jeff Foley — Licensed under Apache 2.0.*