Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 68 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# Virtual environments
venv/
env/
ENV/

# IDE
.vscode/
.idea/
*.swp
*.swo
*~

# OS
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
ehthumbs.db
Thumbs.db

# Docker
Dockerfile
docker-compose.yml
.dockerignore

# Git
.git/
.gitignore

# Documentation
*.md
!README.md

# Logs
*.log

# Test files
tests/
.pytest_cache/

# TorBot specific
*.json
!requirements.json
src/*.json
56 changes: 56 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Contributing to TorBot

Thank you for your interest in contributing to TorBot! This document provides guidelines for contributors.

## Getting Started

1. Fork the repository
2. Clone your fork locally
3. Create a virtual environment: `python -m venv venv`
4. Activate it: `venv\Scripts\activate` (Windows) or `source venv/bin/activate` (Unix/Mac)
5. Install dependencies: `pip install -r requirements.txt`

## Making Changes

1. Create a new branch for your feature/fix
2. Make your changes
3. Test your changes thoroughly
4. Update documentation if necessary
5. Update CHANGELOG.md with your changes

## Testing

Run the test suite before submitting:
```bash
python -m pytest tests/
```

## Code Style

- Follow PEP 8 guidelines
- Use meaningful variable and function names
- Add comments for complex logic
- Ensure all new code has appropriate error handling

## Security Considerations

- Never commit sensitive information (API keys, passwords, etc.)
- Validate all user inputs
- Use secure coding practices
- Test with various edge cases

## Submitting Changes

1. Push your branch to your fork
2. Create a pull request with:
- Clear description of changes
- Reference to any related issues
- Testing evidence/screenshots if applicable

## Review Process

- All PRs require review before merging
- Address feedback promptly
- Ensure CI checks pass

Thank you for contributing to TorBot!
71 changes: 57 additions & 14 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,22 +1,65 @@
# Use an official Python 3.11.4 image as the base
FROM python:3.11.4
# Multi-stage Dockerfile for TorBot
# Stage 1: Build stage
FROM python:3.11.4 as builder

# Set a working directory within the container
WORKDIR /app
# Set working directory
WORKDIR /build

# Install system dependencies required for building Python packages
RUN apt-get update && apt-get install -y \
gcc \
g++ \
&& rm -rf /var/lib/apt/lists/*

# Copy requirements first for better layer caching
COPY requirements.txt .

# Create virtual environment and install dependencies
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
RUN pip install --no-cache-dir --upgrade pip && \
pip install --no-cache-dir -r requirements.txt

# Stage 2: Runtime stage
FROM python:3.11.4-slim as runtime

# Create non-root user for security
RUN groupadd -r torbot && useradd -r -g torbot torbot

# Clone the TorBot repository from GitHub
RUN git clone https://github.com/DedSecInside/TorBot.git /app
# Install runtime dependencies only
RUN apt-get update && apt-get install -y \
curl \
&& rm -rf /var/lib/apt/lists/*

# Install dependencies
RUN pip install -r /app/requirements.txt
# Copy virtual environment from builder stage
COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

# Set the SOCKS5_PORT environment variable
# Set working directory
WORKDIR /app

# Copy application code
COPY --chown=torbot:torbot . /app

# Set environment variables
ENV PYTHONPATH="/app"
ENV SOCKS5_PORT=9050
ENV PYTHONUNBUFFERED=1

# Expose the port specified in the .env file
# Switch to non-root user
USER torbot

# Expose port
EXPOSE $SOCKS5_PORT

# Run the TorBot script
CMD ["poetry", "run", "python", "torbot"]
# Example way to run the container:
# docker run --network="host" your-image-name poetry run python torbot -u https://www.example.com --depth 2 --visualize tree --save json
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD python -c "import sys; sys.exit(0)"

# Default command
CMD ["python", "torbot.py", "--help"]

# Labels for better image management
LABEL maintainer="TorBot Team"
LABEL version="4.2.0"
LABEL description="TorBot - A web scraping and analysis tool with Tor support"
45 changes: 35 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<pre>
<pre>

████████╗ ██████╗ ██████╗ ██████╗ ██████╗ ████████╗
╚══██╔══╝██╔═══██╗██╔══██╗ ██╔══██╗██╔═████╗╚══██╔══╝
Expand Down Expand Up @@ -47,6 +47,11 @@
- Poetry (Optional)

### Python Dependencies
- All dependencies have been updated to latest secure versions (2024)
- Compatible with httpx 0.28.1+ (fixed proxy configuration)
- Enhanced error handling for NLP operations
- Updated security patches for all dependencies
- Added lxml>=5.3.0 for improved XML/HTML parsing

(see pyproject.toml or requirements.txt for more details)

Expand All @@ -55,26 +60,43 @@
### TorBot

#### Using `venv`
* If using Python ^3.4,
* If using Python ^3.9,
```sh
python -m venv torbot_venv
source torbot_venv/bin/activate
source torbot_venv/bin/activate # On Windows: source torbot_venv/Scripts/activate
pip install -r requirements.txt
pip install -e .
./main.py --help
python main.py --help
```

#### Using `docker`
#### Using Docker (Multi-stage build)

**Build the optimized image:**
```sh
docker build -t {image_name} .
docker build -t torbot:latest .
```

# Running without Tor
docker run {image_name} poetry run python torbot -u https://example.com --depth 2 --visualize tree --save json --disable-socks5
**Run with Docker Compose (Recommended):**
```sh
docker-compose up torbot
```

# Running with Tor
docker run --network="host" {image_name} poetry run python torbot -u https://example.com --depth 2 --visualize tree --save json --disable-socks5
**Run manually:**
```sh
# Basic usage
docker run --rm torbot:latest -u https://example.com --depth 2 --visualize tree --save json

# With Tor proxy
docker run --rm --network="host" torbot:latest -u https://example.onion --depth 2 --visualize tree

# Custom SOCKS5 proxy
docker run --rm torbot:latest -u https://example.onion --host 127.0.0.1 --port 9050 --depth 2
```

**Environment Variables:**
- `SOCKS5_HOST`: SOCKS5 proxy host (default: 127.0.0.1)
- `SOCKS5_PORT`: SOCKS5 proxy port (default: 9050)

### Options
<pre>
usage: Gather and analyze data from Tor sites.
Expand Down Expand Up @@ -103,6 +125,9 @@ Read more about torrc here : [Torrc](https://github.com/DedSecInside/TorBoT/blob
- [x] Implement BFS Search for webcrawler
- [x] Improve stability (Handle errors gracefully, expand test coverage, etc.)
- [x] Increase test coverage
- [x] Multi-stage Docker build for optimized container
- [x] Docker Compose support
- [x] Enhanced security with non-root container user
- [ ] Save the most recent search results to a database
- [ ] Randomize Tor Connection (Random Header and Identity)
- [ ] Keyword/Phrase Search
Expand Down
Loading