Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .github/workflows/build-tools.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,12 @@ jobs:
path: crates/synapse-cli
ext: .exe

# Atomicity Checker (Linux)
- target: x86_64-unknown-linux-gnu
os: ubuntu-latest
tool: atomicity-checker
path: tools/atomicity-checker

steps:
- uses: actions/checkout@v4

Expand Down Expand Up @@ -75,6 +81,7 @@ jobs:

# Linux
cp downloaded-artifacts/synapse-x86_64-unknown-linux-gnu/synapse bin/synapse-linux
cp downloaded-artifacts/atomicity-checker-x86_64-unknown-linux-gnu/atomicity-checker bin/atomicity-checker-linux

# Windows
cp downloaded-artifacts/synapse-x86_64-pc-windows-msvc/synapse.exe bin/synapse.exe
Expand Down
36 changes: 36 additions & 0 deletions .github/workflows/commit-atomicity.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# 🔍 Commit Atomicity - Ensures commits follow the atomic commit principle
# Part of Git-Core Protocol

name: 🔍 Commit Atomicity

on:
pull_request:
branches: [main]
types: [opened, synchronize, reopened]

permissions:
contents: read
pull-requests: read

jobs:
check-atomicity:
name: Check Atomicity
runs-on: ubuntu-latest

steps:
- name: 📋 Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history needed for commit analysis

- name: 🦀 Setup Rust
uses: dtolnay/rust-toolchain@stable

- name: 🏗️ Build Atomicity Checker
run: cargo build --release -p atomicity-checker

- name: 🔍 Run Atomicity Check
run: |
./target/release/atomicity-checker \
--range "origin/${{ github.base_ref }}..HEAD" \
--config ".github/atomicity-config.yml"
Comment on lines +33 to +36
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fetch the base ref before computing origin/<base>..HEAD.

Line 35 assumes origin/${{ github.base_ref }} is present locally. In PR checkouts this can be missing, causing range resolution failures and a false-red workflow.

Suggested workflow patch
       - name: 🦀 Setup Rust
         uses: dtolnay/rust-toolchain@stable

+      - name: 📥 Fetch base branch ref
+        run: |
+          git fetch --no-tags --prune --depth=1 origin \
+            +refs/heads/${{ github.base_ref }}:refs/remotes/origin/${{ github.base_ref }}
+
       - name: 🏗️ Build Atomicity Checker
         run: cargo build --release -p atomicity-checker
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/commit-atomicity.yml around lines 33 - 36, The workflow
step that runs the atomicity checker uses the range string "origin/${{
github.base_ref }}..HEAD" but assumes the base branch exists locally; update the
job to fetch the base ref before running the checker (e.g., add a git fetch
origin "${{ github.base_ref }}" or a broader git fetch --prune --unshallow
origin +refs/heads/${{ github.base_ref }}:refs/remotes/origin/${{
github.base_ref }}) so the referenced origin/${{ github.base_ref }} is present
and the atomicity-checker range resolves correctly.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ members = [
"crates/synapse-immune",
"apps/desktop/src-tauri",
"crates/synapse-cognition",
".github/actions/structure-validator",
".github/actions/structure-validator", "tools/atomicity-checker",
]

# Ensure Cargo doesn't treat this crate as an implicit workspace member.
Expand Down
20 changes: 20 additions & 0 deletions tools/atomicity-checker/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
[package]
name = "atomicity-checker"
version.workspace = true
edition.workspace = true
license.workspace = true
authors.workspace = true
repository.workspace = true
description.workspace = true

[dependencies]
clap = { workspace = true }
serde = { workspace = true }
serde_json = { workspace = true }
serde_yaml = "0.9"
regex = { workspace = true }
anyhow = { workspace = true }
colored = "2"
git2 = "0.19"
globset = "0.4"
Comment on lines +14 to +19

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better build reproducibility and to avoid potential issues from transitive dependencies, it's a good practice to pin your dependencies to more specific versions rather than using broad version requirements.

Suggested change
serde_yaml = "0.9"
regex = { workspace = true }
anyhow = { workspace = true }
colored = "2"
git2 = "0.19"
globset = "0.4"
serde_yaml = "0.9.34"
regex = { workspace = true }
anyhow = { workspace = true }
colored = "2.1.0"
git2 = "0.19.0"
globset = "0.4.14"

chrono = { workspace = true }
141 changes: 141 additions & 0 deletions tools/atomicity-checker/src/checker.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
use crate::config::Config;
use crate::models::{CommitInfo, FileInfo};
use globset::{Glob, GlobSet, GlobSetBuilder};
use anyhow::Result;
use std::collections::HashSet;
use regex::Regex;

pub struct Checker {
config: Config,
concern_globs: Vec<(String, GlobSet)>,
exclude_glob: GlobSet,
bot_regexes: Vec<Regex>,
}

impl Checker {
pub fn new(config: Config) -> Result<Self> {
let mut concern_globs = Vec::new();
for (concern, patterns) in &config.concern_patterns {
let mut builder = GlobSetBuilder::new();
for pattern in patterns {
builder.add(Glob::new(pattern)?);
}
concern_globs.push((concern.clone(), builder.build()?));
}
Comment on lines +17 to +24
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Non-deterministic iteration order may cause inconsistent categorization.

config.concern_patterns is a HashMap, and iterating over it yields entries in arbitrary order. If a file matches multiple concern patterns, the first match wins (line 59-61), leading to potentially different results across runs.

🔧 Proposed fix to ensure deterministic ordering
+        let mut sorted_concerns: Vec<_> = config.concern_patterns.iter().collect();
+        sorted_concerns.sort_by_key(|(k, _)| k.clone());
+        
         let mut concern_globs = Vec::new();
-        for (concern, patterns) in &config.concern_patterns {
+        for (concern, patterns) in sorted_concerns {
             let mut builder = GlobSetBuilder::new();
             for pattern in patterns {
                 builder.add(Glob::new(pattern)?);
             }
             concern_globs.push((concern.clone(), builder.build()?));
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let mut concern_globs = Vec::new();
for (concern, patterns) in &config.concern_patterns {
let mut builder = GlobSetBuilder::new();
for pattern in patterns {
builder.add(Glob::new(pattern)?);
}
concern_globs.push((concern.clone(), builder.build()?));
}
let mut sorted_concerns: Vec<_> = config.concern_patterns.iter().collect();
sorted_concerns.sort_by_key(|(k, _)| k.clone());
let mut concern_globs = Vec::new();
for (concern, patterns) in sorted_concerns {
let mut builder = GlobSetBuilder::new();
for pattern in patterns {
builder.add(Glob::new(pattern)?);
}
concern_globs.push((concern.clone(), builder.build()?));
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/atomicity-checker/src/checker.rs` around lines 17 - 24, The iteration
over config.concern_patterns (a HashMap) is non-deterministic which can change
which concern is chosen when a file matches multiple patterns; make the ordering
deterministic by collecting the entries from config.concern_patterns into a Vec,
sorting that Vec by the concern key (or by a stable comparator), then building
concern_globs from that sorted list using GlobSetBuilder and builder.build() so
the first-match logic remains stable; update the loop that constructs
concern_globs (and any subsequent loop that assigns concerns using those
globsets) to use the sorted sequence instead of iterating the HashMap directly.


let mut exclude_builder = GlobSetBuilder::new();
for pattern in &config.exclude_patterns {
exclude_builder.add(Glob::new(pattern)?);
}
let exclude_glob = exclude_builder.build()?;

let mut bot_regexes = Vec::new();
for pattern in &config.bot_patterns {
bot_regexes.push(Regex::new(pattern)?);
}

Ok(Checker {
config,
concern_globs,
exclude_glob,
bot_regexes,
})
}

pub fn is_bot(&self, author: &str) -> bool {
if !self.config.ignore_bots {
return false;
}
for re in &self.bot_regexes {
if re.is_match(author) {
return true;
}
}
false
Comment on lines +49 to +54

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The loop to check for a bot author can be simplified by using the any iterator adapter. This is more idiomatic and makes the code more concise and readable.

        self.bot_regexes.iter().any(|re| re.is_match(author))

}

pub fn categorize_file(&self, path: &str) -> String {
for (concern, globset) in &self.concern_globs {
if globset.is_match(path) {
return concern.clone();
}
}

// Fallback categorization logic if not matched by config
if path.contains("test") || path.contains("spec") {
return "tests".to_string();
}
if path.ends_with(".md") {
return "docs".to_string();
}
Comment on lines +64 to +70

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The fallback categorization logic appears to be a regression compared to the original shell script it replaces. The script had more comprehensive checks for test files (e.g., _test.rs, *.test.rs) and documentation files (e.g., README). Consider making this logic more robust to match the previous implementation, or ideally, making these fallback patterns configurable as well.


"other".to_string()
Comment on lines +64 to +72
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fallback categorization is case-sensitive, potentially missing test files.

The substring checks path.contains("test") and path.contains("spec") are case-sensitive. Paths like Tests/, __TESTS__/, or Spec/ would fall through to "other".

🔧 Proposed fix for case-insensitive matching
         // Fallback categorization logic if not matched by config
-        if path.contains("test") || path.contains("spec") {
+        let path_lower = path.to_lowercase();
+        if path_lower.contains("test") || path_lower.contains("spec") {
             return "tests".to_string();
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Fallback categorization logic if not matched by config
if path.contains("test") || path.contains("spec") {
return "tests".to_string();
}
if path.ends_with(".md") {
return "docs".to_string();
}
"other".to_string()
// Fallback categorization logic if not matched by config
let path_lower = path.to_lowercase();
if path_lower.contains("test") || path_lower.contains("spec") {
return "tests".to_string();
}
if path.ends_with(".md") {
return "docs".to_string();
}
"other".to_string()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/atomicity-checker/src/checker.rs` around lines 64 - 72, The fallback
categorization uses case-sensitive substring checks on the path variable and
misses paths like "Tests" or "Spec"; modify the logic in the block that checks
path.contains("test") / path.contains("spec") and path.ends_with(".md") to
operate on a normalized lowercase string (e.g., let lc = path.to_lowercase())
and use lc.contains("test"), lc.contains("spec"), and lc.ends_with(".md") so
matching becomes case-insensitive while preserving the same return values
("tests", "docs", "other").

}

pub fn is_excluded(&self, path: &str) -> bool {
self.exclude_glob.is_match(path)
}

pub fn analyze_commit(&self, sha: String, message: String, author: String, files: Vec<String>) -> CommitInfo {
let mut concerns = HashSet::new();
let mut file_infos = Vec::new();

for file in files {
if self.is_excluded(&file) {
continue;
}

let concern = self.categorize_file(&file);
concerns.insert(concern.clone());
file_infos.push(FileInfo {
path: file,
concern,
});
}

let count = concerns.len();
let is_atomic = count <= self.config.max_concerns;

CommitInfo {
sha,
message,
author,
concerns: concerns.into_iter().collect(),
count,
is_atomic,
files: file_infos,
}
}
}

#[cfg(test)]
mod tests {
use super::*;
use crate::config::Config;
use std::collections::HashMap;

#[test]
fn test_categorization() {
let mut concern_patterns = HashMap::new();
concern_patterns.insert("source".to_string(), vec!["src/**/*.rs".to_string()]);
concern_patterns.insert("docs".to_string(), vec!["*.md".to_string()]);

let config = Config {
enabled: true,
mode: "warning".to_string(),
ignore_bots: true,
bot_patterns: vec![],
max_concerns: 1,
concern_patterns,
exclude_patterns: vec!["Cargo.lock".to_string()],
};

let checker = Checker::new(config).unwrap();

assert_eq!(checker.categorize_file("src/main.rs"), "source");
assert_eq!(checker.categorize_file("README.md"), "docs");
assert_eq!(checker.categorize_file("other.txt"), "other");
assert!(checker.is_excluded("Cargo.lock"));
assert!(!checker.is_excluded("src/main.rs"));
}
}
48 changes: 48 additions & 0 deletions tools/atomicity-checker/src/config.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
use std::fs;
use std::path::Path;
use anyhow::Result;

#[derive(Debug, Serialize, Deserialize, Clone)]
pub struct Config {
#[serde(default = "default_enabled")]
pub enabled: bool,
#[serde(default = "default_mode")]
pub mode: String,
#[serde(default = "default_ignore_bots")]
pub ignore_bots: bool,
#[serde(default)]
pub bot_patterns: Vec<String>,
#[serde(default = "default_max_concerns")]
pub max_concerns: usize,
#[serde(default)]
pub concern_patterns: HashMap<String, Vec<String>>,
#[serde(default)]
pub exclude_patterns: Vec<String>,
}

fn default_enabled() -> bool { true }
fn default_mode() -> String { "warning".to_string() }
fn default_ignore_bots() -> bool { true }
fn default_max_concerns() -> usize { 1 }

impl Config {
pub fn load<P: AsRef<Path>>(path: P) -> Result<Self> {
let content = fs::read_to_string(path)?;
let config: Config = serde_yaml::from_str(&content)?;
Ok(config)
}

pub fn default() -> Self {
Config {
enabled: true,
mode: "warning".to_string(),
ignore_bots: true,
bot_patterns: vec![],
max_concerns: 1,
concern_patterns: HashMap::new(),
exclude_patterns: vec![],
}
}
}
Comment on lines +30 to +48

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Instead of a standalone default() method, it's more idiomatic in Rust to implement the Default trait for the Config struct. This allows for usage like Config::default() and in struct initializations with ..Default::default(), improving consistency with the Rust ecosystem.

impl Config {
    pub fn load<P: AsRef<Path>>(path: P) -> Result<Self> {
        let content = fs::read_to_string(path)?;
        let config: Config = serde_yaml::from_str(&content)?;
        Ok(config)
    }
}

impl Default for Config {
    fn default() -> Self {
        Self {
            enabled: true,
            mode: "warning".to_string(),
            ignore_bots: true,
            bot_patterns: vec![],
            max_concerns: 1,
            concern_patterns: HashMap::new(),
            exclude_patterns: vec![],
        }
    }
}

61 changes: 61 additions & 0 deletions tools/atomicity-checker/src/git.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
use git2::{Repository, DiffOptions, Oid};
use anyhow::Result;
use std::path::Path;

pub struct GitContext {
repo: Repository,
}

impl GitContext {
pub fn open<P: AsRef<Path>>(path: P) -> Result<Self> {
let repo = Repository::discover(path)?;
Ok(GitContext { repo })
}

pub fn get_commits_in_range(&self, range: &str) -> Result<Vec<Oid>> {
let mut revwalk = self.repo.revwalk()?;
revwalk.push_range(range)?;

let mut oids = Vec::new();
for oid in revwalk {
oids.push(oid?);
}
Ok(oids)
Comment on lines +19 to +23

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This loop for collecting OIDs can be written more idiomatically and concisely using collect() on the iterator. This is a common pattern in Rust for collecting items from an iterator that returns Results into a collection.

Suggested change
let mut oids = Vec::new();
for oid in revwalk {
oids.push(oid?);
}
Ok(oids)
revwalk.collect()

}

pub fn get_changed_files(&self, commit_oid: Oid) -> Result<Vec<String>> {
let commit = self.repo.find_commit(commit_oid)?;
let tree = commit.tree()?;

let parent_tree = if commit.parent_count() > 0 {
Some(commit.parent(0)?.tree()?)
} else {
None
};

let mut opts = DiffOptions::new();
let diff = self.repo.diff_tree_to_tree(parent_tree.as_ref(), Some(&tree), Some(&mut opts))?;

let mut files = Vec::new();
diff.foreach(
&mut |delta, _| {
if let Some(path) = delta.new_file().path() {
files.push(path.to_string_lossy().into_owned());
}
Comment on lines +40 to +44
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
# First, let's see the file structure and find the exact lines
fd git.rs tools/atomicity-checker/src/

Repository: iberi22/synapse-protocol

Length of output: 102


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# Read the relevant section with context
sed -n '35,50p' tools/atomicity-checker/src/git.rs

Repository: iberi22/synapse-protocol

Length of output: 552


🌐 Web query:

git2-rs DiffDelta deleted files new_file path old_file path behavior

💡 Result:

In git2-rs, DiffDelta is a thin wrapper over libgit2’s git_diff_delta, where:

  • old_file() is the “from” side, new_file() is the “to” side. (libgit2.org)
  • Either side can be absent: old may be absent for an added file; new may be absent for a deleted file. (libgit2.org)

Because of that, DiffFile::path() returns an Option<&Path> (it can be null). (docs.rs)

Practical behavior you’ll see

  • Deleted file delta (Delta::Deleted):

    • delta.old_file().path()Some(old_path)
    • delta.new_file().path()None (the “new” side doesn’t exist)
  • Added file delta (Delta::Added):

    • old_file().path()None
    • new_file().path()Some(new_path)
  • Renamed / Copied: usually both are Some(...) and differ.

So for “the path of the file in this delta”, common practice is:

let path = delta.old_file().path().or(delta.new_file().path());

This pattern is used in real code specifically to handle added/deleted cases. (origen-sdk.org)

Note: if you generate a diff with libgit2’s “reverse” option, the meaning of old/new is flipped. (libgit2.org)

Citations:


Handle deleted files in diff classification to prevent incomplete atomicity results.

The current code only checks delta.new_file().path(), which returns None for deleted files in git2-rs. This causes deleted files to be omitted from diff classification, leading to incomplete and potentially incorrect atomicity analysis.

For deleted files, delta.old_file().path() contains the path (the "new" side does not exist). Use a fallback pattern to capture both added and deleted files:

Suggested fix
         diff.foreach(
             &mut |delta, _| {
-                if let Some(path) = delta.new_file().path() {
+                let path = delta
+                    .new_file()
+                    .path()
+                    .or_else(|| delta.old_file().path());
+                if let Some(path) = path {
                     files.push(path.to_string_lossy().into_owned());
                 }
                 true
             },
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
diff.foreach(
&mut |delta, _| {
if let Some(path) = delta.new_file().path() {
files.push(path.to_string_lossy().into_owned());
}
diff.foreach(
&mut |delta, _| {
let path = delta
.new_file()
.path()
.or_else(|| delta.old_file().path());
if let Some(path) = path {
files.push(path.to_string_lossy().into_owned());
}
true
},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/atomicity-checker/src/git.rs` around lines 40 - 44, The diff iteration
currently only uses delta.new_file().path() inside the closure passed to
diff.foreach, so deleted files (where new_file().path() is None) are skipped;
update the closure to fall back to delta.old_file().path() when
new_file().path() is None (e.g., try new_file().path().or_else(||
old_file().path())) and push that path into files so both added and deleted
paths are captured for classification (refer to diff.foreach, the closure,
delta.new_file(), delta.old_file(), and the files vector).

true
},
None,
None,
None,
)?;

Ok(files)
}

pub fn get_commit_details(&self, commit_oid: Oid) -> Result<(String, String)> {
let commit = self.repo.find_commit(commit_oid)?;
let message = commit.message().unwrap_or("").trim().to_string();
let author = commit.author().name().unwrap_or("").to_string();
Ok((message, author))
}
}
Loading
Loading