Code review · testing · debug pass #3

New issue

Closed

opened 2026-05-19 04:42:46 +00:00 by foravo_admin · 0 comments

foravo_admin commented

2026-05-19 04:42:46 +00:00

Owner

Imported from GitHub issue M00C1FER/mesh-review#10.

Source: https://github.com/M00C1FER/mesh-review/issues/10
Original author: @M00C1FER
Original state: closed

Code review · testing · debug pass

Vendor-neutral multi-LLM PR review + summary toolkit. Two subcommands (review,
summary) sharing one CLI registry. Sigma falsification gate dropped false positives
in the demo.

Scope of this pass

Correctness review
- The Sigma falsification gate (sigma_gate(falsifier=...)): trace the path from
  consensus cluster → falsifier call → decision-to-drop. Edge cases:
  - All falsifiers timeout: does it default to keep-finding or drop-finding?
  - Falsifier returns malformed JSON: does it error gracefully?
  - confidence > 1.0 or < 0.0: bounds-checked?
- Consensus clustering on (file, severity, line ±2): what happens when severity
  differs between CLIs for the same line? Verify cluster merge semantics.
- Summary aggregation: when one CLI fails, does the rest still produce a usable
  summary, or does the whole thing abort?
Test coverage
- Run pytest --cov and identify uncovered branches.
- Add tests for: malformed CLI output (truncated JSON, invalid utf-8), CLI
  binary-not-found errors, registry-CLI-list-empty edge case.
- Add a real-world golden test using the example examples/broken-repo/ if it
  exists — assert the gate eliminates a known false-positive without dropping
  a known true-positive.
Debug edge cases
- The --list-clis flag: does it work when one CLI in the registry is missing?
- GitHub Action invocation path: verify the action file in .github/workflows/ is
  well-formed and runs end-to-end on a sample PR.
- Concurrent invocation: two mesh-review instances against the same PR — do they
  race or deduplicate?
Documentation polish (low priority)
- The README's status note says "v0.1 default falsifier is no-op stub" — confirm
  this is still accurate; if a real default has landed, update.

Deliverable

PR(s) per logical change. Conventional commit format. Bundle no-op-grade test additions
into a single PR; correctness fixes get their own.

Style + scope constraints

Python 3.10/3.11/3.12 (matrix in CI). No 3.9 fallbacks.
Don't change the public sigma_gate / build_consensus signatures without flagging.
Keep the registry vendor-neutral: don't hard-code Anthropic/OpenAI/Ollama names in
core code paths. Examples are fine in docs.

Project state: solo author, pre-1.0, MIT.

Imported from GitHub issue `M00C1FER/mesh-review#10`. Source: https://github.com/M00C1FER/mesh-review/issues/10 Original author: @M00C1FER Original state: closed  --- ## Code review · testing · debug pass Vendor-neutral multi-LLM PR review + summary toolkit. Two subcommands (`review`, `summary`) sharing one CLI registry. Sigma falsification gate dropped false positives in the demo. ### Scope of this pass 1. **Correctness review** - The Sigma falsification gate (`sigma_gate(falsifier=...)`): trace the path from consensus cluster → falsifier call → decision-to-drop. Edge cases: - All falsifiers timeout: does it default to keep-finding or drop-finding? - Falsifier returns malformed JSON: does it error gracefully? - `confidence > 1.0` or `< 0.0`: bounds-checked? - Consensus clustering on `(file, severity, line ±2)`: what happens when severity differs between CLIs for the same line? Verify cluster merge semantics. - Summary aggregation: when one CLI fails, does the rest still produce a usable summary, or does the whole thing abort? 2. **Test coverage** - Run `pytest --cov` and identify uncovered branches. - Add tests for: malformed CLI output (truncated JSON, invalid utf-8), CLI binary-not-found errors, registry-CLI-list-empty edge case. - Add a real-world golden test using the example `examples/broken-repo/` if it exists — assert the gate eliminates a known false-positive without dropping a known true-positive. 3. **Debug edge cases** - The `--list-clis` flag: does it work when one CLI in the registry is missing? - GitHub Action invocation path: verify the action file in `.github/workflows/` is well-formed and runs end-to-end on a sample PR. - Concurrent invocation: two `mesh-review` instances against the same PR — do they race or deduplicate? 4. **Documentation polish** (low priority) - The README's status note says "v0.1 default falsifier is no-op stub" — confirm this is still accurate; if a real default has landed, update. ### Deliverable PR(s) per logical change. Conventional commit format. Bundle no-op-grade test additions into a single PR; correctness fixes get their own. ### Style + scope constraints - Python 3.10/3.11/3.12 (matrix in CI). No 3.9 fallbacks. - Don't change the public `sigma_gate` / `build_consensus` signatures without flagging. - Keep the registry vendor-neutral: don't hard-code Anthropic/OpenAI/Ollama names in core code paths. Examples are fine in docs. Project state: solo author, pre-1.0, MIT.

foravo_admin closed this issue

2026-05-19 04:42:46 +00:00

foravo_admin referenced this issue

2026-05-19 04:42:46 +00:00

[AUDIT] Automated code-review summary — mesh-review #4