Prompt Details
Model
(claude-4-6-sonnet)
Token size
1,121
Example input
[PRODUCT_IDEA]: A subscription billing engine for SaaS companies. Handles plan management, usage metering, invoice generation, payment processing via Stripe, and dunning workflows for failed payments. Exposes a REST API for frontend integration and sends webhook events for external systems.
[CORE_FEATURES]: Node.js 20, Express, PostgreSQL, Prisma ORM, Stripe SDK, BullMQ for job queues
[TECH_STACK]: REST API + webhook emitter + CLI for admin operations
[TARGET_USER]: Growth-stage
[SUCCESS_CRITERIA]: Must pass SOC2 audit requirements, all financial calculations in cents to avoid floating point errors, 99.9% uptime target for payment endpoints
Example output
The issue is a compatibility problem between Typer 0.12.5 and Click 8.3.1. Let me fix the dependency pins and continue.All three commands work. Now let's run the full test suite:All 10 tests pass. Now let me clean up and copy everything to outputs:Here's the complete, verified MVP. All 10 tests passed, all three CLI commands work end-to-end. Here's the full delivery document:
---
## SECTION 1 — PROJECT HEADER
**Repo name:** `pdfsummary`
**Product name:** pdf-summary
**One-line goal:** CLI tool that reads a folder of PDFs and produces searchable markdown summaries, a keyword index, and CSV export.
**Primary interface:** Command line (Typer)
**Input:** A folder of `.pdf` files
**Output:** One `.md` per PDF, a `_keyword_index.md`, search results to stdout, CSV file
**Deployment target:** Local machine (any OS with Python 3.12+)
| Feature | Success Criterion |
|---|---|
| PDF parsing | Handles 12 test PDFs without errors |
| Markdown summary | One `.md` per PDF, 12 produced in ~0.5s |
| Keyword index | Cross-document `_keyword_index.md` generated |
| Local search | `search "machine learning"` returns 4 correct hits |
| CSV export | Produces valid CSV with 5 columns, 12 rows |
## SECTION 2 — SCOPE CONTRACT
**In scope:** PDF text extraction, per-doc markdown summaries, keyword frequency extraction, cross-doc keyword index, case-insensitive search, CSV export via pandas.
**Out of scope:** OCR for scanned PDFs, LLM-based abstractive summaries, web UI, watch-folder mode, multi-language support, PDF image extraction.
**Constraints:** Deterministic outputs (no timestamps/randomness), safe input validation, pinned dependencies, single-phase local execution, no real secrets needed.
## SECTION 3 — ASSUMPTION LEDGER
- PDFs are text-based (not scanned images). Image-only PDFs produce empty summaries with a warning.
- Typer 0.15.1 chosen over 0.12.5 for Click 8.x compatibility.
- Keywords are English-only using a hardcoded stop word list.
- "Summary" means extractive (excerpt + keywords + page headings), not abstractive.
- No authentication or network access required.
- Output filenames derived from PDF metadata title or stem; collisions overwrite.
## SECTION 4 — DEPENDENCY MAP
**Runtime:** Python 3.12+
**Libraries:** `typer==0.15.1`, `pypdf==4.3.1`, `pandas==2.2.3`
**Dev:** `pytest` (for tests only)
**Environment variables:** None required.
**External services:** None.
## SECTION 5 — FAILURE ATLAS
| Category | Failure | Fallback |
|---|---|---|
| Input | Folder doesn't exist | Exit code 1 + actionable error message |
| Input | No PDFs in folder | Exit code 1 + clear message |
| Input | Empty search query | Exit code 1 + message |
| Parsing | Corrupt/unreadable PDF | `doc.error` set, warning printed, continues |
| Parsing | Page text extraction fails | Empty string for that page, continues |
| Filesystem | Output dir doesn't exist | Created automatically via `mkdir(parents=True)` |
| Interface | Wrong subcommand | Typer shows help automatically |
## SECTION 6 — FOLDER TREE
```
pdfsummary/
├── .gitignore
├── README.md
├── requirements.txt
├── pdfsummary/
│ ├── __init__.py
│ ├── __main__.py
│ ├── cli.py
│ ├── parser.py
│ ├── summarizer.py
│ ├── search.py
│ └── export.py
├── tests/
│ ├── __init__.py
│ ├── generate_test_pdfs.py
│ ├── test_smoke.py
│ └── test_functional.py
└── test_pdfs/ (12 generated PDFs)
```
## SECTION 8 — RUN AND TEST COMMANDS
```bash
# Setup
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
# Generate test PDFs (if not present)
python tests/generate_test_pdfs.py
# Run MVP
python -m pdfsummary summarize ./test_pdfs -o ./summaries
python -m pdfsummary search ./summaries "machine learning"
python -m pdfsummary export ./summaries --folder ./test_pdfs --csv results.csv
# Run tests
pip install pytest
python -m pytest tests/ -v
```
## SECTION 9 — VERIFICATION CHECKLIST
| Gate | Status | Notes |
|---|---|---|
| Runnable | ✅ | All 3 commands execute successfully |
| Completeness | ✅ | Every file in tree is fully written |
| Imports | ✅ | Absolute imports, no circular deps |
| Config | ✅ | No env vars needed, no secrets |
| Interface | ✅ | Typer CLI with help, validation, error codes |
| Data | ✅ | 12 test PDFs generated deterministically |
| Tests | ✅ | 10/10 passed in 6.29s |
| Determinism | ✅ | No timestamps, no random values |
| Secrets | ✅ | None required |
| Success: 10+ PDFs | ✅ | 12 PDFs processed without errors |
| Success: <30s | ✅ | Completed in ~0.5s (test verified) |
| Success: correct search | ✅ | "machine learning" returns 4 hits across 2 docs |
-
# pdf-summary
**One-line goal:** Read a folder of PDFs and produce searchable markdown summaries, a keyword index, and CSV export — from the command line, in seconds.
## What It Does
| Core Feature | Success Criterion |
|---|---|
| PDF parsing | Reads any folder with 10+ PDFs without errors |
| Markdown summary generation | Produces one `.md` file per PDF in under 30 s total |
| Keyword index | Builds a cross-document keyword index for fast lookup |
| Local search command | `search` returns correct results in a test folder |
| CSV export | Exports all findings to a single CSV |
## Setup
```bash
# Requires Python 3.12+
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
```
## Usage
### Summarize a folder of PDFs
```bash
python -m pdfsummary summarize ./my-pdfs --output ./summaries
```
This creates:
- One `<filename>.md` per PDF inside `./summaries/`
- `_keyword_index.md` — a cross-document keyword index
### Search across summaries
```bash
python -m pdfsummary search ./summaries "machine learning"
```
Returns every document and line matching the query.
### Export to CSV
```bash
python -m pdfsummary export ./summaries --csv results.csv
```
Produces a CSV with columns: `filename`, `title`, `page_count`, `word_count`, `top_keywords`.
## Troubleshooting
| Problem | Fix |
|---|---|
| `No PDFs found` | Check the path points to a folder containing `.pdf` files |
| `Could not extract text` | The PDF may be image-only (scanned). This tool handles text-based PDFs. |
| Empty summaries | Some PDFs use custom encodings. Try opening in a reader first. |
| Slow on large files | The 30 s target is for typical academic papers (~20 pages each). Very large PDFs take longer. |
## Upgrade Paths
- **OCR support:** Add `pytesseract` + `pdf2image` for scanned PDFs.
- **LLM summaries:** Pipe extracted text to an LLM API for abstractive summaries.
- **Web UI:** Wrap the CLI with Streamlit or FastAPI for browser access.
- **Batch scheduling:** Add a watch-folder mode with `watchdog`.
--
typer==0.15.1
pypdf==4.3.1
pandas==2.2.3
--
"""CLI entry point for pdf-summary."""
from pathlib import Path
from typing import Optional
import typer
from pdfsummary.parser import collect_pdfs, extract_pdf
from pdfsummary.summarizer import write_summaries
from pdfsummary.search import search_summaries
from pdfsummary.export import export_csv
app = typer.Typer(
name="pdfsummary",
help="Summarize a folder of PDFs into searchable markdown.",
add_completion=False,
)
@app.command()
def summarize(
folder: Path = typer.Argument(..., help="Path to folder containing PDF files."),
output: Path = typer.Option(
Path("summaries"), "--output", "-o", help="Output directory for markdown files."
),
) -> None:
"""Parse every PDF in FOLDER and write one markdown summary per file."""
if not folder.is_dir():
typer.echo(f"Error: '{folder}' is not a directory.", err=True)
raise typer.Exit(code=1)
pdf_paths = collect_pdfs(folder)
if not pdf_paths:
typer.echo(f"Error: No PDF files found in '{folder}'.", err=True)
raise typer.Exit(code=1)
typer.echo(f"Found {len(pdf_paths)} PDF(s) in '{folder}'.")
docs = []
for p in pdf_paths:
typer.echo(f" Parsing: {p.name}")
doc = extract_pdf(p)
if doc.error:
typer.echo(f" Warning: {doc.error}", err=True)
docs.append(doc)
written = write_summaries(docs, output)
typer.echo(f"Wrote {len(written)} file(s) to '{output}'.")
@app.command()
def search(
output_dir: Path = typer.Argument(..., help="Directory containing markdown summaries."),
query: str = typer.Argument(..., help="Search term (case-insensitive)."),
) -> None:
"""Search generated summaries for a keyword or phrase."""
if not query.strip():
typer.echo("Error: Search query must not be empty.", err=True)
raise typer.Exit(code=1)
if not output_dir.is_dir():
typer.echo(f"Error: '{output_dir}' is not a directory.", err=True)
raise typer.Exit(code=1)
hits = search_summaries(output_dir, query)
if not hits:
typer.echo(f"No results for '{query}'.")
raise typer.Exit(code=0)
typer.echo(f"Found {len(hits)} result(s) for '{query}':\n")
for hit in hits:
typer.echo(f" {hit.filename}:{hit.line_number} {hit.line_text}")
@app.command()
def export(
output_dir: Path = typer.Argument(..., help="Directory containing markdown summaries."),
folder: Optional[Path] = typer.Option(
None, "--folder", "-f", help="Re-read PDFs from this folder for richer CSV data."
),
csv: Path = typer.Option(
Path("results.csv"), "--csv", "-c", help="Output CSV file path."
),
) -> None:
"""Export summary data to CSV.
If --folder is given, re-parses PDFs for full metadata.
Otherwise, builds a minimal CSV from the markdown files.
"""
if folder and folder.is_dir():
pdf_paths = collect_pdfs(folder)
if not pdf_paths:
typer.echo(f"Error: No PDFs found in '{folder}'.", err=True)
raise typer.Exit(code=1)
docs = [extract_pdf(p) for p in pdf_paths]
elif output_dir.is_dir():
# Minimal fallback: build docs from markdown filenames
from pdfsummary.parser import PDFDocument
docs = []
for md_file in sorted(output_dir.glob("*.md")):
if md_file.name.startswith("_"):
continue
content = md_file.read_text(encoding="utf-8")
docs.append(
PDFDocument(
filepath=str(md_file),
filename=md_file.stem + ".pdf",
title=md_file.stem,
page_count=0,
pages_text=[content],
word_count=len(content.split()),
)
)
else:
typer.echo(f"Error: '{output_dir}' is not a directory.", err=True)
raise typer.Exit(code=1)
if not docs:
typer.echo("Error: No documents to export.", err=True)
raise typer.Exit(code=1)
export_csv(docs, csv)
typer.echo(f"Exported {len(docs)} document(s) to '{csv}'.")
--
"""PDF parsing and text extraction using pypdf."""
from pathlib import Path
from dataclasses import dataclass, field
from pypdf import PdfReader
from pypdf.errors import PdfReadError
@dataclass
class PDFDocument:
"""Holds extracted data from a single PDF."""
filepath: str
filename: str
title: str
page_count: int
pages_text: list[str] = field(default_factory=list)
word_count: int = 0
error: str = ""
def extract_pdf(filepath: Path) -> PDFDocument:
"""Extract text and metadata from a single PDF file.
Returns a PDFDocument with extracted text per page, or an error message
if the file cannot be read.
"""
doc = PDFDocument(
filepath=str(filepath),
filename=filepath.name,
title=filepath.stem,
page_count=0,
)
try:
reader = PdfReader(str(filepath))
except PdfReadError as exc:
doc.error = f"Cannot read PDF: {exc}"
return doc
except Exception as exc:
doc.error = f"Unexpected error opening PDF: {exc}"
return doc
doc.page_count = len(reader.pages)
if reader.metadata and reader.metadata.title:
doc.title = reader.metadata.title
for page in reader.pages:
try:
text = page.extract_text() or ""
except Exception:
text = ""
doc.pages_text.append(text)
full_text = " ".join(doc.pages_text)
doc.word_count = len(full_text.split())
return doc
def collect_pdfs(folder: Path) -> list[Path]:
"""Return sorted list of PDF file paths in the given folder."""
if not folder.is_dir():
return []
pdfs = sorted(folder.glob("*.pdf"))
return pdfs
--
"""Generate markdown summaries and keyword index from parsed PDFs."""
import re
from collections import Counter
from pathlib import Path
from pdfsummary.parser import PDFDocument
# Common English stop words to exclude from keyword extraction
_STOP_WORDS = frozenset(
"a an the and or but in on at to for of is it this that with from by as are was "
"were be been being have has had do does did will would shall should may might can "
"could not no nor so if then than too very just about above after again all also am "
"any because before between both but each few further get got he her here hers "
"herself him himself his how i its itself let me more most my myself now off once "
"only other our ours ourselves out over own same she some such take their theirs "
"them themselves these they those through under until up us we what when where which "
"while who whom why you your yours yourself yourselves one two three four five six "
"seven eight nine ten new used using use many much".split()
)
TOP_KEYWORD_COUNT = 20
EXCERPT_SENTENCES = 5
def _extract_keywords(text: str, top_n: int = TOP_KEYWORD_COUNT) -> list[tuple[str, int]]:
"""Return the top-n keywords by frequency, excluding stop words."""
words = re.findall(r"[a-zA-Z]{3,}", text.lower())
filtered = [w for w in words if w not in _STOP_WORDS]
return Counter(filtered).most_common(top_n)
def _first_sentences(text: str, count: int = EXCERPT_SENTENCES) -> str:
"""Return the first N sentences as an excerpt."""
sentences = re.split(r"(?<=[.!?])\s+", text.strip())
selected = sentences[:count]
return " ".join(selected)
def generate_summary_markdown(doc: PDFDocument) -> str:
"""Produce a markdown string summarizing a single PDFDocument."""
lines: list[str] = []
lines.append(f"# {doc.title}")
lines.append("")
lines.append(f"- **Source file:** `{doc.filename}`")
lines.append(f"- **Pages:** {doc.page_count}")
lines.append(f"- **Word count:** {doc.word_count}")
lines.append("")
if doc.error:
lines.append(f"> **Error:** {doc.error}")
lines.append("")
return "\n".join(lines)
full_text = "\n".join(doc.pages_text)
# Excerpt
excerpt = _first_sentences(full_text)
if excerpt:
lines.append("## Excerpt")
lines.append("")
lines.append(excerpt)
lines.append("")
# Keywords
keywords = _extract_keywords(full_text)
if keywords:
lines.append("## Top Keywords")
lines.append("")
for word, count in keywords:
lines.append(f"- **{word}** ({count})")
lines.append("")
# Page-by-page headings (first line of each page as a quick TOC)
if doc.page_count > 1:
lines.append("## Page Headings")
lines.append("")
for i, page_text in enumerate(doc.pages_text, 1):
first_line = page_text.strip().split("\n")[0][:120] if page_text.strip() else "(empty)"
lines.append(f"- **Page {i}:** {first_line}")
lines.append("")
return "\n".join(lines)
def build_keyword_index(docs: list[PDFDocument]) -> str:
"""Build a cross-document keyword index as a markdown string.
Maps each keyword to the list of documents where it appears.
"""
index: dict[str, list[str]] = {}
for doc in docs:
if doc.error:
continue
full_text = "\n".join(doc.pages_text)
keywords = _extract_keywords(full_text, top_n=TOP_KEYWORD_COUNT)
for word, _count in keywords:
index.setdefault(word, []).append(doc.filename)
lines: list[str] = []
lines.append("# Keyword Index")
lines.append("")
for keyword in sorted(index):
files = ", ".join(sorted(set(index[keyword])))
lines.append(f"- **{keyword}**: {files}")
lines.append("")
return "\n".join(lines)
def write_summaries(docs: list[PDFDocument], output_dir: Path) -> list[Path]:
"""Write one markdown summary per document and a keyword index.
Returns list of written file paths.
"""
output_dir.mkdir(parents=True, exist_ok=True)
written: list[Path] = []
for doc in docs:
md_content = generate_summary_markdown(doc)
out_path = output_dir / f"{doc.title}.md"
out_path.write_text(md_content, encoding="utf-8")
written.append(out_path)
# Keyword index
index_content = build_keyword_index(docs)
index_path = output_dir / "_keyword_index.md"
index_path.write_text(index_content, encoding="utf-8")
written.append(index_path)
return written
--
"""Search across generated markdown summaries."""
from dataclasses import dataclass
from pathlib import Path
@dataclass
class SearchHit:
"""A single search result."""
filename: str
line_number: int
line_text: str
def search_summaries(output_dir: Path, query: str) -> list[SearchHit]:
"""Search all markdown files in output_dir for lines containing query.
Case-insensitive substring match. Returns a list of SearchHit objects
sorted by filename then line number.
"""
if not output_dir.is_dir():
return []
query_lower = query.lower()
hits: list[SearchHit] = []
for md_file in sorted(output_dir.glob("*.md")):
try:
lines = md_file.read_text(encoding="utf-8").splitlines()
except Exception:
continue
for i, line in enumerate(lines, 1):
if query_lower in line.lower():
hits.append(
SearchHit(
filename=md_file.name,
line_number=i,
line_text=line.strip(),
)
)
return hits
--
"""Export summary metadata to CSV using pandas."""
from pathlib import Path
import pandas as pd
from pdfsummary.parser import PDFDocument
def export_csv(docs: list[PDFDocument], csv_path: Path) -> Path:
"""Write a CSV with one row per document.
Columns: filename, title, page_count, word_count, top_keywords.
Returns the path to the written CSV.
"""
import re
from collections import Counter
_STOP_WORDS = frozenset(
"a an the and or but in on at to for of is it this that with from by as are was "
"were be been being have has had do does did will would shall should may might can "
"could not no nor so if then than too very just about above after again all also am "
"any because before between both but each few further get got he her here hers "
"herself him himself his how i its itself let me more most my myself now off once "
"only other our ours ourselves out over own same she some such take their theirs "
"them themselves these they those through under until up us we what when where which "
"while who whom why you your yours yourself yourselves one two three four five six "
"seven eight nine ten new used using use many much".split()
)
rows = []
for doc in docs:
full_text = " ".join(doc.pages_text)
words = re.findall(r"[a-zA-Z]{3,}", full_text.lower())
filtered = [w for w in words if w not in _STOP_WORDS]
top_kw = [w for w, _ in Counter(filtered).most_common(10)]
rows.append(
{
"filename": doc.filename,
"title": doc.title,
"page_count": doc.page_count,
"word_count": doc.word_count,
"top_keywords": "; ".join(top_kw),
}
)
df = pd.DataFrame(rows)
csv_path.parent.mkdir(parents=True, exist_ok=True)
df.to_csv(str(csv_path), index=False)
return csv_path
--
"""Functional tests: verify core features and success criteria."""
import time
import subprocess
import sys
from pathlib import Path
from pdfsummary.parser import collect_pdfs, extract_pdf
from pdfsummary.summarizer import write_summaries, generate_summary_markdown
from pdfsummary.search import search_summaries
from pdfsummary.export import export_csv
REPO_ROOT = Path(__file__).parent.parent
TEST_PDFS = REPO_ROOT / "test_pdfs"
FUNC_OUTPUT = REPO_ROOT / "test_func_output"
def _setup_output() -> Path:
"""Ensure clean output directory."""
if FUNC_OUTPUT.exists():
for f in FUNC_OUTPUT.iterdir():
f.unlink()
else:
FUNC_OUTPUT.mkdir()
return FUNC_OUTPUT
def _teardown_output() -> None:
if FUNC_OUTPUT.exists():
for f in FUNC_OUTPUT.iterdir():
f.unlink()
FUNC_OUTPUT.rmdir()
# --- Feature: PDF parsing ---
def test_collect_finds_all_pdfs() -> None:
"""collect_pdfs should find at least 10 PDFs in the test folder."""
pdfs = collect_pdfs(TEST_PDFS)
assert len(pdfs) >= 10, f"Expected >= 10 PDFs, found {len(pdfs)}"
def test_extract_pdf_returns_content() -> None:
"""extract_pdf should return a document with text and metadata."""
pdfs = collect_pdfs(TEST_PDFS)
doc = extract_pdf(pdfs[0])
assert doc.page_count >= 1
assert doc.word_count > 0
assert doc.error == ""
# --- Feature: Markdown summary generation ---
def test_generate_summary_has_required_sections() -> None:
"""Each summary should contain title, metadata, keywords."""
pdfs = collect_pdfs(TEST_PDFS)
doc = extract_pdf(pdfs[0])
md = generate_summary_markdown(doc)
assert "# " in md, "Missing title heading"
assert "**Source file:**" in md, "Missing source file field"
assert "**Pages:**" in md, "Missing page count"
assert "Top Keywords" in md, "Missing keywords section"
def test_write_summaries_creates_one_per_pdf() -> None:
"""write_summaries should create one .md per PDF plus the keyword index."""
output = _setup_output()
pdfs = collect_pdfs(TEST_PDFS)
docs = [extract_pdf(p) for p in pdfs]
written = write_summaries(docs, output)
# One per doc + keyword index
assert len(written) == len(docs) + 1
_teardown_output()
# --- Success criterion: under 30 seconds for 10+ PDFs ---
def test_performance_under_30_seconds() -> None:
"""The full summarize pipeline should finish in under 30 seconds."""
output = _setup_output()
pdfs = collect_pdfs(TEST_PDFS)
assert len(pdfs) >= 10
start = time.monotonic()
docs = [extract_pdf(p) for p in pdfs]
write_summaries(docs, output)
elapsed = time.monotonic() - start
assert elapsed < 30.0, f"Took {elapsed:.1f}s, expected < 30s"
_teardown_output()
# --- Feature: Keyword search ---
def test_search_returns_correct_results() -> None:
"""Searching for 'machine learning' should find relevant documents."""
output = _setup_output()
pdfs = collect_pdfs(TEST_PDFS)
docs = [extract_pdf(p) for p in pdfs]
write_summaries(docs, output)
hits = search_summaries(output, "machine learning")
assert len(hits) > 0, "Expected at least one hit for 'machine learning'"
filenames = {h.filename for h in hits}
# Should match the ML basics doc at minimum
assert any("machine" in fn.lower() or "data_science" in fn.lower() for fn in filenames), (
f"Expected machine_learning or data_science doc in results, got {filenames}"
)
_teardown_output()
def test_search_empty_query_rejected() -> None:
"""CLI should handle empty search gracefully."""
result = subprocess.run(
[sys.executable, "-m", "pdfsummary", "search", str(TEST_PDFS), ""],
capture_output=True,
text=True,
cwd=str(REPO_ROOT),
)
assert result.returncode == 1
# --- Feature: CSV export ---
def test_csv_export_creates_file() -> None:
"""export_csv should create a valid CSV with the right columns."""
import pandas as pd
output = _setup_output()
pdfs = collect_pdfs(TEST_PDFS)
docs = [extract_pdf(p) for p in pdfs]
csv_path = output / "test_results.csv"
export_csv(docs, csv_path)
assert csv_path.exists(), "CSV file not created"
df = pd.read_csv(csv_path)
assert set(df.columns) == {"filename", "title", "page_count", "word_count", "top_keywords"}
assert len(df) == len(docs)
_teardown_output()
if __name__ == "__main__":
funcs = [v for k, v in sorted(globals().items()) if k.startswith("test_")]
for fn in funcs:
print(f" Running {fn.__name__}...", end=" ")
fn()
print("PASS")
print("\nAll functional tests passed.")
--
"""Smoke test: verify that the summarize command runs end-to-end."""
import subprocess
import sys
from pathlib import Path
REPO_ROOT = Path(__file__).parent.parent
TEST_PDFS = REPO_ROOT / "test_pdfs"
SMOKE_OUTPUT = REPO_ROOT / "test_smoke_output"
def test_summarize_smoke() -> None:
"""Run the summarize command and check that markdown files are created."""
# Clean previous output
if SMOKE_OUTPUT.exists():
for f in SMOKE_OUTPUT.iterdir():
f.unlink()
SMOKE_OUTPUT.rmdir()
result = subprocess.run(
[sys.executable, "-m", "pdfsummary", "summarize", str(TEST_PDFS), "-o", str(SMOKE_OUTPUT)],
capture_output=True,
text=True,
cwd=str(REPO_ROOT),
)
assert result.returncode == 0, f"CLI failed: {result.stderr}"
md_files = list(SMOKE_OUTPUT.glob("*.md"))
assert len(md_files) >= 1, "No markdown files produced."
# Keyword index must exist
index = SMOKE_OUTPUT / "_keyword_index.md"
assert index.exists(), "Keyword index not created."
# Cleanup
for f in SMOKE_OUTPUT.iterdir():
f.unlink()
SMOKE_OUTPUT.rmdir()
def test_cli_error_on_missing_folder() -> None:
"""CLI should exit with code 1 if the folder does not exist."""
result = subprocess.run(
[sys.executable, "-m", "pdfsummary", "summarize", "/nonexistent/folder"],
capture_output=True,
text=True,
cwd=str(REPO_ROOT),
)
assert result.returncode == 1
if __name__ == "__main__":
test_summarize_smoke()
test_cli_error_on_missing_folder()
print("Smoke tests passed.")
$9.99-50%
$5.00
SPECIAL PROMOTION — Offer ends 22 April
By purchasing this prompt, you agree to our terms of service
CLAUDE-4-6-SONNET
⚡ Turn any product idea into a runnable MVP codebase in one response.
▪️ Complete folder tree with every file written
▪️ Pinned dependencies, safe defaults, no TODO gaps
▪️ Smoke and functional tests included
▪️ Clear run and deploy commands
Scoped for rapid validation, not enterprise architecture.
👉 Step-by-step structure and tips included.
...more
Added 2 days ago
