Initial commit: skill files, docs site, README

- SKILL.md and pipeline.py from ~/.claude/skills/screenshot-rename/
- docs/index.html — archival/typewriter aesthetic homepage with hero
  monument, problem, 4-stage pipeline, before/after split, run-log
  receipt, ten gotchas, four use cases, install snippets
- MIT license

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Anthony Cardinale
2026-05-04 09:23:02 -04:00
commit 63edc33fc4
6 changed files with 1547 additions and 0 deletions
+3
View File
@@ -0,0 +1,3 @@
__pycache__/
*.pyc
.DS_Store
+21
View File
@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2026 Roberto Cardinale
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
+87
View File
@@ -0,0 +1,87 @@
# screenshot-rename
> A Claude Code skill that turns a folder of timestamp-named screenshots into a folder of human-readable, searchable filenames — using parallel Haiku vision agents.
```
CleanShot 2026-04-15 at 09.14.07.png
CleanShot - Shamel Studio Affiliate Referral Code Modal - 2026-04-15 at 09.14.07.png
```
Built for [CleanShot](https://cleanshot.com)-style screenshot folders, but works on any directory of `.png` / `.gif` / `.mp4` / `.pdf` files named only by timestamp.
## Highlights
- **Parallel** — describes ~200 files in 3 minutes using 10 concurrent Haiku subagents.
- **Safe** — pre-builds the full rename plan in memory, validates uniqueness and target collisions, then renames atomically with file-count audit. Designed after losing 4 files to a `mv` overwrite during prototyping.
- **Handles video/PDF** — extracts the first frame so vision agents can describe them.
- **Resizes for the vision tool** — Retina screenshots exceed Read's image cap; pipeline downsamples to 1568px max.
## Installation
This is a Claude Code skill. Drop the `screenshot-rename/` directory into `~/.claude/skills/`:
```bash
git clone https://gitea.tojo.team/cardinale/screenshot-rename.git ~/.claude/skills/screenshot-rename
```
In your next Claude Code session, ask:
> rename all the cleanshot files in `~/Documents/Screenshots/` based on their content
The skill will activate automatically.
## Usage from the command line
You can also drive the pipeline directly:
```bash
# 1. Prep — extract frames, resize, build batches
python3 pipeline.py prep --src "/path/to/folder" --batch-size 19
# 2. (In a Claude Code session, dispatch one Haiku subagent per
# /tmp/screenshot-rename/full-batch-NN file using the prompt template
# in SKILL.md.)
# 3. Plan — aggregate descriptions, validate, build rename map
python3 pipeline.py plan --src "/path/to/folder"
# 4. Execute — apply the plan, audit file count
python3 pipeline.py execute --src "/path/to/folder"
```
The dispatch step (#2) currently requires a Claude Code session. See [Roadmap](#roadmap).
## Documentation
- **Homepage with worked examples:** [docs/index.html](docs/index.html)
- **Full skill spec:** [SKILL.md](SKILL.md)
- **Pipeline source:** [pipeline.py](pipeline.py)
## The gotchas this skill encodes
This skill exists because every one of these caused real damage during development:
1. The macOS `Read` tool has an image-size cap. Resize first.
2. Vision can't read `.mp4` or multi-page `.pdf` directly. Extract a frame.
3. **Bash regex `[[ =~ ]]` does NOT populate `BASH_REMATCH` in zsh.** Targets become empty. Loops collide on the same filename. Files vanish. Use Python for any filename mutation.
4. `mv` silently overwrites. Use `mv -n` or `os.rename` with explicit pre-existence check.
5. Pre-build the entire rename plan in memory and validate uniqueness before any `mv`.
6. Audit `len(os.listdir(DEST))` before and after. Equal count == proof no overwrites.
7. iCloud-synced files in Time Machine local snapshots are file-provider stubs, not bytes. External backups (Backblaze, Time Machine to physical disk) are the real recovery source.
8. `Bash run_in_background` may exit early on `while read` loops. Run renames foreground via Python.
9. Haiku occasionally returns the resized `.jpg` filename instead of the original `.png`. Validator must try alt extensions.
10. Always preserve the original `.mp4` / `.pdf` extension — describe via the extracted frame, rename the source.
The full discussion is in [SKILL.md](SKILL.md#the-critical-gotchas-every-one-of-these-caused-real-pain).
## Roadmap
- Direct Anthropic API mode (no Claude Code session required) — needs `ANTHROPIC_API_KEY`
- Custom prompt templates per-folder
- Optional preservation of dots in technical strings (`v2.1` currently becomes `V21`)
- Dry-run flag on `execute`
## License
MIT — see [LICENSE](LICENSE).
+160
View File
@@ -0,0 +1,160 @@
---
name: screenshot-rename
description: Use when renaming a folder of screenshots, images, or short clips with AI-generated descriptive names — particularly CleanShot exports or any directory of images named only by timestamp. Triggers on requests like "rename these screenshots based on their content", "describe each of these images and rename it", or batch rename of files by visual content.
---
# Screenshot Rename
## Overview
Rename a directory of timestamp-named images (PNG / GIF / MP4 / PDF) to include AI-generated content descriptions, dispatched as parallel Haiku subagents from this Claude Code session. Each rename has the form:
```
<original prefix> - <Title Cased Description> - <original timestamp>.<ext>
```
The pipeline is **prep → batch → describe (parallel agents) → validate plan → execute renames** with hard data-loss guards at every stage.
**Core principle:** *Plan in memory, validate exhaustively, then mutate the filesystem in a single pass with `os.rename` and pre-existence checks.* Never let `mv` overwrite — that's how you lose files.
## When to Use
- Renaming CleanShot / screenshot folders by content
- Any image batch where the source filenames are timestamps and the user wants them human-scannable
- ≥ ~10 files (otherwise just rename them inline)
- Files include PNG/GIF and optionally MP4 or PDF (pipeline handles all four)
**Don't use for:**
- Code or text files — vision isn't needed
- Files where the name pattern is already meaningful
- Single-file rename (just do it directly)
## Workflow
```
1. Prep
├─ Extract first frame from each .mp4 (ffmpeg) and .pdf (sips) to /tmp/frames/<base>.jpg
├─ Resize every source image to max 1568px on long edge → /tmp/small/<base>.jpg
└─ Build manifest TSV: <small_image_path>\t<original_filename>
2. Batch
└─ Split manifest into N batches of ≤ 20 lines each (file: full-batch-NN)
3. Describe (parallel)
└─ Dispatch N Haiku subagents (model: "haiku") in a single message
Each agent: reads its batch manifest, uses Read on each image_path,
writes desc-full-NN.tsv with: <original_filename>\t<6-8 word description>
4. Plan (Python)
├─ Aggregate all desc-*.tsv into desc-all.tsv
├─ Validate every line: 6+ words, alnum+space only, source exists, target doesn't,
│ no plan-internal collisions
├─ Truncate descriptions to 8 words max, title-case
└─ Write plan-full.tsv: <original>\t<new_name>
5. Execute (Python, NEVER bash)
├─ Read plan, for each line: pre-check src exists & dst doesn't, then os.rename
├─ Audit before/after file count — must be equal
└─ Log failures, report ok/fail counts
```
## The Critical Gotchas (every one of these caused real pain)
1. **Read tool has an image-size cap.** Original Retina screenshots can exceed it. **Always downscale** to ≤ 1568px before handing to a subagent. Use `sips -Z 1568 -s format jpeg`.
2. **Vision API can't read .mp4 or multi-page .pdf directly.** Extract the first frame to a JPEG first (`ffmpeg -ss 1 -i in.mp4 -frames:v 1 out.jpg`, `sips -s format jpeg in.pdf --out out.jpg`).
3. **Bash regex with `[[ =~ ]]` + `BASH_REMATCH` does NOT work in zsh.** zsh uses `$match[1]` etc. instead. Pattern silently fails, target name becomes empty, multiple `mv`s collide on the same empty target, files vanish. **Use Python for any filename mutation.** No exceptions.
4. **`mv` silently overwrites.** A loop that constructs target names from a buggy parse will happily destroy your data. Use `mv -n` (no-clobber) in shell, or `os.rename` after `os.path.exists(dst)` check in Python. Never bare `mv`.
5. **Pre-flight the full plan in memory** before mutating the filesystem. Build a list of `(orig, new)` tuples; verify every `new` is unique within the plan, doesn't collide with anything in the destination directory, and that every `orig` exists. Only then start renaming.
6. **File-count audit.** Record `len(os.listdir(DEST))` before and after — must be equal. Any drop = data loss.
7. **iCloud-synced trees and Time Machine local snapshots:** files in the snapshot are *file-provider stubs*, not the bytes. `cat` / `cp` from a snapshot path inside an iCloud-synced folder returns "Operation timed out" with a 0-byte file. **External backups (Backblaze, Time Machine to a real disk) are the actual recovery source for iCloud data**, not local APFS snapshots.
8. **Bash background jobs in the Claude Code Bash tool can die silently.** A `while read` loop redirected from a file may exit immediately when run in the background. **Run renames foreground via Python** — it's the same code path locally and reliably runs to completion.
9. **Haiku occasionally returns the wrong filename extension** (the resized `.jpg` instead of the original `.png`). The plan-builder must accept that and try alternate extensions when the claimed source isn't found in the destination directory.
10. **Always preserve mp4/pdf source files** — the pipeline reads from the resized JPEG but renames the original mp4/pdf. Don't lose the source extension.
## Quick Reference
| Step | Command |
|---|---|
| Extract mp4 frame | `ffmpeg -y -ss 1 -i "$f" -frames:v 1 -q:v 3 "$out"` |
| Convert pdf to jpg | `sips -s format jpeg "$f" --out "$out"` |
| Resize for vision | `sips -Z 1568 -s format jpeg "$f" --out "$out"` |
| Split TSV into batches of 20 | `awk -v w=DIR 'BEGIN{n=1;c=0} {print > sprintf("%s/batch-%02d", w, n); c++; if(c>=20){c=0;n++}}'` |
| Dispatch agent | Agent tool, `subagent_type=general-purpose`, `model="haiku"`, `run_in_background=true` |
| Execute renames | Python `os.rename` with pre-existence check (NEVER bash `mv` in a loop) |
## Reusable Pipeline
The prep, plan, and rename phases are in `pipeline.py`. The dispatch phase is performed by Claude Code itself (Agent tool calls) and cannot be scripted from inside Python — that's the trade-off of option (b).
Run order:
```bash
# 1. Prep + batch
python3 ~/.claude/skills/screenshot-rename/pipeline.py prep \
--src "/path/to/folder" --batch-size 19
# Now dispatch one Haiku Agent per /tmp/screenshot-rename/full-batch-NN file
# (Claude Code does this — see SKILL.md "Workflow" step 3)
# 2. After all desc-full-NN.tsv files exist:
python3 ~/.claude/skills/screenshot-rename/pipeline.py plan \
--src "/path/to/folder"
# 3. Review the plan, then:
python3 ~/.claude/skills/screenshot-rename/pipeline.py execute \
--src "/path/to/folder"
```
## Subagent Prompt Template
Use exactly this prompt for each batch (substitute the batch number):
```
Describe screenshots so they can be renamed.
Read the manifest at `/tmp/screenshot-rename/full-batch-NN`. Each line: `image_path<TAB>original_filename`.
For EACH line:
1. Use Read on `image_path` (first column) to view the image.
2. Generate a description of EXACTLY 6, 7, or 8 words describing "what app is shown and what the content is". Count your words. Be specific about app names when visible. Use only ASCII letters, numbers, and spaces — NO slashes, colons, dashes, quotes, special characters. Lowercase. 6-8 words.
Output: write `/tmp/screenshot-rename/desc-full-NN.tsv` via Write tool. Each line: `original_filename<TAB>description`. <count> lines total.
Then run `wc -l` on the output file to verify the line count.
Return only "DONE: <count> lines" or an error report.
```
Dispatch all batches **in a single message with multiple Agent tool calls** so they run in parallel. Use `run_in_background=true` so you can keep working.
## Common Mistakes
| Mistake | What goes wrong | Fix |
|---|---|---|
| `mv $f $newname` in a bash loop | One bug → silent overwrite → data loss | `os.rename` in Python with pre-existence check |
| Building target name with bash regex | zsh doesn't populate BASH_REMATCH; empty targets | Use Python `os.path.splitext` and string ops |
| Sending original Retina images to Read | "Image too large" error mid-batch, partial output | Resize to 1568px first |
| Sending .mp4 to vision | Read fails | Extract first frame to JPEG first |
| Skipping the file-count audit | Silent data loss goes unnoticed | `len(os.listdir(DEST))` before & after — must be equal |
| Trusting Haiku's filename column | 30%+ of entries may have wrong extension | Plan-builder tries alt extensions |
| Running rename loop in background `Bash run_in_background=true` | Background `while read` may exit immediately, 0 progress | Run via Python foreground (it's fast — `os.rename` is just a syscall) |
## Recovery — if something does go wrong
1. **Check `~/Library/Application Support/CleanShot/media/`** — CleanShot keeps a recent media history.
2. **Check external backups (Backblaze, Time Machine to physical disk)** — these contain real file bytes.
3. **Local APFS Time Machine snapshots are NOT useful for iCloud-synced files** — they store file-provider stubs that time out on read.
4. **Check icloud.com → Drive → Recently Deleted** — iCloud keeps deleted files for ~30 days, but `mv` overwrites are NOT "deletes" from iCloud's perspective and may not appear there.
## Real-World Impact
First run on 196 CleanShot files lost 4 of them due to the bash-regex-in-zsh gotcha (rule #3). After the rebuild with Python and `mv -n`, second run renamed 189 files cleanly with zero loss. This skill exists so that doesn't happen again.
+996
View File
@@ -0,0 +1,996 @@
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width,initial-scale=1">
<title>screenshot-rename — a Claude Code skill</title>
<meta name="description" content="A Claude Code skill that turns a folder of timestamp-named screenshots into human-readable, searchable filenames using parallel Haiku vision agents.">
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link rel="stylesheet" href="https://fonts.googleapis.com/css2?family=Fraunces:ital,opsz,wght@0,9..144,300;0,9..144,400;0,9..144,500;0,9..144,600;0,9..144,700;1,9..144,400;1,9..144,500;1,9..144,700&family=JetBrains+Mono:wght@400;500;600&display=swap">
<style>
:root {
--paper: #f7f3ed;
--paper-2: #efe7d8;
--paper-3: #e8dfcd;
--ink: #1c1916;
--ink-soft: #3c3530;
--ink-mute: #8d8478;
--accent: #7a1f3d;
--accent-soft: rgba(122,31,61,.10);
--accent-deep: #5d1730;
--rule: #d8cebf;
--rule-soft: #e6dfd2;
--serif: "Fraunces", "Iowan Old Style", "Charter", Georgia, serif;
--mono: "JetBrains Mono", ui-monospace, "SF Mono", Menlo, Consolas, monospace;
--max: 1180px;
--gutter: clamp(20px, 4vw, 56px);
}
* { box-sizing: border-box; }
html, body { margin: 0; padding: 0; }
body {
background: var(--paper);
color: var(--ink);
font-family: var(--serif);
font-feature-settings: "ss01", "ss02", "kern";
font-size: 17px;
line-height: 1.55;
-webkit-font-smoothing: antialiased;
text-rendering: optimizeLegibility;
background-image:
radial-gradient(1200px 400px at 88% -10%, rgba(122,31,61,.05), transparent 60%),
radial-gradient(800px 300px at 0% 20%, rgba(0,0,0,.03), transparent 70%);
}
/* subtle paper grain via SVG noise */
body::before {
content: "";
position: fixed; inset: 0;
background-image: url("data:image/svg+xml;utf8,<svg xmlns='http://www.w3.org/2000/svg' width='180' height='180'><filter id='n'><feTurbulence type='fractalNoise' baseFrequency='0.85' numOctaves='2' stitchTiles='stitch'/><feColorMatrix values='0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.045 0'/></filter><rect width='100%' height='100%' filter='url(%23n)'/></svg>");
pointer-events: none;
z-index: 0;
mix-blend-mode: multiply;
}
a { color: var(--accent); text-decoration: none; }
a:hover { color: var(--accent-deep); text-decoration: underline; text-decoration-thickness: 1px; text-underline-offset: 3px; }
.wrap {
max-width: var(--max);
margin: 0 auto;
padding: 0 var(--gutter);
position: relative;
z-index: 1;
}
/* ───── Header ────────────────────────────────────────── */
header {
border-bottom: 1px solid var(--rule);
background: rgba(247,243,237,.85);
backdrop-filter: saturate(140%) blur(8px);
-webkit-backdrop-filter: saturate(140%) blur(8px);
position: sticky; top: 0; z-index: 10;
}
header .wrap {
display: flex; align-items: baseline;
justify-content: space-between;
padding-top: 14px; padding-bottom: 14px;
gap: 24px;
}
.brand {
font-family: var(--mono);
font-size: 13px;
letter-spacing: 0.04em;
text-transform: lowercase;
color: var(--ink);
}
.brand .dot {
display: inline-block;
width: 7px; height: 7px;
background: var(--accent); border-radius: 50%;
vertical-align: 2px;
margin-right: 8px;
}
nav.top {
font-family: var(--mono);
font-size: 12px;
letter-spacing: 0.04em;
text-transform: lowercase;
display: flex; gap: 22px;
}
nav.top a {
color: var(--ink-mute);
}
nav.top a:hover { color: var(--accent); text-decoration: none; }
/* ───── Hero ────────────────────────────────────────── */
.hero {
padding: clamp(56px, 9vw, 120px) 0 clamp(40px, 6vw, 80px);
position: relative;
}
.eyebrow {
font-family: var(--mono);
font-size: 11.5px;
letter-spacing: 0.16em;
text-transform: uppercase;
color: var(--accent);
display: flex; align-items: center; gap: 12px;
margin-bottom: 28px;
}
.eyebrow .bar {
flex: 0 0 36px; height: 1px; background: var(--accent);
}
h1.display {
font-family: var(--serif);
font-weight: 400;
font-style: italic;
font-variation-settings: "opsz" 144;
font-size: clamp(48px, 8.5vw, 116px);
line-height: 0.96;
letter-spacing: -0.025em;
color: var(--ink);
margin: 0 0 28px;
max-width: 14ch;
}
h1.display .accent {
font-style: normal;
font-weight: 500;
color: var(--accent);
position: relative;
}
h1.display .accent::after {
content: "";
position: absolute;
left: 0; right: 0; bottom: 0.04em;
height: 0.13em;
background: var(--accent-soft);
z-index: -1;
}
.hero .lede {
max-width: 56ch;
font-size: clamp(17px, 1.45vw, 21px);
line-height: 1.55;
color: var(--ink-soft);
margin: 0 0 44px;
}
.hero .lede em { font-style: italic; color: var(--ink); }
.hero-cta {
display: flex; gap: 14px; align-items: center;
font-family: var(--mono);
font-size: 13px;
}
.btn {
display: inline-block;
padding: 12px 22px;
font-family: var(--mono);
font-size: 13px;
letter-spacing: 0.02em;
border-radius: 0;
border: 1px solid var(--ink);
background: var(--ink);
color: var(--paper);
transition: transform .15s ease, background .15s ease;
}
.btn:hover {
background: var(--accent-deep);
border-color: var(--accent-deep);
color: var(--paper);
text-decoration: none;
transform: translateY(-1px);
}
.btn.ghost {
background: transparent;
color: var(--ink);
}
.btn.ghost:hover {
background: transparent;
color: var(--accent);
border-color: var(--accent);
}
/* The hero rename "monument" */
.monument {
margin-top: clamp(60px, 8vw, 100px);
padding: 32px 28px;
background: var(--paper-2);
border: 1px solid var(--rule);
position: relative;
font-family: var(--mono);
font-size: clamp(13px, 1.25vw, 16.5px);
line-height: 1.7;
word-break: break-word;
}
.monument::before {
content: "FILENAME · BEFORE ↓ AFTER";
position: absolute;
top: -10px; left: 22px;
background: var(--paper);
padding: 0 10px;
font-size: 10px;
letter-spacing: 0.18em;
color: var(--ink-mute);
}
.monument .row {
display: grid;
grid-template-columns: 22px 1fr;
gap: 14px;
align-items: baseline;
padding: 6px 0;
}
.monument .row + .row { border-top: 1px dashed var(--rule); }
.monument .glyph {
color: var(--accent);
text-align: center;
font-weight: 600;
}
.monument .before { color: var(--ink-mute); }
.monument .after { color: var(--ink); }
.monument .after .desc { color: var(--accent-deep); font-weight: 500; }
/* ───── Section scaffolding ────────────────────────── */
section { padding: clamp(72px, 10vw, 140px) 0; position: relative; }
section + section { border-top: 1px solid var(--rule-soft); }
.section-label {
font-family: var(--mono);
font-size: 11px;
letter-spacing: 0.22em;
text-transform: uppercase;
color: var(--ink-mute);
margin-bottom: 18px;
display: flex; align-items: center; gap: 12px;
}
.section-label .num {
display: inline-block;
width: 26px; height: 26px;
border: 1px solid var(--ink);
border-radius: 50%;
text-align: center; line-height: 24px;
font-size: 10px; color: var(--ink);
letter-spacing: 0;
}
h2 {
font-family: var(--serif);
font-weight: 400;
font-variation-settings: "opsz" 96;
font-size: clamp(34px, 4.5vw, 56px);
line-height: 1.05;
letter-spacing: -0.02em;
color: var(--ink);
margin: 0 0 28px;
max-width: 22ch;
}
h2 em { font-style: italic; color: var(--accent); }
.lede-2 {
max-width: 60ch;
font-size: 17px;
color: var(--ink-soft);
margin: 0 0 60px;
}
/* ───── Section: the problem ──────────────────────── */
.problem-grid {
display: grid;
grid-template-columns: minmax(0, 5fr) minmax(0, 6fr);
gap: clamp(28px, 4vw, 64px);
align-items: start;
}
@media (max-width: 720px) { .problem-grid { grid-template-columns: 1fr; } }
.opaque-list {
font-family: var(--mono);
font-size: 13.5px;
line-height: 2.05;
color: var(--ink-mute);
border-left: 2px solid var(--rule);
padding: 4px 0 4px 22px;
}
.opaque-list li { list-style: none; }
.opaque-list .ellipsis { color: var(--accent); margin-top: 6px; font-family: var(--serif); font-style: italic; }
.problem-note {
font-size: 17px;
color: var(--ink-soft);
line-height: 1.6;
}
.problem-note strong { font-weight: 500; color: var(--ink); }
.problem-note .stat {
display: block;
font-family: var(--serif);
font-style: italic;
font-variation-settings: "opsz" 96;
font-size: clamp(40px, 5vw, 68px);
line-height: 1;
color: var(--accent);
margin: 28px 0 14px;
}
/* ───── Section: pipeline ─────────────────────────── */
.pipeline {
display: grid;
grid-template-columns: repeat(4, 1fr);
gap: 0;
border: 1px solid var(--rule);
background: var(--paper-2);
}
@media (max-width: 920px) { .pipeline { grid-template-columns: repeat(2, 1fr); } }
@media (max-width: 540px) { .pipeline { grid-template-columns: 1fr; } }
.stage {
padding: 32px 26px;
border-right: 1px solid var(--rule);
position: relative;
}
.stage:last-child { border-right: none; }
@media (max-width: 920px) { .stage:nth-child(2) { border-right: none; } }
.stage .step {
font-family: var(--mono);
font-size: 11px;
letter-spacing: 0.18em;
text-transform: uppercase;
color: var(--ink-mute);
margin-bottom: 14px;
}
.stage h3 {
font-family: var(--serif);
font-weight: 500;
font-style: italic;
font-variation-settings: "opsz" 60;
font-size: 30px;
line-height: 1.05;
margin: 0 0 16px;
color: var(--ink);
letter-spacing: -0.01em;
}
.stage p {
font-size: 14.5px;
color: var(--ink-soft);
margin: 0 0 16px;
line-height: 1.55;
}
.stage code {
font-family: var(--mono);
font-size: 12px;
color: var(--accent-deep);
background: rgba(122,31,61,.07);
padding: 1px 6px;
border-radius: 2px;
}
/* ───── Section: before / after ──────────────────── */
.split {
display: grid;
grid-template-columns: 1fr 1fr;
gap: 0;
border: 1px solid var(--rule);
}
@media (max-width: 720px) { .split { grid-template-columns: 1fr; } }
.split > div {
padding: 40px 32px;
}
.split .lhs {
background: var(--paper-3);
border-right: 1px solid var(--rule);
}
.split .rhs {
background: var(--paper);
position: relative;
}
.split .rhs::before {
content: "→";
position: absolute;
left: -22px; top: 50%;
transform: translateY(-50%);
width: 44px; height: 44px;
background: var(--accent);
color: var(--paper);
border-radius: 50%;
display: grid; place-items: center;
font-family: var(--mono);
font-size: 18px;
}
@media (max-width: 720px) { .split .rhs::before { display: none; } }
.split .tag {
font-family: var(--mono);
font-size: 10.5px;
letter-spacing: 0.18em;
text-transform: uppercase;
color: var(--ink-mute);
margin-bottom: 18px;
}
.split .filename {
font-family: var(--mono);
font-size: clamp(15px, 1.7vw, 19px);
line-height: 1.55;
word-break: break-word;
margin-bottom: 18px;
}
.split .lhs .filename { color: var(--ink-mute); }
.split .rhs .filename { color: var(--ink); }
.split .rhs .filename .desc { color: var(--accent-deep); font-weight: 500; }
.split .meta {
font-size: 14px;
color: var(--ink-soft);
}
.split .meta .row { display: flex; gap: 10px; margin: 4px 0; }
.split .meta .row .k {
font-family: var(--mono);
font-size: 11px;
letter-spacing: 0.08em;
color: var(--ink-mute);
min-width: 90px;
padding-top: 2px;
text-transform: uppercase;
}
/* ───── Section: receipt ─────────────────────────── */
.receipt {
background: var(--paper-2);
border: 1px solid var(--rule);
padding: clamp(36px, 4vw, 56px);
font-family: var(--mono);
font-size: 13.5px;
line-height: 1.85;
color: var(--ink);
position: relative;
max-width: 720px;
margin: 0 auto;
}
.receipt::before, .receipt::after {
content: ""; position: absolute; left: 0; right: 0; height: 8px;
background-image:
radial-gradient(circle at 6px 4px, var(--paper) 4px, transparent 4px);
background-size: 14px 8px;
background-repeat: repeat-x;
}
.receipt::before { top: -1px; }
.receipt::after { bottom: -1px; transform: scaleY(-1); }
.receipt .head {
font-family: var(--serif);
font-style: italic;
font-size: 18px;
color: var(--accent);
margin-bottom: 20px;
padding-bottom: 14px;
border-bottom: 1px dashed var(--rule);
}
.receipt .line { display: flex; justify-content: space-between; gap: 16px; }
.receipt .line .v { color: var(--accent-deep); font-weight: 500; }
.receipt .ok { color: var(--accent); font-weight: 600; }
.receipt .total {
margin-top: 18px;
padding-top: 14px;
border-top: 1px solid var(--ink);
font-weight: 500;
}
/* ───── Section: gotchas ─────────────────────────── */
.gotchas {
list-style: none; padding: 0; margin: 0;
display: grid;
grid-template-columns: 1fr 1fr;
gap: 4px 32px;
counter-reset: g;
}
@media (max-width: 720px) { .gotchas { grid-template-columns: 1fr; } }
.gotchas li {
counter-increment: g;
padding: 22px 0 22px 56px;
border-top: 1px solid var(--rule);
font-size: 16px;
color: var(--ink-soft);
line-height: 1.5;
position: relative;
}
.gotchas li::before {
content: counter(g, decimal-leading-zero);
position: absolute;
left: 0; top: 22px;
font-family: var(--mono);
font-size: 11px;
letter-spacing: 0.12em;
color: var(--accent);
width: 36px;
}
.gotchas li b {
font-family: var(--serif);
font-weight: 500;
font-style: italic;
color: var(--ink);
font-size: 18px;
display: block;
margin-bottom: 4px;
}
.gotchas li code {
font-family: var(--mono);
font-size: 13px;
color: var(--accent-deep);
}
/* ───── Section: use cases ──────────────────────── */
.cases {
display: grid;
grid-template-columns: repeat(2, 1fr);
gap: 22px;
}
@media (max-width: 720px) { .cases { grid-template-columns: 1fr; } }
.case {
padding: 36px 28px;
background: var(--paper-2);
border: 1px solid var(--rule);
position: relative;
transition: transform .2s ease, box-shadow .2s ease;
}
.case:hover {
transform: translateY(-2px);
box-shadow: 0 12px 30px -16px rgba(28,25,22,.18);
border-color: var(--ink);
}
.case .num {
font-family: var(--mono);
font-size: 11px;
letter-spacing: 0.18em;
color: var(--ink-mute);
margin-bottom: 18px;
}
.case h3 {
font-family: var(--serif);
font-weight: 500;
font-style: italic;
font-size: 26px;
line-height: 1.15;
margin: 0 0 14px;
color: var(--ink);
letter-spacing: -0.01em;
}
.case p {
font-size: 15.5px;
color: var(--ink-soft);
line-height: 1.55;
margin: 0 0 16px;
}
.case .example {
font-family: var(--mono);
font-size: 12.5px;
color: var(--accent-deep);
background: var(--paper);
border-left: 2px solid var(--accent);
padding: 10px 12px;
word-break: break-word;
line-height: 1.5;
}
/* ───── Section: install ─────────────────────────── */
pre.code {
background: var(--ink);
color: #ece4d2;
font-family: var(--mono);
font-size: 13.5px;
line-height: 1.7;
padding: 26px 28px;
margin: 0 0 22px;
border-left: 3px solid var(--accent);
overflow-x: auto;
position: relative;
}
pre.code .c { color: #8d8478; font-style: italic; }
pre.code .k { color: #d8a4b3; }
pre.code .s { color: #e9c98c; }
pre.code .p { color: #c8d3a3; }
pre.code .lbl {
position: absolute;
top: 8px; right: 14px;
font-size: 10px;
letter-spacing: 0.18em;
color: #8d8478;
text-transform: uppercase;
}
.install-grid {
display: grid;
grid-template-columns: 1fr 1fr;
gap: 22px;
}
@media (max-width: 760px) { .install-grid { grid-template-columns: 1fr; } }
.install-grid > .col h3 {
font-family: var(--serif);
font-weight: 500;
font-style: italic;
font-size: 22px;
margin: 0 0 14px;
color: var(--ink);
}
.install-grid > .col p { color: var(--ink-soft); font-size: 15.5px; margin: 0 0 14px; line-height: 1.55; }
/* ───── Footer ─────────────────────────────────── */
footer {
border-top: 1px solid var(--rule);
padding: 56px 0 80px;
margin-top: 0;
font-family: var(--mono);
font-size: 12.5px;
color: var(--ink-mute);
}
footer .wrap {
display: flex;
justify-content: space-between;
align-items: baseline;
flex-wrap: wrap;
gap: 22px;
}
footer .marks { display: flex; gap: 22px; }
footer a { color: var(--ink); }
footer a:hover { color: var(--accent); text-decoration: none; }
footer .colophon {
font-family: var(--serif);
font-style: italic;
font-size: 14px;
color: var(--ink-mute);
max-width: 50ch;
}
/* ───── Reveal animation ─────────────────────── */
@media (prefers-reduced-motion: no-preference) {
.reveal { opacity: 0; transform: translateY(8px); transition: opacity .8s ease, transform .8s ease; }
.reveal.in { opacity: 1; transform: none; }
h1.display, .hero .lede, .hero-cta, .monument {
opacity: 0; transform: translateY(10px);
animation: rise .9s ease forwards;
}
.hero .lede { animation-delay: 0.10s; }
.hero-cta { animation-delay: 0.18s; }
.monument { animation-delay: 0.28s; }
@keyframes rise {
to { opacity: 1; transform: none; }
}
}
/* ───── Selection ────────────────────────────── */
::selection { background: var(--accent); color: var(--paper); }
</style>
</head>
<body>
<header>
<div class="wrap">
<a href="#" class="brand"><span class="dot"></span>screenshot-rename</a>
<nav class="top">
<a href="#problem">problem</a>
<a href="#pipeline">pipeline</a>
<a href="#gotchas">gotchas</a>
<a href="#cases">use cases</a>
<a href="#install">install</a>
<a href="https://gitea.tojo.team/cardinale/screenshot-rename">repo&nbsp;</a>
</nav>
</div>
</header>
<main>
<!-- ───── Hero ────────────────────────────────────── -->
<section class="hero">
<div class="wrap">
<div class="eyebrow"><span class="bar"></span>A claude code skill · vision-described renames</div>
<h1 class="display">A folder<br>of timestamps,<br>turned into a <span class="accent">manifest</span>.</h1>
<p class="lede">
Two hundred screenshots, all named <em>CleanShot 2026-04-15 at 09.14.07.png</em>.
Run this skill: ten Haiku subagents read each one in parallel, write a six-to-eight word
description, and rename the file in place — atomically, with the safety nets the author
cost himself four files to learn.
</p>
<div class="hero-cta">
<a class="btn" href="#install">Install</a>
<a class="btn ghost" href="https://gitea.tojo.team/cardinale/screenshot-rename">View on gitea ↗</a>
</div>
<div class="monument">
<div class="row">
<span class="glyph"></span>
<span class="before">CleanShot 2026-04-15 at 09.14.07.png</span>
</div>
<div class="row">
<span class="glyph"></span>
<span class="after">CleanShot · <span class="desc">Shamel Studio Affiliate Referral Code Modal</span> · 2026-04-15 at 09.14.07.png</span>
</div>
</div>
</div>
</section>
<!-- ───── The problem ──────────────────────────────── -->
<section id="problem">
<div class="wrap">
<div class="section-label"><span class="num">01</span>The problem</div>
<h2>You can't <em>find</em> a screenshot you took six months ago.</h2>
<div class="problem-grid">
<ul class="opaque-list reveal">
<li>CleanShot 2025-09-26 at 16.27.39.png</li>
<li>CleanShot 2025-11-19 at 13.12.36.png</li>
<li>CleanShot 2025-12-05 at 11.24.33.png</li>
<li>CleanShot 2026-02-18 at 12.48.31.png</li>
<li>CleanShot 2026-03-04 at 06.13.44.png</li>
<li>CleanShot 2026-03-17 at 22.10.20.mp4</li>
<li>CleanShot 2026-03-21 at 11.46.42.png</li>
<li>CleanShot 2026-04-08 at 12.09.10.png</li>
<li class="ellipsis">…and 187 more</li>
</ul>
<div class="problem-note reveal">
<p>
A timestamp tells you <strong>when</strong> a screenshot exists.
It doesn't tell you what's <strong>in</strong> it. Spotlight indexes the
pixels reluctantly; iCloud-synced folders less reliably still. The only
way most people find an old screenshot is by remembering, roughly,
what they were doing the week they took it — and scrolling.
</p>
<p>
The real cost isn't filesystem clutter. It's the screenshots you
stopped taking, because past you knew future you wouldn't be able to
surface them.
</p>
<span class="stat">196 →</span>
<p style="margin-top: -4px;">files renamed in the first run that motivated this skill, in three minutes, with zero loss after the second pass. The first pass cost four files. <em>That's why the safety rules below are written the way they are.</em></p>
</div>
</div>
</div>
</section>
<!-- ───── Pipeline ─────────────────────────────────── -->
<section id="pipeline">
<div class="wrap">
<div class="section-label"><span class="num">02</span>The pipeline</div>
<h2>Four stages, <em>in two minutes</em>.</h2>
<p class="lede-2">
The skill does as little as possible, and validates as much as possible.
Subagents handle the work that benefits from parallelism (vision); Python
handles the work that benefits from being correct (filename mutation,
collision detection, the actual <code>os.rename</code>).
</p>
<div class="pipeline reveal">
<div class="stage">
<div class="step">Stage 01</div>
<h3>Prep.</h3>
<p>Extract the first frame from every <code>.mp4</code> and <code>.pdf</code>. Resize every image to <code>1568px</code> max — Read's image cap is real. Build a manifest TSV.</p>
<p style="color:var(--ink-mute); font-size: 13px;"><code>ffmpeg · sips · /tmp/screenshot-rename/full-batch-NN</code></p>
</div>
<div class="stage">
<div class="step">Stage 02</div>
<h3>Describe.</h3>
<p>Dispatch one Haiku subagent per batch, in parallel — ten at a time. Each agent reads its 19 images and writes 6&ndash;8&nbsp;word descriptions to <code>desc-full-NN.tsv</code>.</p>
<p style="color:var(--ink-mute); font-size: 13px;"><code>model · "haiku" · ~$0.30 / 200 files</code></p>
</div>
<div class="stage">
<div class="step">Stage 03</div>
<h3>Plan.</h3>
<p>Aggregate. Validate every line: 6+&nbsp;words, alnum&nbsp;only, source exists, target doesn't, no plan-internal collisions. Build the full rename map <em>in&nbsp;memory</em>.</p>
<p style="color:var(--ink-mute); font-size: 13px;"><code>plan-full.tsv · zero-error policy</code></p>
</div>
<div class="stage">
<div class="step">Stage 04</div>
<h3>Execute.</h3>
<p>One <code>os.rename</code> per row, with pre-existence check. Audit <code>len(listdir)</code> before&nbsp;and&nbsp;after — it must&nbsp;be&nbsp;equal. <em>That equality is your only proof no overwrites happened.</em></p>
<p style="color:var(--ink-mute); font-size: 13px;"><code>before == after · ok / fail</code></p>
</div>
</div>
</div>
</section>
<!-- ───── Before / After ───────────────────────────── -->
<section id="example">
<div class="wrap">
<div class="section-label"><span class="num">03</span>An actual rename</div>
<h2><em>Before</em> a timestamp.<br>After, a sentence.</h2>
<p class="lede-2">
A real rename from the run that motivated this skill. The description
was generated by Haiku in roughly two seconds.
</p>
<div class="split">
<div class="lhs">
<div class="tag">Before</div>
<div class="filename">CleanShot 2026-03-17 at 22.10.20.mp4</div>
<div class="meta">
<div class="row"><span class="k">Length</span><span>36 chars</span></div>
<div class="row"><span class="k">Searchable</span><span>by date only</span></div>
<div class="row"><span class="k">Tells you</span><span>when</span></div>
</div>
</div>
<div class="rhs">
<div class="tag">After</div>
<div class="filename">CleanShot · <span class="desc">Claude Conversation About Context Calculator Implementation</span> · 2026-03-17 at 22.10.20.mp4</div>
<div class="meta">
<div class="row"><span class="k">Length</span><span>91 chars</span></div>
<div class="row"><span class="k">Searchable</span><span>by content + date</span></div>
<div class="row"><span class="k">Tells you</span><span>what, when</span></div>
</div>
</div>
</div>
<p style="font-family:var(--mono); font-size:12px; color:var(--ink-mute); letter-spacing:0.1em; text-transform:uppercase; margin-top:20px;">
The original timestamp survives unchanged. Sorting still works. The description sits between, set off by em-dashes.
</p>
</div>
</section>
<!-- ───── Receipt ──────────────────────────────────── -->
<section>
<div class="wrap">
<div class="receipt reveal">
<div class="head">screenshot-rename · run log · 2026-05-04</div>
<div class="line"><span>source files</span><span class="v">196</span></div>
<div class="line"><span>resized to 1568px</span><span class="v">196</span></div>
<div class="line"><span>frames extracted (mp4 / pdf)</span><span class="v">9</span></div>
<div class="line"><span>batches dispatched</span><span class="v">10 · parallel</span></div>
<div class="line"><span>haiku descriptions returned</span><span class="v">196</span></div>
<div class="line"><span>plan validated</span><span class="v">189 renames · 0 errors</span></div>
<div class="line"><span>plan collisions</span><span class="v">none</span></div>
<div class="line"><span>file count before</span><span class="v">195</span></div>
<div class="line"><span>file count after</span><span class="v">195</span></div>
<div class="line total"><span>renames committed</span><span class="ok">189 ✓</span></div>
<div class="line total"><span>files lost</span><span class="ok">0 ✓</span></div>
</div>
</div>
</section>
<!-- ───── Gotchas ──────────────────────────────────── -->
<section id="gotchas">
<div class="wrap">
<div class="section-label"><span class="num">04</span>The rules that prevent data loss</div>
<h2>Every rule below was <em>paid for</em>.</h2>
<p class="lede-2">
During development, four files were destroyed by a one-line bash mistake.
Each rule names the failure mode that earned its place. None are aspirational.
</p>
<ol class="gotchas">
<li><b>Resize before vision.</b>Retina screenshots exceed Read's image cap. Use <code>sips -Z 1568 -s format jpeg</code> first. The agent will fail mid-batch otherwise.</li>
<li><b>Frames, not videos.</b>The vision tool can't read <code>.mp4</code> or multi-page <code>.pdf</code>. Extract a frame with <code>ffmpeg -ss 1 -frames:v 1</code> and describe that.</li>
<li><b>Never trust bash regex on filenames.</b>zsh's <code>[[ =~ ]]</code> does not populate <code>BASH_REMATCH</code>. Pattern silently fails, target name is empty, multiple <code>mv</code>s collide. <em>Use Python.</em></li>
<li><b><code>mv</code> overwrites silently.</b>One off-by-one in target construction destroys data with no error. Use <code>mv&nbsp;-n</code> in shell, or <code>os.rename</code> after an <code>os.path.exists</code> check in Python.</li>
<li><b>Plan the full rename in memory first.</b>Build every <code>(src, dst)</code> tuple. Verify each <code>dst</code> is unique, doesn't exist, and corresponds to a real <code>src</code>. <em>Then</em> mutate disk.</li>
<li><b>File-count audit, every time.</b><code>len(listdir(DEST))</code> before and after must be equal. Inequality is the only evidence of silent loss you'll get.</li>
<li><b>iCloud snapshots are stubs, not bytes.</b>Files in a Time Machine local snapshot inside an iCloud-synced tree are file-provider stubs. <code>cat</code> them and the read times out. Real recovery comes from external backups.</li>
<li><b>Run renames foreground.</b><code>Bash run_in_background</code> with <code>while read</code> may exit early with no progress. Run via Python in the same shell — <code>os.rename</code> is just a syscall.</li>
<li><b>Validate the filename column.</b>Haiku occasionally returns the resized <code>.jpg</code> name instead of the original <code>.png</code>. The plan-builder must try alternate extensions when the claimed source isn't found.</li>
<li><b>Preserve the original extension.</b>The pipeline reads from a resized JPEG but renames the original <code>.mp4</code> / <code>.pdf</code>. Write the source extension back into the new name.</li>
</ol>
</div>
</section>
<!-- ───── Use cases ────────────────────────────────── -->
<section id="cases">
<div class="wrap">
<div class="section-label"><span class="num">05</span>Use cases</div>
<h2>What this looks like in <em>practice</em>.</h2>
<p class="lede-2">
The skill earns its keep when "Spotlight will find it" stops being true. Four scenarios where it has.
</p>
<div class="cases">
<div class="case">
<div class="num">A · Archive</div>
<h3>An audit of a year of work.</h3>
<p>Run the skill on a ~year-old screenshot folder. The output is a chronologically-sorted
narrative of what you were thinking about, week by week — readable from the filename column
in Finder. No app needed.</p>
<div class="example">CleanShot · Synqora Audit Context Calculator Discussion Continued · 2026-03-15 at 08.08.29.png</div>
</div>
<div class="case">
<div class="num">B · Recall</div>
<h3>"Find the screenshot of the bug from last March."</h3>
<p>Renaming once buys you free-text search forever. <code>mdfind "synqora session load"</code>
surfaces the right file in a fraction of a second, with no manual tagging.</p>
<div class="example">CleanShot · Synqora Session Load Failed Disconnect Reconnecting Error · 2026-04-18 at 13.37.12.png</div>
</div>
<div class="case">
<div class="num">C · Onboarding</div>
<h3>Designer joins. Hands them the folder.</h3>
<p>Instead of curating a deck of "what we've shipped this quarter," point them at the renamed
screenshot folder. The filenames are the deck. Categorize by app, by feature, by
timeline — the descriptions are already there.</p>
<div class="example">CleanShot · Xcode Preview Swiftui Render Table Comparison Tools · 2026-03-21 at 10.47.26.png</div>
</div>
<div class="case">
<div class="num">D · Memory</div>
<h3>A searchable design memory.</h3>
<p>Pair with a periodic re-run on new captures. The folder becomes a queryable artifact:
every screenshot you took, with what was in it, in plain text, in the filesystem you
already use. No new tool to adopt.</p>
<div class="example">CleanShot · Storyboard Browser With Harry Bridges 1933 Rally Shots · 2026-05-03 at 07.58.27.png</div>
</div>
</div>
</div>
</section>
<!-- ───── Install ──────────────────────────────────── -->
<section id="install">
<div class="wrap">
<div class="section-label"><span class="num">06</span>Install &amp; run</div>
<h2>Three commands, <em>one folder</em>.</h2>
<p class="lede-2">
The skill installs as a Claude Code skill. Once cloned into <code>~/.claude/skills/</code>, it
activates automatically when you ask Claude to rename a screenshot folder. It can also be
driven from the command line.
</p>
<pre class="code"><span class="lbl">install</span><span class="c"># clone into your Claude Code skills directory</span>
<span class="k">git</span> clone https://gitea.tojo.team/cardinale/screenshot-rename.git \
<span class="s">~/.claude/skills/screenshot-rename</span></pre>
<div class="install-grid">
<div class="col">
<h3>Driven by Claude Code</h3>
<p>Open Claude Code in any project and say it conversationally. The skill activates from its description and runs the workflow end to end.</p>
<pre class="code" style="border-left-color:var(--accent-deep);"><span class="lbl">claude code</span><span class="p">&gt;</span> rename all the cleanshots in
<span class="s">~/Documents/Screenshots/</span>
based on their content.</pre>
</div>
<div class="col">
<h3>Driven directly from the shell</h3>
<p>For folders too large for a single session, run each stage by hand. Dispatch the Haiku subagents from a Claude Code session in between.</p>
<pre class="code"><span class="lbl">cli</span><span class="k">python3</span> pipeline.py prep --src <span class="s">"./shots"</span>
<span class="c"># dispatch one haiku agent per batch...</span>
<span class="k">python3</span> pipeline.py plan --src <span class="s">"./shots"</span>
<span class="k">python3</span> pipeline.py execute --src <span class="s">"./shots"</span></pre>
</div>
</div>
</div>
</section>
</main>
<footer>
<div class="wrap">
<div class="marks">
<a href="https://gitea.tojo.team/cardinale/screenshot-rename">repository ↗</a>
<a href="./SKILL.md">SKILL.md</a>
<a href="./LICENSE">MIT</a>
</div>
<div class="colophon">
Set in Fraunces &amp; JetBrains Mono. Written after losing four files to a bash regex bug.
</div>
</div>
</footer>
<script>
// Tiny stagger for sections that fade in on scroll.
(function () {
if (!('IntersectionObserver' in window)) return;
const io = new IntersectionObserver((entries) => {
entries.forEach(e => {
if (e.isIntersecting) { e.target.classList.add('in'); io.unobserve(e.target); }
});
}, { threshold: 0.08 });
document.querySelectorAll('.reveal').forEach(el => io.observe(el));
})();
</script>
</body>
</html>
+280
View File
@@ -0,0 +1,280 @@
#!/usr/bin/env python3
"""Screenshot-rename pipeline.
Three subcommands:
prep — extract frames, resize, build manifest, split into batches
plan — aggregate desc-*.tsv files, validate, write rename plan
execute — apply the plan with safety checks
The Haiku-subagent dispatch step happens between `prep` and `plan` and is
performed by Claude Code in-session, not by this script.
"""
import argparse
import os
import re
import shutil
import subprocess
import sys
from pathlib import Path
WORK = Path("/tmp/screenshot-rename")
FRAMES = WORK / "frames"
SMALL = WORK / "small"
def run(cmd, **kw):
return subprocess.run(cmd, capture_output=True, text=True, **kw)
def title_case(s: str) -> str:
return " ".join(w.capitalize() for w in s.split())
# ---------- prep ----------
def prep(src: Path, batch_size: int, prefix: str) -> None:
if not src.is_dir():
sys.exit(f"source not a directory: {src}")
WORK.mkdir(parents=True, exist_ok=True)
FRAMES.mkdir(exist_ok=True)
SMALL.mkdir(exist_ok=True)
pattern = re.compile(rf"^{re.escape(prefix)}\s+\d{{4}}-\d{{2}}-\d{{2}}.*$")
files = sorted(p for p in src.iterdir() if p.is_file() and pattern.match(p.name))
if not files:
sys.exit(f"no matching files (prefix='{prefix}') in {src}")
print(f"found {len(files)} source files")
manifest = WORK / "all.tsv"
with manifest.open("w") as out:
for f in files:
base = f.stem
ext = f.suffix.lower()
if ext in (".mp4", ".mov"):
frame = FRAMES / f"{base}.jpg"
if not frame.exists():
r = run(["ffmpeg", "-y", "-ss", "1", "-i", str(f),
"-frames:v", "1", "-q:v", "3", str(frame)])
if not frame.exists():
print(f"WARN ffmpeg failed: {f.name}", file=sys.stderr)
continue
vision_src = frame
elif ext == ".pdf":
frame = FRAMES / f"{base}.jpg"
if not frame.exists():
run(["sips", "-s", "format", "jpeg", str(f), "--out", str(frame)])
if not frame.exists():
print(f"WARN sips failed on pdf: {f.name}", file=sys.stderr)
continue
vision_src = frame
elif ext in (".png", ".gif", ".jpg", ".jpeg", ".webp"):
vision_src = f
else:
print(f"SKIP unknown ext: {f.name}", file=sys.stderr)
continue
small = SMALL / f"{base}.jpg"
if not small.exists():
run(["sips", "-Z", "1568", "-s", "format", "jpeg",
str(vision_src), "--out", str(small)])
if not small.exists():
print(f"WARN resize failed: {f.name}", file=sys.stderr)
continue
out.write(f"{small}\t{f.name}\n")
# split into batches
for old in WORK.glob("full-batch-*"):
old.unlink()
lines = manifest.read_text().splitlines()
n_batches = max(1, (len(lines) + batch_size - 1) // batch_size)
for i in range(n_batches):
chunk = lines[i * batch_size:(i + 1) * batch_size]
(WORK / f"full-batch-{i+1:02d}").write_text("\n".join(chunk) + "\n")
print(f"prepped {len(lines)} files into {n_batches} batches in {WORK}")
print(f"\nDispatch {n_batches} Haiku subagents (one per batch).")
print(f"After all desc-full-NN.tsv files exist, run: pipeline.py plan --src '{src}'")
# ---------- plan ----------
def plan(src: Path, prefix: str, max_words: int) -> None:
if not src.is_dir():
sys.exit(f"source not a directory: {src}")
descs = sorted(WORK.glob("desc-full-*.tsv"))
if not descs:
sys.exit("no desc-full-*.tsv files found in /tmp/screenshot-rename")
all_lines = []
for p in descs:
all_lines.extend(p.read_text().splitlines())
print(f"aggregated {len(all_lines)} description lines from {len(descs)} batches")
existing = set(os.listdir(src))
plan_rows = []
errors = []
seen = {}
for lineno, line in enumerate(all_lines, 1):
line = line.rstrip()
if not line:
continue
parts = line.split("\t", 1)
if len(parts) != 2:
errors.append(f"L{lineno}: bad split: {line!r}")
continue
orig_claimed, desc = parts
if not orig_claimed.startswith(prefix + " "):
errors.append(f"L{lineno}: prefix: {orig_claimed!r}")
continue
# Find the actual file — Haiku occasionally returns .jpg instead of .png
orig = orig_claimed
if orig not in existing:
base = os.path.splitext(orig_claimed)[0]
for ext in (".png", ".gif", ".mp4", ".pdf", ".jpg", ".jpeg", ".webp"):
cand = base + ext
if cand in existing:
orig = cand
break
else:
errors.append(f"L{lineno}: source not found: {orig_claimed!r}")
continue
words = desc.split()
if len(words) < 6:
errors.append(f"L{lineno}: <6 words: {orig!r} -> {desc!r}")
continue
words = words[:max_words]
cleaned = []
for w in words:
cw = "".join(c for c in w if c.isalnum())
if cw:
cleaned.append(cw)
if len(cleaned) < 6:
errors.append(f"L{lineno}: <6 after sanitize: {desc!r}")
continue
cleaned = cleaned[:max_words]
titled = title_case(" ".join(cleaned))
rest = orig[len(prefix) + 1:] # everything after "Prefix "
new = f"{prefix} - {titled} - {rest}"
if new == orig:
errors.append(f"L{lineno}: same: {orig!r}")
continue
if new in existing:
errors.append(f"L{lineno}: target exists in DEST: {new!r}")
continue
if new in seen:
errors.append(f"L{lineno}: plan collision: {new!r} from {orig!r} and {seen[new]!r}")
continue
seen[new] = orig
plan_rows.append((orig, new))
print(f"plan: {len(plan_rows)} renames, {len(errors)} errors")
if errors:
print("\nERRORS:")
for e in errors[:30]:
print(f" {e}")
if len(errors) > 30:
print(f" ... and {len(errors) - 30} more")
plan_path = WORK / "plan-full.tsv"
with plan_path.open("w") as f:
for orig, new in plan_rows:
f.write(f"{orig}\t{new}\n")
print(f"\nplan saved: {plan_path}")
print(f"sample (every {max(1, len(plan_rows)//6)}th row):")
step = max(1, len(plan_rows) // 6)
for i in range(0, len(plan_rows), step):
orig, new = plan_rows[i]
print(f" {orig}\n{new}\n")
print(f"if plan looks good: pipeline.py execute --src '{src}'")
# ---------- execute ----------
def execute(src: Path) -> None:
if not src.is_dir():
sys.exit(f"source not a directory: {src}")
plan_path = WORK / "plan-full.tsv"
if not plan_path.exists():
sys.exit(f"no plan: {plan_path} (run `pipeline.py plan` first)")
before = len(os.listdir(src))
ok = 0
fail = 0
fails = []
with plan_path.open() as f:
for line in f:
line = line.rstrip()
if not line:
continue
orig, new = line.split("\t", 1)
srcp = src / orig
dstp = src / new
if not srcp.exists():
fails.append(f"src missing: {orig}")
fail += 1
continue
if dstp.exists():
fails.append(f"target exists: {new}")
fail += 1
continue
try:
os.rename(srcp, dstp)
if dstp.exists() and not srcp.exists():
ok += 1
else:
fails.append(f"post-check failed: {orig}")
fail += 1
except OSError as e:
fails.append(f"rename error {orig}: {e}")
fail += 1
after = len(os.listdir(src))
print(f"ok={ok} fail={fail} before={before} after={after}")
if before != after:
print("⚠ FILE COUNT CHANGED — investigate immediately")
sys.exit(2)
print("file count unchanged ✓")
if fails:
fails_path = WORK / "rename-fails.txt"
fails_path.write_text("\n".join(fails))
print(f"failures logged: {fails_path}")
for x in fails[:5]:
print(f" {x}")
# ---------- main ----------
def main() -> None:
p = argparse.ArgumentParser(description=__doc__)
sub = p.add_subparsers(dest="cmd", required=True)
p_prep = sub.add_parser("prep", help="extract frames, resize, build batches")
p_prep.add_argument("--src", type=Path, required=True)
p_prep.add_argument("--batch-size", type=int, default=19)
p_prep.add_argument("--prefix", default="CleanShot",
help="filename prefix to match (default CleanShot)")
p_plan = sub.add_parser("plan", help="build & validate rename plan")
p_plan.add_argument("--src", type=Path, required=True)
p_plan.add_argument("--prefix", default="CleanShot")
p_plan.add_argument("--max-words", type=int, default=8)
p_exec = sub.add_parser("execute", help="apply rename plan with safety checks")
p_exec.add_argument("--src", type=Path, required=True)
args = p.parse_args()
if args.cmd == "prep":
prep(args.src, args.batch_size, args.prefix)
elif args.cmd == "plan":
plan(args.src, args.prefix, args.max_words)
elif args.cmd == "execute":
execute(args.src)
if __name__ == "__main__":
main()