Commit Graph

36 Commits

Author SHA1 Message Date
Sean McElwain da538a99ee Derive candidate fields from vision crop OCR 2026-05-30 21:03:41 -05:00
Sean McElwain 5947cc0fe0 Run Tesseract OCR on unmatched vision region crops 2026-05-30 20:46:43 -05:00
Sean McElwain 05c9b6964a Classify and crop unmatched vision regions 2026-05-30 19:31:10 -05:00
Sean McElwain f3e61e877b Normalize and score vision regions against OCR layout 2026-05-30 19:26:07 -05:00
Sean McElwain 70e82123a2 Add OpenCV coarse region detection for vision analysis 2026-05-30 18:33:17 -05:00
Sean McElwain e6ab2f9903 Render PDF pages for vision analysis input 2026-05-30 17:56:03 -05:00
Sean McElwain 3aa2c78ac3 Wire vision candidate generation through backend module 2026-05-30 16:11:30 -05:00
Sean McElwain 7649f48890 Remove scan-backed replica and standardize replica output naming 2026-05-28 21:05:59 -05:00
Sean McElwain 474ab010fe Unify OCR and layout review canonical state 2026-05-25 18:03:55 -05:00
Sean McElwain b27332e530 Remove temporary replica render debug logging 2026-05-25 16:57:24 -05:00
Sean McElwain ea314d733d Fix layout review selected-word style persistence 2026-05-25 16:55:27 -05:00
Sean McElwain 01e081d45a Improve replica text fitting and baseline alignment 2026-05-24 21:36:30 -05:00
Sean McElwain 9fcef4cacd Fix layout review save flow and respect word font sizes in replica rendering 2026-05-24 14:28:26 -05:00
Sean McElwain d24e144490 Use word-level layout data for clean replica rendering 2026-05-11 14:05:27 -05:00
Sean McElwain afd5aaef8c Fix replica output source to use reviewed OCR layout data 2026-05-11 09:50:22 -05:00
Sean McElwain 658240c031 Implement bidirectional OCR and layout review sync 2026-05-10 15:06:19 -05:00
Sean McElwain 8c6e862799 Add debug overlay replica preview and save mode 2026-05-09 15:45:22 -05:00
Sean McElwain 9c0b473ec6 Add layout OCR pipeline and word-level scan-backed overlay 2026-05-09 15:12:39 -05:00
Sean McElwain d292b2d00d Add layout OCR pipeline and clean replica generation 2026-05-09 15:06:20 -05:00
Sean McElwain 610f25c2b8 update app 2026-05-07 13:02:38 -05:00
Sean McElwain 0617ab58c4 refactor: make save-pdf update canonical file without creating artifact versions 2026-04-28 22:35:56 -05:00
Sean McElwain 2521ebd503 feat: Phase 4.4 line item review workflow
- added queue tab to line items page
- added reviewed/approved/excluded/na review toggles
- made reviewed items leave the queue
- added line item source restore controls
- added regenerate line items from OCR
2026-04-18 12:21:36 -05:00
Sean McElwain fcce99a091 feat: Phase 4.3 queue + line item polish
- migrated line item queue to generic document line items
- added detected-count line item rows with add-row
- restored rerun OCR in OCR review tab
- improved line item dates and title case
2026-04-17 20:26:53 -05:00
Sean McElwain 1e37a80894 feat: add file size tracking and formatting
- store file_size_bytes on versions
- add human_size() utility
- display normalized sizes in UI

enables size vs fidelity analysis
2026-04-11 19:28:48 -05:00
Sean McElwain f26f7ddc03 feat: embed PDF metadata for saved versions and validate available versions by metadata match 2026-04-11 17:51:41 -05:00
Sean McElwain c9fdf953e7 feat: unified save flow, path override UI, mirror sync, and automatic pruning of old document versions 2026-04-11 15:11:59 -05:00
Sean McElwain 1cf42242f7 feat(storage): canonical owner-based document paths and overwrite-safe PDF saves
- derive storage path from owner_primary (fallback default)
- route OCR-corrected and field-enriched outputs to canonical records path
- support explicit output_path for save flows
- prevent SameFileError when overwriting canonical file
- keep version history while using stable canonical file path
2026-04-11 09:14:22 -05:00
Sean McElwain 9ebaa6f99e refactor: use time-based hex document ids for new ingests 2026-04-07 13:51:47 -05:00
Sean McElwain ba710db9fa feat: add additional fields, presets, and ownership model (primary/secondary) 2026-04-07 11:25:03 -05:00
Sean McElwain c7dab22f16 feat: improve receipt extraction with reference number and line item pairing 2026-04-06 14:51:05 -05:00
Sean McElwain 87ce5cc2fb fix: OCR editor gutter scroll sync + synthetic layout on ingest + CSS consolidation 2026-04-06 12:07:26 -05:00
Sean McElwain 431372438e feat: finalized Phase 3.5/4.1 workflow and added queue-trash plus UX-1 shell 2026-04-05 12:05:19 -05:00
Sean McElwain 5b9e8f0d01 feat: Phase 4.1 add manual-first extracted fields from reviewed OCR 2026-04-03 16:47:38 -05:00
Sean McElwain bdaff6f781 feat: Phase 3.5 add line-preserving OCR review and corrected searchable PDF generation 2026-04-03 15:07:51 -05:00
Sean McElwain e67a67f80a feat: Phase 3.5 add line-preserving OCR review and corrected searchable PDF generation 2026-04-03 11:56:23 -05:00
Sean McElwain 0d70e6b7bb feat: Phase 3.5 add inbox/upload/server ingest, OCR rerun, and text version tracking 2026-04-03 08:38:13 -05:00