Convertible covers: from paper catalogs to vision-backed PDPs

A niche automotive inventory — convertible tops, soft tops, interior soft goods — lives in manufacturer catalogs. Each part connects to specific makes, models, and years. Without the catalog, you don't know what you have. The client was manually looking up every item in PDF after PDF, searching cross-reference databases, calling distributors for current pricing. That doesn't scale to hundreds of SKUs. That's the problem we solved.

The Problem: Manual Catalog Dependency

The client runs convertible cover inventory — soft tops for classic cars, marine convertible tops, automotive upholstery, and related soft goods. Their inventory spans dozens of manufacturers across decades of production. The products have fitment that varies by make, model, and year — a single convertible top might fit twelve different vehicles, depending on the specific year range and body style.

Every item in the warehouse connects to manufacturer catalog data: part numbers, fitment specifications, material composition, current market pricing. Without the catalog reference, an item is just "a convertible top, maybe for a 1967 Mustang, I think." With the catalog, it becomes: "Haartz convertible top, Haartz 3299 black vinyl, fits 1965-1968 Ford Mustang, current wholesale $340."

The client was looking up items manually: opening manufacturer PDFs (Haartz, action, Robbins, multiple others), searching part databases, calling distributors for current pricing. Twenty items an hour at best. With hundreds of SKUs, this took days. Pricing went stale between catalog updates. Errors accumulated — wrong fitment on listings, mispriced items, incorrect cross-references.

This is fundamentally an inventory operations problem, not an e-commerce problem. The bottleneck isn't selling; it's knowing what you have and what it's worth.

Why Fitment Complexity Makes This Worse

Convertible covers have fitment complexity that generic e-commerce doesn't handle:

Model-specific fitment: One part might fit twelve different vehicles. The SKU alone doesn't encode this. You need the manufacturer catalog to map: part number → make → model → year range → body style. Manual lookup can handle one item; it can't handle hundreds of queries in parallel.

Material variants: Same cover in different materials — vinyl, canvas, cloth,HaartzStaybond, Haartz Ultralux. Each material has different specifications, pricing tiers, and inventory sourcing. The SKU "3299" could mean five different materials with five different prices.

Cross-references: Manufacturer part numbers don't map one-to-one to inventory SKUs. One part number connects to multiple manufacturer numbers across different product lines. Cross-reference tables are required — and they're maintained by the manufacturers, not the client.

Rapid catalog updates: Manufacturers update pricing quarterly. Material costs shift. New colors enter production. Old patterns discontinue. Manual lookup means stale data between manual updates.

You can't build a catalog on human memory. You can't build it on manual lookups every time you need a price. The system needs to own the catalog — structured, queryable, and tied to the inventory at runtime.

The Solution: Vision-Backed Product System

We built a three-layer system to address catalog dependency:

Layer 1: Catalog Corpus Ingestion

The system ingests manufacturer catalog data from multiple sources:

PDF catalogs: We parse manufacturer PDFs — Haartz product guides, Robbins catalogs, distributor spreadsheets — extracting part numbers, fitment tables, material specifications, wholesale pricing, MSRP. PDF parsing builds structured records from what was only searchable in the PDF.

Excel and CSV cross-reference files: Manufacturer-provided mappings of part numbers to cross-reference numbers, material codes to descriptions, fitment ranges to model lists. These integrate into the corpus.

Web data: Published specifications from manufacturer websites. Web scraping extracts current data for parts that haven't made it into printed catalogs.

Every source normalizes to a common schema:

{ manufacturer: "Haartz", part_number: "3299", material: "vinyl", color: "black", fitment: ["1965 Ford Mustang", "1966 Ford Mustang", "1967 Ford Mustang", "1968 Ford Mustang"], wholesale_price: 340.00, msrp: 595.00, active: true }

The corpus isn't static. When manufacturers release updated PDFs, CSV exports, or web changes, the system re-ingests and recalculates. Pricing stays current.

Layer 2: Vision Classification Pipeline

Warehouse intake uses computer vision instead of manual text entry:

Mobile photo capture: Warehouse workers photograph items at intake using mobile devices. Multiple angles, detail shots of materials and construction. The intake process is photograph-first, text-entry-second.

Classification model: A computer vision model analyzes each photo:

Material type classification: vinyl, canvas, cloth, leather, specialty composite - Color family detection: black, tan, beige, burgundy, custom - Cover type identification: full top, window glass replacement, trunk mat, tonneau cover

The model learned from the client's own inventory photos — domain-specific training, not generic ImageNet weights. We trained on two hundred labeled images and refined with corrections.

Catalog matching: Classified item matches against the corpus to propose: manufacturer, part number, fitment, pricing. If confidence is above threshold, it's auto-approved. If below, it queues for human review.

The vision model doesn't replace judgment — it accelerates intake by proposing matches that humans verify.

Layer 3: Query API

The product system exposes a query API:

Search: "Find all Haartz vinyl tops for 1965-1968 Mustang." The system queries corpus + inventory, returns matches with pricing.

Price lookup: "Current wholesale for part 3299 in tan canvas." The system returns current wholesale and MSRP, sourced from corpus.

Inventory alignment: Match inventory records to catalog data — flag cross-references that need verification, identify items with stale pricing, highlight items with no catalog match.

The client workflow shifted: open a PDF, search the PDF, call a distributor → query the system in seconds. Same information, queried programmatically.

Architectural Details

The three layers connect through event-driven architecture:

[Photo Intake] → [Vision API] → [Corpus Match] → [Query API] ↓ [Inventory Records]

When a photo enters:

Vision API classifies material, color, cover type 2. Classification queries corpus for matching records 3. System proposes manufacturer, part number, pricing 4. If confidence above 85%, auto-approve to inventory 5. If below 85%, queue for human review at the intake station

The confidence threshold is tunable — we started conservative at 90%, dropped to 85% after validating initial accuracy. The system learns from corrections at the review step.

What We Delivered

Three outcomes:

Structured catalog intelligence: Every active catalog is normalized, searchable, and programmatically accessible. No more PDF hunting. No more stale pricing. The corpus owns the truth.

Accelerated intake: Vision classification proposes catalog matches at intake. Human verification replaces manual entry. Throughput went from twenty items an hour to two hundred. The bottleneck moved to photography, not lookup.

Fresh pricing: Catalog updates propagate to the query system automatically. When Haartz releases a new price list, re-ingestion takes minutes. No more stale pricing from quarterly PDF reprints.

The client didn't get a generic e-commerce site. They got structured product intelligence that their existing sales workflow uses — not a new sales channel, a new operations capability.

Why This Worked

Three structural reasons:

The problem was catalog lookup, not e-commerce: The client needed to answer questions about inventory — what is this, what fits, what's it worth. That's product intelligence, not shopping cart functionality. Building a catalog was the right problem to solve; a storefront would have been wrong.

Vision matched the input type: Warehouse intake is visual. People photograph parts already — that workflow existed. Photo-first classification with verification was faster than text-entry forms.

Custom fit the domain: Fitment complexity doesn't fit generic e-commerce data models. A custom repository with corpus alignment was cleaner than forcing a square product into a round cart. The SKU + catalog structure is specific to this industry; general-purpose tools couldn't model it well.

What We'd Do Differently

Two improvements:

Inventory-first, catalog-second: We built the vision model before the corpus was complete. We should have prioritized corpus ingestion — the model is more useful with richer cross-references to match against.

More ingestion sources earlier: We started with two manufacturer catalogs, expanded to nine. More sources earlier would have improved matching rate from the start. Plan for broader corpus expansion from day one.

Closing

The convertible covers engagement wasn't an e-commerce site build. It was a product intelligence build — catalog corpus ingestion plus vision classification to make inventory queryable.

This pattern transfers to other niche retail where manufacturer catalogs define product truth: automotive soft goods, marine parts, industrial components. If your inventory depends on catalogs, and manual lookup is the bottleneck, that's the problem to solve. The solution is product intelligence, not a shopping cart.

If you need this kind of analysis on your market, locations, or category—not generic advice—tell us what you are deciding. We deliver ranked findings you can act on.