DealForge AI is an agentic deal intelligence platform that automatically scrapes business-for-sale and CRE listings, enriches incomplete data using Census and industry benchmarks, runs full SBA financing calculations, scores every deal 0–100, and notifies matched investors via SMS — all in under 60 seconds.
The core problem: most business-for-sale listings are incomplete. An estimated 60–70% are missing cash flow, SDE, or other key financial data. Every competitor either skips these listings or forces the user to do manual research. DealForge fills the gaps automatically using a layered intelligence approach, turning "incomplete listing" into "estimated deal with confidence range."
DealForge's 4-tier enrichment engine is what separates it from BizBuySell saved searches, Kumo, and every other listing aggregator. It creates value on the 60–70% of listings that competitors ignore entirely.
How It Works
This is a Karpathy-pattern knowledge base: raw data is collected, then compiled by an LLM into structured articles. The system uses a multi-agent architecture where specialized agents handle discovery, parsing, enrichment, scoring, matching, and notification. The wiki is the source of truth for knowledge; the database is for fast queries.
Market Opportunity
- SBA 7(a) lending hit a record $45B across ~85,000 loans in FY2025, with acquisition-specific volume up ~35% YTD
- 80%+ of sub-$5M buyers use SBA financing — if a deal doesn't work under SBA terms, it doesn't work for most buyers
- 9,500+ completed transactions in the BizBuySell dataset form the foundation of industry benchmarks
- Median sale price rising from $337,750 overall to $375,000 in Q4 2025
- Search fund / ETA community growing rapidly — more first-time acquirers entering the market than ever
DealForge is 3–10x cheaper than the closest competitor for comparable functionality. BizBuySell saved searches are free but offer no analysis. Kumo charges $149–$200/mo. DealForge Tier 1 delivers full analysis, enrichment, and SMS alerts for $15/mo.
The DealForge Score answers one question: "Is this deal worth my time?" It combines seven sub-scores into a single number that accounts for both financial fundamentals and data reliability. A deal scored 85 with HIGH confidence is fundamentally different from 85 with LOW confidence — both numbers are always shown together.
Composite Formula
DealForge Score = (
DSCR_Score * 0.25 + # Can it service debt?
Multiple_Delta * 0.20 + # Is it priced fairly vs industry?
Margin_Delta * 0.15 + # Is it profitable vs industry?
Payback_Score * 0.10 + # How fast do you recover equity?
Cash_After_Debt * 0.10 + # Real take-home after loan
Industry_Risk * 0.10 + # Sector default rate / stability
Data_Confidence * 0.10 # How much do we trust the numbers?
) * 100
Sub-Score Definitions
DSCR Score (25% Weight) — Most Important
| DSCR Value | Score | Interpretation |
|---|---|---|
| < 1.00 | 0.0 | Cannot cover debt — deal fails |
| 1.00 | 0.1 | Barely covers — extremely risky |
| 1.25 | 0.3 | SBA minimum threshold |
| 1.50 | 0.5 | Acceptable but thin |
| 2.00 | 0.8 | Good margin of safety |
| ≥ 2.50 | 1.0 | Excellent coverage |
Multiple Delta Score (20% Weight)
Compares the asking price (as a multiple of cash flow) against the industry average. A deal priced 20%+ below average scores 1.0 (great deal). At average = 0.7 (fair). 30% above = 0.3 (overpriced). 60%+ above = 0.0 (avoid).
Margin Delta Score (15% Weight)
Is the business more or less profitable than its industry peers? 20%+ above average SDE margin = 1.0. At average = 0.6. 50%+ below = 0.0 (red flag).
Payback Score (10% Weight)
| Payback Period | Score | Interpretation |
|---|---|---|
| < 6 months | 1.0 | Excellent — fast equity recovery |
| 6–12 months | 0.8 | Good |
| 12–18 months | 0.5 | Acceptable |
| 18–24 months | 0.3 | Marginal |
| 24–36 months | 0.1 | Slow |
| > 36 months | 0.0 | Capital tied up too long |
Cash After Debt (10%), Industry Risk (10%), Data Confidence (10%)
Cash After Debt: Absolute dollar take-home after debt service. Ranges from 0.0 ($0 or negative) to 1.0 ($300k+/yr). Industry Risk: Based on SBA default rate data — <2% default = 1.0, >6% = 0.1. Data Confidence: How much is based on reported data vs. estimates (see Section 05).
Score Interpretation
| Score Range | Classification | Investor Action |
|---|---|---|
| 85–100 | Exceptional | Immediate deep dive — rare find |
| 70–84 | Strong | Worth pursuing — request financials |
| 55–69 | Moderate | Review if flexible on some criteria |
| 40–54 | Weak | Likely pass unless strategic reason |
| 0–39 | Poor | Auto-reject |
The DealForge Score is universal — it measures absolute deal quality. The Interest Window (Section 07) is personal — it measures fit for a specific investor. A deal can score 90/100 but not match an investor's window. Only deals passing BOTH trigger SMS alerts.
The SBA 7(a) loan program is the #1 way small businesses change hands in the United States. DealForge assumes SBA financing as the default because it's what 80%+ of sub-$5M buyers use. If a deal doesn't work under SBA terms, it doesn't work for most buyers.
Standard SBA 7(a) Terms (DealForge Defaults)
| Parameter | Default Value | Notes |
|---|---|---|
| Down payment | 10% | Standard for acquisitions without real estate |
| Financed portion | 90% | |
| Loan term | 10 years | Non-real-estate-heavy deals |
| Loan term (with RE) | 25 years | When significant real estate is included |
| Interest rate | Prime + 2.75% | For loans >$250k; variable rate |
| SBA guaranty | 75% of loan amount | For loans >$150k |
| Closing costs | ~3% of asking price | Legal, appraisal, environmental, lender fees |
SBA Guaranty Fee Schedule
| Loan Amount | Fee Rate (on guaranteed portion) |
|---|---|
| ≤ $150,000 | 0% |
| $150,001 – $700,000 | 2.0% |
| $700,001 – $1,000,000 | 3.0% |
| > $1,000,000 | 3.5% |
DSCR — The Key Metric
DSCR = Annual Cash Flow (SDE) / Annual Debt Service
| DSCR | Lender View | DealForge Class |
|---|---|---|
| < 1.00 | Automatic decline | FAIL |
| 1.00–1.24 | Very risky | WEAK |
| 1.25 | Minimum threshold | MARGINAL |
| 1.25–1.50 | Approvable but tight | ACCEPTABLE |
| 1.50–2.00 | Comfortable | GOOD |
| > 2.00 | Excellent safety | STRONG |
Full Calculation Chain
1. Total Acquisition Cost = Asking_Price + Closing_Costs + SBA_Guaranty_Fee 2. Cash Down = Total_Acquisition_Cost * 0.10 3. Loan Amount = Total_Acquisition_Cost - Cash_Down 4. Monthly Payment = Loan * [r(1+r)^n] / [(1+r)^n - 1] 5. Annual Debt Service = Monthly_Payment * 12 6. DSCR = Cash_Flow / Annual_Debt_Service 7. Cash Flow After Debt = Cash_Flow - Annual_Debt_Service 8. Payback Period = Cash_Down / Cash_Flow_After_Debt
Worked Example: Auto Repair Shop ($2.1M Asking)
This auto repair shop has a DSCR of 2.37 (STRONG), a CF multiple at industry average (2.75x vs 2.70x), and an above-average CF margin (24.5% vs ~20%). The fast equity payback (6 months) makes this a strong acquisition target under SBA financing.
Most business listings are incomplete. An estimated 60–70% are missing cash flow, SDE, or other key financial data. DealForge's 4-tier enrichment framework is the primary differentiator — it fills gaps automatically using a layered intelligence approach.
The 4-Tier Classification
| Tier | Name | Description | Frequency |
|---|---|---|---|
| Tier 1 | Complete Data | Asking price, revenue, SDE, industry, location all present. Validate and score. | ~25–30% |
| Tier 2 | Inferable Data | Key fields missing but estimable from available signals. This is where DealForge creates the most value. | ~50–60% |
| Tier 3 | Needs Outreach | Too little data to estimate reliably. Flag for broker/seller outreach. | ~10–15% |
| Tier 4 | Suspicious Data | Numbers present but don't pass the smell test. Enrich + flag discrepancy. | ~5–10% |
Enrichment Chain Paths
Path A: Employee Count Known (Most Common)
1. Source: LinkedIn, Data Axle, Google Business Profile, listing text 2. Revenue = Employees x Revenue_Per_Employee[NAICS] 3. SDE = Revenue x SDE_Margin%[NAICS] 4. Apply BEA regional price parity adjustment
Path B: Physical Units Known
| Industry | Unit | Revenue Per Unit |
|---|---|---|
| Auto repair | Per bay | $100k–$200k/yr |
| Self-storage | Per unit | $40–$80/unit/yr |
| Assisted living | Per bed | $40k–$80k/yr |
| Car washes | Per bay | $150k–$400k/yr |
| Day care | Per slot | $8k–$15k/yr |
| Laundromats | Per machine | $10k–$18k/yr |
| Restaurants | Per seat | $15k–$25k/yr |
| Gas stations | Per gallon | $0.15–$0.25/gal |
Path C: Square Footage Known
Revenue/SqFt benchmarks: Grocery ($400–$600), QSR ($400–$700), Full service restaurant ($200–$400), Retail ($200–$400), Medical office ($300–$500), Gym/fitness ($30–$50/yr), Self-storage ($8–$15/yr).
Path D: Only Industry + Location Known (Lowest Confidence)
1. Query Census CBP at county level for NAICS code 2. Get average payroll per establishment 3. Revenue = Avg_Payroll / Payroll_to_Revenue_Ratio[NAICS] 4. SDE = Revenue x SDE_Margin%[NAICS] 5. Value = SDE x SDE_Multiple[NAICS] 6. Present as WIDE range with LOW confidence
Worked Example: HVAC Business, Grapevine TX
Classify: Tier 2 (revenue/SDE missing, employee count available). Path A selected. HVAC (NAICS 238220): Rev/employee $175k, SDE margin 20%, SDE multiple 2.80x. Estimated Revenue = 18 x $175k = $3.15M. SDE = $3.15M x 0.20 = $630k. Geo-adjusted (RPP 0.97): SDE $611k. Range: $489k–$733k. DSCR ~3.4 (EXCELLENT). DealForge Score: ~78/100. Confidence: MEDIUM.
Every deal analysis is only as good as its inputs. The Data Confidence score makes uncertainty explicit. Rule: The DealForge Score and Data Confidence are ALWAYS shown together.
Confidence Levels
| Level | Range | Source | Display | Score Weight |
|---|---|---|---|---|
| HIGH | 85–100% | All key fields from listing/CIM, validated against benchmarks | "Reported by seller" | 1.0 |
| MEDIUM | 60–84% | Estimated from 2+ independent signals | "Estimated from industry data" | 0.7 |
| LOW | 30–59% | Estimated from 1 signal only | "Rough estimate — verify" | 0.3 |
| INSUFFICIENT | <30% | Almost nothing to work with | "Insufficient data — outreach recommended" | 0.0 |
Confidence Point System
| Data Point Available | Points |
|---|---|
| Cash flow / SDE stated | +25 |
| Revenue stated | +20 |
| Asking price stated | +15 |
| Employee count known | +10 |
| Physical units known (bays, beds, etc.) | +10 |
| Square footage known | +5 |
| Industry/NAICS identified | +5 |
| ZIP code identified | +5 |
| Lease terms stated | +3 |
| Years in business stated | +2 |
Total possible: 100 points. Confidence level = sum of available points / 100.
Validation Adjustments
Confidence can be reduced if data fails validation: impossible margin (-10), multiple far outside industry range (-10), Census contradicts stated employees (-15), conflicting signals (-10 each). Confidence can be increased if multiple estimation paths agree within 15% (+10) or Census data supports stated numbers (+10).
Every SMS alert shows confidence. Dashboard deals are sortable by confidence. Marginal matches with LOW confidence are deprioritized in digests. Future: investors can set confidence thresholds in their Interest Window.
DealForge uses three valuation approaches depending on asset type and data availability: Income Approach (SDE multiples), Per-Unit/Physical Attribute Approach, and Market Comparison.
1. Income Approach (SDE Multiple Method)
The dominant method for small businesses under $5M. Business Value = SDE x SDE_Multiple[industry]
| Multiple Range | Typical Industries |
|---|---|
| 1.5–2.0x | Routes, nail salons, flower shops, cell phone repair |
| 2.0–2.5x | Restaurants, coffee shops, hair salons, dry cleaners, pet grooming |
| 2.5–3.0x | Auto repair, HVAC, plumbing, landscaping, cleaning, bars |
| 3.0–3.5x | Dental, home health, trucking, day care, websites/ecommerce |
| 3.5–4.5x | Laundromats, car washes, gas stations, machine shops, hotels |
| 4.5–5.0x | Storage facilities, dog daycare, rubber/plastic manufacturing |
| 5.0x+ | Marinas (6.6x), industrial machinery, medical billing |
Overall average across 9,500+ transactions: 2.57x SDE / 0.67x Revenue
2. Per-Unit / Physical Attribute Approach
| Industry | Unit | Value Per Unit | Revenue Per Unit |
|---|---|---|---|
| Auto repair | Bay | $50k–$150k | $100k–$200k/yr |
| Self-storage | Net rentable sqft | $40–$100+ | $8–$15/sqft/yr |
| Assisted living | Licensed bed | $30k–$80k+ | $40k–$80k/yr |
| Car washes | Express tunnel | $1M–$3M+ | $150k–$400k/yr |
| Day care | Licensed slot | $3k–$8k | $8k–$15k/yr |
| Dental | Annual collections | 60–80% | N/A |
| Gas stations | Gallon throughput | $0.15–$0.25/gal | N/A |
SDE Margin Benchmarks
| Industry | SDE Margin |
|---|---|
| Insurance agencies | 40–55% |
| Accounting/tax | 35–50% |
| Software/SaaS | 30–50% |
| Dental offices | 30–40% |
| Laundromats | 30–40% |
| Cleaning businesses | 25–35% |
| Auto repair / HVAC / Plumbing | 15–25% |
| Coffee shops | 15–22% |
| Restaurants | 10–18% |
| Retail | 8–18% |
The Interest Window is an investor's personalized definition of "a deal worth my time." It's a multi-dimensional preference profile where each criterion can be marked strict (must match exactly) or flexible (prefer this, but consider near-misses).
Profile Structure
Each window contains: industries (include/exclude), geography (radius or state), investment size (min/max), min DSCR, max CF multiple, min CF after debt, max payback months, min DealForge score. Each field has a flexibility toggle: strict or flexible.
How Matching Works
Notification Types
| Match Type | Delivery | Description |
|---|---|---|
| HOT MATCH | Immediate SMS | All strict pass + flex score ≥ 0.70. Very likely worth pursuing. |
| MARGINAL | Periodic Digest | All strict pass + flex 0.40–0.69. Close but not perfect. |
| NEAR MISS | Count Only | Close on strict or low flex. "3 near-misses within 15% of criteria" |
| NO MATCH | Silent Log | Doesn't meet strict criteria. Analytics only. |
If the system is scanning but finding nothing, the investor needs to know it's working. Configurable interval (daily/weekly): "847 deals reviewed this week. 0 in your interest window. 3 near-misses. Your agent is actively scanning."
Deals move through a defined set of states from discovery to resolution. Lifecycle events like price drops and stale listings trigger re-scoring and investor notifications.
State Diagram
NEW → ACTIVE → PRICE_CHANGED → STALE → REMOVED
↑ │
└─────────┘ (cycles back to ACTIVE after price change)
| State | Description | Action |
|---|---|---|
| NEW | Just discovered by ingestion agent | Parse, classify, enrich, calculate, score, match |
| ACTIVE | Fully analyzed and monitored | Re-check periodically for changes |
| PRICE_CHANGED | Price dropped (high-value event) | Re-score immediately, re-match all windows, notify |
| STALE | Listed >90 days with no changes | Flag as negotiation leverage, suggest lower offer |
| REMOVED | No longer on source platform | Log outcome, notify watchers, capture sale price if sold |
Deduplication Strategy
Same business can appear on multiple platforms (BizBuySell + BizQuest + BusinessesForSale). Match on: exact business name + ZIP, industry + asking price + ZIP (fuzzy), or broker name + description similarity (LLM-assisted). On match: merge into single record, keep best data from each source, note "Found on 3 platforms" as a signal of serious listing.
Lifecycle data becomes market intelligence: average days-to-sale by industry, price reduction frequency, which industries have the most stale inventory, and seasonal patterns in listing volume. This feeds back into scoring and enrichment benchmarks.
The end-to-end flow from deal discovery to investor SMS notification. Event-driven architecture triggered by platform email alerts, sitemap changes, and user uploads. Target: under 1 minute from trigger to SMS.
Full Pipeline Flow
Processing Time Targets
| Stage | Target | Notes |
|---|---|---|
| Trigger → Parse | < 30 sec | Email parsing is fast; URL fetch adds latency |
| Parse → Classify | < 2 sec | Simple rules engine |
| Classify → Enrich | < 5 sec | Lookup tables local; Census cached |
| Enrich → Calculate | < 1 sec | Pure math, no external calls |
| Calculate → Score | < 10 sec | LLM narrative generation is bottleneck |
| Score → Match | < 1 sec | In-memory matching |
| Match → SMS | < 5 sec | Twilio API call |
| Total: Trigger → SMS | < 1 minute | Goal for v1 |
The master lookup table powering DealForge's enrichment engine, scoring, and valuation. Source: BizBuySell transaction data (9,500+ deals, Q1 2021 – Q4 2025). Median sale price: $337,750 (rising to $375,000 in Q4 2025).
Service Businesses
| Industry | SDE Multiple | Rev Multiple | SDE Margin | Rev/Employee | # Deals |
|---|---|---|---|---|---|
| Medical Billing | 4.41x | 1.54x | ~35% | $120–180k | 7 |
| Funeral Homes | 4.36x | 1.63x | ~35% | $100–150k | 10 |
| Laundromats | 4.12x | 1.45x | 30–40% | N/A | 169 |
| Waste Mgmt & Recycling | 3.20x | 0.94x | ~25% | $100–150k | 43 |
| Property Management | 2.72x | 0.93x | ~30% | $80–120k | 59 |
| Landscaping & Yard | 2.56x | 0.76x | 15–25% | $80–120k | 211 |
| Cleaning Businesses | 2.30x | 0.78x | 25–35% | $50–80k | 154 |
| Dry Cleaners | 2.20x | 0.77x | 20–30% | $60–100k | 141 |
Building & Construction
| Industry | SDE Multiple | Rev Multiple | SDE Margin | Rev/Employee | # Deals |
|---|---|---|---|---|---|
| Building Materials | 3.40x | 0.64x | ~18% | $150–250k | 29 |
| Concrete | 3.04x | 0.72x | ~22% | $150–250k | 21 |
| Heavy Construction | 2.98x | 0.70x | ~22% | $150–300k | 65 |
| Electrical & Mechanical | 2.94x | 0.59x | ~20% | $130–200k | 57 |
| HVAC | 2.80x | 0.62x | 15–25% | $150–200k | 123 |
| Plumbing | 2.62x | 0.72x | 15–25% | $140–200k | 61 |
Food & Restaurants
| Industry | SDE Multiple | Rev Multiple | SDE Margin | Rev/Employee | # Deals |
|---|---|---|---|---|---|
| Bars & Taverns | 2.86x | 0.53x | ~18% | $50–80k | 217 |
| Bakeries | 2.68x | 0.54x | ~20% | $50–80k | 93 |
| Coffee Shops | 2.28x | 0.45x | 15–22% | $40–60k | 253 |
| Restaurants | 2.26x | 0.37x | 10–18% | $50–80k | 1,774 |
| Juice Bars | 2.09x | 0.46x | ~20% | $40–60k | 40 |
Healthcare & Fitness
| Industry | SDE Multiple | Rev Multiple | SDE Margin | Rev/Employee | # Deals |
|---|---|---|---|---|---|
| Dental Practices | 3.28x | 0.87x | 30–40% | $150–250k | 22 |
| Assisted Living | 3.18x | 1.21x | ~30% | $40–80k/bed | 27 |
| Home Health Care | 2.84x | 0.60x | ~20% | $60–100k | 92 |
| Medical Practices | 2.58x | 0.70x | 25–35% | $100–200k | 139 |
| Gyms & Fitness | 2.44x | 0.64x | ~25% | $40–60k | 79 |
Automotive & Marine
| Industry | SDE Multiple | Rev Multiple | SDE Margin | # Deals |
|---|---|---|---|---|
| Car Washes | 4.73x | 1.81x | ~35% | 32 |
| Gas Stations | 3.70x | 0.63x | ~15% | 113 |
| Equipment Rental | 3.55x | 0.90x | ~25% | 31 |
| Auto Repair & Service | 2.70x | 0.59x | 15–25% | 247 |
Retail, Manufacturing, Technology & Other
| Industry | SDE Multiple | Rev Multiple | SDE Margin | # Deals |
|---|---|---|---|---|
| Marinas & Fishing | 6.60x | 1.53x | ~23% | 9 |
| Rubber & Plastic Mfg | 5.11x | 1.19x | ~22% | 8 |
| Storage Facilities | 4.60x | 1.15x | ~25% | 8 |
| Dog Daycare & Boarding | 4.40x | 1.15x | ~25% | 24 |
| Nursery & Garden | 4.15x | 0.84x | ~20% | 9 |
| Hotels | 4.02x | 1.53x | ~35% | 8 |
| Machine Shops | 3.72x | 0.93x | ~24% | 46 |
| Software & Apps | 3.41x | 1.82x | 30–50% | 49 |
| Financial Services | 3.41x | 1.75x | ~50% | 20 |
| Liquor Stores | 3.41x | 0.52x | ~15% | 189 |
| Day Care | 3.40x | 0.81x | 15–25% | 78 |
| Websites & Ecommerce | 3.33x | 1.04x | ~30% | 339 |
| Grocery Stores | 3.38x | 0.43x | ~13% | 89 |
| Insurance Agencies | 2.68x | 1.53x | 40–55% | 50 |
| Convenience Stores | 2.82x | 0.41x | ~14% | 100 |
| Hair Salons & Barber | 2.18x | 0.59x | ~25% | 181 |
| Routes | 1.51x | 0.63x | ~40% | 496 |
Auto Repair & Service (NAICS 811111)
Strengths: Essential service, recession-resistant, skilled labor barriers, recurring maintenance revenue. Risks: Key-man risk if owner is primary tech, labor shortages, gradual EV transition, environmental compliance. What makes a good acquisition: 4+ bays, diverse customer base, established employees who stay, long-term lease, modern diagnostics.
HVAC (NAICS 238220)
Key insight: HVAC is one of the most active PE roll-up sectors. Private equity firms (Wrench Group, Service Experts, Apex Service Partners) acquire platform companies then bolt on smaller shops, creating upward pressure on multiples. HVAC deals get snapped up fast — speed of notification matters more here than most industries.
Restaurants (NAICS 722511)
The single largest volume of small business transactions. Many investors explicitly exclude restaurants from their interest window. With 10–18% SDE margins, many fail the DSCR test. DealForge flags: "Restaurant deals under $200k SDE often struggle to meet SBA DSCR requirements."
Self-Storage (NAICS 531130)
Self-storage is a hybrid — both operating business and real estate asset. DealForge runs BOTH calculation methods. Minimal labor (1–2 part-time), recession-resistant, scalable, strong NOI margins (60–70%). Watch for: occupancy rate (85%+ healthy), market saturation from new construction, climate control premium (25–50%).
Tier Structure
Tier Comparison
| Feature | Free | Tier 1 ($15) | Tier 2 ($100) |
|---|---|---|---|
| HOT MATCH SMS alerts | 1/quarter | 3/month | Unlimited |
| Interest windows | 1 | 1 | Unlimited |
| Dashboard access | Limited | Full | Full |
| Periodic digest | No | Yes | Yes |
| Sensitivity analysis | No | No | Yes |
| CIM upload + deep analysis | No | No | Phase 2 |
| Priority processing | No | No | Yes |
Competitive Positioning
| Competitor | Price | What They Offer |
|---|---|---|
| BizBuySell saved search | Free | Basic email alerts, no analysis |
| Kumo | $149–200/mo | AI matching, deal flow aggregation |
| Searcher OS | $49–244/mo | Deal sourcing + CRM |
| DealStream | $49/mo+ | Deal marketplace + matching |
| DealForge Tier 1 | $15/mo | Full analysis + enrichment + SMS |
| DealForge Tier 2 | $100/mo | Everything + multi-window + deep tools |
DealForge is 3–10x cheaper than the closest competitor for comparable functionality. The $15/mo entry point drives adoption; the $100/mo tier captures serious acquirers. Blended ARPU: $27/mo with 80/15/5 tier mix.
Beta Founder Program (First 5–10 Users)
12 months completely free service at chosen tier level. Hand-selected, motivated acquirers who provide weekly feedback and a referenceable review. Purpose: rapid product iteration + first social proof assets. Cost: $1,620–$3,240 total (negligible).
Universal Success Incentive (All Users, All Tiers)
| Trigger | Reward | Condition |
|---|---|---|
| User reports a closed deal sourced via DealForge | 12 months free at current tier | Must provide referenceable review |
| Free-tier user closes | Upgraded to Tier 1 for 12 months ($180 value) | Written testimonial or LinkedIn post |
| Tier 1 user closes | 12 months free ($180 value) | Short video or case study permission |
| Tier 2 user closes | 12 months free ($1,200 value) | Referenceable review |
Why This Works
- Outcome alignment — we only "pay" when users succeed
- Social proof generation — testimonials from actual closed deals are 10x more powerful than generic reviews
- Retention boost — successful closers have near-zero churn during free year
- CAC reduction — high-quality testimonials lower paid acquisition costs 30–50%
- Virality — "Closed a $1.4M deal and got a year free" is shareable
10–14% of paid users close 1 deal/year. Each closure costs ~$324 in revenue. Net ARPU reduction: 8–18%. Offset by: retention boost (30–60% LTV increase per affected user), CAC drop to <$200 via organic testimonials, free-tier conversion rate improvement of 15–30%.
Revenue Per User
Gross Margin
Variable COGS Breakdown (Per User/Month)
| Component | Tier 1 (3 alerts) | Tier 2 (~12 alerts) | Free |
|---|---|---|---|
| Scraping (Apify share) | $1.00 | $1.00 | $0.30 |
| LLM parsing/enrichment | $1.20 | $5.00 | $0.15 |
| SMS (Twilio) | $0.25 | $1.00 | $0.03 |
| Census/enrichment | $0.05 | $0.05 | $0.05 |
| Total | $2.50 | $7.05 | $0.53 |
LTV & Breakeven
| Scenario | LTV | CAC | LTV:CAC | Payback (months) |
|---|---|---|---|---|
| Pessimistic (6% churn) | $387 | $350 | 1.1x | 15 |
| Base (4.5% churn) | $516 | $280 | 1.8x | 10 |
| Optimistic (3% churn) | $774 | $220 | 3.5x | 5 |
Monthly fixed costs: ~$15k–$20k. Contribution margin per user: $23.80. At 20% MoM growth from 200 starting users, breakeven is reached around Month 8 (~860 users). Key levers: churn reduction (every 1% = ~25% LTV increase), Tier 2 upsell, and add-on revenue.
Phase 0: Beta Launch (Weeks 1–8)
5–10 hand-picked users from direct network. Active acquirers, located in testable markets (Texas/DFW ideal). 1 year free service. Weekly feedback sessions. Goals: validate enrichment accuracy, tune scoring weights, test SMS timing, generate first testimonials.
Phase 1: Early Growth (Months 3–6)
| Channel | Est. CAC | Notes |
|---|---|---|
| Reddit Communities | $50–$100 | r/Entrepreneur (4M), r/smallbusiness (1.5M), r/searchfunds |
| LinkedIn (ETA Community) | $100–$200 | Search fund operators, independent sponsors |
| Broker Partnerships | $150–$250 | Highest quality leads, best retention |
| BizHub User Crossover | $100–$150 | "BizHub on autopilot — it comes to YOU" |
| Paid Ads (backup only) | $300–$500 | Google/LinkedIn. Scale only if organic plateaus. |
Phase 2: The Flywheel (Months 6–12)
Growth Targets
| Metric | Month 6 | Month 12 |
|---|---|---|
| Total users | 500 | 3,000+ |
| Paying users | 400 | 2,500+ |
| MRR | $10,800 | $67,500 |
| Monthly churn | < 5% | < 4% |
| Blended CAC | < $300 | < $250 |
Launch promotion: "If DealForge doesn't surface at least one deal in your interest window within 30 days, get a full refund." Low risk (most markets have sufficient deal flow) but builds confidence for skeptical early adopters.
Core Components
| Layer | Technology | Why |
|---|---|---|
| Backend API | FastAPI (Python) | Fast, async, great for ML/data pipelines |
| Database | PostgreSQL | Reliable, JSON support, full-text search |
| Calculation Engine | Pure Python module | No LLM dependency — deterministic, testable, fast |
| LLM | Claude API (Haiku + Sonnet) | Haiku for parsing, Sonnet for narrative |
| Enrichment Data | Census bulk files (local) | No API dependency in hot path |
| SMS | Twilio | Industry standard, reliable |
| Email Ingestion | Gmail API / IMAP | Parse platform email alerts |
| Frontend | Streamlit (MVP) → Next.js | Speed first, polish later |
| Hosting | Railway/Render → AWS | Easy deploy, then scale |
| Task Queue | Celery + Redis | Background processing for pipeline |
| Knowledge Base | Markdown wiki (filesystem) | Karpathy pattern — LLM-maintained |
LLM Cost Optimization
| Task | Model | Cost/Call | Volume |
|---|---|---|---|
| Listing parsing | Claude Haiku | ~$0.01 | Every listing |
| Enrichment reasoning | Claude Haiku | ~$0.02 | Tier 2+ listings |
| Narrative generation | Claude Sonnet | ~$0.05 | Every scored deal |
| CIM analysis | Claude Sonnet | ~$0.50 | On-demand (Phase 2) |
Monthly LLM cost estimate at 1,000 deals/day: ~$2,160/month (parsing $300 + enrichment $360 + narratives $1,500).
Infrastructure Costs (MVP)
Total MVP infrastructure: $410–$1,100/month
The wiki is the source of truth for knowledge. The database is for fast queries. Code reads from both. Raw data collected in raw/, compiled by LLM into structured .md articles in wiki/. The LLM maintains all content, indexes, and cross-links.
-
Census CBP / ZBPapi.census.govCounty and ZIP Code Business Patterns. Establishment counts, employment, payroll by NAICS. Foundation of the enrichment engine. Free but lagged 2–3 years. Use ZBP for hyperlocal density, CBP for actual payroll/employment at county level. Download bulk files (~100–200MB) for production.
-
BizBuySellbizbuysell.comLargest US business-for-sale marketplace (owned by CoStar Group). Primary listing source. Email alert parsing is the primary ethical ingestion method. Published transaction data (9,500+ deals) powers industry benchmarks. TOS prohibits scraping — email parsing is defensible.
-
ATTOM Dataattomdata.comProperty ownership, tax, deed, mortgage data for 155M+ properties. Key for off-market signals: ownership duration, tax delinquency, out-of-state owner detection. API-based, pay per call.
-
Apify Platformapify.comScraping-as-a-service with multiple BizBuySell actors. Configurable concurrency, retries, proxy settings. Output as JSON via API. Cost: ~$1–26/mo base + compute. Use carefully per TOS considerations.
-
BLS & BEAbls.gov / bea.govEmployment wages, GDP by industry, regional price parities (RPP). BEA RPP is critical for geographic adjustments — SF/NYC multiply by 1.1–1.3, rural areas by 0.7–0.9. More current than Census (quarterly).
-
Twiliotwilio.comSMS delivery for HOT MATCH alerts. Cost: ~$0.0079/segment outbound. Standard compliance handles opt-in/opt-out. Retry logic: 3x with backoff, fall back to email.
-
FRED API (Federal Reserve)api.stlouisfed.orgDaily prime rate (series DPRIME) for SBA interest rate calculations. Free, reliable. Current prime (~April 2026): ~7.50%, giving SBA rate ~10.25%.
4-digit NAICS (e.g., 8111 for Auto Repair) is the best balance of data quality and specificity. 2-digit is too broad. 6-digit is too sparse at small geographies. Use 3-digit as a useful backup.
Legal compliance is the #1 existential risk for DealForge. Scraping TOS violations, investment advice liability, and broker licensing requirements each require careful navigation. Engage an IP attorney before production launch.
Risk #1: Scraping / TOS Violations
BizBuySell, LoopNet, Crexi, and most marketplaces explicitly prohibit automated scraping. CoStar Group has successfully sued over mass data copying. Risk level: HIGH
Mitigation Strategy (Phased)
| Phase | Method | Risk Level |
|---|---|---|
| Phase 1 (MVP) | Email alert parsing — processing emails sent to you | LOW |
| Phase 1 | User-uploaded listings — user pastes URL or data | LOW |
| Phase 1 | Single-page fetch from alert URLs | LOW |
| Phase 1 | Public data (Census, BLS, BEA) | NONE |
| Phase 2 | Broker partnerships — voluntary submission | NONE |
| Phase 2 | Selective Apify usage (rate-limited, cached) | MODERATE |
| Never | Store/republish raw listing photos or descriptions | PROHIBITED |
Risk #2: Investment Advice Liability
If DealForge says "this deal is strong — pursue it" and the investor loses money, there's potential liability. Mitigation: disclaimers everywhere, never use "buy this" language, show calculation methodology transparently, require user acknowledgment on signup.
Risk #3: Broker Licensing
Current model is SAFE: flat subscription fee regardless of deal outcomes. The 1-year-free incentive is a credit, not a transaction fee. Watch for: per-deal fees (need legal review), direct buyer-seller introductions (some states require license).
Risk #4: Outreach Compliance
| Channel | Key Rules | Fines |
|---|---|---|
| Email (CAN-SPAM) | Physical address, unsubscribe mechanism, honest subjects | Up to $51,744/email |
| Phone/Text (TCPA) | B2B cold calling generally OK; auto-dialed to cell needs consent | $500–$1,500/call |
| Direct Mail | Essentially no restrictions on business addresses | N/A |
| SMS to our users | Users explicitly opt in — compliant via Twilio | Standard |
Pre-Launch Compliance Checklist
Click items to mark them complete as you work through the compliance process.
- IP attorney consulted on scraping approachCritical
- Terms of Service written (user agreement)Required
- Privacy Policy written (CCPA-compliant)Required
- Investment disclaimer language approvedRequired
- CAN-SPAM compliance for outbound emailRequired
- Twilio SMS opt-in/opt-out flow implementedRequired
- Business broker licensing checked for target statesReview
- Data encryption at rest and in transit implementedRequired
15+ public-data signals indicating a business or property may be available for acquisition, even if not listed. Ranked by signal strength and scored using an Availability Likelihood framework.
High-Signal (Strong Motivation to Sell)
| Signal | Data Source | Points |
|---|---|---|
| Probate/estate filing | County probate court records | +30 |
| Divorce filing | County court records | +30 |
| Federal/state tax liens | Public lien filings | +30 |
| Pre-foreclosure (Lis Pendens) | County recorder | +30 |
| Business dissolution filing | Secretary of State | +30 |
| Bankruptcy filing | PACER / PacerMonitor | +30 |
Medium-Signal (Likely Approaching Exit)
| Signal | Data Source | Points |
|---|---|---|
| Aging owner (65+) | Voter registration + SOS formation date | +15 |
| Lapsed business license | Municipal records | +15 |
| Declining employee count | LinkedIn company page (over time) | +15 |
| Delinquent property taxes | County assessor (ATTOM) | +15 |
| Expired real estate listing | MLS history, LoopNet archives | +15 |
| Long ownership + low assessed value | County assessor (ATTOM) | +15 |
Soft Signals (Worth Monitoring)
| Signal | Data Source | Points |
|---|---|---|
| Out-of-state property owner | County assessor mailing address | +5 |
| Stale Google Business Profile | Google Maps API | +5 |
| Declining Google reviews | Google Maps API | +5 |
| Website domain expiring | WHOIS lookup | +5 |
| Reduced job postings | Indeed/LinkedIn | +5 |
| UCC filing expiration | State UCC records | +5 |
Availability Likelihood Scoring
| Score | Classification | Action |
|---|---|---|
| 60+ points | LIKELY AVAILABLE | Prioritize outreach |
| 30–59 points | POSSIBLE | Monitor, outreach if investor interested |
| <30 points | UNKNOWN | Passive monitoring only |
Implementation Priority
Off-market signal detection adds significant value but requires additional data source integrations. Start with on-market deal flow (Sections 04–09), prove the core scoring and alerting engine, then layer in off-market capabilities as a Tier 2 differentiator.