#Ai job

1 messages · Page 1 of 1 (latest)

soft bronze
#

Hi, I’m looking to hire someone on a weekly or monthly basis to build and manage a bot that pulls data from the Master Commissioner and County Clerk websites in the counties I target, retrieves property valuation information from the PVA, captures the last sale date and sale price, and filters out properties that don’t meet my criteria so I’m left with a clean, organized list to review. The goal is to automate part of my workflow and save time by eliminating manual sorting. I’m happy to hop on a call to walk you through exactly what I need pulled and how I narrow down my lists. If this is something you can help with, please send me a DM so we can jump on a call. I’m looking to get this started as soon as possible.

quasi anchorBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

fiery orbitBOT
#

A project like this is doable, but the “gotchas” are almost always (1) per-county variability and (2) site anti-bot / ToS. If you want to get this moving fast with the right contractor, here’s the clean way to scope it so nobody’s guessing.

Key scoping questions (answer these and you’ll get accurate bids)

  1. Which counties + states (start with 1–2 to prove the pipeline)?
  2. For each county source (Master Commissioner / County Clerk / PVA):
    • Is it searchable HTML, PDF lists, image scans, or a paid portal?
    • Any CAPTCHA / login / rate limits?
  3. What’s your matching key across sources (parcel ID, address, owner name)?
  4. Your filters/criteria (examples): equity %, last sale date range, assessed value range, property class, delinquent taxes, excluded neighborhoods/ZIPs, etc.
  5. Desired output:
    • CSV / Excel / Google Sheet, or push into a CRM?
    • One row per parcel, with normalized columns?
  6. How often should it run: daily / weekly / monthly + “only new since last run”?
  7. Where to run it: your machine, a VPS, or a managed service? (This affects ongoing cost.)

What a good implementation typically looks like

  • Config-driven connectors: each county is effectively a “plugin” because layouts differ.
#
  • Scrape layer: Playwright (browser automation) for tricky sites + requests/HTML parsing when possible.
  • Normalize + dedupe: address normalization + parcel ID matching.
  • Storage: Postgres/SQLite to track history and avoid reprocessing.
  • Export: CSV/Sheets + a “review queue”.
  • Ops: logging, retries, change detection, and alerts when a county site changes.

Hiring reality (so expectations are set)

  • A contractor can usually deliver an MVP for 1–2 counties quickly, then each additional county is incremental work.
  • Ongoing maintenance is normal because county sites change. A monthly retainer model makes sense.

Contractor checklist (what to ask before you pick someone)

  • “Show me past work scraping government/public-record sites.”
  • Comfort with Playwright + anti-bot constraints (without doing anything sketchy).
  • How they handle data quality (address normalization, fuzzy matching, audit trails).
  • Will they ship it as Docker + documented setup so you’re not locked in?

If you paste: