← All runbooks
gooseworks-ai / capabilities-job-scraper

Job Scraper

Search LinkedIn and Indeed job postings through Apify and return structured job market data. Search for job postings across LinkedIn and Indeed. Use when users want to find open roles, monitor hiring signals, identify companies hiring for specific positions, or research competito

agent codexmodel gpt-5.5snapshot python312-uveval programmatic7 stepsv1.0.0

Deploy Job Scraper to your jetty.io

One-click installs this runbook into a collection on your Jetty account. You can run it from the Spot dashboard, schedule it, or pipe inputs in via the API.

The shape of the run

7 steps · start to finish.

  1. 1
    Step 1

    Environment Setup

    1. Verify /app/results exists and create it if needed.
    2. Verify APIFY_API_TOKEN is present in the environment.
    3. Install Python dependencies if they are missing:
    pip install requests pyyaml
    mkdir -p /app/results
    test -n "$APIFY_API_TOKEN"
    

    If setup fails, write /app/results/validation_report.json with overall_passed=false, include the failing stage, and stop.

  2. 2
    Step 2

    Interpret the Search Request

    Map the user's request into a concrete job-search plan:

  3. 3
    Step 3

    Run Apify Actors

    Use `automation-lab/linkedin-jobs-scraper` for LinkedIn and `borderline/indeed-scraper` for Indeed. Start only the actors selected in `search_plan.json`, pass the query and location filters, and cap each actor so the combined normalized output respects `max_results`.

  4. 4
    Step 4

    Normalize Job Records

    Transform all source records into this schema:

  5. 5
    Step 5

    Analyze Hiring Signals

    Summarize the strongest patterns in `/app/results/summary.md`: companies hiring most actively, recurring role families, locations, salary ranges when available, and caveats about platform coverage or actor failures.

  6. 6
    Step 6

    Validate Outputs

    Programmatically verify:

  7. 7
    Step 7

    Iterate on Errors (max 3 rounds)

    If validation fails, inspect the first failing stage, apply the targeted fix, and rerun only the affected step. Stop after max 3 rounds and leave `overall_passed=false` if the same failure remains.