gooseworks-ai / capabilities-linkedin-job-scraper

LinkedIn Scraper

This runbook finds LinkedIn job postings using the JobSpy Python library and a local `tools/jobspy_scraper.py` wrapper. It turns a requested role, location, company filter, recency window, and output target into a reproducible CSV export with posting metadata and direct URLs. The

agent codexmodel gpt-5.5snapshot python312-uveval programmatic7 stepsv1.0.0

Deploy LinkedIn Scraper to your jetty.io

One-click installs this runbook into a collection on your Jetty account. You can run it from the Spot dashboard, schedule it, or pipe inputs in via the API.

Deploy on jetty.io →View source

The shape of the run

7 steps · start to finish.

Step 1

Environment Setup

▶

Install the runtime dependency and create the output directory.

mkdir -p /app/results
python3.12 -m pip install -U python-jobspy --break-system-packages

Verify the scraper wrapper is available. If it is missing, copy it from the source skill assets before running.

test -s tools/jobspy_scraper.py || cp skills/linkedin-job-scraper/scripts/jobspy_scraper.py tools/jobspy_scraper.py
test -s tools/jobspy_scraper.py

2
Step 2
Resolve Search Inputs
▶
Identify the search term, location, result count, recency, company IDs, description requirement, job type, and remote-only flag from the user request. If a request is underspecified, choose pragmatic defaults and record them in `/app/results/summary.md`.
3
Step 3
Construct the Scraper Command
▶
Build the command with only the filters that apply. Always write the CSV to `/app/results/linkedin_jobs.csv` unless the user requested a different file under `/app/results`.
4
Step 4
Run the Scraper
▶
Execute the command and capture the terminal output for the summary. Example:
5
Step 5
Interpret Results
▶
Read the CSV and summarize the outcome in `/app/results/summary.md`:
6
Step 6
Iterate on Errors (max 3 rounds)
▶
If installation, scraper execution, or result validation fails, iterate up to max 3 rounds. After each fix, rerun the failed step and update `/app/results/validation_report.json`.
7
Step 7
Write Validation Report
▶
Write `/app/results/validation_report.json` with this shape:

Parameters

Results directory

default: `/app/results`

Directory where required output files are written.

Search termrequired

Job title, role, company hiring keyword, or search phrase for LinkedIn postings.

Location

default: none

City, state, country, or `Remote`. Recommended unless using only remote filters.

Results wanted

default: `25`

Maximum number of jobs to fetch.

Recency

default: none

Optional `hours_old` filter for recent postings, such as the last 48 or 72 hours.

Company IDs

default: none

Optional comma-separated LinkedIn company IDs.

Full descriptions

default: `false`

Fetch full job descriptions when detailed content is needed; this is slower.

Job type

default: any

One of `fulltime`, `parttime`, `contract`, or `internship`.

Remote only

default: `false`

Restrict results to remote jobs.

Output CSV

default: `/app/results/linkedin_jobs.csv`

Path for the scraper CSV output.

Dependencies

Python 3.10+ · required · Runtime
python-jobspy · required · Python package
tools/jobspy_scraper.py · required · Local script
Network access · required · External
csv-readable output path · required · Filesystem

Required outputs

/app/results/linkedin_jobs.csv
CSV export from JobSpy containing LinkedIn job postings and metadata.
/app/results/summary.md
Human-readable summary of search parameters, result count, top matches, output path, and issues.
/app/results/validation_report.json
Structured validation report with stages, pass/fail results, and final status.

Origin

source: raw.githubusercontent.com
title: LinkedIn Scraper
attr: high

Original →

LinkedIn Scraper

Deploy LinkedIn Scraper to your jetty.io

7 steps · start to finish.

Environment Setup

Resolve Search Inputs

Construct the Scraper Command

Run the Scraper

Interpret Results

Iterate on Errors (max 3 rounds)

Write Validation Report