Liepin Agentic Sourcing v2
Overview
Automated candidate sourcing pipeline for Liepin (猎聘). Submits a background job that searches, evaluates, and communicates with candidates via browser automation + LLM evaluation.
Prerequisites
-
Environment variables (in
.env):LIEPIN_USERNAME/LIEPIN_PASSWORDJD_WORKER_LLM_PROVIDER,JD_WORKER_LLM_MODEL,JD_WORKER_LLM_API_KEY,JD_WORKER_LLM_BASE_URLRESUME_WORKER_LLM_PROVIDER,RESUME_WORKER_LLM_MODEL,RESUME_WORKER_LLM_API_KEY,RESUME_WORKER_LLM_BASE_URL
-
Active Liepin session — must be saved before first use:
cd skills/liepin-agentic-sourcing python3 scripts/login_and_save_session.py python3 scripts/session_persistence_probe.py
How to Run
Step 1: Prepare criteria file
Create a JSON file (e.g. criteria.json) with all required fields:
{
"position_name": "职位名称",
"target_count": 3,
"position_scope": "完整的职位描述/JD文本",
"cities": ["上海", "深圳"],
"seniority": "5-10年",
"must_have_signals": ["关键词1", "关键词2"],
"preferred_signals": ["加分项1"],
"hard_rejects": ["排除项1"]
}
| Field | Required | Description |
|-------|----------|-------------|
| position_name | ✅ | Target position name |
| target_count | ✅ | Number of candidates to communicate with |
| position_scope | ✅ | Full JD text, used by JD-worker to generate search keywords |
| cities | ✅ | City filter list |
| seniority | ✅ | Experience band (e.g. "5-10年") |
| must_have_signals | ✅ | Hard requirement keywords |
| preferred_signals | ❌ | Nice-to-have keywords |
| hard_rejects | ❌ | Instant rejection keywords |
⚠️ City and seniority must be explicitly provided. Do not let JD-worker guess them.
Step 2: Submit the job
cd skills/liepin-agentic-sourcing
python3 scripts/jd_search_and_communicate.py --criteria-file path/to/criteria.json
Dry-run (validate criteria without launching):
python3 scripts/jd_search_and_communicate.py --criteria-file path/to/criteria.json --dry-run
Output: Started Liepin runtime job: <job_id>
Architecture
┌─────────────┐
│ Launcher │── starts 4 workers as subprocesses
└──────┬──────┘
│
┌────┴──────────────────────────────────────────┐
│ │
▼ ▼
┌──────────┐ search plan ┌──────────────┐
│ JD-Worker │──────────────▶│ Browser-Worker│
│ (LLM) │ │ (Playwright) │
└──────────┘ └──────┬───────┘
│ candidate data
▼
┌──────────────┐ decisions ┌──────────────┐
│ Resume-Worker │───────────▶│ Orchestrator │
│ (LLM) │ │ │
└──────────────┘ └──────┬───────┘
│ replan signal
▼
┌──────────┐
│ JD-Worker│
│ (widen/ │
│ tighten) │
└──────────┘
Worker responsibilities
| Worker | Role |
|--------|------|
| JD-Worker | Reads JD → generates 5-group keyword search plan via LLM. Supports initial, widen, tighten modes. |
| Browser-Worker | Logs into Liepin, executes searches, collects candidate previews, harvests detail pages, clicks "立即沟通". Uses a single shared session per run — no repeated login/page navigation. |
| Resume-Worker | Evaluates candidates via LLM. Preview → open_detail or skip. Detail → link or skip. Promotes link directly to pending_communication. |
| Orchestrator | Coordinates workers. Decides when to request widen (pool drained, target unmet) or tighten (results > 200, abandon current work immediately). |
Candidate lifecycle
pending_preview_eval → completed(skip) / completed(open_detail)
↓
pending_detail_harvest → detail_harvested → completed(skip) / pending_communication
↓
completed(link)
Adaptive search
- result_count > 200: immediately
tighten— abandons pending work, requests narrower plan - result_count ≤ 200: drains current pool first, then
widenif target not reached - Max 5 search plan iterations per job
Browser session model
- Preview collection: standalone session, up to 5 pages (100 candidates)
- Detail + Communication: shared session per page
- Navigate to target page once
- Click card → harvest detail → if link: click "立即沟通" in same modal → close → next
- No repeated login or page navigation within a page
Known Page Quirks
AI 帮搜 Wizard Overlay
What it is: Liepin displays a full-screen wizard modal (step 1/3) called "AI 帮搜" on first load of the search page. It renders as <div class="overlay--fSOmM" style="z-index: 1000"> with an inner ✕ button. This overlay intercepts ALL pointer events and blocks any click on the search form.
How to detect: page.locator('[class*="overlay--"]').first.is_visible() returns True.
How to close: Press Escape, or click the inner ✕ button with force=True:
overlay = page.locator('[class*="overlay--"]').first
if await overlay.is_visible(timeout=2000):
close_btn = overlay.locator('text=✕').first
await close_btn.click(force=True, timeout=2000)
await asyncio.sleep(1)
Do not use a plain locator('text=✕').click() — the overlay div sits above the inner content and will intercept the click unless force=True is used on the button inside the overlay.
Compound Keyword Input (Verified Flow)
- Match active input by placeholder:
可填写多个or关系的关键词,回车分隔 - Do not index by
inputs[i]; always use the current.firstmatching input - Group 1 / 2: directly fill the current first active input
- Group 3 / 4 / 5: click
增加要求once, then fill the current first active input - Within the same group: click once, then
type(keyword)→Enterrepeatedly; do not click again between keywords - When debugging, screenshot after each filled group to preserve evidence before any later timeout
Key Paths
| Path | Purpose |
|------|---------|
| src/liepin_skill/runtime/ | Workers, orchestrator, DB, launcher |
| src/liepin_skill/browser/ | Playwright session + page objects |
| scripts/ | Login, probe, job submission |
| runtime/ | Runtime state (state.db, job logs, browser storage) |
| .env | Credentials and LLM config |
Testing
cd skills/liepin-agentic-sourcing
pytest tests/ -q
Scan to contact