You will own the backend and rendering plane that powers our SEO optimization platform: headless browsing at scale, high-quality extraction/crawling, resilient APIs, and secure preview delivery.
You’ll turn complex crawling and AI workflows into clean, observable services with clear SLAs and cost controls - so mid-sized engineering teams can integrate via SDK or API in minutes.
What you’ll build
● Rendering Workers: Playwright/Puppeteer jobs that fetch full DOM, handle lazy-loads, retries/backoff, and emit stable HTML snapshots + signals (title/meta/H1/H2/links/JSON-LD).
● Crawl Orchestrator & Queues: Frontier management (dedupe/normalization), per-host concurrency, jittered retries (429/5xx), idempotent jobs (Redis Streams/SQS/PubSub).
● Core APIs: FastAPI/Node endpoints for /by-url/•, /crawls, /pages, /schema, /authors, /links, /gsc, /exports with OpenAPI, pagination, caching, and rate limits.
● Integrate Cloud AI (OpenAI) adapters for page classification, schema generation, and author E-E-A-T enrichment.
● Preview & Page Serving: Secure “live preview” injector with <base> handling, CSP/nonce safety, and HMAC-signed integration for customer sites.
● Schema & Linking Services: Template/validator (schema.org, null-collapse), internal-link “4 power + 1 new” recommender, exports (CSV/NDJSON).
● User & API Access Plumbing (backend side): API key validation, scopes, quotas, usage metering hooks. Responsibilities
● Design data models (pages, crawls, recommendations, schema, GSC aggregates) and ship migrations.
● Implement resilient workers (timeouts, retries, circuit breakers) and backpressure with clear SLOs (p95 render, error budget). Confidential - For internal use only; not for external distribution or disclosure.
● Expose OpenAPI-first REST services with auth (OAuth2 client-creds/API keys), HMAC verification, CORS, and rate limiting.
● Optimize performance and cost: 72h HTML cache, AI digest caching, safe concurrency, minimal token usage.
● Build observability: OpenTelemetry metrics/traces/logs; dashboards for queue depth, render latency, and AI token spend.
● Partner with SaaS/frontend to wire Admin Console flows, previews, exports, and developer-friendly docs.
● Design for 95th percentile render time <4s and crawl concurrency scaling to 50+ parallel workers.
● Works closely with the SaaS/UI team to deliver APIs and SDK-ready outputs.
Minimum qualifications
● 5+ years building backend services in TypeScript/Node and/or Python/FastAPI.
● Hands-on production experience with Playwright or Puppeteer (navigation, waits, routing, auth, proxies).
● Strong with queues (Redis Streams/SQS/PubSub), idempotency patterns, and job orchestration.
● Solid SQL data modeling (Postgres/SQLite), indexing, and query tuning.
● API security: HMAC signing, OAuth2 client-creds, rate limiting, CORS, secrets management (KMS/SM).
● Swagger/OpenAPI, REST APIs, microservices; comfort with serverless AWS services where pragmatic. Nice-to-haves
● React/TypeScript familiarity for small admin/preview tools (or to partner closely with the SaaS UI).
● SEO extraction: schema.org parsing/validation, JSON-LD generation.
● GSC data ingestion/aggregation; embeddings/NLP awareness for future link relevance.
● Edge/page serving, HAR/network introspection, web-layer security and CSP/nonce patterns.
● Redis (cache/queues), Docker/K8s, HPA on queue depth;.
● Willingness to learn and integrate Cloud AI thoughtfully (compact digests, caching, guardrails).