Larksong
Larksong is a community bird-sighting log founded by a small flock of watchers in the Hudson Valley. Members post species, location, and a photo — but the team won't allow direct image uploads, so every photo is fetched server-side from a URL the member supplies.
The Scenario
Larksong started in 2021 as a private mailing list among Hudson Valley birders and grew into a member-funded public log — two naturalists, a part-time engineer, and a volunteer moderator running the site on $4 dues. After a copyright scare in their first year the team banned direct photo uploads and built a "link your photo from elsewhere" form instead. The species pages got a separate image-search widget locked to their own domain; the sighting form got a quick safety check the engineer wrote between bird walks.
Challenge Intel
Synopsis
Two URL-fetching surfaces. The species-page "image search" enforces a strict allowlist (only https://*.larksong.community/...). The sighting submit form's "photo URL" field only runs a regex denylist against the raw URL text — `/(localhost|127\.0\.0\.1|0\.0\.0\.0|metadata)/i`. Any alternate localhost representation that doesn't contain those substrings (decimal IPv4, hex IPv4, IPv6) bypasses the check. The fetched body is rendered on the sighting detail page as a base64 data URL; when the response isn't an image the raw bytes are shown in a fallback `<pre>`, which reflects the flag. An internal HTTP server runs on 127.0.0.1:9090 inside the container and serves the FLAG env var at GET /.
What It Is
Classic medium SSRF with a substring-denylist filter bypass. The vulnerable code path: `if (/(localhost|127\.0\.0\.1|0\.0\.0\.0|metadata)/i.test(photoUrl)) reject; else fetch(photoUrl)`. The regex is a substring match against the raw URL string. Many alternate IPv4/IPv6 representations resolve to the loopback address but don't contain any of those substrings — for example decimal `2130706433`, hex `0x7f000001`, IPv6 `[::1]`, IPv6-mapped `[::ffff:7f00:1]`. Using any of those in `http://<host>:9090/` bypasses the filter and the internal flag server's response is rendered on the sighting detail page. A secondary "search images by URL" form on the species page looks injectable but enforces a strict allowlist (`https://*.larksong.community`) and is the intentional dead end — players should map both surfaces before exploiting.
Who It's For
Players who have done one or two textbook SSRFs already and are ready to deal with a partial mitigation. Assumes familiarity with alternate IP representations and the difference between substring parsing and exact-match validation. No filter bypass tooling required — pen-and-paper IP arithmetic is enough.
Skills You'll Practice
- Recognising hostname-denylist filters as bypassable
- Enumerating alternate IPv4/IPv6 loopback representations (decimal, hex, padded, partial)
- Distinguishing strict allowlist vs partial denylist mitigations
- Mapping multiple URL-fetching surfaces before exploiting
What You'll Gain
- Why denylist-based SSRF defences are almost always insufficient
- How Node's URL parser preserves alternate IP forms in `hostname`
- When an obvious-looking injection point is hardened, look downstream for a sibling feature
- Server-side response reflection as an SSRF exfil channel