The Open Licensing Standard for AI Crawlers – Giving publishers a voice and AI a smarter path forward — beyond scraping vs. paywalls to balanced collaboration https://peekthenpay.org/#how-it-works expect one of these new standards per month for the next months.
How AI internet scraping is evolving, current techniques used : – Direct HTTP Crawlers (Traditional Crawlers) : GPTBot, ClaudeBot, Meta-ExternalAgent, Google-Extended, Bytespider, Amazonbot, Applebot-Extended, CCBot – Cloud Browser Infrastructure (Browser-as-a-Service) : Browserbase, Hyperbrowser – Web Scraping & Data Extraction Platforms : Firecrawl, Apify, Zyte – Browser-driven web agents : Comet (Perplexity), Dia (The Browser Company) – Real-Time Fetchers (On-Demand) : ChatGPT-User, OAI-SearchBot, Claude-User, Perplexity-User
Create content and audiences; provide advertising inventory (impressions)
Goal
Maximize revenue per page view (RPM/CPM); balance UX with monetization
Top 5 Companies
Google (YouTube), Meta, Amazon, News Corp, Condé Nast
Business Model
Ad revenue (CPM/CPC/CPA share), subscriptions, hybrid. Typically keep 60-80% of programmatic revenue after tech fees
Software Components
CMS (WordPress, Drupal), audience development tools, DMP/CDP for 1st party data, consent management
2. PUBLISHER AD SERVER
Attribute
Details
Actor
Publisher-Side Ad Server
Role
Manage, prioritize, and deliver ads across direct-sold and programmatic demand sources
Goal
Maximize yield by selecting the highest-paying ad for each impression; enforce business rules
Top 5 Companies
Google Ad Manager (GAM), Xandr Monetize, Kevel (self-serve), Smart AdServer, Equativ
Business Model
CPM-based serving fees (GAM free tier, then volume-based); SaaS subscription for smaller players
Software Components
Commercial: Google Ad Manager, Xandr, Equativ, Smart. Open Source: Revive Adserver (legacy). Modules: Trafficking UI, ad decisioning engine, forecasting, reporting, unified auction
3. HEADER BIDDING WRAPPER / PREBID
Attribute
Details
Actor
Header Bidding Wrapper
Role
Run parallel auctions across multiple SSPs before calling the ad server; maximize competition
Goal
Increase publisher yield by enabling simultaneous bidding; reduce SSP monopoly power
Top 5 Companies
Prebid.org (standard), Amazon TAM/UAM, Google Open Bidding, Index Exchange Wrapper, PubMatic OpenWrap
Business Model
Prebid.js is free/open-source. Managed services charge fees or require using their SSP. Prebid Server hosts charge tech fees
Software Components
Open Source: Prebid.js (client), Prebid Server (server-to-server). Commercial wrappers: TAM, OpenBidding. Modules: Adapters (per SSP), currency conversion, consent modules, analytics adapters, user ID modules
Neutral marketplace connecting multiple SSPs and DSPs; facilitate real-time auctions
Goal
Maximize liquidity; enable price discovery; provide transaction infrastructure
Top 5 Companies
Google AdX (dominant), Xandr (Microsoft), Magnite, OpenX, Smaato (Verve Group)
Business Model
Transaction fee (often bundled with SSP); typically 10-20%
Software Components
RTB auction engine, QPS infrastructure, fraud filtering, OpenRTB/oRTB 2.6 compliance, deal management. Note: Exchange vs SSP distinction has blurred; most SSPs function as exchanges
7. CONSENT MANAGEMENT PLATFORM (CMP)
Attribute
Details
Actor
CMP (Consent Management Platform)
Role
Collect, store, and signal user privacy consent for GDPR/CCPA/GPP compliance
CPM-based resolution fees; licensing; bundled with other services
Software Components
Commercial: LiveRamp ATS, UID2, ID5. Open Standards: UID2 (open-source framework), SharedID. Modules: Identity graph, resolution API, Prebid User ID modules, first-party data onboarding
9. VIDEO / CTV SPECIFIC
Attribute
Details
Actor
Video Ad Server / CTV Platform
Role
Serve and measure video ads (instream, outstream, CTV); handle VAST/VPAID
Goal
Deliver video ads with proper tracking; manage pods; measure completion rates
Top 5 Companies
Google Ad Manager (video), FreeWheel (Comcast), SpringServe, Magnite CTV, Innovid
Business Model
CPM-based serving fees (higher than display); SaaS subscription for ad servers
Software Components
Commercial: FreeWheel, SpringServe, Innovid. Standards: VAST 4.2, VPAID (deprecated), SIMID, OMID. Modules: Video player integration, pod management, server-side ad insertion (SSAI), frequency capping across screens
Beads is a lightweight, graph-based issue tracker designed specifically for AI coding agents (like Claude, GPT-4, etc.) rather than human developers https://github.com/steveyegge/beads
Genkit Go 1.0 seems promising : – Type-safe AI flows with Go structs and JSON schema validation – Unified model interface supporting Google AI, Vertex AI, OpenAI, Ollama, and more – Tool calling, RAG, and multimodal support – Rich local development tools with a standalone CLI binary and Developer UI – AI coding assistant integration via genkit init:ai-tools command for tools like the Gemini CLI
Being non deterministic LLMs based AI Agents are un-testable (in sw engineering current terms) : the only criteria to evaluate anwsers is “LGTM” .. “A pragmatic guide to LLM evals for devs” https://newsletter.pragmaticengineer.com/p/evals
PACESETTERS is a powerful alliance of 15 partners of diverse scope, scale and focus. The consortium draws on long-term experience, outstanding competences and specific expertise. https://pacesetters.eu/about
“Notably, during Neo’s demo with the WSJ, the robot wasn’t performing any tasks autonomously. However, Børnich says Neo will perform “most household tasks autonomously” when it launches next year, noting that the quality of work “varies and will improve dramatically very rapidly as we acquire data.” Neo Robot is cheating like all the other manufacturers right now. https://www.roadtovr.com/helper-robot-neo-vr-telepresence/
“There’s more to software development than producing a working solution. Someone needs to safeguard design intent and maintainability. Maybe as LLMs democratize coding, existing developers need to evolve into architects who curate the structure of a codebase.” https://mo42.bearblog.dev/help-my-boss-started-programming-with-llms/
Curious to see where this goes.. Subliminal Learning : Language models transmit behavioral traits via hidden signals in data https://arxiv.org/abs/2507.14805
Apollo mission audio/images in realtime (obviously we have never been to the moon, they did all this with photoshop in the 70s 🙂 ) https://apolloinrealtime.org/
Emissions fell by 4% in Q1 and 2.6% in Q2, while GDP grew by 0.3% and 1%, respectively, compared to the same quarters in 2023, according to the latest statistics. This demonstrates that climate action and economic growth can go hand in hand : https://ec.europa.eu/eurostat/en/web/products-eurostat-news/w/ddn-20241115-2
HTTP (REST) and WebSocket, with support for end-to-end encryption (E2EE)
Identity Management
Tied to server domain (e.g., @user@domain.com), uses WebFinger for discovery
Portable DIDs for decentralized identity
Tied to server domain but portable; user ID format is @user:domain.com
Federation
Federated, allowing instances to share content and social connections across domains
Federated with content and algorithm control
Federated, with real-time, synchronized state across servers
Interoperability
Widely interoperable with other ActivityPub-compliant platforms in the Fediverse
Designed for custom app experiences, interoperability is in development
Supports interoperability with other Matrix clients; bridges to other protocols (e.g., Slack, IRC)
End-to-End Encryption
Not native to protocol but possible with extensions
Not natively specified
Built-in and widely supported, particularly in 1:1 and group chats
Moderation
Instance-based moderation policies, customizable filters and blocks
User-level and instance-level moderation, customizable algorithms
Room-level moderation, with granular permissions for room admins
Popular Platforms
Mastodon, PeerTube, Pixelfed, WriteFreely
Bluesky Social, upcoming decentralized apps
Element (main Matrix client), Synapse (server), bridges for Slack, Discord, Telegram, etc.
Summary of Key Differences
ActivityPub is best suited for federated social networking, particularly for applications that prioritize openness and content sharing across platforms in the Fediverse. It uses an Activity-Object model with JSON-LD and supports instance-based identity.
AT Protocol focuses on user control over content algorithms and portable identities using DIDs, with a vision for interoperability in custom social applications. It is also designed for federated social networks but with more control over data portability and algorithmic transparency.
Matrix Protocol excels in real-time, federated communication, supporting secure, encrypted messaging with granular moderation capabilities. It’s heavily used for chat, VoIP, and collaborative tools, emphasizing interoperability with other platforms through bridges.