Validation?

I recently learned about MediaMath’s custom brain. Excited for the validation of OpenDSP‘s concept of custom bidding logic. But it is a bit limited, being just a polynomial. MediaMath’s custom bid router provides way more flexibility — but you need your own infrastructure! So — I still think our approach — DSL-based scripting — is better, because it combines both!

FinOps

Some time ago a discussion about CIO vs CMO as it comes to ad tech started, and as I see it, it still continues. As a technical professional in ad tech space, I followed it with interest.

As I was building ad tech in the cloud (which usually involves large scale — think many millions QPS), business naturally became quite cost-conscious. It was then when, I, meditating on the above CIO-CMO dichotomy, thought that perhaps the next thing is the CIO (or the CTO) vs — or together with — the CFO.

What if whether to commit cloud resources (and what kind of resources to commit) to a given business problem is dictated not purely by technology but by financial analysis? E.g., a report is worth it if we can accomplish it using spot instances mostly; if it goes beyond certain cost, it is not worth it. Etc.

These are all very abstract and vague thoughts, but why not?

Recently I learned of an effort that seems to more or less agree with that thought — the FinOps foundation, so I am checking it out currently.

Sounds interesting and promising so far.

And nice badge too.

FinOps-Foundation-Community-Member-Badge

A look at app-ads.txt

Introduction

App-ads.txt is a follow-up to IAB’s ads.txt initiative aimed at increasing transparency in the programmatic ad marketplace. Related to it are a number of other initiatives and standards such as supply chain, payment chain, sellers.json, various things coming out of TAG group, etc. See last section for relevant links.

TL;DR: This standard allows an ad inventory buyer (DSP) to verify whether the entity selling the inventory (SSP) is authorized by the publisher to sell it. This is done by looking up the the publisher’s URL for an app (bundle) in the app store, and examining app-ads.txt file at that URL. The file lists authorized sellers of the publisher’s inventory, along with an ID that this publisher should be identified by in the reseller’s system (publisher.id in OpenRTB).

Terminology

Here, the words “publisher” and “developer” may be used interchangeably; ditto for “app” and “bundle”.

First pass implementation

The algorithm is fairly simple: for each app – aka bundle – of interest, grab the app publisher’s domain from the appropriate app store, fetch app-ads.txt file from that domain and parse it. But of course, in theory there is no difference between theory and practice, but in practice there is. In reality, there are some deviations from the standard and exceptional cases that had to be taken care of in the process.

As a first pass, we are running this process semi-manually; if the results warrant full automation, this can easily be accomplished. Here is what was done (more technical details can be found on GitHub):

  1. Bundle IDs and other information was retrieved from request log for the last few days (really, requests for all bids in May as of end of May 6), using a query to Athena. This comes to a total of:
    1. 1,144,377 bids
    2. 613,241 unique bundle IDs.
    3. 685,221 bundle-SSP combinations
  2. The result set, exported as CSV file is loaded into SQLite DB. The same SQLite DB is also used for caching of results.
  3. Go through the list of bundles with a Python script, appadsparser.py.While ads.txt standard provides the specification of how app stores should provide publisher information, only Google Play Store currently follows it. For Apple’s App Store, we crawl its search API’s lookup service, and, if not found there, directly the App Store’s page (though this appears to be a violation of robots.txt).
  4. The result is a semi-structured log. The summary is below.

Results Notes

Result codes

The following are result codes and their meaning:

  • OK – The SSP we got the traffic on is authorized by the bundle publisher’s app-ads.txt for the publisher ID specified in request
  • OK_GOOGLE – Google is authorized by the bundle publisher’s app-ads.txt. See below on explanation of why Google is special.
  • PROBLEMS — Mismatch found between authorized SSPs and/or publisher IDs in app-ads.txt.
  • NO_APPADS_TXT – Publisher’s website has no app-ads.txt file
  • NOT_FOUND_IN_PLAY_STORE – bundle not found in Google Play Store. NOT_FOUND_IN_ITUNES – same for Apple App Store.
  • NO_URL_IN_PLAY_STORE – cannot find publisher’s URL in Google Play Store. NO_URL_IN_ITUNES – same for Apple App Store.
  • FACEBOOK_URL – publisher’s URL points to a Facebook page (this happens often enough that it warrants its own status) BAD_DEV_URL – publisher’s URL in an app store is invalid
  • BAD_BUNDLE_ID – store URL (from the OpenRTB request) is invalid, and cannot be determined from the bundle ID either

    Note on Google

    NOTE: At the moment, there is no way to check the publisher ID for Google due to an internal issue. In other words, we cannot verify the following part of the spec:

    Field #2 - Publisher’s Account ID - This must contain the same value used in transactions (i.e. OpenRTB bid requests) in the field specified by the SSP/exchange. Typically, in OpenRTB, this is publisher.id.
    

    Given Google’s aggressive anti-fraud enforcement, we can for now stipulate that it would not run unauthorized inventory. There is still the possibility of fraud, of course. But in the below table we distinguish between bundles served via Google (where we do not check for publisher ID, just the presence of Google in app-ads.txt) and those served via other SSPs, where we cross-reference the publisher ID.

    As a corollary of the above you will not see Google inventory under “PROBLEMS” status.

    This gives a good sample:

    Result code Count
    OK 5,693
    OK_GOOGLE 29,330
    PROBLEMS 361
    NO_APPADS_TXT 7,653
    NOT_FOUND_IN_PLAY_STORE 4,417
    NOT_FOUND_IN_ITUNES 134
    NO_URL_IN_PLAY_STORE 20,112
    NO_URL_IN_ITUNES 16,168
    FACEBOOK_URL 1,378
    BAD_DEV_URL 4

    Summary of findings

    It appears that due to not very high adoption of this standard at present (developer URLs not present in App Stores or app-ads.txt file not present on the developer domain), there is not much utility to it at the moment. However, as the adoption rate is increasing (see below), this is worth revisiting again. Also, consider that for this exercise we only used bids information – that is, not the sample of full traffic, but just what we have bid on. This may not be representative of the entire traffic also, and may be interesting to explore.

    Consider also that the developer may well have the app-ads.txt file on the website, but if the website is not properly listed in the app store, we have no way of getting to it (yet SSPs may include those sites in their overall numbers, see, e.g., MoPub below).

    What does the industry say?

    • Google Play Store is reported to have only ~8% adoption. Worth quoting here is this section of the report: 


      Who Are the Top ‘Direct’ Ad Partners Inside the App-Ads.txt in Google Play Apps?
      Direct ad partners are those which have been granted direct permission by app developers to sell app ad space. That is, they are explicitly listed on the app’s “App-ads.txt” file. Google.com is listed as a direct ad partner on 95.87% of all “App-Ads.txt” files for Google Play apps. This makes it the most frequently mentioned direct ad partner for apps available on the Google Play.

    • Pixalate’s 2019 app-ads.txt trends report is interesting: 
      • Doesn’t seem that app-ads.txt makes that much difference for IVT (invalid traffic) – apps with app-ads.txt have 18.7% of IVT vs 21.1% for those without (pg. 6), despite Pixalate dramatizing this 2.4 percentage point increase as a 13% increase (2.4/18.7 – lies, damn lies and statistics).
      • It lists way higher numbers of adoption for Google Play Store apps than above but that is across top 1K apps.
      • Increasing rates of adoption (~65% rise in Q4 2019 – pg.13)
      • Unity, Ironsource, MoPub, Applovin, Chartboost – in that order – are in the top direct ad partners for Android apps (pg.18); MoPub, Unity, IronSource – in top direct AND resellers for iOS (pg.20)
    • MoPub claims that “app-ads.txt file adoption exceeds 80% for managed MoPub publishers”. It’s unclear what the qualifier “managed” means. Sampling our data, we have issues with app-ads.txt on MoPub about 48% in total, breaking down as follows: 
      • No app-ads.txt: 11%
      • No developer URL found in store: 22.5%
      • Not found in app store: 13.7%

    Not found?

    Additionally, it is worth looking at the bundles that are flagged as NOT_FOUND_IN_PLAY_STORE or NOT_FOUND_IN_ITUNES – how come those cannot be found?

    This does not have to be something nefarious, for example, based on spot-checking, it can be due to:

    • Case-sensitivity. Play Store bundles are case-sensitive but an SSP may normalize them to lower case when sending (e.g., com.GMA.Ball.Sort.Puzzle becomes com.gma.ball.sort.puzzle)
    • Some mishap like non-existent com.rlayr.girly_m_art_backgrounds being sent (but com.instaforall.girly_m_wallpapers exists)

    But even if the majority of the NOT_FOUND errors are due to such discrepancies, not fraud, it means that currently app-ads.txt mechanism itself is de facto not very reliable.

    Some other results

      • Q: Are there any apps that present as different publishers on the same SSP?
      • A: Not too many (about 8.6%), but even so for most important apps it is at most 2-3 – and even at long tail the max is 10 publisher IDs. This, though, can still account for some app-ads.txt PROBLEMs as seen above)

     

    Related documents, standards and initiatives

A post-mortem of a project: Wildboard

Once upon a time, I was working at Snaplogic, and at that time its office was in downtown San Mateo. Pretty much across the street from a great coffee shop, Kaffeehaus.

If you’ve been there, or even if you just looked at the website, you’d realize that the owner put quite an effort into it being a Viennese-style coffee house, with all the interior design decisions that go with it.

Now, a local coffee shop is often a place where people expect to post some local notices and ads (“lost dog”, “handyman available”, “local church choir concert”, etc). And here’s a conundrum. A simple cork bulletin board with a bunch of papers pinned to it just did not seem to fit the overall mood/interior/decor of the cafe:

Yet the cafe does want to serve local community and become an institution.

This being Silicon Valley, Val, the Kaffeehaus owner, had a vision — what about a virtual board, as a touch-screen.

The name was quickly chosen to be Wildboard — because it is, well, a bulletin board and in honor of the boar’s head that is prominently featured on the wall:

.

A multi-touch-based virtual bulletin board sounded interesting. Most touch-screen kiosks I’ve seen so far — in hotels and malls, for instance, or things like ImageSurge — only allow tap, not true multi-touch. (To be honest, multi-touch may or may not be useful — but see below and see also P.S. — but it is a very nice “pizzazz”).

And we — that is, myself and Vio — got to work. And in short order we had:

  • Fully multi-touch (with rotation, zoom, etc) web UI — as a Windows 8 CSS/Javascript app (source).
  • Wildboard “board server” — a Python app running on the same computer as the UI. It is responsible for polling the web server (below) and serving information to the UI (source).
  • Wildboard web server — a PHP app based on an existing web classified application(source). This allows users to submit ads (or they can do it via a mobile app, as below). It is also modified to automatically create QR codes based on user-provided information (map, contact, calendar, etc) and adds them to an ad.
  • Wildboard mobile app — PhoneGap/Cordova based app for both Android and iPhone (source)
    This app allows one to:

    • Post an ad
    • Scan an ad’s QR code
    • And, finally, for the “Wow!” effect during the demo, one can drag an ad from the screen into the phone. Here it is, in action:

  • Wildboard orchestrator — a Node.js app (source) designed to coordinate interactions between the mobile app and the board. It is the one that is determines which mobile app is near which board and orchestrates the fancy “drag” operation shown above.
  • For more information, check out spec and the writeup.

Charismatic Val somehow managed to get a big touch screen from Elo Touch. Here’s how it fit in the decor:

A network of such bulletin boards, allowing hyper-local advertising, seems like a good idea. Monetization can be done in a number of ways:

  • Charging for additional QR codes — e.g., map, contact, schedule.
  • Custom ad design (including interactive and advanced multimedia features — sound, animation, video).
  • A CPA (cost-per-acquisition) model, while tracking interaction via an app — per saved contact, per scheduled appointment, per phone call.
  • Premium section.

But… alas… This is as far as we got.

P.S. One notable exception is a touch-screen showing suggestions in Whole Foods in Redwood City.

OpenDSP – stay tuned

Every marketer, it seems, wants to participate in real-time bidding (RTB). But what is it that they really want?

They want an ability to price (price, not target!) a particular impression in real-time. Based on the secret-sauce business logic and data science. Fair enough.

But that secret sauce, part of their core competence, is just the tip of the iceberg — and the submerged part is all that is required to keep that tip above water. To wit:

  • Designing, developing, testing and maintaining actual code for the UI for targeting, the bidder for bidding, reporting, and data management
  • Scaling and deploying such code in some infrastructure (own data center,
    clouds like AWS, GCE, Azure), etc.
  • Integrating with all exchanges of interest, including the following steps:
    • Code: passing functional tests (understanding the exchange’s requirements for parsing request and sending response)
    • Infrastructure: ensuring the response is being sent to the exchange within the double-digit-millisecond limit
    • Scaling: As above, but under real load (hundreds of thousands of queries per second)
    • Business: Paperwork to ensure seat on the exchange, including credit agreements when necessary
  • Operations: Ongoing monitoring of the operations, including technical (increased latency) and business (low fill level, high disapproval level) concerns (whether these concerns are triggered by clients, exchange partners or,
    ideally, pro-actively addressed internally.

None of which is their core competence. We propose to address the underwater part. It’ll be exciting.

Enter OpenDSP. We got something cool coming up here. Stay tuned.

Watch this space, and by this space I mean this blog, this GitHub account.