A look at app-ads.txt

Introduction

App-ads.txt is a follow-up to IAB’s ads.txt initiative aimed at increasing transparency in the programmatic ad marketplace. Related to it are a number of other initiatives and standards such as supply chain, payment chain, sellers.json, various things coming out of TAG group, etc. See last section for relevant links.

TL;DR: This standard allows an ad inventory buyer (DSP) to verify whether the entity selling the inventory (SSP) is authorized by the publisher to sell it. This is done by looking up the the publisher’s URL for an app (bundle) in the app store, and examining app-ads.txt file at that URL. The file lists authorized sellers of the publisher’s inventory, along with an ID that this publisher should be identified by in the reseller’s system (publisher.id in OpenRTB).

Terminology

Here, the words “publisher” and “developer” may be used interchangeably; ditto for “app” and “bundle”.

First pass implementation

The algorithm is fairly simple: for each app – aka bundle – of interest, grab the app publisher’s domain from the appropriate app store, fetch app-ads.txt file from that domain and parse it. But of course, in theory there is no difference between theory and practice, but in practice there is. In reality, there are some deviations from the standard and exceptional cases that had to be taken care of in the process.

As a first pass, we are running this process semi-manually; if the results warrant full automation, this can easily be accomplished. Here is what was done (more technical details can be found on GitHub):

  1. Bundle IDs and other information was retrieved from request log for the last few days (really, requests for all bids in May as of end of May 6), using a query to Athena. This comes to a total of:
    1. 1,144,377 bids
    2. 613,241 unique bundle IDs.
    3. 685,221 bundle-SSP combinations
  2. The result set, exported as CSV file is loaded into SQLite DB. The same SQLite DB is also used for caching of results.
  3. Go through the list of bundles with a Python script, appadsparser.py.While ads.txt standard provides the specification of how app stores should provide publisher information, only Google Play Store currently follows it. For Apple’s App Store, we crawl its search API’s lookup service, and, if not found there, directly the App Store’s page (though this appears to be a violation of robots.txt).
  4. The result is a semi-structured log. The summary is below.

Results Notes

Result codes

The following are result codes and their meaning:

  • OK – The SSP we got the traffic on is authorized by the bundle publisher’s app-ads.txt for the publisher ID specified in request
  • OK_GOOGLE – Google is authorized by the bundle publisher’s app-ads.txt. See below on explanation of why Google is special.
  • PROBLEMS — Mismatch found between authorized SSPs and/or publisher IDs in app-ads.txt.
  • NO_APPADS_TXT – Publisher’s website has no app-ads.txt file
  • NOT_FOUND_IN_PLAY_STORE – bundle not found in Google Play Store. NOT_FOUND_IN_ITUNES – same for Apple App Store.
  • NO_URL_IN_PLAY_STORE – cannot find publisher’s URL in Google Play Store. NO_URL_IN_ITUNES – same for Apple App Store.
  • FACEBOOK_URL – publisher’s URL points to a Facebook page (this happens often enough that it warrants its own status) BAD_DEV_URL – publisher’s URL in an app store is invalid
  • BAD_BUNDLE_ID – store URL (from the OpenRTB request) is invalid, and cannot be determined from the bundle ID either

    Note on Google

    NOTE: At the moment, there is no way to check the publisher ID for Google due to an internal issue. In other words, we cannot verify the following part of the spec:

    Field #2 - Publisher’s Account ID - This must contain the same value used in transactions (i.e. OpenRTB bid requests) in the field specified by the SSP/exchange. Typically, in OpenRTB, this is publisher.id.
    

    Given Google’s aggressive anti-fraud enforcement, we can for now stipulate that it would not run unauthorized inventory. There is still the possibility of fraud, of course. But in the below table we distinguish between bundles served via Google (where we do not check for publisher ID, just the presence of Google in app-ads.txt) and those served via other SSPs, where we cross-reference the publisher ID.

    As a corollary of the above you will not see Google inventory under “PROBLEMS” status.

    This gives a good sample:

    Result code Count
    OK 5,693
    OK_GOOGLE 29,330
    PROBLEMS 361
    NO_APPADS_TXT 7,653
    NOT_FOUND_IN_PLAY_STORE 4,417
    NOT_FOUND_IN_ITUNES 134
    NO_URL_IN_PLAY_STORE 20,112
    NO_URL_IN_ITUNES 16,168
    FACEBOOK_URL 1,378
    BAD_DEV_URL 4

    Summary of findings

    It appears that due to not very high adoption of this standard at present (developer URLs not present in App Stores or app-ads.txt file not present on the developer domain), there is not much utility to it at the moment. However, as the adoption rate is increasing (see below), this is worth revisiting again. Also, consider that for this exercise we only used bids information – that is, not the sample of full traffic, but just what we have bid on. This may not be representative of the entire traffic also, and may be interesting to explore.

    Consider also that the developer may well have the app-ads.txt file on the website, but if the website is not properly listed in the app store, we have no way of getting to it (yet SSPs may include those sites in their overall numbers, see, e.g., MoPub below).

    What does the industry say?

    • Google Play Store is reported to have only ~8% adoption. Worth quoting here is this section of the report: 


      Who Are the Top ‘Direct’ Ad Partners Inside the App-Ads.txt in Google Play Apps?
      Direct ad partners are those which have been granted direct permission by app developers to sell app ad space. That is, they are explicitly listed on the app’s “App-ads.txt” file. Google.com is listed as a direct ad partner on 95.87% of all “App-Ads.txt” files for Google Play apps. This makes it the most frequently mentioned direct ad partner for apps available on the Google Play.

    • Pixalate’s 2019 app-ads.txt trends report is interesting: 
      • Doesn’t seem that app-ads.txt makes that much difference for IVT (invalid traffic) – apps with app-ads.txt have 18.7% of IVT vs 21.1% for those without (pg. 6), despite Pixalate dramatizing this 2.4 percentage point increase as a 13% increase (2.4/18.7 – lies, damn lies and statistics).
      • It lists way higher numbers of adoption for Google Play Store apps than above but that is across top 1K apps.
      • Increasing rates of adoption (~65% rise in Q4 2019 – pg.13)
      • Unity, Ironsource, MoPub, Applovin, Chartboost – in that order – are in the top direct ad partners for Android apps (pg.18); MoPub, Unity, IronSource – in top direct AND resellers for iOS (pg.20)
    • MoPub claims that “app-ads.txt file adoption exceeds 80% for managed MoPub publishers”. It’s unclear what the qualifier “managed” means. Sampling our data, we have issues with app-ads.txt on MoPub about 48% in total, breaking down as follows: 
      • No app-ads.txt: 11%
      • No developer URL found in store: 22.5%
      • Not found in app store: 13.7%

    Not found?

    Additionally, it is worth looking at the bundles that are flagged as NOT_FOUND_IN_PLAY_STORE or NOT_FOUND_IN_ITUNES – how come those cannot be found?

    This does not have to be something nefarious, for example, based on spot-checking, it can be due to:

    • Case-sensitivity. Play Store bundles are case-sensitive but an SSP may normalize them to lower case when sending (e.g., com.GMA.Ball.Sort.Puzzle becomes com.gma.ball.sort.puzzle)
    • Some mishap like non-existent com.rlayr.girly_m_art_backgrounds being sent (but com.instaforall.girly_m_wallpapers exists)

    But even if the majority of the NOT_FOUND errors are due to such discrepancies, not fraud, it means that currently app-ads.txt mechanism itself is de facto not very reliable.

    Some other results

      • Q: Are there any apps that present as different publishers on the same SSP?
      • A: Not too many (about 8.6%), but even so for most important apps it is at most 2-3 – and even at long tail the max is 10 publisher IDs. This, though, can still account for some app-ads.txt PROBLEMs as seen above)

     

    Related documents, standards and initiatives