FinOps

Some time ago a discussion about CIO vs CMO as it comes to ad tech started, and as I see it, it still continues. As a technical professional in ad tech space, I followed it with interest.

As I was building ad tech in the cloud (which usually involves large scale — think many millions QPS), business naturally became quite cost-conscious. It was then when, I, meditating on the above CIO-CMO dichotomy, thought that perhaps the next thing is the CIO (or the CTO) vs — or together with — the CFO.

What if whether to commit cloud resources (and what kind of resources to commit) to a given business problem is dictated not purely by technology but by financial analysis? E.g., a report is worth it if we can accomplish it using spot instances mostly; if it goes beyond certain cost, it is not worth it. Etc.

These are all very abstract and vague thoughts, but why not?

Recently I learned of an effort that seems to more or less agree with that thought — the FinOps foundation, so I am checking it out currently.

Sounds interesting and promising so far.

And nice badge too.

FinOps-Foundation-Community-Member-Badge

Ad-hoc querying on AWS: Connecting BI tools to Athena

In a previous post, we discussed using Lambda, Glue and Athena to set up queries of events that are logged by our real-time bidding system. Here, we will build on that foundation, and show how to make this even friendlier to business users by connecting BI tools to this setup.

Luckily, Athena supports both JDBC and ODBC, and, thus, any BI tool that uses either of these connection methods can use Athena!

First, we need to create an IAM user. The the minimum policies required are:

  • AmazonAthenaFullAccess
  • Writing to a bucket for Athena query output (use an existing one or create a new one). For the sake of example, let’s call it s3://athena.out
  • Reading from our s3://logbucket which is where the logs are in

Now we’ll need the access and the secret keys for that user to use it with various tools. 

JDBC

JDBC driver (com.simba.athena.jdbc.Driver) can be downloaded here.

The JDBC URL is constructed as follows:

jdbc:awsathena://User=<aws-access-key>;Password=<aws-secret-key>;S3OutputLocation=s3://athena.out;

Here’s a sample Java program that shows it in action:

import java.sql.Connection;
import java.sql.DriverManager;
 
public class Main {
  public static void main(String[] args) throws Throwable {
    Class.forName("com.simba.athena.jdbc.Driver");
    String accessKey = "...";
    String secretKey = "...";
    String bucket = "athena.out";
    String url = "jdbc:awsathena://AwsRegion=us-east-1;User=" + accessKey + ";Password=" + secretKey
        + ";S3OutputLocation=s3://" + bucket +”;";
    Connection connection = DriverManager.getConnection(url);
    System.out.println("Successfully connected to\n\t" + url);
  }
}

Example using JDBC: DbVisualizer

  1. If you haven’t already, download the JDBC driver to some folder.
  2. Open Driver Manager (Tools-Driver Manager)
  1. Press green + to create new driver
  1. Press the folder icon on the right …
  1. … and browse to the folder where you saved the JDBC driver and select it:
  1. Leave the URL Format field blank and pick com.simba.athena.jdbc.Driver for Driver Class:
  1. Close the Driver Manager, and let’s create a Connection:
  1. We’ll use “No Wizard” option. Pick Athena from the dropdown in the Driver (JDBC) field and enter the JDBC URL from above in the Database URL field:
  2. Press “Connect” and observe DbVisualizer read the metadata information from Athena (well, Glue, really), including tables and views.

ODBC (on OSX)

  1. Download run ODBC driver installer
  2. Create or edit /Library/ODBC/odbcinst.ini to add the following information:
    [ODBC Drivers]
    Simba Athena ODBC Driver=Installed
    [Simba Athena ODBC Driver]
    Driver = /Library/simba/athenaodbc/lib/libathenaodbc_sbu.dylib

    If the odbcinst.ini file already has entries, put new entries into the appropriate sections; e.g., if it was
    [ODBC Drivers]
    PostgreSQL Unicode = Installed
    [PostgreSQL Unicode]
    Description = PostgreSQL ODBC driver
    Driver = /usr/local/lib/psqlodbcw.so

    Then it becomes
     [ODBC Drivers]
    PostgreSQL Unicode = Installed
    Simba Athena ODBC Driver=Installed
    [PostgreSQL Unicode]
    Description = PostgreSQL ODBC driver
    Driver = /usr/local/lib/psqlodbcw.so
    [Simba Athena ODBC Driver]
    Driver = /Library/simba/athenaodbc/lib/libathenaodbc_sbu.dylib
  3. Create or edit, in a similar fashion, /Library/ODBC/odbc.ini to include the following information:
    [AthenaDSN]
    Driver=/Library/simba/athenaodbc/lib/libathenaodbc_sbu.dylib
    AwsRegion=us-east-1
    S3OutputLocation=s3://athena.out
    AuthenticationType=IAM Credentials
    UID=AWS_ACCESS_KEY
    PWD=AWS_SECRET_KEY
  4. If you wish to test, download and run ODBC Manager. You should see that it successfully recognizes the DSN:

Example using ODBC: Excel

  1. Switch to Data tab, and under New Database Query select From Database:
  1. In the iODBC Data Source Chooser window, select AthenaDSN we configured above and hit OK. 
  1. Annoyingly, despite having configured it, you will be asked for credentials again. Enter the access and secret key.
  2. You should see a Microsoft Query window. 

Success!

Helpful links