We often need to count field data at Volmanac, either from hardware or the web. For example, we may hire people to visit stores or count customers, call producers, or ingest third party data from a source such as Live Ramp. We wanted to build a simple system that could ingest this data:

  1. Accurately
  2. Cost-Effectively
  3. Scale Easily

What we ended up settling on was using AWS API-Gateway to send data to a lambda function, and from there, send data to dynamoDB and potentially Redshift for additional data analysis. We will walk through a very basic example of our setup below:

counter

Lambda and DynamoDB are extremely cheap, easy to scale to our needs without provisioning servers, and DynamoDB supports atomic counts which is perfect for a simple counter (atomic updates are not idempotent which is fine for our counts, we can withstand a margin of error).

DynamoDB supports atomic counters, where you use the UpdateItem operation to increment or decrement the value of an existing attribute without interfering with other write requests. (All write requests are applied in the order in which they were received.) For example, a web application might want to maintain a counter per visitor to their site. In this case, the application would need to increment this counter regardless of its current value

Our total cost for ~2mm request/month between API gateway, Lambda, and DynamoDB using 256MB in our lambda function (I would not recommend using less) is roughly ~$15/month depending on our we provision our DynamoDB write capacity - Redshift costs a bit more.

The first step is to create a AWS Lambda Function. Name your function and select a node.js runtime, 256MB memory and add any security groups you might need (you can change these later, create one with lambda_dynamo to start). Copy the below code into the editor and test the function:

exports.handler = function(event, context) {
    context.done(null, "First Lambda Function");
}

We will amend the function later, but for now let’s create the API using API-Gateway. Create a new API then then create a new GET method (this tutorial will also work for POST requests as well). For integration type select “Lambda Function” and enter the function you just created.

Next under “Integration Request”, => Body Mapping Templates type in “application/json” in content type. Under generate template select “Method Request Passthrough” and save.

integration

Next go back to your lambda function and require a few libraries and set a variable to capture params.

var doc = require('dynamodb-doc');
var AWS = require('aws-sdk');
var dynamodb = new doc.DynamoDB();
var firehose = new AWS.Firehose({apiVersion: '2015-08-04'});

exports.handler = function(event, context) {
    //get variables
    var params = event.params;
    ...
}

Next to deploy your API, you will need to create a stage, which you can name anything. Once you deploy, you will receive a url which you can test. You should see the text “First Lambda Function” if you visit the page. You can now add params to the end of your get request i.e. ?param1=foo&param2=bar and access these within the lambda function with params["querystring"][param1].

Next create a DynamoDB table and define a primary partition key. We can then add the below code in our lambda function to write items to our Dynamo Table. Note that you can add any additional params you wish (this is not a relational data store), you just have to pass the correct table name and primary key which you defined.

...
//write to dynamoDB
dynamodb.updateItem({
    "TableName": tableName,
    "Item" : {
        ...
        "param1": param1,
        "param2": param2,
        ...
    }
}, function(err, data) {
    if (err) {
        context.done(null, 'error putting item into dynamodb failed: '+err);
    }
    else {
        console.log("write to dynamo");
        context.done(null, 'Write to DynamoDB' + data);
    }
});

You can also put records with additional data if you wanted to track individual events, such as did a specific cookie do a certain event. Additionally, you can save data to redshift. First create a redshift cluster and download the driver. Make sure you limit outside accessibility to only your IP address. Then create the table you require:

create table your_table_name (
 param_one varchar(100),
 param_two varchar(100),
 param_three varchar(50),
 time_of_event timestamp encode delta32k sortkey);

After that, you need to set up a kinesis firehose stream with the params that you want to save. You enter these values in a comma separated list which must match the names in the table you just defined. Also make sure that when you send them from your lambda function they are in the same order as the comma separated list.

...
var f_params = {
    DeliveryStreamName: 'STREAM-NAME',
    Record: {
              Data: data1 + "," + data2 + "," + data3 + "," + data4 + "\n"
            }
};

firehose.putRecord(f_params, function(err, data) {
        if (err) console.log(err, err.stack);
        else     console.log("write to firehose");
        });

API Gateway, Lambda, Kinesis Firehose, and DynamoDB are amazing tools that save developers (and investors) significant time and money. This exercise only scratches the surface at what they can do - but it is an important case study as to how the world is changing. Volmanac can monitor millions of events (and scale to billions) which are directly related to investments we are tracking for virtually zero cost and developer time.