Working with data feeds

A data feed is the transfer of data via one or more files, from one system to another. Data is not real-time, but is sent on a regular basis, such as every weekday morning. In this way a data file feed is different to an API, which supplies specific pieces of data between systems ‘on the fly’ or on request.

This post outlines the basics involved in integrating a third party data feed into an application, for business analysts and project managers.

There are five basic components to taking in a data feed:

People involved will be:

Some things to note about integrating data feeds:

Data feed integration is specialised work that requires a data developer. It also should include the oversight of a technical specialist in your application who understands very well the behaviour of data and processes in the application, if you want to avoid integrating a feed that the application is unable to make full use of.

Quite likely, your application will need a new release to support the data integration. That could be just a ‘server-side’ release (database and other background stuff) or a server-side and ‘client-side’ release (the application that your users use).

Testing needs to be done multiple times, in multiple environments – including in production.

So, let’s get started!

1: Commercials

In other words, securing agreement with the data feed vendor. The agreement needs to include support levels, including what happens if there is a problem with the feed and remediation for late or missing data.

Tip: particularly if your organisation is new to integrating and managing third party data: choose standard feeds and agreements where possible, and avoid bespoke data and agreements unless you really, really need them.

2: Connectivity

The two sides – data vendor and your organisation – agree on method of connectivity, such as SFTP, and then set this up and test it. This will need to be done initially between test environments and then again later any time you are implementing the feed to another environment – staging or pre-production if you have it, and again for production (go-live). The environments used will depend on what your organisation and your vendor has available, your test requirements, and time and cost trade-offs.

3: Data mapping and integration

By far the biggest part of the work, and the most time consuming.

If your application already has a mature data structure then the work is to map the data feed into that structure. If your organisation is building or extending an application, then the application’s data structure needs to be built or extended as well.

Each data feed needs to be taken in and stored, essentially in a database that the data development team sets up.

This means there will be at least two data stores involved: one for the raw data coming in from the feed, and one that sits behind your organisation’s application, that mediates data to the application’s interface (screen).

The high-level steps to map a data feed are:

4: Testing

5: Go-live, support and monitoring

Hooray! Testing has passed, you all thoroughly understand the data feed and all possible data you can expect to receive (or do you?!) So now it’s time to go live.

Here are the steps:

TLDR: Tips for working with data feeds