Polling data from the API

A common pattern we see our customers employing is to run a periodic (regular) process to retrieve the Long Energy data for their devices via the API, so that they can transfer it to their own data store for further analysis or provision through their own apps and APIs.

This support note outlines some common approaches (and potential pitfalls) for this kind of periodic process.

We are exploring options to enable a stream-based "push" API that provides significant benefits for this type of use case. If you have requirements like those described here, and would like to participate in early trials of the stream API, please contact the Wattwatchers team to discuss.

General principles

"Internet of Things" (IoT) applications are not like traditional computer systems in terms of data delivery. They are susceptable to communications outages and other issues that may delay the transmission of data from the device to the Wattwatchers infrastructure, and thus what is available via the API at any point in time.

Similarly, while we work very hard to ensure that our systems work flawlessly, 24/7, there are occasions where delays may be introduced within the Wattwatchers system, for example a short lag in message processing during a system maintenance window or unexpected server outage.

In such cases, data may be delayed in being processed and available via our REST API.

It is important that you consider this potential in the design of your application.

An example

For example, let's say a WiFi device is connected to a network that goes down for two weeks while a home owner is on holidays. The device still has power and will continue to collect data to its local data store over that two week period, but will not be able to communicate that data to the Wattwatchers system.

When the home owner returns home, they reboot their WiFi network or internet connection, the fault is corrected, and the device will resume sending the data it has collected to the Wattwatchers system.

Note that this "catch-up" process may take some time to complete, due to the nature of the Wattwatchers communications protocols. How long is dependent on how long the device is offline, and how reliable the communications channel from the device is.

In this example, assuming a solid connection is re-established between the device and Wattwatchers, it may take up to 6 hours for the device to completely catch up under normal operating conditions.

Polling for data

In the following sections we outline some of the approaches—design patterns—we've seen our customers employing, and review some of the potential pros an cons of these approaches.

Polling the last X mins

In this pattern, a script or process runs periodically to retrieve the data for a previous period on a rolling basis.

For example, at 10 minutes past the hour, the script requests data for the previous hour.

This is a very simple and straightforward approach, and may be suitable for quick "proof of concept" or prototoype application, but is not robust to common issues with data synchronisation between devices and the Wattwatchers system.

That is to say, if a device experiences a delay in reporting data, the API call is quite likely to return incomplete data from your devices, and you won't have any mechanism to detect the error.

Pros

Straightforward to implement
No need to store state (can be based purely on time calculations)

Cons

Highly likely to "miss" data if a comms error occurs

Rolling the last X hours

A variation on the "Polling the last X mins" approach is to have the rolling period extend for multiple hours. For example, at 10 minutes past the hour, requesting data for the previous 24 hours. Each hour, the time window would roll forward.

This approach is slightly more robust, and is likely to catch devices that have a temporary outage that come back online within the time window specified. It also enables detection of changes when you are working in granularities other than 5 minutes. For example, if you are retrieving data on a 15 minute granularity, and a device goes offline for a short period, when it comes back online the data reported for a previous 15 minute period may change (see note "Gaps" in a device's reported data below for more on this).

However, it still has the potential to miss data from previous periods if the device doesn't come back online within the specified period (e.g. the 24 hour window.)

Pros

Straightforward to implement
Provides some robustness to pick up "missing" data
No need to store state (can be based purely on time calculations)

Cons

Still has strong potential to miss previous data for extended outages
Ability to detect "changes" when working in granularity less than 5m

Maintaining a "last Long Energy" state for device

The most robust model is for your system to maintain some record of the last energy data you have received from your device(s).

Your periodic process is then responsible for retrieving data from the point where you last received data from a device to present time.

This process will need to take into consideration the maximum time period applicable to the specific API call(s) you are making.

Using the example above, while the device is online, you are retrieving the data with a fromTs that matches the timestamp of the last received Long Energy data from the device.

When the device goes offline, the last received Long Energy timestamp doesn't increment/increase, as no new data is being reported for the system.

When the device comes back online again, the last received Long Energy timestamp increments, and your system will start to receive the data incoming from the device, as it becomes available during the "catch-up" process.

Pros

Most robust model
Will pick up all data from a device
Provides ability to detect "changes" when working in less than 5m granularity

Cons

More complicated to implement
Requires storing of the device state within your system (e.g. in a database)

"Gaps" in a device's reported data

Communications delays are inevitable. Wattwatchers devices maintain a local data store of at least one month under normal operation. When they come back online, the devices will send this data from earliest to latest.

Variable 'duration' value

If you request data for a device from the API and there is only partial data available for that period, the duration property will reflect this.

For example, if a device goes offline at 10:05am (after reporting the last 5 mins data), and you request 15 minute granularity for the period 10:00am to 10:15am, the duration for the returned data will be 300 instead of 900.

If the device comes back online later, and catches up the data it collected during the period it was offline, calling the API again for the same peried will result in 900 being returned as the duration, provided there were 3 full Long Energy entries for a period.

Handling gaps when polling

Sometimes, however, a device may not report for a specific 5 minute period. For example, during a reboot or a system update, or if the power to the entire site goes down for a period.

In these cases, there will be "gaps" in the 5m data, as the device does not have data to report for the period. That is to say, if you retrieve data at 5 minute granularity, you will see this as a "missing" entry for the period. If you retrieve data at a lower granularity (e.g. 15m) you will see this reflected in the duration property.

Under normal operation, such gaps will not be present in the data if the device has collected data for the period and successfully reported the period (and/or subsequent periods) to the Wattwatchers system.

Handling empty data returned

Sometimes the outage for collecting data will be large enough that you get an empty data set [] for the period you are requesting for. For example, if the device loses power for more than 7 days.

If you are relying solely on the 'last Long Energy' value based on the contents of the return data, and the gap between the 'last Long Energy' value and the next valid data entry on a device is > 7 days, you'll not get the latest data using this method.

In addition to storing the 'last Long Energy' state, when you poll you should also check the /long-energy/{device-id}/latest to confirm you have all of the data collected for the device.

Under normal operation, such gaps will not be present in the data if the device has collected data for the period and successfully reported the period (and/or subsequent periods) to the Wattwatchers system.

Recommendations

When transferring data between one system and another, we recommend:

Use the "last Long Energy" state method for your application
Use 5 minute data, and calculate rollups (e.g. to 15m or lower granularity) within your own system, only after you are confident you have all data in your system (e.g. when subsequent records for later periods have been detected)
If using less than 5m granularity (e.g. 15m, 30m etc.), design your application to support changes in previous period data
Use the /long-energy/{device-id}/latest value to determine the end date for importing and roll between the first and latest values in regular intervals, rather than relying on the returned timestamp in the /long-energy/{device-id} endpoint alone.