Managing Updates on Large Fleets of Remote Devices

by Andy Slote - Director of Customer Success for ObjectSpectrum

Dec 01 2023

All Posts Managing Updates on Large Fleets of Remote Devices

The world of device management presents many challenges depending on a long list of factors to consider. When managing firmware, software, and configuration changes for devices in the field, successful execution often requires detailed knowledge and expertly designed software.

Many devices do not have the processing power, memory, etc., to be updated in any way other than in close proximity of the device, using NFC or Bluetooth, or a direct connection like USB or a custom serial cable. Even some wireless connectivity technologies only provide limited update capability or none at all. Our blog audience knows the extent of our work with the LoRaWAN standard, which supports “over-the-air” updating with the right management software. However, relatively few LoRaWAN devices currently accommodate it. For this article, we are highlighting environments where a remote option is technically feasible and widely supported, which usually means with wireless connectivity using WiFi, cellular, or satellite networks.

Devices

The characteristics of the devices under management weigh heavily on how extensively they can be updated remotely. There is a broad range here, from relatively simple “low end” devices to those with powerful processors and extensive memory. And if remote management and updates are needed, the solution design needs to take these attributes into account as a primary factor.

For example, the available memory on a device may be sufficient to store more than one software version. In this situation, it may make sense to download all the files before triggering the update. This approach can minimize downtime through a more efficient process. The other benefit to having enough memory to store two copies of the content is that it reduces risk, allowing automatic fallback to the previous content in case of corruption. When memory is limited, a “wipe and replace” approach may be the only option, which could take the device out of service for longer, and can risk “bricking” the device.

Connectivity

Firmware, configuration, or software updates can be challenging over wireless connections.

The available bandwidth may be limited, and effective operation within this constraint is an element of a well-designed solution. Efficiency may require downloading one file at a time or even “chunks” of larger files in order to make efficient use of the available connection. If the same wireless network is supporting other traffic at the same time, this type of metering becomes even more critical.

When there are “concentrations” of devices in the same location, an option to consider is enabling devices to obtain files from peers on the same local network. File downloads may be slow initially but will increase speed as more devices share files using the LAN bandwidth rather than relying solely on the site’s Internet connection capacity.

Another essential aspect of good connectivity management is to be aware of interruptions and track the accurate status of the progress of all downloads or file shares, enabling processes to restart at the correct point after a failed connection is restored.

Timing

Timing is another consideration, usually important in environments where devices are in use for a portion of the day, or all day, and taking them down would have a significant impact (in a retail environment during business hours, for example). Updates must typically occur outside of this window, but having the ability to apply them immediately in critical situations is also an important feature.

Another potential timing requirement is the need for all devices in a location to “flip” to a new software load simultaneously for consistency. An example is when pricing changes in a restaurant location where inconsistency across devices in the same store could cause confusion and potential customer issues.

Implementation

An effective solution for updates and supporting capabilities would include two primary components:

  1. Control and monitoring logic in the cloud
  2. A device-level “Agent”

A list of files exists to be downloaded and included in the update process, with attributes such as file type, size, and integrity checks like cryptographic hashes. When possible, the objective is to download some of the files specified to a small set of devices to minimize Internet bandwidth use and then use local file sharing between devices to help distribute all the files. The control logic manages this process until it is complete.

Remote management and updates on most connected devices are more difficult because network limitations and typical firewall rules almost always require the device to establish a robust and reliable connection to the cloud using minimal bandwidth. The Agent solves this dilemma. It is a simple, efficient program running on each device that recognizes files already on the device and the files needed, all based on the previously mentioned list. It requests files from the cloud-based control logic, which determines the source – either from the cloud or a locally available resource. The Agent also performs “clean up” by identifying and removing unnecessary files and initiating a restart or reboot when required.

For greater efficiency, there is a site-specific “awareness” of available Internet and WAN bandwidth with the ability to set two parameters to work within the constraints of a location:

  1. Maximum number of simultaneous downloads
  2. Maximum number of simultaneous local file shares

This solution also includes other essential functions:

  • Easy & efficient rollback to the previous version
  • Power status and battery level monitoring 
  • “Heartbeat” monitoring for devices
  • The ability to create and run scripts for individual devices or groups, both on-demand and scheduled
  • System and Device Log Management
  • Secure cloud, connectivity, and devices using encryption
  • A robust User Interface including statuses, progress reports, alerts, etc.

Thoroughly researching this capability confirms the need to create a comprehensive solution to enable success. Ignoring key aspects can lead to poor performance and implementation that is difficult, if not impossible, to manage.

«

|

»