Strategic considerations on the implementation of telemetry
How to use telemetry to configure multiple devices at once?
Konstantin Hristozov
05.06.2024

The purpose of this article is to explain strategies for configuring multiple devices using telemetry and point to the challenges and important considerations for their implementation.

Device configuration can be categorized into two types: local and remote. Local configuration involves using a direct interface on the device, such as USB, Bluetooth, LoRaWAN or local TCP/UDP based protocols. Remote configuration involves accessing the device over the Internet. This article will focus on strategies and challenges for implementing a telemetry service for remote configuration.

Device Reachability: Permanent or Sporadic?

The first consideration for remote configuration is whether the device needs to be reachable most of the time or only sporadically. This decision impacts the infrastructure cost and the device's memory requirements. Devices that need to be reachable most or all of the time require an infrastructure capable of maintaining permanent connections to multiple devices simultaneously. This infrastructure needs substantial server memory, as each connection requires a dedicated memory block. If TLS (Transport Layer Security) is used, memory requirements increase further. An example of such a device is a public charging station for electric vehicles, which needs to authenticate users remotely and provide charging updates to their smartphones.

To reduce infrastructure costs, consider using non-permanent connections for remote configuration. A hybrid model involves using a local node, which act as a gateway with a permanent connection to the remote service. This gateway then provides configuration to multiple devices directly connected to it. While this model can reduce costs, it introduces complexity in maintenance and development.

Securing Configuration Data: Sensitive or Non-sensitive

Ideally, all data should be protected, regardless of its sensitivity. However, small microcontrollers might lack the bandwidth to establish or the resources to maintain TLS, because a single TLS record can reach up to 16kB, which could be like 10% of all available memory. In such cases, the configuration data should include a signature to verify its authenticity. And if the signature is too much, at least some sort of MAC (message authentication code) must be included. If the configuration contains sensitive information, it must also be encrypted.

For any SoC based device TLS (resp. DTLS) is the state-of-the-art method for securely transporting configuration data. Before implementing the secure transport of the data, consider the following:

  • Server Trustworthiness: Ensure you trust the hosting provider of your service. Devices should establish TLS connections to trusted servers. However, relying solely on server trust is insufficient. To prevent attacks where devices might connect to fake servers, your configuration client should trust server certificates issued by a PKI (Public Key Infrastructure) that you control. Avoid redirections. For the most-paranoid among us add a signature to your configuration data to further enhance its authenticity.
  • Updating Certificates: Every server must update its server certificate on a regular basis. If the server hosting your update service fails to update its certificate, it poses a security risk that needs immediate attention. Devices receiving regular updates should also update their set of trusted CA certificates to verify server authenticity. Because every certificate has a limited validity period, if the CA certificates expire and the devices do not receive updates, they will stop trusting your server. Ensure your devices can receive updates in the field to maintain trust.

By addressing these considerations, you can effectively use telemetry to configure multiple devices remotely, ensuring both secure and efficient operations.

Transport protocols for remote configuration

MQTT is a good choice for configuring a small number of devices. For configurations involving fewer than 100 devices, an MQTT-based strategy, or the more generic AMQP (Advanced Message Queuing Protocol), works well. An MQTT configuration service typically uses an MQTT broker with a list of configuration topics that each device subscribes to. However, as the number of devices increases, MQTT becomes less effective. When trying to manage lots of devices truough MQTT, everything gets complicated when you deal with multiple software versions or you have to configure only a subset of devices. Additionally, hosting expenses for the MQTT broker increase exponentially with the number of devices.

Websocket is an excellent choice for bi-directional communication and telemetry, but it comes at a price. Websocket is still one of the best performing technologies for transferring arbitrary data with low overhead and optional compression. Many telemetry protocols, such as OCPP (Open Charge Point Protocol) and WAMP, use Websocket because of its ease of integration with web servers and simple development. Even MQTT can run over Websocket, and there are implementations of Modbus over Websocket. The downside is that server infrastructure must scale proportionally to the number of devices. Each persistent connection requires its set of server resources, so adding more devices increases infrastructure costs linearly. Let's make an example of a client connection that requires 500kB of RAM on the server. The amount of RAM required to handle 1 million devices could be something around 476 GB. So scaling your product fleet will require to scale your infrastructure at the same pace.

HTTP-based telemetry is probably the cheapest choice for SoC-based devices. HTTP-based RESTful APIs are simpler to implement than WebSocket but less efficient, because a connection must be established more often. HTTP's stateless nature and compatibility with existing web infrastructure make it easy to integrate telemetry systems with web services and databases. HTTP traffic is generally allowed through most network security measures, providing broad compatibility. However, HTTP/1.1 is not optimized for low-latency or high-frequency data transmission, which can be a drawback for real-time telemetry applications. HTTP/2 improves efficiency with multiplexed streams and header compression, and HTTP/3, based on UDP, offers even better performance. HTTP-based protocols generally require less resources of the infrastructure than WebSocket or MQTT due to infrequent server communication.

CoAP (Constrained Application Protocol) is ideal for embedded devices with limited resources. CoAP uses minimal overhead, making it suitable for sending sensor data to remote servers. CoAP clients, typically sensors, send telemetry data to a CoAP server using requests like GET, PUT, POST, or DELETE. CoAP's use of UDP reduces communication overhead compared to TCP-based protocols like HTTP/1.1 and HTTP/2. Its simplicity and RESTful architecture make it easy to implement and integrate with web technologies. However, UDP's lack of guaranteed delivery, ordering, or duplicate protection introduces potential reliability issues. CoAP mitigates this with optional features like message retransmission and acknowledgments. Despite its lightweight nature, CoAP can be secured with DTLS, providing security comparable to HTTPS. CoAP's efficiency and adaptability make it a popular choice for low power consumption and minimal data transmission in IoT systems. For the configuration use-case the infrastructure cost for this technology is as low as for HTTP, but the initial development of the server infrastructure could be a little bit more expensive than HTTP. When a newer product generation overcomes the constraint of the limited resources and the communication to the server is infrequent, you could eventually transition to HTTP.

Understand your configuration format

Transferring configuration data from a telemetry service to a device is one thing, but knowing how to read and store the configuration is another. When developing a product, you typically have a specific configuration format in mind, such as a JSON file or a key-value database like LMDB. If your product has a user interface, you will likely have many configuration parameters - some user-changeable, some remotely changeable, and others for external configurations. Parameters for your application can be stored in your file or database, but system configurations need special handling.

For instance, if you want to remotely provision a new set of authorized SSH public keys, they usually need to be deployed to the “.ssh” directory of a user. If your file system is writable, you can simply copy them. However, on a hardened embedded device with a read-only file system, overwriting a file is not possible. A cheap method is to use a symbolic link to a writable partition, but a better method is to use an overlay file system.

Another challenge is maintaining consistent configurations across different software versions. Over time, your product may have devices with various software versions in the field. New software versions may rely on different configuration parameters. A well-designed service should ensure that older devices can store new configuration parameters that will be recognized by newer software versions after updates. Deprecating parameters in new versions also requires configuration versioning to avoid issues when downgrading software.

A best practice to overcome this challenge is using a well-designed database for storing configurations. A mature database management system offers system stability. It keeps configuration parameters consistent when changed through multiple sources, such as a local GUI, telemetry service, or sensor-based automation. Provisioning or updating configuration parameters can be managed by one application while being used by multiple others, making databases a reliable choice for storing configuration parameters.

Handling Factory Reset

Factory reset is often neglected when configuring devices via telemetry. Consider a scenario where you have a product with software version 1, which uses three configuration parameters: A, B, and C. You produce and deliver multiple units of this product. Later, you release software version 2 that requires six configuration parameters: A, B, C, X, Y, and Z. If you provide all six parameters through a remote configuration service to devices running software version 1, will they store and use them after upgrading to version 2? Additionally, how do you manage software downgrades? Should you retain or remove the X, Y, and Z parameters after downgrading from version 2 to version 1 and performing a factory reset?

These issues become more complex when distinguishing between a reset to the working initial configuration and a reset intended for transferring product ownership. A well-designed system must address these scenarios.

Remote Configuration of Boot Parameters

So far, we've discussed the remote configuration of application and system parameters. The boot loader, however, is a special case. Exchanging configuration parameters with the boot loader requires a reliable mechanism to transport the configuration to a space accessible by the boot loader. This is typically done with a state framework, which allows applications to modify the boot loader environment and direct it to the correct resources for the next boot.

A common use case for remote configuration of boot parameters is provisioning a different splash image displayed during the early boot stage. This requires ensuring that the new image is correctly transferred and stored in a location where the boot loader can access it.

Key Considerations for Effective Remote Configuration

  1. Scalability: Ensure your system can handle an increasing number of devices without exponential cost increases. Consider using a hybrid model or optimizing protocols for your specific use case.
  2. Security: Use secure transport mechanisms like TLS/DTLS and regularly update CA certificates of your devices to maintain trust. Encrypt sensitive configuration data and use signatures or at least HMAC to verify authenticity.
  3. Consistency: Maintain consistent configuration formats across different software versions. Use a database management system to keep configurations stable and consistent, regardless of the update source.
  4. Flexibility: Design your system for both permanent and non-permanent device connections to balance cost and resource requirements.
  5. Resilience: Implement mechanisms to handle factory resets and software downgrades gracefully. Ensure that configuration data is managed appropriately during these processes.

By addressing these considerations, you can create a robust and efficient telemetry system for configuring multiple devices remotely.

How Tempo2Market Handles Device Configuration

The heart of the Tempo2Market software suite is the ttmdaemon agent application. This application manages telemetry connectivity and provides a RESTful API and a D-Bus interface to local applications, making configuration transparent and offering a robust set of features for telemetry communication to other applications running on the same system.

The infrastructure component of Tempo2Market is the IoT platform fleetwarden.de, which offers a variety of telemetry services. Among the most prominent services are remote device configuration and over-the-air updates.

Tempo2Market addresses scalability and flexibility challenges with a hybrid approach. It uses an HTTP-based REST API for low-frequency data transfer and Websocket-based communication for specific use cases on individual devices. This hybrid model helps keep infrastructure costs low while providing a cost-efficient entry-level telemetry option for clients. The ttmdaemon leverages HTTP/2 features to balance the costs of establishing connections with the timely transfer of configuration changes.

A guiding principle of Tempo2Market is “security first”. All data, whether sensitive or non-sensitive, is transferred through a secure connection. The platform employs mutual TLS and its own PKI (Public Key Infrastructure) to ensure secure communications. Credentials are stored securely to protect against unauthorized access.

Consistency and resilience are maintained through proper partitioning and the use of an overlay file system, paired with a well-designed state framework. This ensures that configuration parameters remain consistent and resilient across various updates and system changes, even in complex scenarios involving software downgrades.