How do I store data on iot device with only occasional access to the internet?

Question

We are in the planning phase for a telemetry iot device with only occasional access to the internet.

I found a lot of information online on how to store iot data in the cloud, what databases to use, how to calculate space requirements etc. what I'm missing is:

How do I store the data locally on the client before sending it to the cloud?

Newest data is the most interessting for us, but for a subset of metrics we want to keep all data points on device until they get transfered
Minimizing device storage is not a top priority
Battery life is not crucial our device will be connected to power as long as it is collecting data

We need to know a lot more about your device to be able to help with this. e.g. is it a micro controller or a system running something like linux? How much storage does the device have mb/gb flash/ram? Does it need to survive power outages? — hardillb, Feb 04 '19 at 16:05
Also what is the data collection rate and how fast is the internet uplink when it's present? — hardillb, Feb 04 '19 at 16:06
It is a linux based system. There is no hardware specification yet so I cannot give details about the size of flash/ram but we are talking about gbs not mbs. Yes needs to survive power outage. Will be mounted in a race car. Internet uplink will be cellular, so depending on the location where the race is from 4g to not present or Wifi. The data collection rate is not fixed yet, but we are talking about 30 sensors and about 100 data points, highest rate ~30 seconds other metrics 5 minutes — seedie, Feb 04 '19 at 16:09
i would cache it on microSD cards before uploading in batches. — dandavis, Feb 05 '19 at 22:20
@jsotola to be honest sqlite and flat files were the two things that came to my head. I will try to find out how to best store our data in sqlite and how to calculate the space usage Mawg maybe I wasn't clear enough in my question, but it helps me to know that sqlite is used in iot devices. I guess if the sensor data is written to the sqlite db I can read from it to push data to the cloud and delete what isn't needed anymore — seedie, Feb 06 '19 at 09:38
If I would add another sensor I'd have to change the sqlite db schema, is this correct? Sorry I know nothing about setting up a db structure — seedie, Feb 06 '19 at 09:48
I know spreadsheets, so you mean I could just add another column? Or another table if needed? — seedie, Feb 07 '19 at 11:20
This question is missing any constraints. It just says 'I want to do a thing'. Anything relevant from the comments could be elevated into the question, that might make it more useful since this is an interesting tpoic. — Sean Houlihane, Feb 14 '19 at 14:24
@SeanHoulihane you are right, while reading through the comments and answer I realized how bad my question was. I wanted to be open for new input and thats why I was as unspecific as possible, which I realized was bad for the question and not appropriate for this forum. There are multiple parts to the question, where to store data physically, what kind of data storage (relational db, json in db, flat file, etc.) the data model. I haven't updated it because I wasn't sure if its not better to split it up in multiple questions — seedie, Feb 14 '19 at 17:04
Generally multiple questions is better. Is it possible to narrow this question to keep the answer relevant, then ask the other aspects? Particularly on a beta site, it helps to have more questions so there is no harm adding questions where you think you already have a good answer. — Sean Houlihane, Feb 15 '19 at 12:33

score 1 · Accepted Answer · answered Feb 05 '19 at 14:09

Perhaps you could use the Round Robin Database Tool? It certainly operate along the lines that you want.

RRDtool (round-robin database tool) aims to handle time series data such as network bandwidth, temperatures or CPU load. The data is stored in a circular buffer based database, thus the system storage footprint remains constant over time.

It also includes tools to extract round-robin data in a graphical format, for which it was originally intended. Bindings exist for several programming languages, e.g. Perl, Python, Ruby, Tcl, PHP and Lua. There is an independent full Java implementation called rrd4j

If you can’t find a port or build it yourself, then the same principles are demonstrated with MySql at Round Robin Database with MySQL.

It’s what we do on every project I have worked on as an embedded develop for the last few decades (although we roll our own):

Some data are important/vital, and we do whatever we can to retain them all (not always possible, if there is along communications break)
Some we need the last X entries for (and can dimension that to a fixed size at design time)
Some we need the last X minutes’ worth of data (“round robin”, also size fixed at design time)
Some are “nice to have”, but we can ditch them if the space is needed (not so common, though)

Plus, of course, there is the question of which to store in NVM. This is all really a question for your project’s software architect - unless that’s you, in which case it’s a question for us :-)

There are other things that can help, such as compressing the data, or Run Length Encoding (e.g don't save the values for 1,000 consecutive readings with the same value; just save the value once, with a repeat counter), etc

Do you mean using RRDTool as a second database for the data that has to stay on the device or as the primary database where all metrics will be moved to? If some of the data is already transfered to the client how do I know what has been transfered already? I thought RRDTool is not meant to delete data from it, but I only know it as backend of mrtg. Oh and I'm not the software architect :) — seedie, Feb 06 '19 at 09:19
We generally have round robin data on the device (in NVM), but not a database (YMMV). Some gets removed as it transmitted - and receipt acknowledged. Some stays there for service technicians. I advise to keep all entries of the same type to a fixed size for simple implementation. — Mawg says reinstate Monica, Feb 06 '19 at 10:22

score 1 · Answer 2 · answered Feb 19 '19 at 13:04

Since as per your spec this would be a linux based system, you could store the data locally on Sqlite database.

Further check out SQLITE-SYNC. With this framework supposedly your application can work completely offline, then perform an automated Bidirectional Synchronization when an internet connection becomes available.

So your app does not need to maintain routines to sync forward and/ back.

Ref: https://ampliapps.com/sqlite-sync/

How do I store data on iot device with only occasional access to the internet?

2 Answers2