1

I'm looking for a database for the following data storage and query pattern:

Data consists of dictionaries - mappings string to string.

Queries are also dictionaries. The result of the query is a list of all matching dictionaries, where matching is the "obvious" relationship (dictionary d matches query q if d[k] == q[k] for all k in q).

For clarity, here is an example. Let's say you have the following data stored:

data = [{'a': '1'}
        {'a': '1', 'b': '1'}
        {'c': '2', 'b': '1'}]

Then

query({'a': '1', 'b': '1'}) == [{'a': '1', 'b': '1'}]

query({'b': '1'}) == [{'a': '1', 'b': '1'}
                      {'c': '2', 'b': '1'}]

While I think that this can be implemented using many databases, I'm looking for something that would be optimized for this kind of pattern. That is, it would easily handle thousands of different keys (across all the dictionaries), and tens of millions of entries.

psarka
  • 111
  • 4
  • Is the query always the leading prefix? Sounds like a job for a hierachical database or a KV Store which allows Index range queries. What programming language you are looking for? – eckes Jun 02 '19 at 09:39
  • @eckes No, query can have arbitrary keys. Python would be best, but it is not too important. – psarka Jun 02 '19 at 09:46
  • Not sure if there is a library who allows that, you would have to store the terms separately and query all of them and create the union of the results. It’s a typical operation for fulltext engines like Lucene. – eckes Jun 15 '19 at 15:40

0 Answers0