How to create a subrange of a BTreeSet<(String, String, String)>? How to turn a tuple of bounds into a bound of a tuple?

Issue

I am trying to use a BTreeSet<(String, String, String)> as a way to create a simple in-memory ‘triple store‘.

To be precise:

type Entity = String;
type Attribute = String;
type Value = String;
type EAV = (Entity, Attribute, Value);
type EAVSet = BTreeSet<EAV>;


pub fn example_db() -> EAVSet {
    let mut example: EAVSet = BTreeSet::new();
    insert_strtup(&mut example, ("1", "type", "user"));
    insert_strtup(&mut example, ("1", "user/name", "Arthur Dent"));
    insert_strtup(&mut example, ("1", "user/age", "33"));

    insert_strtup(&mut example, ("2", "type", "user"));
    insert_strtup(&mut example, ("2", "user/name", "Ford Prefect"));
    insert_strtup(&mut example, ("2", "user/age", "42"));

    return example;
}


fn insert_strtup(db: &mut EAVSet, val: (&str, &str, &str)) -> () {
    db.insert((val.0.to_string(), val.1.to_string(), val.2.to_string()));
}

pub fn example()  {
    let db = example_db();

    // How to customize this?
    let range: (Bound<EAV>, Bound<EAV>) = (Bound::Unbounded, Bound::Unbounded);

    for elem in eavt.range(range) {
        println!("{:?}", elem);
    }
}

The problem I am facing, is that I want people to be able to iterate over a subrange of values in the set. However, a simple usage of std::ops::Bound is not possible because we store a tuples with multiple fields.

I’d like to be able to build range-queries for all of the following:

  • all entities;
  • all entities with an ID in range x..y;
  • all fields of entity 1;
  • the current value of entity 1‘s "user/age" field).

The only idea which came to mind so far, is to use a string key which we know for a fact compares lower resp. higher than what we’re looking for for the ‘placeholder’ fields. But this feels very hackish/error-prone and like reinventing the wheel.

Is there maybe a way to turn a (Bound<String>, Bound<String>, Bound<String>) into a Bound<(String, String, String)> maybe?
Or is there another approach to take here?

EDIT: Filtering/querying a multi-key btree index in Rust shows one solution by wrapping all values in an ordered Enum (Min, Exact(String), Max), but this solution requires altering what kind of values are stored inside the BTreeSet. It also feels like adding a memory overhead as we’re in actuality never storing anything other than Exact(some_string) inside. Is there another approach that does not require altering the type of values stored in the BTreeSet?

Solution

Since Borrow always returns a reference (grrrrrrrr), and Borrowed isn’t necessarily Copy, you might be able to rely on a sentinel memory address?

Note that since associated static items aren’t allowed, you probably need a copy of this code for each type you want to use.

use std::borrow::Borrow;
use std::cmp::Ordering;

#[repr(transparent)]
pub struct StringWithMinMaxSentinel(String);

// must be static, not const, to ensure a constant memory address
pub static STRING_MIN_SENTINEL: StringWithMinMaxSentinel = StringWithMinMaxSentinel(String::new());
pub static STRING_MAX_SENTINEL: StringWithMinMaxSentinel = StringWithMinMaxSentinel(String::new());

impl Borrow<StringWithMinMaxSentinel> for String {
    fn borrow(self: &String) -> &StringWithMinMaxSentinel {
        unsafe { &*(self as *const String as *const StringWithMinMaxSentinel) }
    }
}

impl PartialEq for StringWithMinMaxSentinel {
    fn eq(&self, other: &Self) -> bool {
        std::ptr::eq(self, other) || (!std::ptr::eq(self, &STRING_MIN_SENTINEL) && !std::ptr::eq(other, &STRING_MAX_SENTINEL) && !std::ptr::eq(other, &STRING_MIN_SENTINEL) && !std::ptr::eq(self, &STRING_MAX_SENTINEL) && self.0.eq(&other.0))
    }
}

impl Eq for StringWithMinMaxSentinel {}

impl PartialOrd for StringWithMinMaxSentinel {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        Some(self.cmp(other))
    }
}

impl Ord for StringWithMinMaxSentinel {
    fn cmp(&self, other: &Self) -> Ordering {
        if std::ptr::eq(self, other) {
            Ordering::Equal
        } else if std::ptr::eq(self, &STRING_MIN_SENTINEL) || std::ptr::eq(other, &STRING_MAX_SENTINEL) {
            Ordering::Less
        } else if std::ptr::eq(self, &STRING_MAX_SENTINEL) || std::ptr::eq(other, &STRING_MIN_SENTINEL) {
            Ordering::Greater
        } else {
            self.0.cmp(&other.0)
        }
    }
}

Answered By – Solomon Ucko

Answer Checked By – Terry (AngularFixing Volunteer)

Leave a Reply

Your email address will not be published.