Exact function of storage.useTombstones config setting

For ODB3, I want to confirm my understanding of storage.useTombstones. The default value is ‘false’. Does this apply to the space in the cluster that is actually occupied by the deleted record’s data or to the tombstone that is used to mark the deleted record’s RID as “unusable”? In other words, it’s my understanding that a small tombstone is always used to prevent reuse of RIDs and that this config settings applies only to the deleted record’s data. Is this correct?

Is there any way to eliminate the use of “RID tombstones”?
How many bytes is a “RID tombstone”?

@luigidellaquila - Can you shed some light on this? I have certain tables that hold very temporary information (for seconds or minutes), so records are constantly being created and deleted. Off-line compaction isn’t an option, so I’m trying to determine if deleted records are being tombstoned, causing the disk space for these classes to slowly grow forever.

I think @laa can help here

Thanks

Luigi

@laa - Do you have any details on this?

Hello, we use tombstones to prevent repetition of RIDs while we create a new records. Right now there is no way to eliminate usage of tombstones. We are discussing some ways to implement tombstone-less model of records generation but that is for not a priority. Size of single tombstone is 13 bytes. So right now once you remove record you left with tombstone which will consume 13 bytes on your disk.

@laa - Thanks for the quick reply. That is my current understanding, but two questions:

  1. Why is it necessary to not reuse RIDs? In other words, could a simple way to avoid tombstones be to allow a class-level option to reuse RIDs?
  2. If tombstones are always used, what is the purpose of the setting storage.useTombstones?

Currently, for our “high flux” classes (where we may see millions of records deleted per day), we don’t delete records. Instead, we maintain an index that lets us reuse them. It works, and it lets us avoid using tombstones, but it does add a fair amount of extra complexity. That’s why I’m interested in this topic.

@eric24 we should not reuse RIDs because there are cases when users operate by rids directly and can create a connection to the record which is already deleted and handle it incorrectly. That is the disadvantage of exposing internals to the users :slight_smile: . This setting was introduced a long time ago because in initial design we did reuse RIDs, but that caused troubles from users side. So we gave up this idea.

@laa - Understood. Of course users should not use RIDs directly, but that’s another story.

In the meantime, the tombstones continue to be an area of concern for us (think about our daily average of 2 million records being deleted, which would consume 26MB of data every day that can currently only be recovered through off-line compaction, which isn’t an option–thus our “record re-use” hack, where we don’t actually delete anything). Is there any plan for this on the roadmap? Is it possible to enable the storage.useTombstones property again? Like make database settings, the default of true leaves it as-is, but it would still allow “advanced” users (that should know what they are doing) to set it to false.

@eric24 I will think about it. But this change will not be provided in very short term.