One source of pessimism is database limits. MongoDB has a great page to describe limits. It limits index keys to less than 1025 bytes. But this is a great example that shows the value of pessimism. The documentation states that values (MongoDB documents) are not added to the index when the index key is too large. An optimist might assume that an insert or update statement fails when the index key is too large, but that is not the specified behavior.
As far as I can tell, prior to MongoDB 2.5.5 the behavior was to not add the document to the index when the indexed column exceeded 1024 bytes. The insert or update would succeed but the index maintenance would fail. Queries that used the index after this can return incorrect results.
A quick search of the interwebs shows that people were aware of the problem in 2011. I can reproduce the problem on my Ubuntu 12.10 VM. Why do we tolerate problems like this? Maybe this isn't a big deal and the real problem is that new risks (from a system I don't know much about) are worse than risks in software that I have been using for years. But corruption (either permanent or via incorrect query results) has been a stop the world bug for me -- as in you do nothing else until the problem has been fixed. Why have MongoDB users tolerated this problem for years?
While it is great that the bug appears to have been fixed, database vendors should understand that FUD takes much more time to go away. See all of the stories about transactions and MySQL that lived on long after InnoDB became viable. And note that some of the MySQL FUD was self-inflicted -- see the section of the MySQL manual on Atomic Operations.
Found this code in 2.4.9 to explain how key-too-large is handled. A careful reader might figure out that index maintenance isn't done. Nice optimization. I spoke to one user who doesn't like the old behavior but doesn't want to break apps with the new behavior that fails inserts/updates with too large keys. Indexes on a prefix of a key would help in that case.
template< class V >
void BtreeBucket<V>::twoStepInsert(DiskLoc thisLoc,
IndexInsertionContinuationImpl<V> &c,
bool dupsAllowed) const
{
if ( c.key.dataSize() > this->KeyMax ) {
problem() << "ERROR: key too large len:" << c.key.dataSize()
<< " max:" << this->KeyMax << ' '
<< c.key.dataSize() << ' '
<< c.idx.indexNamespace() << endl;
return; // op=Nothing
}
insertStepOne(thisLoc, c, dupsAllowed);
}
/** todo: meaning of return code unclear clean up */
template< class V >
int BtreeBucket<V>::bt_insert(const DiskLoc thisLoc, const DiskLoc recordLoc,
const BSONObj& _key, const Ordering &order,
bool dupsAllowed,
IndexDetails& idx, bool toplevel) const
{
guessIncreasing = _key.firstElementType() == jstOID && idx.isIdIndex();
KeyOwned key(_key);
dassert(toplevel);
if ( toplevel ) {
if ( key.dataSize() > this->KeyMax ) {
problem() << "Btree::insert: key too large to index, skipping "
<< idx.indexNamespace() << ' ' << key.dataSize()
<< ' ' << key.toString() << endl;
return 3;
}
}
As far as I can tell, prior to MongoDB 2.5.5 the behavior was to not add the document to the index when the indexed column exceeded 1024 bytes. The insert or update would succeed but the index maintenance would fail. Queries that used the index after this can return incorrect results.
A quick search of the interwebs shows that people were aware of the problem in 2011. I can reproduce the problem on my Ubuntu 12.10 VM. Why do we tolerate problems like this? Maybe this isn't a big deal and the real problem is that new risks (from a system I don't know much about) are worse than risks in software that I have been using for years. But corruption (either permanent or via incorrect query results) has been a stop the world bug for me -- as in you do nothing else until the problem has been fixed. Why have MongoDB users tolerated this problem for years?
While it is great that the bug appears to have been fixed, database vendors should understand that FUD takes much more time to go away. See all of the stories about transactions and MySQL that lived on long after InnoDB became viable. And note that some of the MySQL FUD was self-inflicted -- see the section of the MySQL manual on Atomic Operations.
Found this code in 2.4.9 to explain how key-too-large is handled. A careful reader might figure out that index maintenance isn't done. Nice optimization. I spoke to one user who doesn't like the old behavior but doesn't want to break apps with the new behavior that fails inserts/updates with too large keys. Indexes on a prefix of a key would help in that case.
template< class V >
void BtreeBucket<V>::twoStepInsert(DiskLoc thisLoc,
IndexInsertionContinuationImpl<V> &c,
bool dupsAllowed) const
{
if ( c.key.dataSize() > this->KeyMax ) {
problem() << "ERROR: key too large len:" << c.key.dataSize()
<< " max:" << this->KeyMax << ' '
<< c.key.dataSize() << ' '
<< c.idx.indexNamespace() << endl;
return; // op=Nothing
}
insertStepOne(thisLoc, c, dupsAllowed);
}
/** todo: meaning of return code unclear clean up */
template< class V >
int BtreeBucket<V>::bt_insert(const DiskLoc thisLoc, const DiskLoc recordLoc,
const BSONObj& _key, const Ordering &order,
bool dupsAllowed,
IndexDetails& idx, bool toplevel) const
{
guessIncreasing = _key.firstElementType() == jstOID && idx.isIdIndex();
KeyOwned key(_key);
dassert(toplevel);
if ( toplevel ) {
if ( key.dataSize() > this->KeyMax ) {
problem() << "Btree::insert: key too large to index, skipping "
<< idx.indexNamespace() << ' ' << key.dataSize()
<< ' ' << key.toString() << endl;
return 3;
}
}
View comments