3

I'm inserting a batch of 100 records, each containing a dictonary containing arbitrarily long HTML strings, and by god, it's slow. On the iphone, the runloop is blocking for several seconds during this transaction. Is my only recourse to use another thread? I'm already using several for acquiring data from HTTP servers, and the sqlite documentation explicitly discourages threading with the database, even though it's supposed to be thread-safe... Is there something I'm doing extremely wrong that if fixed, would drastically reduce the time it takes to complete the whole operation?

 NSString* statement; statement = @"BEGIN EXCLUSIVE TRANSACTION"; sqlite3_stmt *beginStatement; if (sqlite3_prepare_v2(database, [statement UTF8String], -1, &beginStatement, NULL) != SQLITE_OK) { printf("db error: %s\n", sqlite3_errmsg(database)); return; } if (sqlite3_step(beginStatement) != SQLITE_DONE) { sqlite3_finalize(beginStatement); printf("db error: %s\n", sqlite3_errmsg(database)); return; } NSTimeInterval timestampB = [[NSDate date] timeIntervalSince1970]; statement = @"INSERT OR REPLACE INTO item (hash, tag, owner, timestamp, dictionary) VALUES (?, ?, ?, ?, ?)"; sqlite3_stmt *compiledStatement; if(sqlite3_prepare_v2(database, [statement UTF8String], -1, &compiledStatement, NULL) == SQLITE_OK) { for(int i = 0; i < [items count]; i++){ NSMutableDictionary* item = [items objectAtIndex:i]; NSString* tag = [item objectForKey:@"id"]; NSInteger hash = [[NSString stringWithFormat:@"%@%@", tag, ownerID] hash]; NSInteger timestamp = [[item objectForKey:@"updated"] intValue]; NSData *dictionary = [NSKeyedArchiver archivedDataWithRootObject:item]; sqlite3_bind_int( compiledStatement, 1, hash); sqlite3_bind_text( compiledStatement, 2, [tag UTF8String], -1, SQLITE_TRANSIENT); sqlite3_bind_text( compiledStatement, 3, [ownerID UTF8String], -1, SQLITE_TRANSIENT); sqlite3_bind_int( compiledStatement, 4, timestamp); sqlite3_bind_blob( compiledStatement, 5, [dictionary bytes], [dictionary length], SQLITE_TRANSIENT); while(YES){ NSInteger result = sqlite3_step(compiledStatement); if(result == SQLITE_DONE){ break; } else if(result != SQLITE_BUSY){ printf("db error: %s\n", sqlite3_errmsg(database)); break; } } sqlite3_reset(compiledStatement); } timestampB = [[NSDate date] timeIntervalSince1970] - timestampB; NSLog(@"Insert Time Taken: %f",timestampB); // COMMIT statement = @"COMMIT TRANSACTION"; sqlite3_stmt *commitStatement; if (sqlite3_prepare_v2(database, [statement UTF8String], -1, &commitStatement, NULL) != SQLITE_OK) { printf("db error: %s\n", sqlite3_errmsg(database)); } if (sqlite3_step(commitStatement) != SQLITE_DONE) { printf("db error: %s\n", sqlite3_errmsg(database)); } sqlite3_finalize(beginStatement); sqlite3_finalize(compiledStatement); sqlite3_finalize(commitStatement); 
5
  • You code looks OK to me; you might need to profile your code to get more information on where the bottleneck is e.g. is it performing the sqlite statement or is it archiving the item? Commented Mar 15, 2010 at 9:53
  • (Though the while(YES) smells a bit - why not just do the request and see what result is - why do you need the while at all?) Commented Mar 15, 2010 at 9:54
  • I was under the impression that the interface might be stuck in the SQLITE_BUSY state for awhile, and that I shouldn't simply proceed until the transaction is actually over. It could possibly be archiving, but I have another transaction that runs a delete query instead, and thats slow too. I'll try running the inserts without encoding the dictionary data and see how it goes. Commented Mar 15, 2010 at 10:04
  • Even without the archiving, a hundred inserts are taking up to 5 seconds to complete. I don't know if this is normal, but at this rate, I'll have no choice but to stuff the whole database module in a thread... Does anyone have a better solution for this? It seems inconceivable that transactions could be taking this long... And sqlite seems to be the only way I can manage large amounts (40+MB) of data without keeping it all in memory at the same time. Core Data doesn't allow me to get rid of objects without loading it into memory first, and uses sqlite as a base anyway. Commented Mar 15, 2010 at 10:12
  • I've suddenly found that I have an even worse problem on my hand. sqlite3_bind_blob is apparently failing for some reason, silently. When I store the blob, the NSData object's length is in the thousands (KB range). When I get it back, it's been truncated to 10 bytes. When I check the DB through SQLite Browser, most of the data is gone (I can recognize the keys in the record if I store the dictionary as NSPropertyListSerialization, but the values are gone). This happens regardless of whether I use NSPropertyListSerialization or NSKeyedArchiver to serialize my data. Commented Mar 16, 2010 at 3:53

4 Answers 4

4

The thing that you need to be aware of is that the SQLite documentation warns you away from accessing/writing to the database from multiple threads. As long as you access the database from a single thread, you'll be fine. It doesn't matter if that thread is your program's main thread or some other thread.

Keep in mind that compiled version of SQLite on the iPhone has its threading mode set to "multi-thread" which, according to the documentation, "disables mutexing on database connection and prepared statement objects. The application is responsible for serializing access to database connections and prepared statements but other mutexes are enabled so that SQLite will be safe to use in a multi-threaded environment as long as no two threads attempt to use the same database connection at the same time." So, if you do decide to put this transaction on another thread, be careful of what else you try to do with the database.

That being said, I'd first follow Yonel's advice and switch to "BEGIN" AND "COMMIT". If that doesn't help, move the transaction to another thread. Working with "blobs" can be pretty slow, from what I've heard.

Sign up to request clarification or add additional context in comments.

6 Comments

1) If I remove the long HTML string from the dictionary blob, and place it into a text column instead (the blob would be much smaller in this case, all the other data in the dictionary are fairly short strings/arrays of strings), do you think this would help significantly? 2) While putting all db transactions in a thread would keep my UI thread from blocking, it would not actually help shorten the actual transaction time (in fact, it would probably get worse, since I'd need to manage and pass messages between threads). This is something I want to avoid if at all possible.
I do think that it would be faster to put the HTML strings in a text column. I've done the same thing in a program I've been working on, and the insert completes much faster than yours does. Give it a try. I'd love to hear how it works out for you. You are right to say that putting the work in a separate thread won't speed up the process, but it will certainly improve the responsiveness of the UI, which is incredibly important.
I intend to do that, thanks. I plan to a) shrink my dictionary by discarding data I don't need to store, and b) split my dictionary into text/varchars at storage time and reassemble it when i need it... ill still need for some blobs for a couple of arrays contained within my dictionary, but theyre far far smaller than the huge html string that occupies most of the dictionary's space. hopefully, this will improve things.
Would calling NSInvocationOperation be okay so long as I open a new connection each time I want to carry out a transaction? Particularly, I'm probably going to carry out deletes and inserts/updates with NSInvocationOperation, fire and forget. Selects...are another matter I'll have to deal with seperately, since I actually need the results after the operation is finished.
It should work, but be careful not to access the database more than once from different threads. If you do try to create two or more connections to the database at the same time, your application will crash.
|
1

Did you try the same as your code but with "BEGIN" and "COMMIT" instead of "BEGIN EXCLUSIVE TRANSACTION" and "COMMIT TRANSACTION" ?

I'm simply using BEGIN and COMMIT and it's pretty much faster than committing for each transaction so I guess it's working with those keywords.

http://www.sqlite.org/lang_transaction.html

4 Comments

I just tried this. I can discern no noticable difference in performance. The "EXCLUSIVE" keyword might make a difference, but isn't BEGIN TRANSACTION just an alias for BEGIN?
:( Hum, I couldn't say if the two keywords are the same.
Hrmm, supposing the effect is different, the operation is still pretty heavy all the same. You said it was much faster... how many records are you inserting into the table at a given time?
It was between 1500/2000, I don't have the numbers in mind, but as far as I remember the ratio with/without the begin/commit was quite high.
0

I see a lot of cases where developers new to the iPhone believe the code is slow when it is simply a case of asking lightweight hardware to do to much processing. Processing several hundred (thousands?) of "arbitrarily long HTML strings" might to heavy a task for the iPhone to carry out in a timely fashion.

Remember that the iPhone isn't a very powerful piece of hardware. It pulls off all the nifty graphics with dedicated hardware whose computational power you can't access for other task. Even if you optimize the code it maybe way slower than you would intuitively expect based on your experience with full fledged laptops and desktops.

Instead of guessing where the bottleneck is, I suggest that you profile the code with Instruments (or even just using NSLog with timestamps) to see exactly where the code is spending most of its time.

2 Comments

I have used timestamps... via NSTimeIntervals. There are other tasks being done that are fairly weighty as well and can take up to several seconds, but all of those are either asynchronous or packed into another thread. I assure you, when I say 5-17 seconds, I am not pulling those numbers out of air. I am getting those from calling NSLog on my timestamps. I have also tried Instruments the task that consumes the most time on the main thread are my inserts. I specifically put them into this thread because the documentation discourages multithreading with the DB.
It sounds like you may indeed be overloading the hardware. I would try testing with lighter weight data i.e. the same number of records but with significantly smaller blobs. If it speeds up significantly then your just trying to do to much.
0

A better approach to avoid blocking problems is to use an asynchronous callbacks. Try using Enorm EGO sqlite wrapper https://github.com/jdp-global/egodatabase

Have a look at my Readme section for EGODatabaseRequest - asynchronous requests /inserts to db.

2) Add the requestDidSucceed /requestDidFail callback methods.

 -(void)requestDidSucceed:(EGODatabaseRequest*)request withResult:(EGODatabaseResult*)result idx++ if ([items count]<idx) [self insertRow]; } -(void)requestDidFail:(EGODatabaseRequest*)request withError:(NSError*)error{ NSLog(@"WARNING requestDidFail"); } -(void)insertRow{ NSMutableDictionary* item = [items objectAtIndex:idx]; NSInteger hash = [[NSString stringWithFormat:@"%@%@", tag, ownerID] hash]; NSInteger timestamp = [[item objectForKey:@"updated"] intValue]; NSData *dictionary = [NSKeyedArchiver archivedDataWithRootObject:item]; NSString *qry = [NSString stringWithFormat:@"INSERT OR REPLACE INTO item (hash, tag, owner, timestamp, dictionary) VALUES (%@, %@, %@, %@, %@);",NUMBER(hash),[tag UTF8String],[ownerID UTF8String],NUMBER(timestamp),dictionary]; // be sure to use NSNumbers not NSIntegers EGODatabaseRequest* request = [[EGODatabaseRequest alloc] initWithQuery:qry parameters:nil]; request.delegate = self; request.database = appDelegate.database; request.requestKind = EGODatabaseUpdateRequest; // use update not select [request fire]; [request release]; } 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.