Comments (47)
There are likely some low-hanging performance improvements to be made right now, especially around row iteration. I've just been optimizing the interface, first. Can we isolate a couple factors, though?
- Is this a Release build, or Debug?
- Performance vs. raw query. Let's find out how much more expensive the type-safe iteration is. You can run
pdfDb!.trace(println)
before all the above queries to set up a SQL query logger, from here, take the queries that spit out and test them with raw iteration (for _ in db.prepare(query) {}
). Run both queries at least twice to account for warm-up.
If you have runnable code you can share, please email it and it should make any performance testing a bit easier.
from sqlite.swift.
I have changed the code to trace raw iteration, and it is much faster
func getPdfWithCategory(language:Language) -> CategoriesForLanguage {
let categoriesForLanguage = CategoriesForLanguage()
pdfDb!.db!.trace(println)
let PDFs = pdfDb!.pdfsTb.table
let categories = pdfDb!.categoriesTb.table
start = NSDate()
let query = "SELECT p.* , c.id as categoryid, c.name as categoryname, c.thumb as categorythumb, c.orderid as categorysortid FROM pdf p inner join categories c on c.id=p.categoryid WHERE p.languageId='\(language.id)'"
let timeInterval:NSTimeInterval = start!.timeIntervalSinceNow
println("query \(timeInterval)")
start = NSDate()
let stmt = pdfDb!.db!.prepare(query)
for row in stmt {
}
let timeInterval2:NSTimeInterval = start!.timeIntervalSinceNow
println("for row in query \(timeInterval2)")
return categoriesForLanguage
}
query -0.00111699104309082
SELECT p.* , c.id as categoryid, c.name as categoryname, c.thumb as categorythumb, c.orderid as categorysortid FROM pdf p inner join categories c on c.id=p.categoryid WHERE p.languageId='4'
for row in query -0.623710989952087
compared with the original code output:
query -0.00927197933197021
SELECT count() FROM pdf INNER JOIN categories ON (categories.id = pdf.categoryId) WHERE (pdf.languageId = '4')
query count = 88
SELECT pdf., categories.id, categories.name, categories.thumb, categories.orderid FROM pdf INNER JOIN categories ON (categories.id = pdf.categoryId) WHERE (pdf.languageId = '4')
for row in query -2.36225700378418
regarding sharing my code, I need to create a new project without all sensitive data and will email it to you.
from sqlite.swift.
Great, thanks! I have a few ideas already, but would be helpful to have a baseline to work with.
from sqlite.swift.
I can also confirm it is really slow. To get 1000 rows from my database it takes over 5 seconds. When I use db.prepare then the performance is much better.
from sqlite.swift.
@TomasLinhart Can you give actual benchmarks? I'm under the impression that the difference is linear and about 4x slower at the moment, but any specifics to the contrary (with examples) would be helpful.
from sqlite.swift.
@stephencelis Sure, I can provide you something. But it is not so hard to test. Just create a database with over 1000 rows and at least 15 columns and try to load everything.
from sqlite.swift.
@TomasLinhart Sure, but less time for me to have a reproducible problem the better. My personal use cases aren't dealing with this much data all at once, so it's been more of a back burner fix for me. (In fact, if you give me your use case that would be helpful knowledge, as well.) The easier you make it for me to take it off the back burner, the more likely fixes will come sooner when I have personal time to spare and allot to this part of the project. I'm also happy to take pull requests if you'd like to dig in yourself and spend time contributing to the project.
from sqlite.swift.
@stephencelis Sure, I will prepare an example project. Currently I just switched into db.prepare but in future I might need more so I could dive in and do some pull requests but for now I just want to finish my hobby project.
My use case is I have data set that I want to have in memory so I can manipulate it easily.
from sqlite.swift.
I have created a project with benchmark https://github.com/TomasLinhart/SQLitePerformance Just download it and run it and you will see difference in the performance.
from sqlite.swift.
Thanks! I'll take a look at this oneโand @violabg's, aboveโwhen I have the chance.
from sqlite.swift.
I have updated the benchmark with C API example which is 17x times faster than db.prepare and 225x faster than typed API.
from sqlite.swift.
@TomasLinhart db.prepare
is fairly lightweight, but there is implicit type conversion happening there that isn't happening in your native example. Assuming your benchmarks are from a release build, we may hit the point where Swift becomes the bottleneck (and will hopefully be optimized automatically in the future). I'll check for optimizations that can be made there, as well, though.
from sqlite.swift.
@stephencelis You are right, I checked your code for db.prepare and it is looking good. I tried to remove the logic for generator and then it is almost same speed as my C API example. So I guess biggest bottleneck in db.prepare is row.append but it is not possible to do much about it...
from sqlite.swift.
@TomasLinhart Were you running things in debug mode? My results:
using typed API took 6.32480198144913 seconds
using db.prepare took 0.377619981765747 seconds
using C native API took 0.0522429943084717 seconds
If I change your iteration logic to use sqlite3_column_type
and sqlite3_column_{type}
methods to determine the types and yield arrays of rows, there is negligible overhead:
using db.prepare took 0.388479948043823 seconds
using C native API took 0.382124006748199 seconds
var results = [[Any?]]()
if (sqlite3_prepare_v2(db, query.cStringUsingEncoding(NSUTF8StringEncoding)!, -1, &statement, nil)
== SQLITE_OK) {
let columnCount = sqlite3_column_count(statement)
while (sqlite3_step(statement) == SQLITE_ROW) {
var row = [Any?]()
for idx in 0..<columnCount {
switch sqlite3_column_type(statement, Int32(idx)) {
case SQLITE_BLOB:
let bytes = sqlite3_column_blob(statement, Int32(idx))
let length = sqlite3_column_bytes(statement, Int32(idx))
row.append(Blob(bytes: bytes, length: Int(length)))
case SQLITE_FLOAT:
row.append(Double(sqlite3_column_double(statement, Int32(idx))))
case SQLITE_INTEGER:
let int = Int(sqlite3_column_int64(statement, Int32(idx)))
var bool = false
if let type = String.fromCString(sqlite3_column_decltype(statement, Int32(idx))) {
bool = type.hasPrefix("BOOL")
}
row.append(bool ? int != 0 : int)
case SQLITE_NULL:
row.append(nil)
case SQLITE_TEXT:
row.append(String.fromCString(UnsafePointer<CChar>(sqlite3_column_text(statement, Int32(idx))))!)
case let type:
assertionFailure("unsupported column type: \(type)")
}
}
results.append(row)
}
sqlite3_finalize(statement)
}
Assuming you add logic to your C API example that handles the underlying types at all, I imagine there is little to no performance hit using SQLite.swift.
Given that, the typed interface is still weighing in at about 21x less performant in this case, which is better than 225x, but slow enough to warrant investigation and cleanup when I can.
Meanwhile, you should be able to batch/stream the results into your interface to avoid such big delays and still be able to use the typed interface.
from sqlite.swift.
Yeah, I was, sorry about that. So db.prepare is fast enough, check my last comment, biggest overhead is the append but there is not much we can do about that.
So only problem is only the typed interface. ๐
from sqlite.swift.
@TomasLinhart Ah, good to hear you're seeing the same results as me now :)
In the end I think the typed interface will become more performant as Swift improves its own performance, but I do believe in making optimizations in SQLite.swift wherever possible, so I'll try to dig in sometime soon!
from sqlite.swift.
@TomasLinhart Actually, the point about append
is a good one. I've seen cases where things like reduce
and map
are slower, but map
actually improves the performance here quite a bit:
using db.prepare took 0.273506999015808 seconds
using C native API took 0.397485971450806 seconds
I'll commit this optimization (and more) when I have more time to dig in. Thanks!
from sqlite.swift.
Also quickly cached columnNames
in the QueryGenerator
and we're down to 3.4x the slowdown of the original db.prepare
and 5x with the new improvements.
using typed API took 1.3848859667778 seconds
using db.prepare (old) took 0.388479948043823 seconds
using db.prepare took 0.279254972934723 seconds
This may be the easiest win for now. I'm not sure if you want to dig in for any more optimizations.
from sqlite.swift.
Hehe, I was also digging down and discovered same problem ๐
from sqlite.swift.
One last improvement is an internal restructuring of how Row
stores its data:
using typed API took 0.305689036846161 seconds
using db.prepare (new) took 0.307142019271851 seconds
Iteration should basically be identically performant, now.
OK! I'll push these fixes but then I really need to get back to work ๐จ
from sqlite.swift.
Good job, I am glad the problem was discovered. Looking forward to your commit! ๐
from sqlite.swift.
Closed by:
from sqlite.swift.
Thanks for your help, @TomasLinhart!
from sqlite.swift.
Hi Stephen,
I don't know if you had any chances to look at my sample code, but the last commit did not improve much for me.
Also when I try this:
let PDFs = pdfDb!.pdfsTb.table
let categories = pdfDb!.categoriesTb.table
let query = PDFs.select(PDFs[*], categories[pdfDb!.categoriesTb.id], categories[pdfDb!.categoriesTb.name], categories[pdfDb!.categoriesTb.thumb], categories[pdfDb!.categoriesTb.orderid])
.join(categories, on: categories[pdfDb!.categoriesTb.id] == PDFs[pdfDb!.pdfsTb.categoryId])
.filter(PDFs[pdfDb!.pdfsTb.languageId] == language.id)
it breaks saying :
fatal error: no such table: "pdf": file /workspace/apple/ios8/SQLitePerformance-master/SQLite.swift/SQLite Common/Database.swift, line 335
with it didn't happen before
from sqlite.swift.
@violabg That sounds like an unrelated issue. What does pdfDb!.pdfsTb.table
look like? If you run pdfDb!.run("select count(*) from pdfs")
(or whatever your PDFs table name is) you should get the same error. It sounds like the table doesn't exist yet.
from sqlite.swift.
let PDFs = pdfDb!.pdfsTb.table
let categories = pdfDb!.categoriesTb.table
println(PDFs.count) //prints 467
println(categories.count) //prints 13
let query = PDFs.select(PDFs[*], categories[pdfDb!.categoriesTb.id], categories[pdfDb!.categoriesTb.name], categories[pdfDb!.categoriesTb.thumb], categories[pdfDb!.categoriesTb.orderid]).join(categories, on: categories[pdfDb!.categoriesTb.id] == PDFs[pdfDb!.pdfsTb.categoryId]).filter(PDFs[pdfDb!.pdfsTb.languageId] == language.id)
var categoriesDictionary = [Int: PdfCategory]()
var currentCategory:PdfCategory
for row in query {
here I get the error, when it try to enter the for in loop
fatal error: no such table: "pdf": file /workspace/apple/ios8/SQLitePerformance-master/SQLite.swift/SQLite Common/Database.swift, line 335
wich is table:
let PDFs = pdfDb!.pdfsTb.table
but it works here
println(PDFs.count) //prints 467
println(categories.count) //prints 13
from sqlite.swift.
Very strange. The error is coming from SQLite, so the query must be getting malformed somehow. Can you email me the latest version of your code so that I can better troubleshoot?
from sqlite.swift.
Aside: just wanted to throw in another sample from my personal project.
+65% longer for a flat-out table scan when using typed versus raw (8.54s vs 5.16s).
FWIW, I have been using flat-out binary block dumps for large arrays (10k) of structs/fixed-data for speed (like a column-based DBMS), but I am migrating to sqlite for ease-of-maintenance and flexibility.
/// typed: db_read -8.54326200485229
let tbl = GGGTable.CardAnnotation
for r in tbl {
if var s = notes[ r[e_cid] ]
{
s.unixtime = r[e_show_me_on]
} else {
notes[ r[e_cid] ] = CardAnnotation( unixtime: r[e_show_me_on] )
}
}
/// raw: db_read -5.16345697641373
let db = GGGDatabase.db
let stmt = db.prepare( "SELECT * FROM cardannotation" )
for r in stmt {
let idx = r[0] as Int
let ut = r[1] as Int
if var s = notes[ idx ]
{
s.unixtime = ut
} else {
notes[ idx ] = CardAnnotation( unixtime: ut )
}
}
from sqlite.swift.
@kfmfe04 And this is in a release build, not debug?
Regardless, the difference isn't too bad, considering. Certain optimizations are going to come with time as Swift improves. In the meantime there will be a tradeoff in speed vs. safety and code clarity.
SQLite.swift's type-safe interface should be pretty speedy when you use SQLite itself for querying and scoping the data you're working with, but if you're loading a large dataset into memory, you're probably going to want to optimize as you did, dipping down into the raw API with fewer generics.
I'm hoping that Swift's performance improves over time and the gap you're seeing narrows quite a bit.
I'm also open to pull requests that continue to improve performance with the type-safe interface, so feel free to dig in if you have ideas! I'll also keep performance in mind as I work with things.
from sqlite.swift.
@violabg Looks like the issue you've stumbled upon is a table quoting bug. I'll try to fix it soon!
from sqlite.swift.
thanks Stephen,
once you fix it, could you test the performance of my code and see if they match your results?
from sqlite.swift.
@violabg Fix pushed here: 1f070f7
I'm not sure how to check the performance difference, though. Please let me know how it's improved!
from sqlite.swift.
Thanks Stephen, the problem is fixed now and this are my test results:
func getPdfWithCategory(language:Language) -> CategoriesForLanguage {
let categoriesForLanguage = CategoriesForLanguage()
categoriesForLanguage.laguangeId = language.id
let PDFs = pdfDb!.pdfsTb.table
let categories = pdfDb!.categoriesTb.table
self.start = NSDate()
let query = PDFs.select(PDFs[*], categories[pdfDb!.categoriesTb.id], categories[pdfDb!.categoriesTb.name], categories[pdfDb!.categoriesTb.thumb], categories[pdfDb!.categoriesTb.orderid]).join(categories, on: categories[pdfDb!.categoriesTb.id] == PDFs[pdfDb!.pdfsTb.categoryId]).filter(PDFs[pdfDb!.pdfsTb.languageId] == language.id)
let timeInterval:NSTimeInterval = start!.timeIntervalSinceNow
println("query \(timeInterval)")
println("query.count \(query.count)")
var categoriesDictionary = [Int: PdfCategory]()
var currentCategory:PdfCategory
self.start = NSDate()
for row in query {
//
}
let timeInterval2:NSTimeInterval = start!.timeIntervalSinceNow
println("for \(timeInterval2)")
return categoriesForLanguage
}
on mac:
query -0.00147002935409546
query.count 87
for -0.021390974521637
timeInterval -0.0214470028877258
on ipad 4:
query -0.0244329571723938
query.count 87
for -0.350589990615845
timeInterval -0.350932002067566
iteration over 87 records fills a little slow, I have ported this project from Objective C and is much faster.
Probably as you said, Swift is the bottleneck.
from sqlite.swift.
that was a debug build - ok - I have a more pressing issue
I do a table scan something like
let sentences = db[ "sentences" ]
let id = Expression<Int>("id")
let entry = Expression<String>("entry")
let address = Expression<String>("address")
let tokens = Expression<String>("tokens")
for r in sentences
{
// ....
}
for about 8,000 records, this takes 3 minutes on an iPod touch. Parsing a pipe-delimited ASCII file equivalent takes 25 seconds. Is this within range of expectations or should a table-scan be faster?
Maybe I should try passing in a "SELECT" statement instead and see how fast it is?
from sqlite.swift.
@kfmfe04 What are you doing with the data as you iterate? If you're doing empty iteration, it's not a fair comparison because SQLite.swift has type-handling code built in. If you're working with such a large dataset all at once, you may need to optimize by ditching the type-safe interface for something lower-level. Or better yet, limit your queries to only the data you need, and use aggregate functions for calculations across the dataset.
from sqlite.swift.
so slow ....
using
for about 2000rows
from sqlite.swift.
@androidcn Please try to constructively contribute to the conversation and read this thread thoroughly (check my questions to others throughout, including whether you've compiled for "Release" or not, and what device you're running on).
In short:
- The basic iteration API should be as fast as interfacing with libsqlite3 directly, while the typed API is as fast as adding type-checking logic for each row. You're likely hitting a Swift or SQLite bottleneck.
- In the end, iterating over 2,000 rows seems a bit heavy-handed when SQLite provides aggregation APIs and when you can scope/batch your queries.
from sqlite.swift.
@stephencelis but i want to use typed API
from sqlite.swift.
@androidcn The typed API is as fast as manually-typed, so please do. It will improve with Swift. Scope your queries for performance.
from sqlite.swift.
An update to those listening: Swift 1.2 performs 2x+ faster for me. If you end up re-running your benchmarks (off the swift-1-2
branch), please share your experiences.
from sqlite.swift.
i will check it out later
On Sat, 14 Feb, 2015 at 9:30 am Stephen Celis [email protected]
wrote:
An update to those listening: Swift 1.2 performs 2x+ faster for me. If you
end up re-running your benchmarks (off the swift-1-2 branch), please
share your experiences.Reply to this email directly or view it on GitHub
#23 (comment)
.
from sqlite.swift.
I have tested my code with branche Swift 1.2.
I get 1,3 sec.
with Swift 1.1 I get 6,9 sec.
using raw sql I get 0,28 sec. with Swift 1.2
and 0,98 sec. with Swift 1.1
from sqlite.swift.
Thanks @violabg! Hopefully things will continue to improve steadily.
from sqlite.swift.
A little bit OT. When will you merge the Swift 1.2 branch into master, @stephencelis ?
from sqlite.swift.
@TomasLinhart per #62:
This PR is open to track changes against the Swift 1.2 beta. It won't be merged till Swift 1.2 goes GM, but will be continuously rebased onto
master
.
from sqlite.swift.
Is the limit set to 1000? Or is there a way to do more than 1000 (8000 to be specific)
from sqlite.swift.
@sehgalsaheel Have you tried it? Have you had problems? You should have the same ability, limit-wise, as you'd have with raw sqlite3. If the device can't handle 8,000, you should be batching.
from sqlite.swift.
Related Issues (20)
- Duplicate symbol in LinkServices.framework HOT 1
- Feature Request: Way to exclude an index when doing a query
- privacy manifest HOT 2
- Join between tables that have some columns with the same name
- Symbol not found: _$s6SQLite10ConnectionC8LocationO8inMemoryyA2EmFWC
- Feature Request: Add support for sqlite-vss extension
- Feature Request: Make Connection's userVersion property non optional
- Integrate SQLite code directly into project - Did a few renames - one last issue not sure what is causing it. - Help HOT 1
- SqlCipher support for Swift PM, Here's a solution
- Feature: Request visionOS Support HOT 2
- Windows support HOT 3
- can not get column info with join
- Subscrip to changes
- Problem with CSQLite HOT 1
- bug in optional support for decoding HOT 1
- Version 0.15.0 Not Applied in CocoaPods HOT 10
- Swift Compiler Error when project depend SQLite.swift and sqlite3_flutter_libs HOT 1
- ncorrect checksum for freed object HOT 1
- Double quotes when using ULID HOT 1
- Unable to decode optional custom type HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sqlite.swift.