Watch, Follow, &
Connect with Us

For forums, blogs and more please visit our
Developer Tools Community.


Welcome, Guest
Guest Settings
Help

Thread: SQLite, suitable for large amounts of data? Connectivity?


This question is answered. Helpful answers available: 1. Correct answers available: 1.


Permlink Replies: 10 - Last Post: Oct 6, 2014 6:45 PM Last Post By: Wayne Niddery Threads: [ Previous | Next ]
Arthur Hoornweg

Posts: 414
Registered: 6/2/98
SQLite, suitable for large amounts of data? Connectivity?  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Sep 30, 2014 2:46 AM
Hello all,

for a future project, I need to implement a database server that stores enormous amounts of floating-point readings. The requirements are "impossible". One requirement is that the database must be virtually "unbreakable" by design, even in the case of a HDD failure or power outage the damage should be as limited as possible. The database will accumulate something like 1 GB of data per project per day and we expect to run some 20 projects simultaneously.

I am reluctant to implement something like this in a "black box" like MSSQL because I simply don't know how badly the database could break and how quickly it can be restored. In this special case I'd much rather have individual database files per project. The advantage would be that it's much easier to backup, restore and defragment. Also it would enable me to easily move individual projects out of the database to a storage medium or back again. If, for some reason, a database file "breaks", all other projects would still be accessible.

I was thinking about using SQLite for this. What I don't know is,can it handle such large amounts of data, is it 100% "acid" and what's a good (stable, future-proof) connectivity for Delphi? I don't care about bells&whistles but it must be rock-steady and support unicode.

Ian Barker Barker

Posts: 98
Registered: 5/18/07
Re: SQLite, suitable for large amounts of data? Connectivity?  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Sep 30, 2014 5:54 AM   in response to: Arthur Hoornweg in response to: Arthur Hoornweg
20GB per day is quite a lot to store :)

Personally I don't think SQLite would be up to the job.

If I was being asked to implement this I would use MySQL. It's free,
supports unicode and is ACID compliant if you are using the InnoDB
format (which is a default table type and built-in).

You don't mention what type of data components you have access to but
FireDAC supports MySQL although personally I use the MyDAC components
which you'd need to buy if you do not already have them.

With regard to DB stability you really should be thinking about having
a fail-over setup for the databases - whichever DB technology you go
for - so that you have a definite of maintaining database integrity in
the event of a system failure.

I work on a number of 'large' systems which use MySQL as their
database, all written with Delphi, widely distributed to several
hundred customers of avarying technical skills and even more varied
reliability of equipment such as server and mains power and in the last
ten years we have never had anyone lose data in the databases. This is
either luck or, more likely, inherent reliability. :)

IanB.
http://about.me/IanBarker

Arthur Hoornweg wrote:

Hello all,

for a future project, I need to implement a database server that
stores enormous amounts of floating-point readings. The requirements
are "impossible". One requirement is that the database must be
virtually "unbreakable" by design, even in the case of a HDD failure
or power outage the damage should be as limited as possible. The
database will accumulate something like 1 GB of data per project per
day and we expect to run some 20 projects simultaneously.
quinn wildman

Posts: 856
Registered: 12/2/99
Re: SQLite, suitable for large amounts of data? Connectivity?  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Sep 30, 2014 10:00 AM   in response to: Ian Barker Barker in response to: Ian Barker Barker
I am no MySQL licensing expert, but it is my understanding the
conditions where MySQL is free are limited. So, if free is important,
you'll want to make sure to have a look at MySQL licensing.

If free is not important, then we'd love for you to have a look at
InterBase, as I am sure it would meet all the requirements you have
stated. If you have XE7, you could start with IBLite for free, but I do
not think IBLite will meet your requirements,

Ian Barker wrote:
If I was being asked to implement this I would use MySQL. It's free,...
Ian Barker Barker

Posts: 98
Registered: 5/18/07
Re: SQLite, suitable for large amounts of data? Connectivity?  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Oct 1, 2014 3:01 PM   in response to: quinn wildman in response to: quinn wildman
quinn wildman wrote:

I am no MySQL licensing expert, but it is my understanding the
conditions where MySQL is free are limited. So, if free is important,
you'll want to make sure to have a look at MySQL licensing.

MySQL Community edition is fully GPL.

(Nothing against Interbase either - great DB, I just use MySQL a lot).

--
IanB.
http://about.me/IanBarker
Aage Johansen

Posts: 23
Registered: 9/27/00
Re: SQLite, suitable for large amounts of data? Connectivity?  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Sep 30, 2014 12:21 PM   in response to: Arthur Hoornweg in response to: Arthur Hoornweg
Den 30.09.2014 11:46, skrev Arthur Hoornweg:
Hello all,

for a future project, I need to implement a database server that stores enormous amounts of floating-point readings. The requirements are "impossible". One requirement is that the database must be virtually "unbreakable" by design, even in the case of a HDD failure or power outage the damage should be as limited as possible. The database will accumulate something like 1 GB of data per project per day and we expect to run some 20 projects simultaneously.

I am reluctant to implement something like this in a "black box" like MSSQL because I simply don't know how badly the database could break and how quickly it can be restored. In this special case I'd much rather have individual database files per project. The advantage would be that it's much easier to backup, restore and defragment. Also it would enable me to easily move individual projects out of the database to a storage medium or back again. If, for some reason, a database file "breaks", all other pro
jects would still be accessible.

I was thinking about using SQLite for this. What I don't know is,can it handle such large amounts of data, is it 100% "acid" and what's a good (stable, future-proof) connectivity for Delphi? I don't care about bells&whistles but it must be rock-steady and support unicode.

The least expensive options may be Firebird and PostgreSQL.
I haven't use PG, but Fb has proven to be very good - but then, we don't
have your list of requirements!
Firebird's insert speed is good, and its architecture ensures few
locking problems.
We are using Firebird with Delphi and components from IBObjects.

--
Aage J.

Arthur Hoornweg

Posts: 414
Registered: 6/2/98
Re: SQLite, suitable for large amounts of data? Connectivity?  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Sep 30, 2014 11:58 PM   in response to: Aage Johansen in response to: Aage Johansen
Thank you all for your valuable input! You have convinced me that a failsafe solution with a replicated SQL server would help to minimize downtime in the event of a failure. I think I'm going that way, be it MySQL, Interbase, Firebird, PostgresSql or MS-SQL.

Two more little questions...

Background info: This server is going to be a web service that implements the WitsML 1.4.1.1. API, which is an XML based oil industry standard for storing oil well drilling data. Scalability is highly desirable. The database will contain many tables containing floating point data at 1 Hz interval, each table being indexed for project and timestamp. Project duration is normally between two weeks and half a year (typically 1-2 months). Each table has maybe 10-20 columns. However, it is quite possible that a query performs an "alter table", creates a new data column in an existing table and then inserts data into it.

Question one: Since the tables are so large, I'm worried about significant delays at this point because the DB will have to restructure the table. Is any of the aforementioned databases particularly good at doing this, and will the transaction isolation level (I plan to use either snapshot/serializable or read/committed) make a huge difference in performance?

Question two: Currently I'm using ADO for connectivity and one of its features that's extremely handy for me is the "recordset.Updatebatch" command that's available for client-side recordsets. This way I can download an interval of data from the database with a single query, make any modifications to the resultset, even append records to it, and then upload the changes back. Everything happening in one single transaction. If I want such a feature, what are my connectivity options for the aforementioned databases?

Thanks for your patience and time,
Arthur

Ian Barker Barker

Posts: 98
Registered: 5/18/07
Re: SQLite, suitable for large amounts of data? Connectivity?  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Oct 1, 2014 3:11 PM   in response to: Arthur Hoornweg in response to: Arthur Hoornweg
Arthur Hoornweg wrote:


Question one: Since the tables are so large, I'm worried about
significant delays at this point because the DB will have to
restructure the table. Is any of the aforementioned databases
particularly good at doing this, and will the transaction isolation
level (I plan to use either snapshot/serializable or read/committed)
make a huge difference in performance?

Not sure what you mean by "the DB will have to restructure the table" -
it sounds like your biggest problem is going to be replication (for
failsafeness <--- new word) and choosing indices carefully to ensure
that speed is maintained.

Question two: Currently I'm using ADO for connectivity and one of
its features that's extremely handy for me is the
"recordset.Updatebatch" command that's available for client-side
recordsets. This way I can download an interval of data from the
database with a single query, make any modifications to the
resultset, even append records to it, and then upload the changes
back. Everything happening in one single transaction. If I want such
a feature, what are my connectivity options for the aforementioned
databases?


ADO is not going to cut it for the majority of the suggested databases.
What you're describing sounds a bit like transaction handling? Am I
right? In which case most will deal with it - MS SQL, MySQL and modern
SQL DBs pretty much need it. Try googling "BEGIN TRANSACTION" and the
name of your favorite chosen DB from the suggested options. :)

I could be wrong since you mention transaction isolation so I apologise
if I've got the wrong end of the stick.

One thing to avoid - in my opinion - is using TTable type components if
at all possible. They add all sorts of overhead plus in many cases
you're going to want to be using stored procedures or read/update
queries to manage the data for speed since they can be optimized by the
DBMS.

--
IanB.
http://about.me/IanBarker
Arthur Hoornweg

Posts: 414
Registered: 6/2/98
Re: SQLite, suitable for large amounts of data? Connectivity?  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Oct 2, 2014 8:46 AM   in response to: Ian Barker Barker in response to: Ian Barker Barker
Ian Barker wrote:

Not sure what you mean by "the DB will have to restructure the table"

Assume you have a table consisting of 20 columns and a million records. What happens if a query creates an additional column in the table? (I mean something like "alter table mytable add somecolumn float"). The layout of the table changes in this case. It goes from 20 columns wide to 21. I need the database to respond snappily to this situation, a longer delay would be fatal.

ADO is not going to cut it for the majority of the suggested databases.

What's the problem? I can use ADO to connect to any database that has either an OleDB or ODBC driver and have done so for the past 10 years. It is mature and stable technology.

What you're describing sounds a bit like transaction handling? Am I right?

No, that's not exactly what I mean. Transactions are just a way of making things "atomic". Everything between Begintrans and Committrans either succeeds fully or leaves no trace. I use them all the time.

What I mean is, databases offer up to 4 different transactional MODELS. The selected model determines how exactly the database handles concurrent transactions. (google for "set transaction isolation level"). The 4 models are "read committed", "read uncommitted", "repeatable read" and "serializable" (aka "snapshot" or "versioning") and these all behave differently.

One thing to avoid - in my opinion - is using TTable type components if
at all possible.

I agree. I never use tTable. I run my queries directly against tADOconnection and it returns me a reference-counted recordset interface. The overhead is extremely small this way. If I do need visual components, I use tAdoDataset and specify a very concise query that returns only the fields and records that I need.

you're going to want to be using stored procedures or read/update
queries to manage the data for speed since they can be optimized by the
DBMS.

Maybe in the future. But for the time being, I try to remain as SQL-99 compliant as possible, simply because it hasn't been decided yet which database to use. Stored procs and triggers are highly proprietary. If speed is an issue, I can gain a lot by running the webservice (=middleware) on the same machine as the SQL database itself. This eliminates the bandwidth and latency of the NIC.
Ian Barker Barker

Posts: 98
Registered: 5/18/07
Re: SQLite, suitable for large amounts of data? Connectivity?
Helpful
Click to report abuse...   Click to reply to this thread Reply
  Posted: Oct 3, 2014 10:14 AM   in response to: Arthur Hoornweg in response to: Arthur Hoornweg
Arthur Hoornweg wrote:


ADO is not going to cut it for the majority of the suggested
databases.

What's the problem? I can use ADO to connect to any database that has
either an OleDB or ODBC driver and have done so for the past 10
years. It is mature and stable technology.

Because it adds a layer over the top of the native protocol which can
slow things down dramatically - Google "ADO speed problem" and you'll
see lots of questions about that along with code tricks to mitigate it
in several different programming languages not just Delphi.

It looks like you're well ahead of the curve though so perhaps it's not
an issue for you?


What I mean is, databases offer up to 4 different transactional
MODELS.

Ah. MySQL has two native table types: INNODb and MyISAM - you might
want to read through their descriptions to see which transactiona dn
locking strategy works best for you.

MySQL is more than capable of running with millions (plural) of
records. There are optimizations and configurations available to suit
your particular tables and schema - try googling "mysql million records
performance" and you'll see people describing 300 million, 70 million
and 6 million row table optimizations along with one insane person
using, attempting to use 192 trillion records and a 1 petabyte of
data!!! Pro top: this will not work unless you have an unlimited
lifespan and use the NSA computers to do things.

Restructing the database is the thing I think is going to be a problem
for any database - adding columns could be slow, especially if those
columns are on tables with multiple indices and you add new indices
too. No DBMS is going to deal with that well when the table data is in
the gigabyte range.

You might want to test it and see if you think under your scenario this
is going to work out acceptably.

Here are a few links to test scripts to insert 100,000 - 3,000,000
records into a MySQL database along with some performance analysing
techniques:

http://stackoverflow.com/questions/7367687/creating-a-very-large-mysql-database-from-php-script

https://www.digitalocean.com/community/tutorials/how-to-measure-mysql-query-performance-with-mysqlslap

http://www.moredevs.ro/mysql-vs-mongodb-performance-benchmark/

...and the MYSQL sample database which contains 4 million records for
you to play with....

http://dev.mysql.com/doc/employee/en/index.html

Good luck!

--
IanB.
http://about.me/IanBarker
Arthur Hoornweg

Posts: 414
Registered: 6/2/98
Re: SQLite, suitable for large amounts of data? Connectivity?  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Oct 6, 2014 7:50 AM   in response to: Ian Barker Barker in response to: Ian Barker Barker
Ian Barker wrote:

Because it adds a layer over the top of the native protocol which can
slow things down dramatically - Google "ADO speed problem" and you'll
see lots of questions about that along with code tricks to mitigate it
in several different programming languages not just Delphi.
It looks like you're well ahead of the curve though so perhaps it's not
an issue for you?

Ado has never been a bottleneck of any kind to me. I believe that most "slowness" is simply due to the fact that people query much more data than they need, and then perform a filter hat discards 99% of the output. They flush gigabytes through the network cable just to find a phone number and then complain it's slow! It would be much better to let the database handle the "heavy work" by specifying a precise queries that return exactly the results one needs.

MySQL is more than capable of running with millions (plural) of
records.

Millions of records can be handled by any database nowadays. The amount of data that is written if a record is appended to a table with a billion records is tiny.

However, "Alter table" is a vastly different beast. If I decide that a table needs another column, the database core has to make some intelligent decisions. Copying the whole table into a new structure would be a very bad idea.


You might want to test it and see if you think under your scenario this
is going to work out acceptably.

Here are a few links to test scripts to insert 100,000 - 3,000,000
records into a MySQL database along with some performance analysing
techniques:

http://stackoverflow.com/questions/7367687/creating-a-very-large-mysql-database-from-php-script

https://www.digitalocean.com/community/tutorials/how-to-measure-mysql-query-performance-with-mysqlslap

http://www.moredevs.ro/mysql-vs-mongodb-performance-benchmark/

...and the MYSQL sample database which contains 4 million records for
you to play with....

http://dev.mysql.com/doc/employee/en/index.html

Good luck!

Thank you!

Regards,
Arthur

Wayne Niddery

Posts: 791
Registered: 4/14/98
Re: SQLite, suitable for large amounts of data? Connectivity?  
Click to report abuse...   Click to reply to this thread Reply
  Posted: Oct 6, 2014 6:45 PM   in response to: Ian Barker Barker in response to: Ian Barker Barker
"Ian Barker" <ian dot barker at gmail dot com> wrote in message
news:694898 at forums dot embarcadero dot com...

Because it adds a layer over the top of the native protocol which can
slow things down dramatically - Google "ADO speed problem" and you'll
see lots of questions about that along with code tricks to mitigate it
in several different programming languages not just Delphi.

That depends very much on the drivers involved. For MSSQL the latest native
drivers are actually back to good ODBC protocol, and they are readily
accessed via ADO. In this case ADO is actually a very thin wrapper giving
you its interface over the very fast native ODBC drivers.

--
Wayne Niddery
"You know what they call alternative medicine that has been proven to work?
Medicine." - Tim Minchin
Legend
Helpful Answer (5 pts)
Correct Answer (10 pts)

Server Response from: ETNAJIVE02