INSERT vs UPDATE: MySQL, 7 million rows

basickarl

For those opting that is is not a programming question, it ultimately is as I have to 1) code, 2) the amount of algorithms of comparing data for each row in option #1. I am not asking how to do queries or how my database should be setup.

I have run into a bit of a pickle. I have a database which follows Google GTFS specs and I am now writing an automatic update program to service this database.

The database gets an overhaul once every 3 months. The table with the least amount of rows consists of between 1-10 rows and the largest table contains 7 million rows. The rest have between 10 000 and 80 000.

The files my program to download are .txt files which translate in a table. In other words: stops.txt = database.stops. Database is InnoDB type.

I have come up with 2 solutions.

  • 1) every row id in the several .txt documents is to be compared to what is currently in the database, if nothing has changed do nothing, if something has changed, do an UPDATE.

  • 2) INSERT the downloaded files into their own tables (basically mirroring the live tables) and then switch table names. Example: database.stop_new and database.stop switch names.

Another twist: The overhaul will be done at a certain date defined in one of the .txt files, this means it could be done a Saturday on a peak holiday prehaps, meaning that users could be sending queries at all times.

Question: Which approach should I go for to endure that nothing breaks and that the user experience is upheld. (I am leaning towards option 2...)

Hituptony

According to MySQL documentation...

MySQL uses table-level locking for MyISAM, MEMORY and MERGE tables, page-level locking for BDB tables, and row-level locking for InnoDB tables.

You will be able to row-level lock the table, instead of rendering the entire contents unusable...

Table updates are given higher priority than table retrievals.

Updates take priority over selects, and is also based on key values, so i think this would be a much better option.

> Advantages of row-level locking:

Fewer lock conflicts when different sessions access different rows

Fewer changes for rollbacks

Possible to lock a single row for a long time

> Disadvantages of row-level locking:

Requires more memory than page-level or table-level locks

Slower than page-level or table-level locks when used on a large part of the table because you must acquire many more locks

Slower than other locks if you often do GROUP BY operations on a large part of the data or if you must scan the entire table frequently

However, in general table-locks are superior to row-level locks according to MySQL Documentation...

Another option...

Instead of using row-level locks, you can employ application-level locks, such as those provided by GET_LOCK() and RELEASE_LOCK() in MySQL. These are advisory locks, so they work only with applications that cooperate with each other. See Section 12.14, “Miscellaneous Functions”.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Best way to insert ~20 million rows using Python/MySQL

From Dev

update query with 1.5 million rows taking long time to execute mysql

From Dev

UPDATE AND INSERT multiple rows into MySQL database

From Dev

PHP MySQL insert/update multiple rows at once

From Dev

mysql : How to update or insert new rows DAILY?

From Dev

How to insert/update rows from MySQL to SQL Server by using SSIS

From Dev

Bulk Collect with million rows to insert.......Missing Rows?

From Dev

Loading 5 million rows into Pandas from MySQL

From Dev

Can MySQL handle 100 million+ rows?

From Dev

Mysql database design: table with half a million rows

From Dev

Condition for Update vs insert

From Dev

Fastest way to insert 1 million rows in SQL Server

From Dev

MySQL insert category rows

From Dev

MySql hangs on update and insert

From Dev

Mysql Deadlock on insert and update

From Dev

Multiple INSERT/UPDATE on MySQL

From Dev

MYSQL - INSERT INTO ON DUPLICATE UPDATE

From Dev

MySQL - UPDATE or INSERT if exists

From Dev

MySql Update if exists insert if not

From Dev

MySQL UPDATE, IF, INSERT INTO statement

From Dev

SSIS - Delete and Insert All Rows or Update and Insert

From Dev

MySQL Python update rows

From Dev

UPDATE mysql rows

From Java

"INSERT IGNORE" vs "INSERT ... ON DUPLICATE KEY UPDATE"

From Dev

PHP/MySQL - Updating 70 million rows every week

From Dev

Optimal MYSQL query for longest prefix matching in a table with 5 million rows

From Dev

Slow Query on Medium MySQL Table (1 Million Rows)

From Dev

Using php/MySQL to run queries on 3+million rows

From Dev

Mysql group by won't use index after 1 million rows