7

I have a table like this

CREATE TABLE IF NOT EXISTS `dnddata` (
  `numbers` varchar(10) NOT NULL,
  `opstype` char(1) NOT NULL,
  PRIMARY KEY (`numbers`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
/*!50100 PARTITION BY KEY (numbers)
PARTITIONS 25 */;

i have to insert 300 million records. i am inserting 10 million records each time using load data in file from csv file.

To insert 10 million records taking nearly 5 min first time. Time is increasing each time gradually. after 30 million records it stops inserting and memory using 100% server not responding.

below my my.cnf file setting

bulk_insert_buffer_size = 100M
key_buffer = 100M
sort_buffer_size = 50M
read_buffer = 50M

i am using cpu with 2 G memory.

details for 30 million records

    Space usage
Type    Usage
Data    545.3   MiB
Index   694.8   MiB
Total   1,240.1 MiB

MySQL client version: 5.5.14

with out index it is inserting fine 10 million in 50 sec.

Please tell me what kind of setting need to change.

Edit based on user answers

I have changed my.cnf setting to below

key_buffer_size = 1G
sort_buffer_size = 4M
read_buffer_size = 1M
read_rnd_buffer_size = 1M
join_buffer_size = 2M
bulk_insert_buffer_size = 32M
myisam_sort_buffer_size = 256M

no use. issue is not resolved.

I have tried below mentioned methods to load data

set autocommit = 0; //for innodb
load data infile into …
COMMIT;

START TRANSACTION; 
load data infile into …
COMMIT;

ALTER TABLE dnddata DISABLE KEYS;
load data infile into …
ALTER TABLE dnddata ENABLE KEYS;

No luck..

sankar.suda
  • 223
  • 1
  • 3
  • 6
  • thank you every body, along with feedback from users. we have also increase my ram size to 4gb. it is working fine now.. – sankar.suda Jul 26 '12 at 07:32
  • Is numbers being loaded in order? Or randomly? This will make a big difference in the building of PRIMARY KEY(numbers). – Rick James Jul 01 '18 at 17:25

4 Answers4

6

You should load such a huge file in chunk for faster loading of data, Here it is mentioned problems loading huge Load local data file and solution how to solve it and make it faster.

http://www.mysqlperformanceblog.com/2008/07/03/how-to-load-large-files-safely-into-innodb-with-load-data-infile/

  • MyISAM tables you should follow this steps to fasten loading data:

With some extra work, it is possible to make LOAD DATA INFILE run even faster for a MyISAM table when the table has many indexes.

  1. Execute a FLUSH TABLES statement or a mysqladmin flush-tables command.

  2. Use myisamchk --keys-used=0 -rq /path/to/db/tbl_name to remove all use of indexes for the table.

  3. Insert data into the table with LOAD DATA INFILE. This does not update any indexes and therefore is very fast.

  4. Re-create the indexes with myisamchk -rq /path/to/db/tbl_name. This creates the index tree in memory before writing it to disk, which is much faster that updating the index during LOAD DATA INFILE because it avoids lots of disk seeks. The resulting index tree is also perfectly balanced.

  5. Execute a FLUSH TABLES statement or a mysqladmin flush-tables command.

LOAD DATA INFILE performs the preceding optimization automatically if the MyISAM table into which you insert data is empty. The main difference between automatic optimization and using the procedure explicitly is that you can let myisamchk allocate much more temporary memory for the index creation than you might want the server to allocate for index re-creation when it executes the LOAD DATA INFILE statement.

You can also disable or enable the nonunique indexes for a MyISAM table by using the following statements rather than myisamchk. If you use these statements, you can skip the FLUSH TABLE operations:

ALTER TABLE tbl_name DISABLE KEYS;
ALTER TABLE tbl_name ENABLE KEYS;

For more information refer these:

http://dev.mysql.com/doc/refman/5.5/en/optimizing-myisam-bulk-data-loading.html

https://wiki.rice.edu/confluence/display/~as43/Make+LOAD+DATA+INFILE+run+even+faster+for+a+MyISAM+table

Mahesh Patil
  • 3,056
  • 2
  • 16
  • 23
2

I didn't see any indexes in your CREATE TABLE command (other than the primary key). If you just left those out of the question and they really do exist, then you can try disabling indexes prior to the LOAD DATA command and rebuild them after it's complete:

ALTER TABLE dnddata DISABLE KEYS;

LOAD DATA INFILE ... ;

ALTER TABLE dnddata ENABLE KEYS;

As your table grows, the time to write indexes will also increase.

Derek Downey
  • 23,440
  • 11
  • 78
  • 104
1

Since you are dealing with a MyISAM table

CREATE TABLE dndtemp ENGINE=MyISAM SELECT * FROM dnddata WHERE 1=2;
LOAD DATA INFILE ... INTO TABLE dndtemp ... ;
INSERT INTO dnddata SELECT * FROM dndtemp;
DROP TABLE dndtemp;

The temp table has no indexes, so the LOAD DATA INFILE should be faster

DISABLE KEYS / ENABLE KEYS does not help in this instance because there are no secondary indexes.

Your key buffer can be resized to 768M for now because the sum of your MyISAM indexes is 694.8 MiB and MyISAM only caches index pages.

Give it a Try !!!

UPDATE 2012-07-25 10:06 EDT

You should increase your bulk_insert_buffer_size to 512M because it accommodates bulk loading of MyISAM tables. According to the MySQL Documentation on bulk_insert_buffer_size:

MyISAM uses a special tree-like cache to make bulk inserts faster for INSERT ... SELECT, INSERT ... VALUES (...), (...), ..., and LOAD DATA INFILE when adding data to nonempty tables. This variable limits the size of the cache tree in bytes per thread. Setting it to 0 disables this optimization. The default value is 8MB.

In light of this and of you only having 2GB of RAM (how does DB even survive now?), here are my recommended settings

[mysqld]
bulk_insert_buffer_size=512M
key_buffer_size=512M
RolandoMySQLDBA
  • 182,700
  • 33
  • 317
  • 520
  • hi RolandoMySQLDBA, I have already tried this method. this is taking very long time than load data infile. – sankar.suda Jul 25 '12 at 04:07
  • The PRIMARY KEY in MyISAM is essentially a "secondary" key. Hence, I disagree with your bold sentence. – Rick James Jul 01 '18 at 17:27
  • Don't increase both the key_buffer and the bulk_insert_buffer if you stay with only 2GB of RAM. Swapping will be worse than having smaller buffers. – Rick James Jul 01 '18 at 17:29
  • @RickJames a PRIMARY KEY is not a Secondary Index at all. Why ??? 1) C.J. Date taught that a PRIMARY KEY is Unique and arbitrary default unique index amongst other Unique Keys (See https://en.wikipedia.org/wiki/Candidate_key, https://en.wikipedia.org/wiki/Primary_key, https://dba.stackexchange.com/a/1366/877), 2) MySQL Documentation (https://dev.mysql.com/doc/refman/8.0/en/innodb-index-types.html) separates Clustered Index (which a PRIMARY KEY is) from Secondary Index : All indexes other than the clustered index are known as secondary indexes.. Therefore, you disagee with MySQL Doc. – RolandoMySQLDBA Jul 02 '18 at 16:54
  • @RickJames I know what I just stated clearly applies to InnoDB, but this is a MyISAM question. is there documentation that clearly states that a PRIMARY KEY is just like a Secondary Index ??? If that is indeed the case, then (in theory) DISABLE KEYS should work on a PRIMARY KEY and all other unique indexes. By design in MyISAM, DISABLE KEYS should bypass disabling PRIMARY KEY and Unique Keys. This was the point of original statement DISABLE KEYS / ENABLE KEYS does not help in this instance because there are no secondary indexes. – RolandoMySQLDBA Jul 02 '18 at 17:00
  • @RolandoMySQLDBA C.J.Date did not implement MyISAM. MyISAM's PK is a implemented as a BTree in exactly the same way as all other secondary keys. Plus it has the attributes of being a Uniqueness constraint (a la UNIQUE INDEX) and there can be only one PK. You seem to be mixing up MyISAM and InnoDB -- they are very different in their implementation of PK. All MyISAM indexes contain a pointer to the data -- either a row number (for FIXED) or a byte offset into the .MYD. – Rick James Jul 02 '18 at 17:02
  • Ouch - "The MyISAM storage engine already has support for unique columns without an index (it uses a hash-based system), but the mechanism isn't exposed at the SQL level yet." -- WRONG. That was from High Performance MySQL (ver 1?) – Rick James Jul 02 '18 at 17:06
  • The next paragraph is correct: "With MyISAM tables, the indexes are kept in a completely separate file that contains a list of primary (and possibly secondary) keys and a value that represents the byte offset for the record. These ensure MySQL can find and then quickly skip to that point within the database to locate the record. MySQL has to store the indexes this way because the records are stored in essentially random order." – Rick James Jul 02 '18 at 17:08
  • " In MyISAM, the primary index and secondary index (Secondary key) there is no difference in the structure, is the main index for key is the only, and the auxiliary index key can be repeated." -- https://www.programering.com/a/MTMwAzMwATM.html – Rick James Jul 02 '18 at 17:09
  • @RickJames BTree design is definitely not the issue for this question. Whoever implemented indexing in MyISAM defined the PRIMARY KEY and Unique Keys to be bypassed with DISABLE KEYS. Thus, DISABLE KEYS / ENABLE KEYS does not help in this instance because there are no secondary indexes.. This is absolutely true regardless of the storage layer because the storage engine layer has the final say. – RolandoMySQLDBA Jul 02 '18 at 17:10
  • Please note even the quote you gave me : With MyISAM tables, the indexes are kept in a completely separate file that contains a list of primary (and possibly secondary) keys and a value that represents the byte offset for the record.. Note that this sentence separates primary and secondary indexes just like the gen_clust_index page I quoted. Thus, it goes back to the question. I answered this question by stating the ineffectiveness of DISABLE KEYS. – RolandoMySQLDBA Jul 02 '18 at 17:29
1

IMHO, 2GB of RAM is not enough for MySQL to efficiently deal with 30 million rows.

Nonetheless, your key_buffer is waaaay too small. If this box is a dedicated database box, and all you have is 2G RAM, you could safely up that to 1G and not swap to disk as often.

key_buffer = 1GB 
randomx
  • 3,934
  • 4
  • 30
  • 43