I have developed my first application with VB.NET at work (junior developer) and basically it reads several thousands of lines from X number of files, and inserts them into a mysql database.
The application consists of several steps:
1) Dump all lines with all fields into a temp table.
a. This works fine. Optimal speed for the hardware we have.
2) Query temp table to obtain certain. I query by a unique ID. I insert all unique ID (isn field). If the ID matches my interests, I insert it into an ArrayList.
a. Query: SELECT COUNT(SUBSTR(ISN,2)), SUBSTR(isn, 2) FROM SMS.TEMPDATADUMP WHERE error = 0 GROUP BY SUBSTR(ISN,2) HAVING COUNT(SUBSTR(ISN,2)) = 4
b. The isn is not unique per line, but per data (sms)
c. Once I have all isn on an arraylist, I do the following query:
i. SELECT select, my, fields,of,interest FROM SMS.TEMPDATADUMP WHERE SUBSTR(ISN, 2) = '" & isn & "' ORDER BY type LIMIT 1
d. To obtain some data. During the process I query around 10 times other table per ISN.
e. Here is the problem. If I have a few files to process (around 3000-4000 lines in total, small array) this steps work fine, good speed. But If I have big files or a lot of files (more than 10000 lines in total, big array), this steps are incredibly slow. Queries and inserts are too slow. Meaning, one-two inserts per second, while the other case inserts are around 800 per second.
Our hardware is not optimized for database server, but I don?t have other choice. It is mostly a desktop computer
Intel core i5, windows 32 bits, 3GB RAM, one disk 500GB 7200rpm.
I have tried some optimizations commented in mysqlperformance blog without success.
This site manages and broadcasts several email lists pertaining to Lasso Programming and technologies related and used by Lasso developers. Sign up today!