Вопрос. Почему производительность MySQL снижается для запросов, соединяющих почти пустые таблицы при параллельном выполнении?
Ниже приводится более подробное объяснение проблемы, с которой я столкнулся. У меня две таблицы в MySQL
CREATE TABLE first (
num int(10) NOT NULL,
UNIQUE KEY key_num (num)
) ENGINE=InnoDB
CREATE TABLE second (
num int(10) NOT NULL,
num2 int(10) NOT NULL,
UNIQUE KEY key_num (num, num2)
) ENGINE=InnoDB
Первый содержит около тысячи записей. Второй - пустой или содержит очень мало записей. Он также содержит двойной индекс, который каким-то образом связан с проблемой: проблема исчезает для одного индекса. Теперь я пытаюсь сделать много одинаковых запросов к этим таблицам параллельно. Каждый запрос выглядит так:
SELECT first.num
FROM first
LEFT JOIN second AS second_1 ON second_1.num = -1 # non-existent key
LEFT JOIN second AS second_2 ON second_2.num = -2 # non-existent key
LEFT JOIN second AS second_3 ON second_3.num = -3 # non-existent key
LEFT JOIN second AS second_4 ON second_4.num = -4 # non-existent key
LEFT JOIN second AS second_5 ON second_5.num = -5 # non-existent key
LEFT JOIN second AS second_6 ON second_6.num = -6 # non-existent key
WHERE second_1.num IS NULL
AND second_2.num IS NULL
AND second_3.num IS NULL
AND second_4.num IS NULL
AND second_5.num IS NULL
AND second_6.num IS NULL
Проблема, которую я получаю, заключается в том, что вместо того, чтобы иметь почти линейное повышение производительности на 8-ядерном компьютере, у меня действительно есть падение. А именно, имея один процесс, типичное количество запросов в секунду у меня составляет около 200. Имея два процесса вместо ожидаемого увеличения до 300 - 400 запросов в секунду, у меня фактически есть падение до 150. Для 10 процессов у меня есть только 70 запросов в секундах. Код Perl, который я использую для тестирования, показан ниже:
#!/usr/bin/perl
use strict;
use warnings;
use DBI;
use Parallel::Benchmark;
use SQL::Abstract;
use SQL::Abstract::Plugin::InsertMulti;
my $children_dbh;
foreach my $second_table_row_count (0, 1, 1000) {
print '#' x 80, "\nsecond_table_row_count = $second_table_row_count\n";
create_and_fill_tables(1000, $second_table_row_count);
foreach my $concurrency (1, 2, 3, 4, 6, 8, 10, 20) {
my $bm = Parallel::Benchmark->new(
'benchmark' => sub {
_run_sql();
return 1;
},
'concurrency' => $concurrency,
'time' => 3,
);
my $result = $bm->run();
}
}
sub create_and_fill_tables {
my ($first_table_row_count, $second_table_row_count) = @_;
my $dbh = dbi_connect();
{
$dbh->do(q{DROP TABLE IF EXISTS first});
$dbh->do(q{
CREATE TABLE first (
num int(10) NOT NULL,
UNIQUE KEY key_num (num)
) ENGINE=InnoDB
});
if ($first_table_row_count) {
my ($stmt, @bind) = SQL::Abstract->new()->insert_multi(
'first',
['num'],
[map {[$_]} 1 .. $first_table_row_count],
);
$dbh->do($stmt, undef, @bind);
}
}
{
$dbh->do(q{DROP TABLE IF EXISTS second});
$dbh->do(q{
CREATE TABLE second (
num int(10) NOT NULL,
num2 int(10) NOT NULL,
UNIQUE KEY key_num (num, num2)
) ENGINE=InnoDB
});
if ($second_table_row_count) {
my ($stmt, @bind) = SQL::Abstract->new()->insert_multi(
'second',
['num'],
[map {[$_]} 1 .. $second_table_row_count],
);
$dbh->do($stmt, undef, @bind);
}
}
}
sub _run_sql {
$children_dbh ||= dbi_connect();
$children_dbh->selectall_arrayref(q{
SELECT first.num
FROM first
LEFT JOIN second AS second_1 ON second_1.num = -1
LEFT JOIN second AS second_2 ON second_2.num = -2
LEFT JOIN second AS second_3 ON second_3.num = -3
LEFT JOIN second AS second_4 ON second_4.num = -4
LEFT JOIN second AS second_5 ON second_5.num = -5
LEFT JOIN second AS second_6 ON second_6.num = -6
WHERE second_1.num IS NULL
AND second_2.num IS NULL
AND second_3.num IS NULL
AND second_4.num IS NULL
AND second_5.num IS NULL
AND second_6.num IS NULL
});
}
sub dbi_connect {
return DBI->connect(
'dbi:mysql:'
. 'database=tmp'
. ';host=localhost'
. ';port=3306',
'root',
'',
);
}
И для сравнения таких запросов, которые выполняются одновременно с увеличением производительности:
SELECT first.num
FROM first
LEFT JOIN second AS second_1 ON second_1.num = 1 # existent key
LEFT JOIN second AS second_2 ON second_2.num = 2 # existent key
LEFT JOIN second AS second_3 ON second_3.num = 3 # existent key
LEFT JOIN second AS second_4 ON second_4.num = 4 # existent key
LEFT JOIN second AS second_5 ON second_5.num = 5 # existent key
LEFT JOIN second AS second_6 ON second_6.num = 6 # existent key
WHERE second_1.num IS NOT NULL
AND second_2.num IS NOT NULL
AND second_3.num IS NOT NULL
AND second_4.num IS NOT NULL
AND second_5.num IS NOT NULL
AND second_6.num IS NOT NULL
Результаты тестирования, измерения использования процессора и диска приведены здесь:
* table `first` have 1000 rows
* table `second` have 6 rows: `[1,1],[2,2],..[6,6]`
For query:
SELECT first.num
FROM first
LEFT JOIN second AS second_1 ON second_1.num = -1 # non-existent key
LEFT JOIN second AS second_2 ON second_2.num = -2 # non-existent key
LEFT JOIN second AS second_3 ON second_3.num = -3 # non-existent key
LEFT JOIN second AS second_4 ON second_4.num = -4 # non-existent key
LEFT JOIN second AS second_5 ON second_5.num = -5 # non-existent key
LEFT JOIN second AS second_6 ON second_6.num = -6 # non-existent key
WHERE second_1.num IS NULL
AND second_2.num IS NULL
AND second_3.num IS NULL
AND second_4.num IS NULL
AND second_5.num IS NULL
AND second_6.num IS NULL
Results:
concurrency: 1, speed: 162.910 / sec
concurrency: 2, speed: 137.818 / sec
concurrency: 3, speed: 130.728 / sec
concurrency: 4, speed: 107.387 / sec
concurrency: 6, speed: 90.513 / sec
concurrency: 8, speed: 80.445 / sec
concurrency: 10, speed: 80.381 / sec
concurrency: 20, speed: 84.069 / sec
System usage after for last 60 minutes of running query in 6 processes:
$ iostat -cdkx 60
avg-cpu: %user %nice %system %iowait %steal %idle
74.82 0.00 0.08 0.00 0.08 25.02
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda1 0.00 0.00 0.00 0.12 0.00 0.80 13.71 0.00 1.43 1.43 0.02
sdf10 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02
sdf4 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 30.00 15.00 0.05
sdm 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdf8 0.00 0.00 0.00 0.37 0.00 1.24 6.77 0.00 5.00 3.18 0.12
sdf6 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02
sdf9 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 0.00 0.00 0.00
sdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdf3 0.00 0.00 0.00 0.08 0.00 1.33 32.00 0.00 4.00 4.00 0.03
sdf2 0.00 0.00 0.00 0.17 0.00 1.37 16.50 0.00 3.00 3.00 0.05
sdf15 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdf14 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdf1 0.00 0.00 0.00 0.05 0.00 0.40 16.00 0.00 0.00 0.00 0.00
sdf13 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02
sdf5 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 50.00 25.00 0.08
sdm2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdm1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdf12 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02
sdf11 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02
sdf7 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02
md0 0.00 0.00 0.00 0.97 0.00 13.95 28.86 0.00 0.00 0.00 0.00
################################################################################
For query:
SELECT first.num
FROM first
LEFT JOIN second AS second_1 ON second_1.num = 1 # existent key
LEFT JOIN second AS second_2 ON second_2.num = 2 # existent key
LEFT JOIN second AS second_3 ON second_3.num = 3 # existent key
LEFT JOIN second AS second_4 ON second_4.num = 4 # existent key
LEFT JOIN second AS second_5 ON second_5.num = 5 # existent key
LEFT JOIN second AS second_6 ON second_6.num = 6 # existent key
WHERE second_1.num IS NOT NULL
AND second_2.num IS NOT NULL
AND second_3.num IS NOT NULL
AND second_4.num IS NOT NULL
AND second_5.num IS NOT NULL
AND second_6.num IS NOT NULL
Results:
concurrency: 1, speed: 875.973 / sec
concurrency: 2, speed: 944.986 / sec
concurrency: 3, speed: 1256.072 / sec
concurrency: 4, speed: 1401.657 / sec
concurrency: 6, speed: 1354.351 / sec
concurrency: 8, speed: 1110.100 / sec
concurrency: 10, speed: 1145.251 / sec
concurrency: 20, speed: 1142.514 / sec
System usage after for last 60 minutes of running query in 6 processes:
$ iostat -cdkx 60
avg-cpu: %user %nice %system %iowait %steal %idle
74.40 0.00 0.53 0.00 0.06 25.01
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda1 0.00 0.00 0.00 0.02 0.00 0.13 16.00 0.00 0.00 0.00 0.00
sdf10 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02
sdf4 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02
sdm 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdf8 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02
sdf6 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 0.00 0.00 0.00
sdf9 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02
sdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdf3 0.00 0.00 0.00 0.13 0.00 2.67 40.00 0.00 3.75 2.50 0.03
sdf2 0.00 0.00 0.00 0.23 0.00 2.72 23.29 0.00 2.14 1.43 0.03
sdf15 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdf14 0.00 0.00 0.00 0.98 0.00 0.54 1.10 0.00 2.71 2.71 0.27
sdf1 0.00 0.00 0.00 0.08 0.00 1.47 35.20 0.00 8.00 6.00 0.05
sdf13 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdf5 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02
sdm2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdm1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdf12 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdf11 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 0.00 0.00 0.00
sdf7 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02
md0 0.00 0.00 0.00 1.70 0.00 15.92 18.74 0.00 0.00 0.00 0.00
################################################################################
And this server has lots of free memory. Example of top:
top - 19:02:59 up 4:23, 4 users, load average: 4.43, 3.03, 2.01
Tasks: 218 total, 1 running, 217 sleeping, 0 stopped, 0 zombie
Cpu(s): 72.8%us, 0.7%sy, 0.0%ni, 26.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.1%st
Mem: 71701416k total, 22183980k used, 49517436k free, 284k buffers
Swap: 0k total, 0k used, 0k free, 1282768k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2506 mysql 20 0 51.7g 17g 5920 S 590 25.8 213:15.12 mysqld
9348 topadver 20 0 72256 11m 1428 S 2 0.0 0:01.45 perl
9349 topadver 20 0 72256 11m 1428 S 2 0.0 0:01.44 perl
9350 topadver 20 0 72256 11m 1428 S 2 0.0 0:01.45 perl
9351 topadver 20 0 72256 11m 1428 S 1 0.0 0:01.44 perl
9352 topadver 20 0 72256 11m 1428 S 1 0.0 0:01.44 perl
9353 topadver 20 0 72256 11m 1428 S 1 0.0 0:01.44 perl
9346 topadver 20 0 19340 1504 1064 R 0 0.0 0:01.89 top
Есть ли у кого-нибудь идея, почему производительность снизилась для запроса с несуществующими ключами?