Dear MySQL Users,
MySQL Cluster is the distributed, shared-nothing variant of MySQL.
This storage engine provides:
- In-Memory storage - Real-time performance (with optional
checkpointing to disk)
- Transparent Auto-Sharding - Read & write scalability
- Active-Active/Multi-Master geographic replication
- 99.999% High Availability with no single point of failure
and on-line maintenance
- NoSQL and SQL APIs (including C++, Java, http, Memcached
and JavaScript/Node.js)
MySQL Cluster 7.6.4-dmr, has been released and can be downloaded from
http://www.mysql.com/downloads/cluster/
where you will also find Quick Start guides to help you get your
first MySQL Cluster database up and running.
The release notes are available from
http://dev.mysql.com/doc/relnotes/mysql-cluster/7.6/en/index.html
MySQL Cluster enables users to meet the database challenges of next
generation web, cloud, and communications services with uncompromising
scalability, uptime and agility.
More details can be found at
http://www.mysql.com/products/cluster/
Enjoy !
Changes in MySQL NDB Cluster 7.6.4 (5.7.20-ndb-7.6.4) (2018-01-31,
Development Milestone 4)
MySQL NDB Cluster 7.6.4 is a new release of NDB 7.6, based on
MySQL Server 5.7 and including features in version 7.6 of the
NDB storage engine, as well as fixing recently discovered
bugs in previous NDB Cluster releases.
Obtaining NDB Cluster 7.6. NDB Cluster 7.6 source code and
binaries can be obtained from
http://dev.mysql.com/downloads/cluster/.
For an overview of changes made in NDB Cluster 7.6, see What
is New in NDB Cluster 7.6
(http://dev.mysql.com/doc/refman/5.7/en/mysql-cluster-what-is-new-7-6.html).
This release also incorporates all bug fixes and changes made
in previous NDB Cluster releases, as well as all bug fixes
and feature changes which were added in mainline MySQL 5.7
through MySQL 5.7.20 (see Changes in MySQL 5.7.20
(2017-10-16, General Availability)
(http://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-20.html)).
* Functionality Added or Changed
* Bugs Fixed
Functionality Added or Changed
* Incompatible Change; NDB Disk Data: Due to changes in
disk file formats, it is necessary to perform an
--initial restart of each data node when upgrading to or
downgrading from this release.
* Important Change; NDB Disk Data: NDB Cluster has improved
node restart times and overall performance with larger
data sets by implementing partial local checkpoints.
Prior to this release, an LCP always made a copy of the
entire database.
NDB now supports LCPs that write individual records, so
it is no longer strictly necessary for an LCP to write
the entire database. Since, at recovery, it remains
necessary to restore the database fully, the strategy is
to save one fourth of all records at each LCP, as well as
to write the records that have changed since the last
LCP.
Two data node configuration parameters relating to this
change are introduced in this release: EnablePartialLcp
(default true, or enabled) enables partial LCPs. When
partial LCPs are enabled, RecoveryWork controls the
percentage of space given over to LCPs; it increases with
the amount of work which must be performed on LCPs during
restarts as opposed to that performed during normal
operations. Raising this value causes LCPs during normal
operations to require writing fewer records and so
decreases the usual workload. Raising this value also
means that restarts can take longer.
Important
Upgrading disk data tables to NDB 7.6.4 or downgrading
them from this release requires an initial restart of
each data node. An initial node restart still requires a
complete LCP; a partial LCP is not used for this purpose.
This release also deprecates the data node configuration
parameters BackupDataBufferSize, BackupWriteSize, and
BackupMaxWriteSize; these are now subject to removal in a
future NDB Cluster release.
* Important Change: Added the ndb_perror utility for
obtaining information about NDB Cluster error codes. This
tool replaces perror --ndb; the --ndb option for perror
is now deprecated and raises a warning when used; the
option is subject to removal in a future NDB release.
See ndb_perror --- Obtain NDB error message information
(http://dev.mysql.com/doc/refman/5.7/en/mysql-cluster-programs-ndb-perror.html),
for more information. (Bug
#81703, Bug #81704, Bug #23523869, Bug #23523926)
References: See also: Bug #26966826, Bug #88086.
* NDB Client Programs: NDB Cluster Auto-Installer node
configuration parameters as supported in the UI and
accompanying documentation were in some cases hard coded
to an arbitrary value, or were missing altogether.
Configuration parameters, their default values, and the
documentation have been better aligned with those found
in release versions of the NDB Cluster software.
One necessary addition to this task was implementing the
mechanism which the Auto-Installer now provides for
setting parameters that take discrete values. For
example, the value of the data node parameter Arbitration
must now be one of Default, Disabled, or WaitExternal.
The Auto-Installer also now gets and uses the amount of
disk space available to NDB on each host for deriving
reasonable default values for configuration parameters
which depend on this value.
See The NDB Cluster Auto-Installer
(http://dev.mysql.com/doc/refman/5.7/en/mysql-cluster-install-auto.html),
for more information.
* NDB Client Programs: Secure connection support in the
MySQL NDB Cluster Auto-Installer has been updated or
improved in this release as follows:
+ Added a mechanism for setting SSH membership on a
per-host basis.
+ Updated the Paramiko Python module to the most
recent available version (2.6.1).
+ Provided a place in the GUI for encrypted private
key passwords, and discontinued use of hardcoded
passwords.
Related enhancements implemented in the current release
include the following:
+ Discontinued use of cookies as a persistent store
for NDB Cluster configuration information; these
were not secure and came with a hard upper limit on
storage. Now the Auto-Installer uses an encrypted
file for this purpose.
+ In order to secure data transfer between the web
browser front end and the back end web server, the
default communications protocol has been switched
from HTTP to HTTPS.
See The NDB Cluster Auto-Installer
(http://dev.mysql.com/doc/refman/5.7/en/mysql-cluster-install-auto.html),
for more information.
* It is now possible to specify a set of cores to be used
for I/O threads performing offline multithreaded builds
of ordered indexes, as opposed to normal I/O duties such
as file I/O? compression? or decompression. "Offline" in
this context refers to building of ordered indexes
performed when the parent table is not being written to;
such building takes place when an NDB cluster performs a
node or system restart, or as part of restoring a cluster
from backup using ndb_restore --rebuild-indexes.
In addition, the default behaviour for offline index
build work is modified to use all cores available to
ndbmtd, rather limiting itself to the core reserved for
the I/O thread. Doing so can improve restart and restore
times and performance, availability, and the user
experience.
This enhancement is implemented as follows:
1. The default value for BuildIndexThreads is changed
from 0 to 128. This means that offline ordered index
builds are now multithreaded by default.
2. The default value for TwoPassInitialNodeRestartCopy
is changed from false to true. This means that an
initial node restart first copies all data from a
"live" node to one that is starting---without
creating any indexes---builds ordered indexes
offline, and then again synchronizes its data with
the live node, that is, synchronizing twice and
building indexes offline between the two
synchonizations. This causes an initial node restart
to behave more like the normal restart of a node,
and reduces the time required for building indexes.
3. A new thread type (idxbld) is defined for the
ThreadConfig configuration parameter, to allow
locking of offline index build threads to specific
CPUs.
In addition, NDB now distinguishes the thread types that
are accessible to "ThreadConfig" by the following two
criteria:
1. Whether the thread is an execution thread. Threads
of types main, ldm, recv, rep, tc, and send are
execution threads; thread types io, watchdog, and
idxbld are not.
2. Whether the allocation of the thread to a given task
is permanent or temporary. Currently all thread
types except idxbld are permanent.
For additonal information, see the descriptions of the
parameters in the Manual. (Bug #25835748, Bug #26928111)
* Added the ODirectSyncFlag configuration parameter for
data nodes. When enabled, the data node treats all
completed filesystem writes to the redo log as though
they had been performed using fsync.
Note
This parameter has no effect if at least one of the
following conditions is true:
+ ODirect is not enabled.
+ InitFragmentLogFiles is set to SPARSE.
(Bug #25428560)
* Added the ndbinfo.error_messages table, which provides
information about NDB Cluster errors, including error
codes, status types, brief descriptions, and
classifications. This makes it possible to obtain error
information using SQL in the mysql client (or other MySQL
client program), like this:
mysql> SELECT * FROM ndbinfo.error_messages WHERE error_code='321';
+------------+----------------------+-----------------+---------------
-------+
| error_code | error_description | error_status | error_classifi
cation |
+------------+----------------------+-----------------+---------------
-------+
| 321 | Invalid nodegroup id | Permanent error | Application er
ror |
+------------+----------------------+-----------------+---------------
-------+
1 row in set (0.00 sec)
The query just shown provides equivalent information to
that obtained by issuing ndb_perror 321 or (now
deprecated) perror --ndb 321 on the command line. (Bug
#86295, Bug #26048272)
* When executing a scan as a pushed join, all instances of
DBSPJ were involved in the execution of a single query;
some of these received multiple requests from the same
query. This situation is improved by enabling a single
SPJ request to handle a set of root fragments to be
scanned, such that only a single SPJ request is sent to
each DBSPJ instance on each node and batch sizes are
allocated per fragment, the multi-fragment scan can
obtain a larger total batch size, allowing for some
scheduling optimizations to be done within DBSPJ, which
can scan a single fragment at a time (giving it the total
batch size allocation), scan all fragments in parallel
using smaller sub-batches, or some combination of the
two.
Since the effect of this change is generally to require
fewer SPJ requests and instances, performance of
pushed-down joins should be improved in many cases.
* As part of work ongoing to optimize bulk DDL performance
by ndbmtd, it is now possible to obtain performance
improvements by increasing the batch size for the bulk
data parts of DDL operations which process all of the
data in a fragment or set of fragments using a scan.
Batch sizes are now made configurable for unique index
builds, foreign key builds, and online reorganization, by
setting the respective data node configuration parameters
listed here:
+ MaxFKBuildBatchSize: Maximum scan batch size used
for building foreign keys.
+ MaxReorgBuildBatchSize: Maximum scan batch size used
for reorganization of table partitions.
+ MaxUIBuildBatchSize: Maximum scan batch size used
for building unique keys.
For each of the parameters just listed, the default value
is 64, the minimum is 16, and the maximum is 512.
Increasing the appropriate batch size or sizes can help
amortize inter-thread and inter-node latencies and make
use of more parallel resources (local and remote) to help
scale DDL performance.
* Formerly, the data node LGMAN kernel block processed undo
log records serially; now this is done in parallel. The
rep thread, which hands off undo records to local data
handler (LDM) threads, waited for an LDM to finish
applying a record before fetching the next one; now the
rep thread no longer waits, but proceeds immediately to
the next record and LDM.
There are no user-visible changes in functionality
directly associated with this work; this performance
enhancement is part of the work being done in NDB 7.6 to
improve undo long handling for partial local checkpoints.
* When applying an undo log the table ID and fragment ID
are obtained from the page ID. This was done by reading
the page from PGMAN using an extra PGMAN worker thread,
but when applying the undo log it was necessary to read
the page again.
This became very inefficient when using O_DIRECT (see
ODirect) since the page was not cached in the OS kernel.
Mapping from page ID to table ID and fragment ID is now
done using information the extent header contains about
the table IDs and fragment IDs of the pages used in a
given extent. Since the extent pages are always present
in the page cache, no extra disk reads are required to
perform the mapping, and the information can be read
using existing TSMAN data structures.
* Added the NODELOG DEBUG command in the ndb_mgm client to
provide runtime control over data node debug logging.
NODE DEBUG ON causes a data node to write extra debugging
information to its node log, the same as if the node had
been started with --verbose. NODELOG DEBUG OFF disables
the extra logging.
* Added the LocationDomainId configuration parameter for
management, data, and API nodes. When using NDB Cluster
in a cloud environment, you can set this parameter to
assign a node to a given availability domain or
availability zone. This can improve performance in the
following ways:
+ If requested data is not found on the same node,
reads can be directed to another node in the same
availability domain.
+ Communication between nodes in different
availability domains are guaranteed to use NDB
transporters' WAN support without any further manual
intervention.
+ The transporter's group number can be based on which
availability domain is used, such that also SQL and
other API nodes communicate with local data nodes in
the same availability domain whenever possible.
+ The arbitrator can be selected from an availability
domain in which no data nodes are present, or, if no
such availability domain can be found, from a third
availability domain.
This parameter takes an integer value between 0 and 16,
with 0 being the default; using 0 is the same as leaving
LocationDomainId unset.
Bugs Fixed
* Important Change: The --passwd option for ndb_top is now
deprecated, and thus subject to removal in a future
release of NDB Cluster. (Bug #88236, Bug #20733646)
References: See also: Bug #86615, Bug #26236320.
* Replication: With GTIDs generated for incident log
events, MySQL error code 1590 (ER_SLAVE_INCIDENT) could
not be skipped using the --slave-skip-errors=1590 startup
option on a replication slave. (Bug #26266758)
* NDB Disk Data: An ALTER TABLE that switched the table
storage format between MEMORY and DISK was always
performed in place for all columns. This is not correct
in the case of a column whose storage format is inherited
from the table; the column's storage type is not changed.
For example, this statement creates a table t1 whose
column c2 uses in-memory storage since the table does so
implicitly:
CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 INT) ENGINE NDB;
The ALTER TABLE statement shown here is expected to cause
c2 to be stored on disk, but failed to do so:
ALTER TABLE t1 STORAGE DISK TABLESPACE ts1;
Similarly, an on-disk column that inherited its storage
format from the table to which it belonged did not have
the format changed by ALTER TABLE ... STORAGE MEMORY.
These two cases are now performed as a copying alter, and
the storage format of the affected column is now changed.
(Bug #26764270)
* NDB Replication: On an SQL node not being used for a
replication channel with sql_log_bin=0 it was possible
after creating and populating an NDB table for a table
map event to be written to the binary log for the created
table with no corresponding row events. This led to
problems when this log was later used by a slave cluster
replicating from the mysqld where this table was created.
Fixed this by adding support for maintaining a cumulative
any_value bitmap for global checkpoint event operations
that represents bits set consistently for all rows of a
specific table in a given epoch, and by adding a check to
determine whether all operations (rows) for a specific
table are all marked as NOLOGGING, to prevent the
addition of this table to the Table_map held by the
binlog injector.
As part of this fix, the NDB API adds a new
getNextEventOpInEpoch3() method which provides
information about any AnyValue received by making it
possible to retrieve the cumulative any_value bitmap.
(Bug #26333981)
* ndbinfo Information Database: Counts of committed rows
and committed operations per fragment used by some tables
in ndbinfo were taken from the DBACC block, but due to
the fact that commit signals can arrive out of order,
transient counter values could be negative. This could
happen if, for example, a transaction contained several
interleaved insert and delete operations on the same row;
in such cases, commit signals for delete operations could
arrive before those for the corresponding insert
operations, leading to a failure in DBACC.
This issue is fixed by using the counts of committed rows
which are kept in DBTUP, which do not have this problem.
(Bug #88087, Bug #26968613)
* Errors in parsing NDB_TABLE modifiers could cause memory
leaks. (Bug #26724559)
* Added DUMP code 7027 to facilitate testing of issues
relating to local checkpoints. For more information, see
DUMP 7027
(http://dev.mysql.com/doc/ndb-internals/en/ndb-internals-dump-command-7027.html).
(Bug #26661468)
* A previous fix intended to improve logging of node
failure handling in the transaction coordinator included
logging of transactions that could occur in normal
operation, which made the resulting logs needlessly
verbose. Such normal transactions are no longer written
to the log in such cases. (Bug #26568782)
References: This issue is a regression of: Bug #26364729.
* Due to a configuration file error, CPU locking capability
was not available on builds for Linux platforms. (Bug
#26378589)
* Some DUMP codes used for the LGMAN kernel block were
incorrectly assigned numbers in the range used for codes
belonging to DBTUX. These have now been assigned symbolic
constants and numbers in the proper range (10001, 10002,
and 10003). (Bug #26365433)
* Node failure handling in the DBTC kernel block consists
of a number of tasks which execute concurrently, and all
of which must complete before TC node failure handling is
complete. This fix extends logging coverage to record
when each task completes, and which tasks remain,
includes the following improvements:
+ Handling interactions between GCP and node failure
handling interactions, in which TC takeover causes
GCP participant stall at the master TC to allow it
to extend the current GCI with any transactions that
were taken over; the stall can begin and end in
different GCP protocol states. Logging coverage is
extended to cover all scenarios. Debug logging is
now more consistent and understandable to users.
+ Logging done by the QMGR block as it monitors
duration of node failure handling duration is done
more frequently. A warning log is now generated
every 30 seconds (instead of 1 minute), and this now
includes DBDIH block debug information (formerly
this was written separately, and less often).
+ To reduce space used, DBTC instance number: is
shortened to DBTC number:.
+ A new error code is added to assist testing.
(Bug #26364729)
* During a restart, DBLQH loads redo log part metadata for
each redo log part it manages, from one or more redo log
files. Since each file has a limited capacity for
metadata, the number of files which must be consulted
depends on the size of the redo log part. These files are
opened, read, and closed sequentially, but the closing of
one file occurs concurrently with the opening of the
next.
In cases where closing of the file was slow, it was
possible for more than 4 files per redo log part to be
open concurrently; since these files were opened using
the OM_WRITE_BUFFER option, more than 4 chunks of write
buffer were allocated per part in such cases. The write
buffer pool is not unlimited; if all redo log parts were
in a similar state, the pool was exhausted, causing the
data node to shut down.
This issue is resolved by avoiding the use of
OM_WRITE_BUFFER during metadata reload, so that any
transient opening of more than 4 redo log files per log
file part no longer leads to failure of the data node.
(Bug #25965370)
* A join entirely within the materialized part of a
semi-join was not pushed even if it could have been. In
addition, EXPLAIN provided no information about why the
join was not pushed. (Bug #88224, Bug #27022925)
References: See also: Bug #27067538.
* When the duplicate weedout algorithm was used for
evaluating a semi-join, the result had missing rows. (Bug
#88117, Bug #26984919)
References: See also: Bug #87992, Bug #26926666.
* A table used in a loose scan could be used as a child in
a pushed join query, leading to possibly incorrect
results. (Bug #87992, Bug #26926666)
* When representing a materialized semi-join in the query
plan, the MySQL Optimizer inserted extra QEP_TAB and
JOIN_TAB objects to represent access to the materialized
subquery result. The join pushdown analyzer did not
properly set up its internal data structures for these,
leaving them uninitialized instead. This meant that later
usage of any item objects referencing the materialized
semi-join accessed an initialized tableno column when
accessing a 64-bit tableno bitmask, possibly referring to
a point beyond its end, leading to an unplanned shutdown
of the SQL node. (Bug #87971, Bug #26919289)
* In some cases, a SCAN_FRAGCONF signal was received after
a SCAN_FRAGREQ with a close flag had already been sent,
clearing the timer. When this occurred, the next
SCAN_FRAGREF to arrive caused time tracking to fail. Now
in such cases, a check for a cleared timer is performed
prior to processing the SCAN_FRAGREF message. (Bug
#87942, Bug #26908347)
* While deleting an element in Dbacc, or moving it during
hash table expansion or reduction, the method used
(getLastAndRemove()) could return a reference to a
removed element on a released page, which could later be
referenced from the functions calling it. This was due to
a change brought about by the implementation of dynamic
index memory in NDB 7.6.2; previously, the page had
always belonged to a single Dbacc instance, so accessing
it was safe. This was no longer the case following the
change; a page released in Dbacc could be placed directly
into the global page pool where any other thread could
then allocate it.
Now we make sure that newly released pages in Dbacc are
kept within the current Dbacc instance and not given over
directly to the global page pool. In addition, the
reference to a released page has been removed; the
affected internal method now returns the last element by
value, rather than by reference. (Bug #87932, Bug
?? #26906640)
References: See also: Bug #87987, Bug #26925595.
* The DBTC kernel block could receive a TCRELEASEREQ signal
in a state for which it was unprepared. Now it such cases
it responds with a TCRELEASECONF message, and
subsequently behaves just as if the API connection had
failed. (Bug #87838, Bug #26847666)
References: See also: Bug #20981491.
* When a data node was configured for locking threads to
CPUs, it failed during startup with Failed to lock tid.
This was is a side effect of a fix for a previous issue,
which disabled CPU locking based on the version of the
available glibc. The specific glibc issue being guarded
against is encountered only in response to an internal
NDB API call (Ndb_UnlockCPU()) not used by data nodes
(and which can be accessed only through internal API
calls). The current fix enables CPU locking for data
nodes and disables it only for the relevant API calls
when an affected glibc version is used. (Bug #87683, Bug
#26758939)
References: This issue is a regression of: Bug #86892,
Bug #26378589.
* ndb_top failed to build on platforms where the ncurses
library did not define stdscr. Now these platforms
require the tinfo library to be included. (Bug #87185,
Bug #26524441)
* On completion of a local checkpoint, every node sends a
LCP_COMPLETE_REP signal to every other node in the
cluster; a node does not consider the LCP complete until
it has been notified that all other nodes have sent this
signal. Due to a minor flaw in the LCP protocol, if this
message was delayed from another node other than the
master, it was possible to start the next LCP before one
or more nodes had completed the one ongoing; this caused
problems with LCP_COMPLETE_REP signals from previous LCPs
becoming mixed up with such signals from the current LCP,
which in turn led to node failures.
To fix this problem, we now ensure that the previous LCP
is complete before responding to any TCGETOPSIZEREQ
signal initiating a new LCP. (Bug #87184, Bug #26524096)
* NDB Cluster did not compile successfully when the build
used WITH_UNIT_TESTS=OFF. (Bug #86881, Bug #26375985)
* Recent improvements in local checkpoint handling that use
OM_CREATE to open files did not work correctly on Windows
platforms, where the system tried to create a new file
and failed if it already existed. (Bug #86776, Bug
#26321303)
* A potential hundredfold signal fan-out when sending a
START_FRAG_REQ signal could lead to a node failure due to
a job buffer full error in start phase 5 while trying to
perform a local checkpoint during a restart. (Bug #86675,
Bug #26263397)
References: See also: Bug #26288247, Bug #26279522.
* Compilation of NDB Cluster failed when using
-DWITHOUT_SERVER=1 to build only the client libraries.
(Bug #85524, Bug #25741111)
* The NDBFS block's OM_SYNC flag is intended to make sure
that all FSWRITEREQ signals used for a given file are
synchronized, but was ignored by platforms that do not
support O_SYNC, meaning that this feature did not behave
properly on those platforms. Now the synchronization flag
is used on those platforms that do not support O_SYNC.
(Bug #76975, Bug #21049554)
On Behalf of Oracle/MySQL Release Engineering Team
Balasubramanian Kandasamy
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql