Lasso Programming

MySQL Cluster 7.2.19 has been released

1 Messages
Collapse All
Expand All
Subscribe

Jan 26, 2015; 02:47

Kent Boortz

MySQL Cluster 7.2.19 has been released

Dear MySQL Users,

MySQL Cluster is the distributed, shared-nothing variant of MySQL.
This storage engine provides:

- In-Memory storage - Real-time performance (with optional
checkpointing to disk)
- Transparent Auto-Sharding - Read & write scalability
- Active-Active/Multi-Master geographic replication
- 99.999% High Availability with no single point of failure
and on-line maintenance
- NoSQL and SQL APIs (including C++, Java, http and Memcached)

MySQL Cluster 7.2.19, has been released and can be downloaded from

http://www.mysql.com/downloads/cluster/

where you will also find Quick Start guides to help you get your
first MySQL Cluster database up and running.

The release notes are available from

http://dev.mysql.com/doc/relnotes/mysql-cluster/7.2/en/index.html

MySQL Cluster enables users to meet the database challenges of next
generation web, cloud, and communications services with uncompromising
scalability, uptime and agility.

More details can be found at

http://www.mysql.com/products/cluster/

Enjoy !

Changes in MySQL Cluster NDB 7.2.19 (5.5.41-ndb-7.2.19) (2015-01-25)

MySQL Cluster NDB 7.2.19 is a new release of MySQL Cluster,
incorporating new features in the NDB storage engine, and
fixing recently discovered bugs in previous MySQL Cluster NDB
7.2 development releases.

Obtaining MySQL Cluster NDB 7.2. MySQL Cluster NDB 7.2
source code and binaries can be obtained from
http://dev.mysql.com/downloads/cluster/.

This release also incorporates all bugfixes and changes made
in previous MySQL Cluster releases, as well as all bugfixes
and feature changes which were added in mainline MySQL 5.5
through MySQL 5.5.41 (see Changes in MySQL 5.5.41 2014-11-28
http://dev.mysql.com/doc/relnotes/mysql/5.5/en/news-5-5-41.html).

Bundled SSL Update (Commercial Releases)

* Starting with this release, commercial distributions of
MySQL Cluster NDB 7.2 are built using OpenSSL 1.0.1i.

Bugs Fixed

* The global checkpoint commit and save protocols can be
delayed by various causes, including slow disk I/O. The
DIH master node monitors the progress of both of these
protocols, and can enforce a maximum lag time during
which the protocols are stalled by killing the node
responsible for the lag when it reaches this maximum.

This DIH master GCP monitor mechanism did not perform its
task more than once per master node; that is, it failed
to continue monitoring after detecting and handling a GCP
stop. (Bug #20128256)
References: See also Bug #19858151.

* A number of problems relating to the fired triggers pool
have been fixed, including the following issues:

+ When the fired triggers pool was exhausted, NDB
returned Error 218 (Out of LongMessageBuffer). A new
error code 221 is added to cover this case.

+ An additional, separate case in which Error 218 was
wrongly reported now returns the correct error.

+ Setting low values for MaxNoOfFiredTriggers led to
an error when no memory was allocated if there was
only one hash bucket.

+ An aborted transaction now releases any fired
trigger records it held. Previously, these records
were held until its ApiConnectRecord was reused by
another transaction.

+ In addition, for the Fired Triggers pool in the
internal ndbinfo.ndb$pools table, the high value
always equalled the total, due to the fact that all
records were momentarily seized when initializing
them. Now the high value shows the maximum following
completion of initialization.

(Bug #19976428)

* Online reorganization when using ndbmtd data nodes and
with binary logging by mysqld enabled could sometimes
lead to failures in the TRIX and DBLQH kernel blocks, or
in silent data corruption. (Bug #19903481)
References: See also Bug #19912988.

* The local checkpoint ScanFrag watchdog and the global
checkpoint monitor can each exclude a node when it is too
slow when participating in their respective protocols.
This exclusion was implemented by simply asking the
failing node to shut down, which in case this was delayed
(for whatever reason) could prolong the duration of the
GCP or LCP stall for other, unaffected nodes.

To minimize this time, an isolation mechanism has been
added to both protocols whereby any other live nodes
forcibly disconnect the failing node after a
predetermined amount of time. This allows the failing
node the opportunity to shut down gracefully (after
logging debugging and other information) if possible, but
limits the time that other nodes must wait for this to
occur. Now, once the remaining live nodes have processed
the disconnection of any failing nodes, they can commence
failure handling and restart the related protocol or
protocol, even if the failed node takes an excessively
long time to shut down. (Bug #19858151)
References: See also Bug #20128256.

* A watchdog failure resulted from a hang while freeing a
disk page in TUP_COMMITREQ, due to use of an
uninitialized block variable. (Bug #19815044, Bug #74380)

* Multiple threads crashing led to multiple sets of trace
files being printed and possibly to deadlocks.
(Bug #19724313)

* When a client retried against a new master a schema
transaction that failed previously against the previous
master while the latter was restarting, the lock obtained
by this transaction on the new master prevented the
previous master from progressing past start phase 3 until
the client was terminated, and resources held by it were
cleaned up. (Bug #19712569, Bug #74154)

* When a new data node started, API nodes were allowed to
attempt to register themselves with the data node for
executing transactions before the data node was ready.
This forced the API node to wait an extra heartbeat
interval before trying again.

To address this issue, a number of HA_ERR_NO_CONNECTION
errors (Error 4009) that could be issued during this time
have been changed to Cluster temporarily unavailable
errors (Error 4035), which should allow API nodes to use
new data nodes more quickly than before. As part of this
fix, some errors which were incorrectly categorized have
been moved into the correct categories, and some errors
which are no longer used have been removed.
(Bug #19524096, Bug #73758)

* When executing very large pushdown joins involving one or
more indexes each defined over several columns, it was
possible in some cases for the DBSPJ block
(see The DBSPJ Block
http://dev.mysql.com/doc/ndbapi/en/ndb-internals-kernel-blocks-dbspj.html)
in the NDB kernel to generate SCAN_FRAGREQ signals that
were excessively large. This caused data nodes to fail
when these could not be handled correctly, due to a hard
limit in the kernel on the size of such signals (32K).

This fix bypasses that limitation by breaking up
SCAN_FRAGREQ data that is too large for one such signal,
and sending the SCAN_FRAGREQ as a chunked or fragmented
signal instead. (Bug #19390895)

* ndb_index_stat sometimes failed when used against a table
containing unique indexes. (Bug #18715165)

* Queries against tables containing a CHAR(0) columns
failed with ERROR 1296 (HY000): Got error 4547
'RecordSpecification has overlapping offsets' from
NDBCLUSTER. (Bug #14798022)

* ndb_restore failed while restoring a table which
contained both a built-in conversion on the primary key
and a staging conversion on a TEXT column.

During staging, a BLOB table is created with a primary
key column of the target type. However, a conversion
function was not provided to convert the primary key
values before loading them into the staging blob table,
which resulted in corrupted primary key values in the
staging BLOB table. While moving data from the staging
table to the target table, the BLOB read failed because
it could not find the primary key in the BLOB table.

Now all BLOB tables are checked to see whether there are
conversions on primary keys of their main tables. This
check is done after all the main tables are processed, so
that conversion functions and parameters have already
been set for the main tables. Any conversion functions
and parameters used for the primary key in the main table
are now duplicated in the BLOB table.
(Bug #73966, Bug #19642978)

* Corrupted messages to data nodes sometimes went
undetected, causing a bad signal to be delivered to a
block which aborted the data node. This failure in
combination with disconnecting nodes could in turn cause
the entire cluster to shut down.

To keep this from happening, additional checks are now
made when unpacking signals received over TCP, including
checks for byte order, compression flag (which must not
be used), and the length of the next message in the
receive buffer (if there is one).

Whenever two consecutive unpacked messages fail the
checks just described, the current message is assumed to
be corrupted. In this case, the transporter is marked as
having bad data and no more unpacking of messages occurs
until the transporter is reconnected. In addition, an
entry is written to the cluster log containing the error
as well as a hex dump of the corrupted message.
(Bug #73843, Bug #19582925)

* Transporter send buffers were not updated properly
following a failed send. (Bug #45043, Bug #20113145)

* ndb_restore --print_data truncated TEXT and BLOB column
values to 240 bytes rather than 256 bytes.

* Disk Data: When a node acting as a DICT master fails, the
arbitrator selects another node to take over in place of
the failed node. During the takeover procedure, which
includes cleaning up any schema transactions which are
still open when the master failed, the disposition of the
uncommitted schema transaction is decided. Normally this
transaction be rolled back, but if it has completed a
sufficient portion of a commit request, the new master
finishes processing the commit. Until the fate of the
transaction has been decided, no new TRANS_END_REQ
messages from clients can be processed. In addition,
since multiple concurrent schema transactions are not
supported, takeover cleanup must be completed before any
new transactions can be started.

A similar restriction applies to any schema operations
which are performed in the scope of an open schema
transaction. The counter used to coordinate schema
operation across all nodes is employed both during
takeover processing and when executing any non-local
schema operations. This means that starting a schema
operation while its schema transaction is in the takeover
phase causes this counter to be overwritten by concurrent
uses, with unpredictable results.

The scenarios just described were handled previously
using a pseudo-random delay when recovering from a node
failure. Now we check before the new master has rolled
forward or backwards any schema transactions remaining
after the failure of the previous master and avoid
starting new schema transactions or performing operations
using old transactions until takeover processing has
cleaned up after the abandoned transaction.
(Bug #19874809, Bug #74503)

* Disk Data: When a node acting as DICT master fails, it is
still possible to request that any open schema
transaction be either committed or aborted by sending
this request to the new DICT master. In this event, the
new master takes over the schema transaction and reports
back on whether the commit or abort request succeeded. In
certain cases, it was possible for the new master to be
misidentified---that is, the request was sent to the
wrong node, which responded with an error that was
interpreted by the client application as an aborted
schema transaction, even in cases where the transaction
could have been successfully committed, had the correct
node been contacted. (Bug #74521, Bug #19880747)

* Cluster Replication: It was possible using wildcards to
set up conflict resolution for an exceptions table (that
is, a table named using the suffix $EX), which should not
be allowed. Now when a replication conflict function is
defined using wildcard expressions, these are checked for
possible matches so that, in the event that the function
would cover an exceptions table, it is not set up for
this table. (Bug #19267720)

* Cluster API: The buffer allocated by an NdbScanOperation
for receiving scanned rows was not released until the
NdbTransaction owning the scan operation was closed. This
could lead to excessive memory usage in an application
where multiple scans were created within the same
transaction, even if these scans were closed at the end
of their lifecycle, unless NdbScanOperation::close() was
invoked with the releaseOp argument equal to true. Now
the buffer is released whenever the cursor navigating the
result set is closed with NdbScanOperation::close(),
regardless of the value of this argument.
(Bug #75128, Bug #20166585)

* ClusterJ: The following errors were logged at the SEVERE
level; they are now logged at the NORMAL level, as they
should be:

+ Duplicate primary key
+ Duplicate unique key
+ Foreign key constraint error: key does not exist
+ Foreign key constraint error: key exists

(Bug #20045455)

* ClusterJ: The com.mysql.clusterj.tie class gave off a
logging message at the INFO logging level for every
single query, which was unnecessary and was affecting the
performance of applications that used ClusterJ.
(Bug #20017292)

* ClusterJ: ClusterJ reported a segmentation violation when
an application closed a session factory while some
sessions were still active. This was because MySQL
Cluster allowed an Ndb_cluster_connection object be to
deleted while some Ndb instances were still active, which
might result in the usage of null pointers by ClusterJ.

This fix stops that happening by preventing ClusterJ from
closing a session factory when any of its sessions are
still active. (Bug #19846392)
References: See also Bug #19999242.

On behalf of the Oracle/MySQL RE Team,

Kent Boortz

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql

MySQL Cluster 7.2.19 has been released

Search

LassoSoft Inc. > Home