How non-relational database technologies free up data to create value

The proliferation of multiple non-relational databases is transforming the data management landscape. Instead of having to force structures onto their data, organisations can now choose NoSQL database architectures that fit their emerging data needs, as well as combining these new technologies with conventional relational databases to drive new value from their information.

Until recently, data’s potential as a source of rich business insight has been limited by the structures that have been imposed upon it. Without access to the new database technologies now available, standard back-end design practice has been to force data into rigid architectures (regardless of variations in the structure of the actual data).

Inherently inflexible, these legacy architectures have prevented organisations from developing new use cases for the exploitation of structured and unstructured information.

The ongoing proliferation of non-relational database architectures marks a watershed in data management. What is emerging is a new world of horizontally scaling, unstructured databases that are better at solving some problems, along with traditional relational databases that remain relevant for others.

Technology has evolved to the extent that organizations need no longer be constrained by a lack of choice in database architectures. As front-runners have moved to identify the database options that match their specific data needs, we saw three key changes becoming increasingly prevalent during 2012:

  1. A rebalancing of the database landscape, as data architects began to embrace the fact that their architecture and design toolkit has evolved from being relational database-centric to also including a varied and maturing set of non-relational options (NoSQL database systems).
  2. The increasing pervasiveness of hybrid data ecosystems powered by disruptive technologies and techniques (such as the Apache Hadoop software framework for cost-effective processing of data at extreme scale).
  3. The emergence of more responsive data management ecosystems to provide the flexibility needed to undertake prototyping-enabled delivery (test-prove-industrialize) at lower cost and at scale.

From now on, savvy analytical leaders will be seeking to crystallize the use cases to which platforms are best suited. Instead of becoming overly focused on the availability of new technologies, they will identify the “sweet spots” where relational and non-relational databases can be combined to create value for information above and beyond its original purpose.

By taking advantage of the new world of choice in data architectures, more organizations will be equipped to identify and exploit breakthrough opportunities for data monetization.

Just as communications operators have created valuable B2B revenue streams from the wealth of customer data at their disposal, so better usage of their existing data will empower other companies to build potent new business models.

Implementing a rethink of how data is stored, processed and enriched means re-evaluating the traditional world of data management. Until now, data has been viewed as a structured asset and a cost centre that must be maintained.

The availability of new database architectures means that this mindset will change forever. Data management in a services-led world will require IT leaders to think about how the business can most easily take advantage of the data they have and the data they may previously have been unable to harness.

Agile data services architecture

As more architecture options become available, data lifecycles will shrink and become more agile. Rather than seeking to “over control” data, approaches to data management will become much less rigid. One key aim will be to open up new possibilities by encouraging and facilitating data sharing. Amazon stands out as a pioneer in this field. By building a service-oriented platform with an agile data services architecture, the company has been able to offer new services around cloud storage and data management – as well as giving itself the flexibility needed to cope with future demand for as yet unknown services.

Unprecedented accessibility to non-relational databases is reinvigorating the role of conventional architectures and “traditional” data management disciplines. From now on, analytics leaders will increasingly move to adopt hybrid architectures that combine the best of both worlds to leverage fresh new insights from the surging volumes of structured and unstructured information that are now the norm. In summary, there has never been a more exciting time to be a data management professional.

IBM releases Hadoop box and database technology for quicker data insights

Summary:

IBM announced a new PureData appliance for Hadoop and technology for speeding up analytic databases. The announcements come at a good time, with data sets growing and enterprises hankering for easy and fast analysis capability.

IBM talked up the latest ways in which it has sped up databases and introduced a Hadoop appliance at a press and analyst event in San Jose, Calif., on Wednesday. The developments aim to bring enterprises closer to running analytics on more types and greater quantities of data as close to real time as possible — a higher and higher priority as big-data projects proliferate.

In the long run, as more and more data piles up and in greater varieties, IBM wants to help to prevent its customers from drowning in the deluge of data and instead give them tools to get better results, such as more revenue, said Bob Picciano, general manager of information management at IBM Software. That’s why tools have to be fast, capable of working on huge data sets and easy to use.

Toward that end, IBM announced BLU Acceleration. When a user of an IBM database such as DB2 runs a query, BLU quickly slims down a big set of data to the amount needed for analysis and spreads tiny workloads across all available compute cores to give a result. One feature of BLU — data skipping — essentially fast-forwards over the data that’s not needed and hones in on the small area that is. And with BLU, data can stay compressed for almost the entire duration of the analysis. IBM claimed BLU produces results a thousand times faster than a previous version of the DB2 database without BLU in some tests.

IBM also unveiled another IBM PureData box tailored for big-data purposes, this time around Hadoop. Previous boxes in the line include the PureData System for Analytics. The IBM PureData System for Hadoop appliance will become available later this year. It enables customers to start loading data in 90 minutes, compared with two or three weeks for a company’s Hadoop instance in a data center, said Phil Francisco, vice president of Netezza product management and product marketing at IBM Software. The box can store data processed in Hadoop right in the box, a perk for companies facing retention requirements.

Look for IBM to offer more big-data hardware and software. The company has spent $16 billion on big-data and analytics acquisitions, and it wants to “spend as much organically as well as inorganically to figure out what clients need in this space,” said Inhi Cho Suh, vice president of information management product strategy at IBM Software. Meanwhile, Picciano said IBM will soon come out with a way to do for many servers what the BLU Acceleration does with the processors inside a single server.

The new IBM products sound like they could speed up analytics. If enterprises don’t believe the need is there now, they will as data gets bigger.

MySQL Cluster 7.3 Enables Faster and Simpler Development of New Web and Mobile Services

News Summary

With the accelerated pace of innovation in Web, cloud, social and mobile services, the new GA release of MySQL Cluster 7.3 makes it simpler and faster than ever for developers to enrich their applications with a highly available and scalable, fault tolerant, real-time database.

News Facts

  Oracle today announced the general availability of MySQL Cluster 7.3.
  With a new NoSQL JavaScript connector for node.js, MySQL Cluster 7.3 makes it simpler and faster to build services deployed across clusters of commodity hardware, with minimum development and operational effort.
  The new release features enhanced capabilities including native support for foreign keys, a browser-based auto-installer and new connection thread scalability, further enabling users to meet the high availability database challenges of next generation Web, Cloud, and communications services.
  Additionally, native integration with the MySQL 5.6 Server enables developers to combine the InnoDB and MySQL Cluster storage engines within a single MySQL 5.6-based application.
  MySQL Cluster is an open source, auto-sharded, real-time, ACID-compliant transactional database with no single point of failure, designed for next generation web, cloud, social and mobile applications.
  MySQL Cluster 7.3 is available for download here. Terms, conditions and restrictions apply.

Easy to Use plus Uncompromised Availability and Scalability

  MySQL Cluster 7.3 GA builds upon a series of Development Milestone Releases that have enabled users to preview, test and provide feedback during the development process. The latest enhancements include:
  NoSQL JavaScript Connector for node.js: Simplifies development re-using JavaScript from the client to the server, all the way through to the database. Provides node.js with a native, asynchronous JavaScript interface that can be used to both query and receive result sets directly from MySQL Cluster, without transformations to SQL, ensuring low latency for simple queries. The JavaScript Connector for node.js joins a growing portfolio of NoSQL interfaces for MySQL Cluster, which already includes Memcached, Java, JPA and HTTP/REST, enabling NoSQL and SQL access to the same data set.
  Foreign Key support: Simplifies application logic and strengthens data models by automatically enforcing referential integrity between different tables located on different shards, different nodes, or in different data centers. Combines advanced RDBMS features with a highly scalable, real-time distributed database that enforces Foreign Keys across a shared-nothing cluster, while maintaining ACID guarantees and the ability to run cross-shard JOINs to support both high volume OLTP and real-time analytics. Foreign Keys are enforced across applications using both SQL and NoSQL connectors into the Cluster. Modelled on the InnoDB implementation of Foreign Keys, developers can re-use existing MySQL skills.
  MySQL Cluster Auto-Installer: Enables DevOps teams to graphically configure and provision a production-grade cluster in minutes, automatically tuned for their workload and environment, directly from their browser.
  Integration with MySQL Server 5.6: The SQL layer is now based on the latest MySQL 5.6 GA release, enabling DevOps teams to take advantage of the enhanced query throughput and replication robustness of the release. Developers can combine the InnoDB and MySQL Cluster storage engines side by side within a single application using the latest MySQL 5.6 release.
  Connection Thread Scalability: Delivers 1.5x to 7.5x higher throughput per connection to the MySQL Custer data nodes, increasing the overall capacity and scalability of the cluster. The improvement is achieved by splitting mutexes within internal connection APIs. It is completely transparent to applications, which will benefit from higher throughput by simply upgrading to MySQL Cluster 7.3, which itself is an on-line operation.

Supporting Quotes

  “The latest MySQL Cluster 7.3 GA release blends the agility, performance and scale demanded by Web, mobile and emerging application workloads, with the data integrity and high availability only offered by RDBMS platforms,” said Tomas Ulin, vice president, MySQL Engineering. “It is a winning combination, reflecting the priorities of our largest developers and users.”
  “A key driver for our original selection of MySQL Cluster for our real-time Web recommendations platform was the ability to service not just today’s workloads, but also meet our future needs,” said Sean Chighizola, senior director of Database Administration, Big Fish. “MySQL Cluster 7.3 looks an exciting upgrade – we are evaluating its new features with a view to extending MySQL Cluster to more of our services.”