Cheap , Easy and Fast DataWarehousing – Our Implementation of Amazon Redshift

Conventional methods were proven either slow and/or expensive in providing a complete Analytics Data Warehousing/Reporting solution to one of our clients who provide world-wide IPTV Services.     We then decided to scrap plans for a conventional data warehouse after doing preliminary POC Testing on Redshift.
 
With 1/10th of the cost and high performance and ease of manageability and moving away from traditional in house IT management and support,  the extensively scalable cloud based solution seem to be a perfect fit.      Redshift’s capability to work in sync with Hadoop and integrations with Amazon’s Elastic Map Reduce was another great motivation factor in selection of the technology.
 
There was never any worries about capacity that redshift can handle or other multi-tenancy or scalability factors.   The other primary factor was also being able to choose a feasible, flexible and ease of use reporting system that would work with Redshift and also have the capability of delivering ad-hoc reporting features with multi-tenancy support.
 
The estimated size of the database is expected to be several 10s of 100s of Terrabytes across multiple tenants all residing in a single redshift cluster scalable with multiple compute nodes.   The easy of  management without indexes and just having control over the sort and distribution keys was a significant time saver in data management.  Within few seconds the system is capable of delivering reports of various time granularity shifting through several years of data.

We at Applayatech, provide Redshift consulting service, which includes Redshift data warehouse service too at worthwhile Amazon Redshift pricing.

Contact us for Redshift consulting service, we also provide Redshift data warehouse service. Call us to know more about Amazon Redshift pricing today.
 

How non-relational database technologies free up data to create value

The proliferation of multiple non-relational databases is transforming the data management landscape. Instead of having to force structures onto their data, organisations can now choose NoSQL database architectures that fit their emerging data needs, as well as combining these new technologies with conventional relational databases to drive new value from their information.

Until recently, data’s potential as a source of rich business insight has been limited by the structures that have been imposed upon it. Without access to the new database technologies now available, standard back-end design practice has been to force data into rigid architectures (regardless of variations in the structure of the actual data).

Inherently inflexible, these legacy architectures have prevented organisations from developing new use cases for the exploitation of structured and unstructured information.

The ongoing proliferation of non-relational database architectures marks a watershed in data management. What is emerging is a new world of horizontally scaling, unstructured databases that are better at solving some problems, along with traditional relational databases that remain relevant for others.

Technology has evolved to the extent that organizations need no longer be constrained by a lack of choice in database architectures. As front-runners have moved to identify the database options that match their specific data needs, we saw three key changes becoming increasingly prevalent during 2012:

  1. A rebalancing of the database landscape, as data architects began to embrace the fact that their architecture and design toolkit has evolved from being relational database-centric to also including a varied and maturing set of non-relational options (NoSQL database systems).
  2. The increasing pervasiveness of hybrid data ecosystems powered by disruptive technologies and techniques (such as the Apache Hadoop software framework for cost-effective processing of data at extreme scale).
  3. The emergence of more responsive data management ecosystems to provide the flexibility needed to undertake prototyping-enabled delivery (test-prove-industrialize) at lower cost and at scale.

From now on, savvy analytical leaders will be seeking to crystallize the use cases to which platforms are best suited. Instead of becoming overly focused on the availability of new technologies, they will identify the “sweet spots” where relational and non-relational databases can be combined to create value for information above and beyond its original purpose.

By taking advantage of the new world of choice in data architectures, more organizations will be equipped to identify and exploit breakthrough opportunities for data monetization.

Just as communications operators have created valuable B2B revenue streams from the wealth of customer data at their disposal, so better usage of their existing data will empower other companies to build potent new business models.

Implementing a rethink of how data is stored, processed and enriched means re-evaluating the traditional world of data management. Until now, data has been viewed as a structured asset and a cost centre that must be maintained.

The availability of new database architectures means that this mindset will change forever. Data management in a services-led world will require IT leaders to think about how the business can most easily take advantage of the data they have and the data they may previously have been unable to harness.

Agile data services architecture

As more architecture options become available, data lifecycles will shrink and become more agile. Rather than seeking to “over control” data, approaches to data management will become much less rigid. One key aim will be to open up new possibilities by encouraging and facilitating data sharing. Amazon stands out as a pioneer in this field. By building a service-oriented platform with an agile data services architecture, the company has been able to offer new services around cloud storage and data management – as well as giving itself the flexibility needed to cope with future demand for as yet unknown services.

Unprecedented accessibility to non-relational databases is reinvigorating the role of conventional architectures and “traditional” data management disciplines. From now on, analytics leaders will increasingly move to adopt hybrid architectures that combine the best of both worlds to leverage fresh new insights from the surging volumes of structured and unstructured information that are now the norm. In summary, there has never been a more exciting time to be a data management professional.

IBM releases Hadoop box and database technology for quicker data insights

Summary:

IBM announced a new PureData appliance for Hadoop and technology for speeding up analytic databases. The announcements come at a good time, with data sets growing and enterprises hankering for easy and fast analysis capability.

IBM talked up the latest ways in which it has sped up databases and introduced a Hadoop appliance at a press and analyst event in San Jose, Calif., on Wednesday. The developments aim to bring enterprises closer to running analytics on more types and greater quantities of data as close to real time as possible — a higher and higher priority as big-data projects proliferate.

In the long run, as more and more data piles up and in greater varieties, IBM wants to help to prevent its customers from drowning in the deluge of data and instead give them tools to get better results, such as more revenue, said Bob Picciano, general manager of information management at IBM Software. That’s why tools have to be fast, capable of working on huge data sets and easy to use.

Toward that end, IBM announced BLU Acceleration. When a user of an IBM database such as DB2 runs a query, BLU quickly slims down a big set of data to the amount needed for analysis and spreads tiny workloads across all available compute cores to give a result. One feature of BLU — data skipping — essentially fast-forwards over the data that’s not needed and hones in on the small area that is. And with BLU, data can stay compressed for almost the entire duration of the analysis. IBM claimed BLU produces results a thousand times faster than a previous version of the DB2 database without BLU in some tests.

IBM also unveiled another IBM PureData box tailored for big-data purposes, this time around Hadoop. Previous boxes in the line include the PureData System for Analytics. The IBM PureData System for Hadoop appliance will become available later this year. It enables customers to start loading data in 90 minutes, compared with two or three weeks for a company’s Hadoop instance in a data center, said Phil Francisco, vice president of Netezza product management and product marketing at IBM Software. The box can store data processed in Hadoop right in the box, a perk for companies facing retention requirements.

Look for IBM to offer more big-data hardware and software. The company has spent $16 billion on big-data and analytics acquisitions, and it wants to “spend as much organically as well as inorganically to figure out what clients need in this space,” said Inhi Cho Suh, vice president of information management product strategy at IBM Software. Meanwhile, Picciano said IBM will soon come out with a way to do for many servers what the BLU Acceleration does with the processors inside a single server.

The new IBM products sound like they could speed up analytics. If enterprises don’t believe the need is there now, they will as data gets bigger.