Introduction Every five to six years, there comes a technology wave, and if you are able to catch it, it will take you a long way. Throughout my career, I’ve ridden several of these waves. MPP data warehouses brought us incredible speed for analytics and a few headaches for data integration. We’re seeing in-memory analytics reducing disk latency. Hadoop based technologies are opening up new solutions every day for storage and compute workloads while our source systems are still generating varying degrees of velocity, volume, and variety. As a traditional ETL developer, I would usually try to figure out the
USE CASE Leverage the Talend random data generator to spawn records onto a CUSTOMER table residing on the Netezza emulator. Additionally, leverage Talend’s Netezza specific functions for loading to Netezza. BUSINESS CASE Recently, a client wanted to compare some of the leading data integration tools on the market for their ability to integrate with Netezza. Coming from an Oracle solution most of their existing code was written for singleton transactions and the client needed to take advantage of MPP’s ability to bulk load and bulk transform their data sets. Since Talend’s data integration tool contains a library of Netezza specific
Over the last few years there has been a lot of industry buzz about the future of the enterprise data warehouse (EDW). Maybe we should change the classic EDW acronym for a new title: Extended Data Warehouse.
When you first read this title you probably think these are instructions for someone else—but not you—right? But read on—see where you’re at when it comes to the issues of talking, listening, moderating and overall meeting etiquette.
The dimensional model is a thing of beauty when it is done right. It is designed independently of vendor and architecture. It is built with fact tables for every process, conformed dimensions, hierarchies, slowly changing dimensions, bridges, aggregates and so on. It is a work of art you print on a plotter and hang on your office wall for business users to gaze upon in total awe. It is the map you use to trace data lineage and figure out join paths. It is the reference guide for enhancement impact analysis.
If you have any doubts about the data flood that is covering the globe, here are a few amazing stats. Around the world, in just one minute…
When I received the email notice from the TDWI Dallas Chapter about an upcoming Big Data event, I was interested. The meeting was at 8:00 a.m. on a Friday, the traffic wouldn’t be ideal, but it sounded like this might be a good opportunity.
Why would it be a good opportunity? Bill Inmon was in town!
Business Intelligence guru, analyst and author, Wayne Eckerson, and I had great times when we worked together at The Data Warehousing Institute (TDWI). Although we have both moved on to other ventures, we remain in touch and I still like reading his books and articles.
I particularly enjoyed the first chapter of his most recent book, Secrets of Analytical Leaders: Insights from Information Insiders, where he talks about the concept of “purple people.” But before I explain that, read this concept from Wayne in his book.
There have been several advancements within the Hadoop world that have positioned Hadoop closer to the data warehousing community than ever before. With a series of Hadoop 2.0 releases starting in October 2013, Hadoop is now much closer to being a platform for a data warehouse.
Every once in a while I brush aside all the stacks of paper on my desk and tell myself to spend a couple hours deep diving into some topic to see where it takes me. If browsers could heat up from over-use, there would be smoke coming out of my office for those two hours.
This time I focused on an IDC report that came out towards the end of 2013. I had seen this phrase a few times before, but for some reason it really caught my eye this time:
The Third Platform.