If you happen to be in Bay Area on Thursday 9th November, then come check out the NoCOUG Fall Conference in California State University in downtown Oakland, CA.
Gluent is delivering a Hadoop for Database Professionals class as a separate track there (with myself and Michael Rainey as speakers) where we’ll explain the basics & concepts of modern distributed data processing platforms and then show a bunch of Hadoop demos too (mostly SQL-on-Hadoop stuff that database folks care about).
This should be a useful class to attend if you are wondering why all the buzz about Hadoop and various distributed “Big Data” computing technologies – and where do these technologies work well (and not work) in a traditional enterprise. All explained using database professionals’ terminology. And there’s a networking event in the end!
You can check out the event agenda here and can RSVP at http://nocoug.org/rsvp.html. If you aren’t already a NoCOUG member, you can still attend for free as a Gluent Guest using the secret code…. “GLUENT” :-)
See you soon!
NoCOUG Conference – Hadoop Workshop
Here’s some more free stuff by Gluent!
We are running another half-day course together with Cloudera, this time in St. Louis on 7. September 2017.
We will use our database background and explain using database professionals terminology why “new world” technologies like Hadoop will take over some parts of the enterprise IT, why are those platforms so much better for advanced analytics over big datasets and how to use the right tool from Hadoop ecosystem for solving the right problem.
More information below. See you there!
Hadoop for Database Professionals – St. Louis
Also, Michael Rainey will deliver a SQL-on-Hadoop overview session in Portland, OR on 6. Sep 2017
NWOUG Portland Training Day 2017
We are running a “Gluent New World training month” in this July and have scheduled 3 webinars on following Wednesdays for this!
The first webinar with Michael Rainey is going to cover modern alternatives to the traditional old-school “ETL on a RDBMS” approach for data integration and sharing. Then on the next Wednesday I will demonstrate some Apache Impala SQL engine’s internals, with commentary from an Oracle database geek’s angle (I plan to get pretty deep & technical). And in the end of the month, a Gluent customer Vistra Energy will talk about their journey towards a modern analytics platforms.
All together this should give a good overview of architectural opportunities that modern enterprise data platforms provide, with some technical Apache Impala hacking thrill too!
Offload, Transform & Present – The New World of Data Integration
Apache Impala Internals with Tanel Poder
- Speaker: Tanel Poder, Gluent
- Wednesday, July 19 @ 12 PM CDT
Building an Analytics Platform with Oracle & Hadoop
- Speakers: Gerry Moore & Suresh Irukulapati, Vistra Energy
- Wednesday, July 26 @ 9 AM CDT
You can see the abstracts and register for the webinars here.
We plan to run more technical sessions about different modern platform components and more customer case studies in the future too. See you soon!
In case you are interested in the “New World” and happen to be in Bay Area this week (19 & 21 Jan 2017), there are two interesting events that you might want to attend (I’ll speak at one and attend the other):
Advanced Spark and TensorFlow Meetup
I’m speaking at the advanced Apache Spark meetup and showing different ways for profiling applications with the main focus on CPU efficiency. This is a free Meetup in San Francisco hosted at AdRoll.
Putting Deep Learning into Production Workshop
This 1-day workshop is about the practical aspects of putting deep learning models into production use in enterprises. It’s a very interesting topic for me as enterprise-grade production-ready machine learning requires much more than just developing a model (just like putting any software in production requires much more than just writing it). “Boring” things like reliability, performance, making input data available for the engine – and presenting the results to the rest of the enterprise come to mind first (the last parts are where Gluent operates :)
Anyway, the speaker list is impressive and I signed up! I told the organizers that I’d promote the event and they even offered a 25% discount code (use GLUENT as the discount code ;-)
This will be fun!
It’s time to announce the next webinar in the Gluent New World series. This time I will deliver it myself (and let’s have some fun :-)
GNW05 – Extending Databases With the Full Power of Hadoop: How Gluent Does It
Mark Rittman has been publishing his podcast series (Drill to Detail) for a while now and I sat down with him at UKOUG Tech 2016 conference to discuss Gluent and its place in the new world with him.
This podcast episode is about 49 minutes and it explains the reasons why I decided to go on to build Gluent a couple of years ago and where I see the enterprise data world going in the future.
It’s worth listening to, if you are interested in what we are up to at Gluent and hear Mark’s excellent comments about what he sees going on in the modern enterprise world too!
Just letting people in DFW area know that I’m speaking at the DOUG Performance & Tuning and 12.2 New Features Technical Day!
- Thursday 20 October 2016 9:30am-5:30pm
- Courtyard & TownePlace Suites DFW Airport North/Grapevine, TX
2200 Bass Pro Court|Grapevine, TX 76051 [map]
Speakers (Seven Oracle ACE Directors!):
- I’ll speak about In-Memory Processing for Databases where I plan to go pretty deep into fundamentals of columnar data structures, CPU & cache-efficient execution and how Oracle’s In-Memory column store does this.
- There will be plenty of Oracle performance talks and also Oracle Database 12.2 topics.
Sign up & more details:
There will also be free beer in the end! ;-)
Update: The video recording is available in Vimeo (see below)
Other relevant reading is James’es Sane SAN 2010 whitepaper and his legendary book!
It’s time to announce the 4th episode of Gluent New World webinar series by James Morle! James is a database/storage visionary and has been actively contributing to Oracle database scene for over 20 years – including his unique book Scaling Oracle 8i that gave a full-stack overview of how different layers of your database platform worked and performed together.
The topic for this webinar is:
When the Rules Change: Next Generation Oracle Database Architectures using Super-Fast Storage
- James Morle has been working in the high performance database market for 25 years, most of which has been spent working with the Oracle database. After 15 years running Scale Abilities in the UK, he now leads the Oracle Solutions at DSSD/EMC in Menlo Park.
- Tue, Jun 21, 2016 12:00 PM – 1:15 PM CDT
- When enabled with revolutionary storage performance capabilities, it becomes possible to think differently about physical database architecture. Massive consolidation, simplified data architectures, more data agility and reduced management overhead. This presentation, based on the DSSD D5 platform, includes performance and cost comparison with other platforms and shows how extreme performance is not only for extreme workloads.
This should be fun! As usual, I’ll be asking some questions myself and the audience can ask questions too. See you soon!
Update: Added links to video recording and slides below.
It’s time to announce the 3rd episode of Gluent New World webinar series! This time Gwen Shapira will talk about Kafka as a key data infrastructure component of a modern enterprise. And I will ask questions from an old database guy’s viewpoint :)
Apache Kafka and Real Time Stream Processing
Video recording & slides:
- Gwen Shapira (Confluent)
- Gwen is a system architect at Confluent helping customers achieve
success with their Apache Kafka implementation. She has 15 years of
experience working with code and customers to build scalable data
architectures, integrating relational and big data technologies. She
currently specializes in building real-time reliable data processing
pipelines using Apache Kafka. Gwen is an Oracle Ace Director, an
author of “Hadoop Application Architectures”, and a frequent presenter
at industry conferences. Gwen is also a committer on the Apache Kafka
and Apache Sqoop projects. When Gwen isn’t coding or building data
pipelines, you can find her pedaling on her bike exploring the roads
and trails of California, and beyond.
- Tue, May 24, 2016 12:00 PM – 1:15 PM CDT
- Modern businesses have data at their core, and this data is
changing continuously. How can we harness this torrent of continuously
changing data in real time? The answer is stream processing, and one
system that has become a core hub for streaming data is Apache Kafka.This presentation will give a brief introduction to Apache Kafka and
describe it’s usage as a platform for streaming data. It will explain
how Kafka serves as a foundation for both streaming data pipelines and
applications that consume and process real-time data streams. It will
introduce some of the newer components of Kafka that help make this
possible, including Kafka Connect, framework for capturing continuous
data streams, and Kafka Streams, a lightweight stream processing
library. Finally it will describe some of our favorite use-cases of
stream processing and how they solved interesting data scalability
See you soon!
Update: The video recording of this session is here:
Slides are here.
Other videos are available at Gluent video collection.
It’s time to announce the 2nd episode of the Gluent New World webinar series!
The Gluent New World webinar series is about modern data management: architectural trends in enterprise IT and technical fundamentals behind them.
GNW02: SQL-on-Hadoop : A bit of History, Current State-of-the-Art, and Looking towards the Future
- This GNW episode is presented by no other than Mark Rittman, the co-founder & CTO of Rittman Mead and an all-around guru of enterprise BI!
- Tue, Apr 19, 2016 12:00 PM – 1:15 PM CDT
Hadoop and NoSQL platforms initially focused on Java developers and slow but massively-scalable MapReduce jobs as an alternative to high-end but limited-scale analytics RDBMS engines. Apache Hive opened-up Hadoop to non-programmers by adding a SQL query engine and relational-style metadata layered over raw HDFS storage, and since then open-source initiatives such as Hive Stinger, Cloudera Impala and Apache Drill along with proprietary solutions from closed-source vendors have extended SQL-on-Hadoop’s capabilities into areas such as low-latency ad-hoc queries, ACID-compliant transactions and schema-less data discovery – at massive scale and with compelling economics.
In this session we’ll focus on technical foundations around SQL-on-Hadoop, first reviewing the basic platform Apache Hive provides and then looking in more detail at how ad-hoc querying, ACID-compliant transactions and data discovery engines work along with more specialised underlying storage that each now work best with – and we’ll take a look to the future to see how SQL querying, data integration and analytics are likely to come together in the next five years to make Hadoop the default platform running mixed old-world/new-world analytics workloads.
If you missed the last GNW01: In-Memory Processing for Databases session, here are the video recordings and slides!
See you soon!