In case you are interested in the “New World” and happen to be in Bay Area this week (19 & 21 Jan 2017), there are two interesting events that you might want to attend (I’ll speak at one and attend the other):
Advanced Spark and TensorFlow Meetup
I’m speaking at the advanced Apache Spark meetup and showing different ways for profiling applications with the main focus on CPU efficiency. This is a free Meetup in San Francisco hosted at AdRoll.
Putting Deep Learning into Production Workshop
This 1-day workshop is about the practical aspects of putting deep learning models into production use in enterprises. It’s a very interesting topic for me as enterprise-grade production-ready machine learning requires much more than just developing a model (just like putting any software in production requires much more than just writing it). “Boring” things like reliability, performance, making input data available for the engine – and presenting the results to the rest of the enterprise come to mind first (the last parts are where Gluent operates :)
Anyway, the speaker list is impressive and I signed up! I told the organizers that I’d promote the event and they even offered a 25% discount code (use GLUENT as the discount code ;-)
This will be fun!
In case you missed this webinar, here’s a 1.5h holiday video about how Gluent “turbocharges” your databases with the power of Hadoop – all this without rewriting your applications :-)
Also, you can already sign up for the next webinar here:
- GNW06 – Modernizing Enterprise Data Architecture with Gluent, Cloud and Hadoop
- January 17 @ 12:00pm-1:00pm CST
- Register here.
See you soon!
It’s time to announce the next webinar in the Gluent New World series. This time I will deliver it myself (and let’s have some fun :-)
GNW05 – Extending Databases With the Full Power of Hadoop: How Gluent Does It
Mark Rittman has been publishing his podcast series (Drill to Detail) for a while now and I sat down with him at UKOUG Tech 2016 conference to discuss Gluent and its place in the new world with him.
This podcast episode is about 49 minutes and it explains the reasons why I decided to go on to build Gluent a couple of years ago and where I see the enterprise data world going in the future.
It’s worth listening to, if you are interested in what we are up to at Gluent and hear Mark’s excellent comments about what he sees going on in the modern enterprise world too!
Just letting people in DFW area know that I’m speaking at the DOUG Performance & Tuning and 12.2 New Features Technical Day!
- Thursday 20 October 2016 9:30am-5:30pm
- Courtyard & TownePlace Suites DFW Airport North/Grapevine, TX
2200 Bass Pro Court|Grapevine, TX 76051 [map]
Speakers (Seven Oracle ACE Directors!):
- I’ll speak about In-Memory Processing for Databases where I plan to go pretty deep into fundamentals of columnar data structures, CPU & cache-efficient execution and how Oracle’s In-Memory column store does this.
- There will be plenty of Oracle performance talks and also Oracle Database 12.2 topics.
Sign up & more details:
There will also be free beer in the end! ;-)
Update: The video recording is available in Vimeo (see below)
Other relevant reading is James’es Sane SAN 2010 whitepaper and his legendary book!
It’s time to announce the 4th episode of Gluent New World webinar series by James Morle! James is a database/storage visionary and has been actively contributing to Oracle database scene for over 20 years – including his unique book Scaling Oracle 8i that gave a full-stack overview of how different layers of your database platform worked and performed together.
The topic for this webinar is:
When the Rules Change: Next Generation Oracle Database Architectures using Super-Fast Storage
- James Morle has been working in the high performance database market for 25 years, most of which has been spent working with the Oracle database. After 15 years running Scale Abilities in the UK, he now leads the Oracle Solutions at DSSD/EMC in Menlo Park.
- Tue, Jun 21, 2016 12:00 PM – 1:15 PM CDT
- When enabled with revolutionary storage performance capabilities, it becomes possible to think differently about physical database architecture. Massive consolidation, simplified data architectures, more data agility and reduced management overhead. This presentation, based on the DSSD D5 platform, includes performance and cost comparison with other platforms and shows how extreme performance is not only for extreme workloads.
This should be fun! As usual, I’ll be asking some questions myself and the audience can ask questions too. See you soon!
Update: Added links to video recording and slides below.
It’s time to announce the 3rd episode of Gluent New World webinar series! This time Gwen Shapira will talk about Kafka as a key data infrastructure component of a modern enterprise. And I will ask questions from an old database guy’s viewpoint :)
Apache Kafka and Real Time Stream Processing
Video recording & slides:
- Gwen Shapira (Confluent)
- Gwen is a system architect at Confluent helping customers achieve
success with their Apache Kafka implementation. She has 15 years of
experience working with code and customers to build scalable data
architectures, integrating relational and big data technologies. She
currently specializes in building real-time reliable data processing
pipelines using Apache Kafka. Gwen is an Oracle Ace Director, an
author of “Hadoop Application Architectures”, and a frequent presenter
at industry conferences. Gwen is also a committer on the Apache Kafka
and Apache Sqoop projects. When Gwen isn’t coding or building data
pipelines, you can find her pedaling on her bike exploring the roads
and trails of California, and beyond.
- Tue, May 24, 2016 12:00 PM – 1:15 PM CDT
- Modern businesses have data at their core, and this data is
changing continuously. How can we harness this torrent of continuously
changing data in real time? The answer is stream processing, and one
system that has become a core hub for streaming data is Apache Kafka.This presentation will give a brief introduction to Apache Kafka and
describe it’s usage as a platform for streaming data. It will explain
how Kafka serves as a foundation for both streaming data pipelines and
applications that consume and process real-time data streams. It will
introduce some of the newer components of Kafka that help make this
possible, including Kafka Connect, framework for capturing continuous
data streams, and Kafka Streams, a lightweight stream processing
library. Finally it will describe some of our favorite use-cases of
stream processing and how they solved interesting data scalability
See you soon!
Update: The video recording of this session is here:
Slides are here.
Other videos are available at Gluent video collection.
It’s time to announce the 2nd episode of the Gluent New World webinar series!
The Gluent New World webinar series is about modern data management: architectural trends in enterprise IT and technical fundamentals behind them.
GNW02: SQL-on-Hadoop : A bit of History, Current State-of-the-Art, and Looking towards the Future
- This GNW episode is presented by no other than Mark Rittman, the co-founder & CTO of Rittman Mead and an all-around guru of enterprise BI!
- Tue, Apr 19, 2016 12:00 PM – 1:15 PM CDT
Hadoop and NoSQL platforms initially focused on Java developers and slow but massively-scalable MapReduce jobs as an alternative to high-end but limited-scale analytics RDBMS engines. Apache Hive opened-up Hadoop to non-programmers by adding a SQL query engine and relational-style metadata layered over raw HDFS storage, and since then open-source initiatives such as Hive Stinger, Cloudera Impala and Apache Drill along with proprietary solutions from closed-source vendors have extended SQL-on-Hadoop’s capabilities into areas such as low-latency ad-hoc queries, ACID-compliant transactions and schema-less data discovery – at massive scale and with compelling economics.
In this session we’ll focus on technical foundations around SQL-on-Hadoop, first reviewing the basic platform Apache Hive provides and then looking in more detail at how ad-hoc querying, ACID-compliant transactions and data discovery engines work along with more specialised underlying storage that each now work best with – and we’ll take a look to the future to see how SQL querying, data integration and analytics are likely to come together in the next five years to make Hadoop the default platform running mixed old-world/new-world analytics workloads.
If you missed the last GNW01: In-Memory Processing for Databases session, here are the video recordings and slides!
See you soon!
Although we are still in stealth mode (kind-of), due to the overwhelming requests for information, we decided to publish a video about what we do :)
It’s a short 5-minute video, just click on the image below or go straight to http://gluent.com:
And this, by the way, is just the beginning.
Gluent is getting close to 20 people now, distributed teams in US and UK – and we are still hiring!
This Gluent New World webinar is based on my RAM is the new disk and how to measure its performance article series:
I’m using the Oracle Database In-Memory option as an example here, but the same rules apply to other row & column stores as well.
You can also subscribe to our new Vimeo channel here – I will announce the next event with another great speaker soon ;-)
A few comments:
- Slides are here
- I’ll figure a good way to deal with offline follow-up Q&A later on, after we’ve done a few of these events
If you like this stuff, please share it too – let’s make this series totally awesome!