Here are notes I live tweeted at Gluecon 2016.
I took the following notes during various sessions at Gluecon 2016 at the Omni Interlocken in Broomfield Colorado. Notes were live tweeted during the event on @windley using Kevin Marks' Noter Live tool.
starting off the day with how cloud accelerates innovation in software development
Three waves of cloud tools: Colocation, virtualized data centers, and 3rd wave: actual, global, flexible cloud
Goal is NoOps: auto everything. No need to manage or spin up servers. Write code, rather than manage servers
Kubernetes manages containers, supports multiple envs & container runtimes, 100% open source
Speaking of developers: "We keep raising the bar on ourselves"
Goal is to let developers focus on code. PaaS (e.g. appEngine) needs to evolve
Now: PaaS is a walled garden; Future: choice of tools, more complex apps, global scale
31% time spent troubleshooting; ofter in time-critical situations; need better tools: trace, error reporting, prod debug
Up next: building apps on the blockchain
Hyperledger is new project from Linux Foundation.
Business is increasingly interested in permissioned blockchains rather than promiscuous or permissionless blockchains
Governance of blockchain will be incredibly important if we're going to bet on this technology
Requirements for blockchains vary greatly across different use cases
"This is too important to be owned by a single entity" speaking of distributed ledger tech
Hyper ledger has ~50 members, >$6M in funding, 2300 membership requests
Brian Behlendorf is now executive director
Showing how to deploy a blockchain application with Cloudsoft AMP
Key concepts: shared ledger, smart contract, consensus network, membership, events, management, wallet, integration
live demo of asset transfer demo. #brave Challenge is speed at the moment. Proof of asset ownership controls transfer
want to be diverse? Start by diversifying your twitter feed.
Crash course in electrical engineering at start of #IoT session
Done with signals; moving on to components. "kind of like legos except sometimes they catch on fire"
bread boards, perf boards, jumpers, resistors, LEDs, push button, capacitor, servo motors
The recipe: put components on bread board, arduino uno converts component signals, feeds to RaspPI over USB
Johnny Five is a JS code for #IoT on Arduino Uno
Firebase is a real-time database; "data that is best served fresh"
Where's the Bus? as an example of real-time data. I care where the bus is now, not yesterday.
Collaborative drawing is another example where real-time matters. Much less interesting with several second lag
When you're working on your Arduino always do it unplugged or you'll be sad.
After getting the button and LED hooked up: we have a thing, but no Internet yet. Let's add Firebase
why live code when you can live copy/paste
Firebase shows button presses in FB console. "Now, let's go through the Internet to change the LED status"
Connects button to her slides. Button adds rick-roll to the slide.
Celebrate the first time something catches on fire
Slides are here: http://mimming.com/presos/internet-of-nodebots/index.html#/
Multi-tier applications with Kubernetes
speaking on how Intuit is breaking up the monolith
Intuit moving everything to AWS; shutting down local data centers
TurboTax is a $2 Billion business; product managers don't want to touch it (other than updating tax logic)
our vision is to make tax prep obsolete; this makes TurboTax irrelevant
20yo tech stack; terrible, horrible, monolith. Written by tax specialists who became programmers.
Going beyond the interview to personal experience. Why as 20yo barista in NYC if they get California RR Retirement?
Can't replace TurboTax by creating something complex to replace it. #FAIL'd twice already. #gallslaw
2nd problem: trying to create better TurboTax instead of creating product to kill TurboTax
Bearing up the TurboTax monolith: everything as a service; quickly create frictionless experiences
Teams work at their own speed; teams are decoupled; services built for other teams.
path forward to create a pirate ship. Everyone wants to be a pirate.
pirate ship means "this was not a sanctioned project"
took the narrowest part of TurboTax: vehicle registration, 3 screens; TurboTax interview has 53K screens
hardest problem to solve in TurboTax: what does back button do.
back button takes you back a screen; should it save the data or not?
Built vehicle registration in 4 weeks and pushed to production; Sr leadership then sanctioned project;
Old stack: changing user experience took 3 months. New stack: 1 week.
Now 14 most common topics in TurboTax are running on new stack
In a world with 50K interview screens, you can build them manually. Intuit has a "tax player" for tax content
Intuit's ability to enter markets on new devices skyrocketed
Old product had 6 different "beaconing" libraries
In three years Intuit will have eliminated every line of code from the monolith and be completely service based
1. Everything as a service 2. Attack the monolith; 3. Build common application fabric (prescriptive on standards)
Time is Hard: Doing Meaningful Things with Data
Doing meaningful things with fast data
fast data continually reflects changing state; enables real time decision making
Time data often implies big data
Ex: sentiment analysis on Twitter; seismic sensor networks; data fusing from distributed sensors (phones in cars)
individual records are small; all have timestamps; repeated measurements yield time series
Value of time series data diminishes over time; 2 strategies: store nothing & store everything
tiered storage: recent data at high fidelity; older data at low fidelity; store analysis not raw sensor data
Batch processing is common, but reduces responsiveness
Alternative is stream processing; stream processor is stateful and incremental; typically using O(1) algorithm
stream processing: read-once, write-once. No do overs.
stream processing is not only far more timely, but also more efficient than batch processing.
BUT: you lose ability to rewind and do a redo.
Good news: simple primitives take you a long way; bad news: dealing with time is hard
merging streams by timestamp; skewed, irregular, bursty, laggy, jittery, lossy
skew: data from different time series arrive at different timestamps
irregular: aperiodic or unsteady periodicity
bursty: no activity for while, then all arrive at once
laggy: difference between generation and receipt
skew happens all the time. must logically align data within each period; requires understanding data
skew requires alignment between alignment and the periodicity the data is arriving at
Bustiness and lag require deciding how long to wait
The longer you wait the more likely data will appear; but computation is less timely
wait time must be bounded in some way because of finite resources
Types pf clocks: measurement time, receipt time, analytic/processing time
deadlines exclude data: schemes: static guarantees timeliness while dynamic adapts to changing conditions
dynamic deadlines can be set based on how much data is being excluded by deadline
Morning starts off with cognitive computing
Arthur Bock wrote the first gam playing computer (checkers) on the IBM701
IBM's Deep Blue (chess) was 10M times faster than the IBM701; massively parallel with specialized chess playing ASICs
In 1997 Deep Blue was in top 500 super computers. Today the same compute is available on a $400 graphics card
IBM Watson was a 305 year project; team of 15 people. Within a year, it could regularly beat some champions.
Problem domain: broad domain; complex language; high precision, accurate response
Real Jeopardy champions buzz in 80% of the time & answer correctly 80% of the time. That's incredible performance
Can't be solved with lookup table. In 20,000 questions there are over 2500 types; bigger bucket is 3%
Cognitive has evolved from systems that play games to multi-model understanding speech, emotions; all API driven
Cognitive system is a partnership between humans & computers
Cognitive computing depends on understanding, reasoning, and learning.
Cognitive systems are trained, not programmed. They work with humans to develop they're capabilities.
Turning cognitive computing loose on Internet leads to interesting results. For example, it learned dogs are people.
The problem is that there's not "one truth" In some contexts people equate their dogs with people. But dogs aren't people
Now, understanding human speech is an API call away
Three rest calls: speech understanding -> translation -> txt-to-speech yields a speech translator
thanks Kim, Brian, and rest of the staff. They make the conference work
Introspection: find out what went wrong; Insight is finding out things you didn't know
Introspection requirements: specification, status, events, attribution
Audit requirements: transparency, immutability, verification, restrictions & limits, automation (APIs)
Insight requirements: dynamic organization, interactive exploration, visualization
At first blush, IaaS checks a lot of these boxes, but not all of them (eg. immutability, monitoring APIs are wrong)
Containers are the right API object, can be immutable, and can be verifiable
Cluster management has organization, introspection, specification, status, events
Demo of KSQL for querying Kubernetes. Here's the repo: https://github.com/brendandburns/ksql
Kubernetes API server enables immutability by limiting actions on containers (eg ensure code checked in)
ABAC in cluster management allows policies to control actions (eg access, deployment)
Policy ex: Only allow certain people to create containers that come from specific registries/repos
Admission control policies can control resource use (eg auto-approve if resource overage explained in issue ticket)
Mosquito checks for things that haven't changed in 3 months. Finds dead resources.
Or find services or machines that restart the most.
VMs didn't change things; containers and cluster management did.
Robots are lots cheaper; robots are safer;
A collaborative robot or cobot
KRL is Kuka Robot Language (or Kynetx Rule Language)
A 3D printer is a blind robot. Having vision is a good thing.
Big breakthrough is cloud robotics. Off load processing to the cloud. Ex: self driving cars
Big advantages: all robots learn from experiences of the others
Cloud robotics provides designing freedom, collaborative learning and application development
Manufacturing is still area for robotics; only 10% of manufacturing is automated. Barrier: robots are hard to use
Research is another. 90% of research projects not repeatable. And the pipetting by hand for hours isn't fun.
Another: 12M people require full time care. Eg. attach robot arm to wheelchair
I made the term "production identity" up
Google systems have largely been in production for > 10yrs & are highly integrated
GOOG doesn't have all the answers, but they do have all the problems.
Solutions aren't as important as understanding how to breakdown and frame problem
Question: how do we identity production services
Trend: Manual -> Automation
We have more things we're dealing with and the change more often than they have in the past.
Evolving security: (1) network problem; lock down network (2) application security both operations and code analysis
Micro segmentation: surround any piece of hardware with it's own policy. chroot for your network
Does readability imply authorization? Doesn't sound very secure
Microservices -> many connections between components. When a micro service has 100 connections, reachability doesn't cut it
Devops is therapy for large organizations
identity is lower level function from authn or authz
We can come up with a one-size-fits-all solution for production ID where authn and authz not so much
Many applications have their own idea of a user. So secret stores become key translators
GOOG has LOAF: stuff in production has an identity that is transported ambiently
SPIFFE: Secure Production Identity Framework for Everyone
SPIFFY is dailtone for identity
SPIFFE ID: urn:spiffe:example.com:alpaca-service
Developer experience: get SPIFFE ID, give it key pair with certificate chain & root certs to trust
Cert usage: TLS verification and message signing
When we talk about message signing, think JWOT
SPIFFE could be integrated in micro service and RPC frameworks, in smart "side car" proxies, & off-the--shelf systems
Future directions for SPIFFE: federation, authorization, delegation, capability tokens
See https://spiffe.io for more information
speaking on patterns languages for microservices
Successful software development depends on architecture, process & organization
Organization should be small, autonomous teams
Process should be agile
There's no silver bullet for architecture (reference to Fred Brookes)
Architecture patterns are a reusable solution to a problem in a particular context
Patterns force you to consider tradeoffs: benefits, drawbacks, issues to resolve
Patterns force you to consider other patterns: alternative solution and solution introduced to the pattern
Microservices pattern available at http://microservices.io
Infrastructure patterns include deployment and communication patterns
Core patterns include cross-cutting concerns
Application patterns include database architectures and data consistency
Monolithic architecture are relatively simple to develop, test, deploy, & scale (in certain contexts)
Problem is that successful applications keep growing; adding code day-after-day; you end up with a "ball of mud"
Monolithic architectures break process goals of agile and continuous deliver & the org goal of autonomous teams
Microservices architecture functionally decomposes app into many services intermediated by API gateways
Microservice architecture drawbacks: complexity, IPC, partial failure, TxN span multiple services; testing is hard
Issues: deployment; communication, partitioning; distributed data management
Shared databases lead to tight coupling between services; each service needs it's own data store
Data store per service -> services communicating via API only
Event-driven, eventually consistent architecture is solution to data store per service downsides
dual write problems traditionally solved using TxNs. Instead must reliably publish events; use Event sourcing
2nd problem: queries are no longer easy across several services; pattern is CQRS and materialized views
There are many more patterns for deployment, communication, etc.
Connect and control #IoT devices in minutes using voice commands
Architecture uses MQTT to publish & subscribe data from device; processing in cloud; connecting homekit & TI devs
Learned: make devices more self-describable; allows generic UIs that devices plug into and work
Voice is the last mile in device interaction
Doing demos: monitor and control a device with voice commands
Demo using services from Bluemix services catalog
"ambient computing at your disposal"
Nodered used to get commands from speech application; program processes keywords; sends JSON using mQTT to robot
Using an iPod touch as the HomeKit gateway; using another iPod touch as gateway for bluetooth spheros
Created composite applications from multiple device types and the IoT foundation
Capabilities unfortunately change when manufacturers send firmware updates
APIs can be great, but not always... API Ops is the answer
APIs go down; have unversioned changes; API Ops to the rescue
API Ops is like DevOps for APIs
API Ops should build, test, and deploy APIs more reliably.
API Ops and Dev Ops are similar, but different in subtle ways
We're seeing more and more stories about API failures.
API Ops: design, build, test & release APIs more rapidly, frequently, & reliably
Elephant in the room is micro services; DevOps necessary for managing all these services.
Use of API specification has exploded. So has the number of API tools
Why all the tools? The API Lifecycle. 1st gen focused on operation. 2nd gen focused on the rest
API Lifecycle: requirements, design, development, test, deployment, and operations
DevOps is about looking at entire lifecycle. API Ops is similarly focused on entire lifecycle
Going meta: APIs for API Ops
The entire API lifecycle can be controlled with APIs