In the late 1990s, the new owners of the Oakland Athletics had to cut payroll. So GM Billy Beane used statistics to discover undervalued players.
It turned out that certain players were good at getting walks or stealing bases. While that wasn't flashy, it correlated with winning games. From 2000 to 2003, Oakland reached the playoffs, and had the most wins per payroll dollar of any team.
Just a few years ago, critical decisions were made by experts, whose wisdom was hard won over years of experience. But now, decisions informed by analyzing vast amounts of data are proving more effective. Increasingly, the depth, breadth, agility, and speed of data analytics are a strategic differentiator and a critical success factor
We have the data. We have the computing resources. Yet too few of us are breaking out the champagne. What's wrong?
Our analytic engines are letting us down. They make it too hard to load new streams of data. They impose a high cost for changing modeling decisions. They are inflexible, costly, imperialistic, sluggish, and frankly, they have too much attitude.
Not that they should. Analytic optimizers were designed in the 1980s. They're not aware of the capabilities of your modern infrastructure. They don't leverage SIMD processing or vector instructions. They don't treat SSD any differently than disk. They could care less about CPU cache latencies or capacities.
So to conform to the limitations our engines impose on response time and resource provisioning, we're forced into hacks and shortcuts. Often we have to subset our data, running the risk of losing valuable cross-subset correlations. What a mess!
Physics Speed is an analysis engine that unlocks the power of your existing hardware, to achieve performance that is as fast as the laws of physics permit. It takes a new approach to query processing, that focuses on streaming data flows.
Where other analytic engines start with a formalism like relational algebra and optimize, Physics Speed starts with the actual hardware characteristics and systematically eliminates processing bottlenecks. So analysis runs as fast as the hardware allows.
How fast is that? Across several industry standard benchmarks, Physics Speed was consistently more than ten times faster, on hardware that was much less expensive. The table at the top shows some example results on 100GB of data. That's tiny of course, but we got tired of waiting for the unaccelerated engines to answer questions on as little as 1TB of data.
The summary table on the bottom compares price and performance across simple queries ("scan and aggregate") and complex queries. The important point is that you can dial the knob between high performance and lower cost to fit your fancy. Run on equivalent hardware and go 50 times faster. Or dial your costs down while maintaining the same performance. Or strike a balance in beteen.
|Example Queries||Physics Speed||Greenplum||Redshift||Spark 2.0|
|tpcds 28 sf100||3.2 sec||141||56.5|
|tpcds 34 sf100||1.3 sec||31.7||12.2||30|
|tpcds 53 sf100||1.0 sec||29.4||26.1||21|
|tpcds 63 sf100||1.0 sec||28.6||17.2||15|
|tpcds 90 sf100||0.3 sec||14.2||10.8|
|Physics Speed||Greenplum||Redshift||Vertica||Netezza||Spark 2.0|
|hardware cost||9x cheaper||2x cheaper||3x cheaper||100x cheaper||8x cheaper|
|Scan & Agg||40x faster||30x faster||80x faster||3x faster|
|Complex Joins||16x faster||10x faster||20x faster|
|total advantage||250x better||40x better||240x better||300x better||160x better|
Physics Speed was created to deliver the "AHA!" feeling that comes with sudden insight, to give flight to your curiosity, to rise above the dust and clouds, so you can see all that can be seen.
We want to sweep away the drudgery and clear the runway, because nothing should come between your ideas and your data.
So we built an ANALYTIC ENGINE that executes your experiments as fast as physics, and as easy as pie.
Foster founded Netezza, introducing data warehouse appliances to the market. He has a B.S. and Masters in Engineering from Cornell University, an MBA from Harvard Business School, and a long track record of success.
Craig co-founded Ontologic, the first object database company to use C++ as its DML. Then he created PulseTrak which delivered a sentiment extraction service. Craig teemed with Foster at Netezza and was previously an architect at Oracle. He has a B.A. from Dartmouth College
Richard has long experience in algorithm and data structure optimizations, and deep hardware knowledge. He worked with Craig at Oracle. Richard has a B.A. in Math from Cambridge University and a M.Sc. in Computer Science from Edinburgh University.
What could you learn mashing up ten times the amount of data, with five times more analytic functions?
Projects finish faster. Not just because the queries go faster. Spend less time loading and tuning and rewriting.
Unleash the power of your existing hardware, with multi-threading, AVX instructions, storage tiering and more.
Native performance means less time futzing with indexes, tablespaces, and other arcana.
Plug your own proprietary algorithms right into the data flow. We bring the data to you.
Connect with your existing tools. Access your existing data.
When you can't afford to wait any longer
Use Physics Speed and take back all the time you need
A Perfect Design is made with Passion, Dedication,
and a lots of Coffee!
What separates design from art is that design is
meant to be... functional.
You've seen it all, and you still need more. More speed, more power, more "right now" and less "get back to you tomorrow". Ninety-nine percent of the world's analytic needs are served by systems that are good enough. Lucky you. Welcome to the 1% of people who need something 10 times faster. Not that you wouldn't take more.
A large semiconductor fabricator reaps $millions in profits for every 0.1% improvement in yield per month. So figuring out what's going wrong when spinning up a new design is part of their business DNA. But with hundreds of steps and machines and environmental factors, the retrospective cohort analysis operates over huges amounts of data, and requires days of processing time for each experiment. By accelerating their pattern analysis, Physics Speed shortens their time to market.
A major telecommunications company carries most of the telephone traffic across Europe and North Africa. But they don't do it alone. Each call flows through a series of carriers across a number of switches with a number of segments. Physics Speed helps them stitch together the segments of a phone call to assure that each carrier gets revenue for the portion of the call that they carry.
Your organization wants the flexibility and economics that come with operating in the cloud. You just want to avoid the fee fie foe fum. Existing applications should run the way they do now. You would even sacrifice some speed in exchange for low cost reliability, and easy dev ops.
A young customer marketing business had a whale of a client that they they served on some expensive hardware. As they grew they needed a lower cost way to support smaller clients, without bifurcating their core product.
A mobile marketing business was already in the cloud, designed that way from day one. But as their revenues grew, their operating costs grew even faster. They had chosen a popular MPP column store for their analytics but keeping it all balanced and online was becoming prohibitively expensive. They could shut it down, but starting it up was taking longer. They needed a solution that scaled, but wanted sub-linear costs.
You like your existing ride just fine. But the data has grown faster that anyone guessed, and you're flat outta gas. You need to find a home for your older data, your newer unvetted experimental data, and the data of your smaller customers. The solution has to fit within your budget, and everything has to keep working more or less the way it does now, preferably without providing lifetime employment to a team in India.
A cable operator happily ran all their analytics on a Netezza TwinFin(tm) appliance. Until they ran out of space. Their budget didn't have room for another rack, and they couldn't throw away their data. They tried a hadoop-based solution, but six months later, they were still trying to get their applications to run on the new platform. And the performance? It felt like their transmission was shot: a jerk forward, then a stall.
An investment bank had a team of quants happily using a pricey MPP appliance.
Then their CIO ramped it up a notch, but they were two years away from another MPP buy.
They had plenty of servers available. But no plug-compatible way to scale out, while
incorporating their special investment juju.
An analytic query engine accelerator that plugs into an existing DBMS, to unleash the performance potential of your hardware, so your experiments run as fast as the laws of physics permit. Of course your existing applications don't change at all, except for running faster.
The core component of Physics Speed is a shared library that plugs into an existing DBMS, including open source databases like Postgres (single-node) or Greenplum (multi-node). Physics Speed accelerates analytic queries by generating C++ code to execute portions of the query. This code is compiled and dynamically linked into the host DBMS as a "user-defined function".
Physics Speed understands how to accelerate basic operations like scans, joins, sorts, aggregations and set operations. If more complex constructs are found, like text processing or windowed aggregations Physics Speed falls back to the host DBMS for processing. In most cases, Physics Speed handles the whole query
The diagram to the right shows where Physics Speed's shared library plugs into the host DBMS query stack. The generated code can be compiled with either clang or g++. The resulting shared libraries are cached so that similar queries can reuse previously compiled code.
Sometimes the compilation process can take a second or two, so Physics Speed is not a good choice for applications that need sub-second response times for "find the needle in the haystack" queries, or for very small databases.
Most database vendors are able to access other vendors' databases through a "federation" layer. For example, this layer allows an IBM DB2 system to query a Hadoop/Spark system through the DB2 SQL API. In a federated system, the remote DBMS is "wrapped" in a way that tells the host DBMS which pieces of data it contains. The host DBMS then divides up its plan for answering a question, pushing a portion of the plan to the remote system.
Physics Speed plugs into a DBMS as a virtual remote database. It claims to contain data that directly answers a portion (or the whole) of the original query. It doesn't actually contain that data. Instead, it quickly generates a C++ program that produces the results to the query, and returns it as if it had the data stored on disk all along.
Where does it get the actual data? In some cases, from the host dbms itself. For systems with slower, row-oriented storage, Physics Speed also offers a highly compressed column store. Because Physics Speed operates directly against its compressed data, this typically results in an additional 10x performance boost.
The Performance Engine parallelizes your analytics over the cores in a processor, and across servers. As your data and processing needs expand and contract, you can add or remove servers to maintain acceptable cost and performance. The engine takes care of moving code and data across the network to balance resource usage. So you get the linear scalability you expect from an MPP appliance with the flexible elasticity you have with a map-reduce architecture.
If you already have a system, such as Greenplum, which provides parallel query processing, then the Physics Speed Performance Engine plugs right in and uses the existing system's data shuffling mechanisms.
With the Performance Engine, you don't have to guess your peak load years in advance, paying for more resources than you need most of the time, or risk falling short just when the going gets tough. Instead, you can pay for what you're using right now. And then ramp up or down as workloads change.
Some types of analytics don't fit easily into an existing DBMS. They might incorporate an element of chance, or have inputs from a complex event stream. An example might be a Monte Carlo simulation of a stochastic process involving geometric Brownian Motion, where a state at a given time involves both deterministic and probabilistic factors. These kinds of stochastic processes arise when assessing risk, and have applications in project management, banking, option value assessment, insurance and warranty claims.
PathScope uses the underlying capabilities of the Physics Speed Performance Engine to operate on streaming data flows, with:
It combines these capabilities with stochastic capabilities including efficient and parallelized generation of random numbers conforming to a probability distribution.
PathScope can generate Monte Carlo paths in parallel and evalautate multiple grouped aggregators in parallel at 2100 million timesteps per second on a 16 core processor. That's about 1 trillion time/risk points in 1.3 hours on a single processor.
Most database optimizers are written without regard for modern hardware. With few exceptions, they ignore SIMD vector processing, and the 200X difference between L1 cache and main memory. Only a few DBs use multiple cores for a single query. These techniques, along with compressed columnar storage and code generation, are "table stakes" for analytic processing.
Physics Speed goes beyond these basics with new techniques for primitive query operations, and dynamically changing execution plans as processing occurs. But the main thing is a different way of thinking. Instead of asking how to do things faster, we start with the theoretically optimal, speed-of-light approach.
Code Generation and Compilation
Stream through L1 Cache
Multicore Intra-query parallelism
For large scan and aggregation
price * performance for TPCH like queries
based on Twin Fin list pricing
Plugs right in
Modern CPUs support SIMD processing, which executes a single instruction simultaneously across all elements in a vector register. The latest Intel CPUs allow as many as 8 (AVX2/Haswell) or 16 (AVX-512/Skylake) 32-bit elements in a vector. This means arithmetic can be performed on many numeric values simultaneously providing as much as a 16x performance boost ... BUT ONLY when processing vectors of attribute values.
In terms of analytic database processing, this means there's a huge penalty to operating on whole records, and a huge advantage to operating on column attributes across multiple records.
Most analytic systems support columnar storage these days. But too often, they recombine columns into records too early, as data flows from storage into RAM. This fills a cache line with information that might be relevant in the very next instruction, but is irrelevant for the current instruction, and so loses vector parallelism
Even systems that are careful about staying columnar can lose the potential for vector processing. Most analytic engines interpret a tree of analysis operators. The top level operator asks its children for one row of input, whereupon the children ask their children for one row of input, and on down the line, until some kind of scan operator reads one record from a file system buffer. Then each operator transforms its input into one row of output, resulting in the next query result. This is called "demand pull."
The advantage of demand pull systems is that they can avoid needless work. But as you can imagine, with all the context switching between operators, demand-pull systems lose the opportunity for processing multiple vector elements at the same time.
Up until 2000 or so, analytic query processing was relatively simple. Computers consisted of CPUs, RAM and spinning disk. RAM was expensive and spinning disk was about 10,000 times slower than anything else, so optimizing analytic performance was almost entirely about optimizing getting information off of disk.
But things are different now! SSDs are larger, cheaper, and capable of using the faster PCIe bus at over 2GB/sec bandwidth. Soon, non-volatile 3D XPoint memory in NVDIMMs will allow even higher bandwidth with sub-microsecond latency. Cloud object storage offers enhanced durability and availability at low cost, but with high access latency. With these different tiers of storage with widely varying cost, latency, and throughput, the analytic query engine has many new opportunities to balance query performance and system cost by managing the flow of data between storage tiers.
And it doesn't stop there, CPU caches are growing in size, with lower latencies, but there's still more than a 10x difference between L2 cache and RAM and a 5X difference between L3 cache and RAM.
All of this means that the location of data matters, up and down the stack. Hash tables and bucket sorts that fit in L2 or L3 cache are much faster than those in DRAM. Within a record, some columns may be accessed frequently and others rarely. Even within a column, values for some records (e.g. recent records) may be accessed more frequently than for other records. Physics Speed supports locating data at the optimum location in the storage hierarchy to provide the best balance between performance and cost.
With long experience helping deliver analytic solutions, we can help in many ways.
It all starts with you. Your goals, your data, and your infrastructure. We can provide tools, advice on resource sizing, and a path to success.
Physics Speed can be deployed on premise or in the cloud. Either way, it plugs into your database and works with your data. We can help you hook it all together.
Specify which data you want to accelerate. We'll analyze a sample and suggest how it could be optimally compressed and partitioned.
Is cleansing, extracting and loading data really the best use of your time? You can give us the grunt work so that you can focus on creating value.
With Physics Speed, you can plug your existing algorithms directly into the data flow, processing it as it streams by. We're happy to connect it up for you.
We can help you track query performance, adjust compression and storage tiering, and administer resources.
Tell us how to get in touch with you, and a little about your current environment