Good introductiory video. Not too many technical details though.
Are compute node SQL Server instances running the same code as the coordinator? Doesn't sound like they need to.
Is data auto partitioning going to be supported?
How Madison compares to now Oracle's Exadata?
What kind of storage (row oriented, column oriented) is used for compute nodes?
Coordinator still seems like a potential bottleneck, if 150 compute nodes start streaming back to the corrdinator, on a poorly scoped query there is still a good chance to food it with data. Are there any provisions for scaling out the coordinator, or it's
vertical scaling for now?
Really looking forward to more videos on Madison (with a bt more details on internals ) .
Is there anything that can be used in cases when binaries are coming not from the internal dev.team or a major vendor, like Microsoft, and there is no .pdbs immediately available? Is it possible to blindly search for a sequence of machine code instructions
(naive signature matching)? Or in this case /gs injected code "optimized out beyond recognition"?
Good video. It's been quite a while I had to write anything in C/C++, so excuse my ignorance.
But, is there a utility to quickly check if an executable or a DLL contains modules that were compiled with /gs or /gs++ flag? In another words to do some sort of a static analysis of program binaries to have at least some level of confidence that it was
hardened against buffer overflows?
Charles, close to the end Andrew mentioned that SQL team is working on putting more CLR into SQL Server.
Is there a chance to interview anyone on the SQL team on that? Are we to expect T-SQL being (finally) executing on the top of the CLR (after all, what's the point of having two VMs on a database server doing essentially the same type of work)? Are we (finally)
to see a decent programming language that can be used to do rich data processing inside the database server (code near) instead of (frankly utterly outdated and feature deprived) T-SQL? Something like F# with LINQ directly to relational engine (bypassing SQL
all together) would be a blessing (after all a relational engine does not have to be SQL).
Regarding lazy evaluated methods, I guess what you are really asking for is an object pipeline at VM or even OS level that would be thread safe and allowed multiple contributors and consumers to plug into it. It would presumably address quite a few scenarios
of massive and/or parallel data processing. And yes, it would be nice if all the interfaces that can potentially return lots and lots of data were "pipeline"-aware.
Too bad it's called SQL Services. If it's not a SQL Server (which is presumably a good thing), it would make sense to drop the SQL moniker all together and name it something like Cloud Database Services. If it does offer a SQL query interface, that's fine, but it doesn't mean that SQL should
be part of the name. If it offered a solid relational calculus DSL and no SQL support, no one would complain.
As soon as you guys are building a new database service, why don't you take it a bit further and allow defining entities and relations between them as predicates in a semi-natural language? Relvars are predicates, according to the relational database theory.
Why not to preserve the relvar's original meaning in the database metadata? That would be a competitive advantage.
PS: What's that S-LINK thing, anyways? Is it a subset of LINK or a superset of thereof?