Chris Anderson - Talking shop about Avalon

This isn't the CLR. In our world, we compile entire MSIL for the kernel into x86 instructions at installation time. There is no libc at the bottom.
However, we do have around some assembly code. Like a kernel written in C, our C# kernel needs assembly code to handle the lowest part of the interrupt dispatch on the x86. But once the assembly code has finished, it dispatches directly into compiled C# (no
C). BTW, there is some C code in the system, primarily for the debugger stub.
Beer28 wrote:comment withdrawn.
Beer28 wrote:I just finished it, at the end Charles commented on the OS having a webserver asking if it "parsed html and stuff." (52:10)
A webserver reads the file off the disk(the document portion of the http request header), optionally shoots it to registered functions of interpreters loaded as modules in it's proc address, like mod_php or mod_perl, or even mod_mono, or does cgi piping(older style) to an interpreter process, then takes that output and throws it back down the tcp line with send(socket,,);
An http server doesn't do any type of document processing on it's own, that's the browser on the client side that parses it, and sets it up for drawing to the client area. That's why the guy came back right away and said "http".
Manip wrote:It was just lots of edits as he watched the video, very messy, no loss.
Manip wrote:You know you come across a lot better in person than you do when posting online.
Manip wrote:hmm ironic you edited the above post, what were you saying about living with what you post again?
Ahh but I defending my ego by calling you names and changed the meaning of the post. All just by adding text. So a very fine line indeed.
PS - I don't think you're a jackass.
Charles wrote:
Manip wrote: It was just lots of edits as he watched the video, very messy, no loss.
I don't like the practice of removing what you post because you decide you don't want it to be there any more. Live with what you say. Have some onions and accept it when you look like a fool or whatever. Hey, I do!
C
Beer28 wrote:
if he does have a copy he's welcome to put it back. I don't care. I deleted it so nobody would get confused by my using the post as a whiteboard.
From what I gather from the video - they used a cut down version of the clr - except for the low level assem stuff.
They have the GC and strong typing at the kernal level. Jim said that he did away of the stuff they didn't need - the different ways to compare strings in the different languages (is there a difference between comparing strings in a language - a string is
a string ??) There is also no JIT - so it would be like running NGEN on the OS.
Java had a similar go at trying to create a Java OS desktop some time ago ? Anyone know how that went?
I like the idea of this OS, but it will not replace windows anytime soon. But could replace things like Windows CE / SmartPhone etc (aren't these all os's ?)
EDIT: NGEN pre-JIT's your code base
Beer28 wrote:
Buzza wrote: They have the GC and strong typing at the kernal level. Jim said that he did away of the stuff they didn't need - the different ways to compare strings in the different languages (is there a difference between comparing strings in a language - a string is a string ??) There is also no JIT - so it would be like running NGEN on the OS.
EDIT: NGEN pre-JIT's your code base
galenh wrote: This isn't the CLR. In our world, we compile entire MSIL for the kernel into x86 instructions at installation time. There is no libc at the bottom.
He says there's no CLR. The clr would need the real kernel and libc anyway. It sounds like they're compiling the C# code to native instructions the way a C compiler would with a library that it uses to do io and memory access.
What do you think Buzza?
staceyw wrote:There are 2 large differences. You can compile C code without importing anything, thus making it independant. The 2nd difference is that you can statically compile in both the C and C++ runtimes.
I see no practicle difference here. If you can do it in C, you can do it in any language as long as you have the compiler support. That language itself does not matter. You could do it in Perl as long as you had the support to convert the MSIL into native code to run on the bare metal (which is what Bartok does AFAICT).
Beer28 wrote:The huge difference between C, C++, Perl, and java || C# is that the latter 2 have memory management through a garbage collector and cannot directly access memory pointers.
Perl5 uses reference counting. Perl6 will use Parrot.
If the compiler links all the runtime support as one huge linkable monolith binary then fine, that's what gcj does, except it's shared, and not usually statically linked. In a kernel though you'd have to link it statically.
Case in point, one of the people in the video responded and wrote, in libc, you have functions written with C, but in their kernel you have functions equivalent to libc written with C#.
How are you going to write a malloc routine with C#?
You have to allocate memory at some point in a kernel, you have to have some IO and communication with devices through shared memory. You have to use memory addresses, you have to load the ivt with the entry points of your int handlers.
C# won't let you manipulate memory pointers. There has to be something else there. That big runtime bartok links your C# to must be full of C code.
C# can't do these things because of the language, so yes it differs from C/C++ and Perl.
EDIT: Or say you get an interupt and god forbid your x86 compiled C# handler must pull values out of registers to handle it. C# has no such functionality, and you can't drop to asm.
Beer28 wrote:
How can you swap memory for IO with a device if C# won't let you out of the sandbox and handle memory, for instance. The video says there is no underlying Win32 API, so there's no PInvoke there to do that.
Man... I can't decide whether I want to drool over Xbox 360 or over Singularity. I'm speechless.
I wonder if this sort of thing might create incentive to make more .NET hardware? I'm no doubt pushing it here, but it would be amazing to see, say, a processor that takes in IL as its machine language. I doubt there'd be very much of an advantage to such a
thing (it'd be ridiculous to manufacture cost-effectively I'm sure), but it's something to daydream about.
And yeah, so what if a lot of the code actually turns out to be C or machine code? So what if it's 90% managed rather than the 99% that they say? What does it prove? What are you trying to prove?
reinux wrote:Man... I can't decide whether I want to drool over Xbox 360 or over Singularity. I'm speechless.
I wonder if this sort of thing might create incentive to make more .NET hardware? I'm no doubt pushing it here, but it would be amazing to see, say, a processor that takes in IL as its machine language. I doubt there'd be very much of an advantage to such a thing (it'd be ridiculous to manufacture cost-effectively I'm sure), but it's something to daydream about.
And yeah, so what if a lot of the code actually turns out to be C or machine code? So what if it's 90% managed rather than the 99% that they say? What does it prove? What are you trying to prove?
Beer28 wrote:
Wouldn't that mean that the bulk of the kernel is that dependancy free mini-CLR runtime binary that bartok links the code to. What's that written in?
Beer28 wrote:
As for the stack, the stack is more than just an area of memory for pushing and popping, it works with the registers in the CPU for cpu instructions like push and pop, and call and ret.
Beer28 wrote:
Buzza wrote: registers / stack / etc are so low level - last time I touched them was when i was doing control electronics on a 68HC11.
I have a Motorolla 68HC11 test board I want to get rid of, if you're feeling nostalgic and you want a good clean one for twenty bucks, let me know.
Beer28 wrote:
Buzza wrote: registers / stack / etc are so low level - last time I touched them was when i was doing control electronics on a 68HC11.
I have a Motorolla 68HC11 test board I want to get rid of, if you're feeling nostalgic and you want a good clean one for twenty bucks, let me know.
Beer28 wrote:
If the CLR runtime that the kernel is linked to when it bartok compiles the C# code is written IN C#, what code is managing the GC of the CLR, if the CLR runtime itself is C#?
What is the the CLR linked to as a runtime?
C# code can't exist on it's own, because of it's dependancy on memory management and it's inability to reach into memory. It's not like you can compile C#, using the confines of the language for a 8051 chip and load it the way you can with C and SDCC.
The language itself requires a runtime for GC and memory management. That's like saying you're going to write a 8051 or any MCU/CPU OS with Visual Basic and the msvbvm is also written in VB, it doesn't compute.
You've basically described it. Basically all of the runtime is written in C#. The GC itself is written in C# using the "unsafe" extensions.
The GC gets all of its memory from a very simple page manager at the very bottom of the system.
The page manager and GC are written carefully so that they don't require any GC'd memory.
Beer28 wrote:
So why doesn't visual studio .NET have x86/64 as a C# compile target?
Beer28 wrote:
are they dependancy free like the bartok compilied C# binaries they're doing the kernel with if you don't ever use anything but stack local vars?
galenh wrote:Like the OS, the compiler is a research prototype. We use it to try out new ideas.
Beer28, AFAIK, only Bartok does that.
Beer28 wrote:so there's no heap when bartok compiles C# to x86?
it's all .bss and .data type reserved memory?
The kernel has it's own C library with a few functions but they have no GC dependancy or other dependancy.
Even x86 compiled C# by ngen has the GC dependancy and the runtime dependancies. It's still linked to all the CLR imports.
Maybe it compiles the C# to x86 and does all allocations as hard allocated initialized or uninitialized data reserved instead of doing a heap and reclaiming memory after use?
What I think you need to do is forget about malloc and free and look at what these functions do.
New is part of the framework and free is implemented via the GC - in regards to what the functions do - they are extrememly simple and most likely implemented in assem - remember they said they dropped down to assem for parts.
I found this site about another OO language called Oberon:
http://www.oberon.ethz.ch/native/
Where they discuss memory management in another type of OS called Oberon. It's also a OO language.
http://www.oberon.ethz.ch/native/WebHeap.html
They discuss memory allocation. This is most likely what these guys did.
Beer28 wrote:
like C++ or java, new is part of the language, it's a keyword, malloc is not, heap allocation and freeing is actually embedded in the language specs.
Beer28 wrote:
like C++ or java, new is part of the language, it's a keyword, malloc is not, heap allocation and freeing is actually embedded in the language specs.
Beer28 wrote:
Why not "new" the memory for the map table?
Beer28 wrote:say you are writing the createpagetable() function, and you call new something(); , what's going to happen?
Where's it going to allocate the object?
Again, I don't think NEW would be implemented in c# but probably in low level assembler - same as the GC?
Beer28 wrote:
As a matter of fact the implementation of anything in the C library is not defined at all. It doesn't matter how it's implemented as long as it is according to the standard and as long as those functions are linkable in C code. That's why C is portable and works on everything.
I could be wrong, but I believe new and relying on auto-destruction is part of C# and not the CLI.
unsafe ZeroSomeRam() { int *pX = 0x00000000A; for (int i=0;i < 100;i++) { *pX = 0; pX ++; } }
Beer28 wrote:
There is the C language, then there's the C library.
Could you please post more information about your project? The information posted on
https://research.microsoft.com/os/singularity/ is not very much.
I agree. This could be by design, but that paper and ppt seem to target high level people, not technical people interested in the design.
Here, maybe this helps. Code in C#:
class Class1
{
static void Main()
{
Console.Write('.');
Class1 c1 = new Class1();
Console.Write('.');
}
}
Compiled into IL:
.method private hidebysig static void Main() cil managed
{
.entrypoint
// Code Size: 21 byte(s)
.maxstack 1
.locals (
Test167.Class1 class1)
L_0000: ldc.i4.s 46
L_0002: call void [mscorlib]System.Console::Write(char)
L_0007: newobj instance void Test167.Class1::.ctor()
L_000c: stloc.0
L_000d: ldc.i4.s 46
L_000f: call void [mscorlib]System.Console::Write(char)
L_0014: ret
}
And finally, x86 instructions as disassembled by the VS.NET debugger:
static void Main()
{
Console.Write('.');
push ebp
mov ebp,esp
push eax
push edi
push esi
xor edi,edi
mov ecx,2Eh
call dword ptr ds:[79C566A8h]
Class1 c1 = new Class1();
mov ecx,0AD5098h
call FDBE1FC0
mov esi,eax
mov ecx,esi
call dword ptr ds:[00AD50D4h]
mov edi,esi
Console.Write('.');
mov ecx,2Eh
call dword ptr ds:[79C566A8h]
}
nop
pop esi
pop edi
mov esp,ebp
pop ebp
ret
I think if you look at byte 7 of the IL, you can see that it's neither C# nor the IL that decides how objects are created. Since for Singularity they're using their own .NET runtime, making "newobj" do something other than what the normal .NET JITter would do is probably as easy as reimplementing malloc in C. At least, I don't see why it wouldn't be.
Manip wrote:It might not be a speed demon but it is what I would like running on an ATM machine I am using or on the hospital monitor equipment.
Beer28 wrote:
Basically, are those allocation functions imported from the CLR?
Beer28 wrote:
I think the compiler handles it after all this. I think the compiler, the bartok, does special heap allocations as bss and data(init and uninitialized reserveds) for C# that compiles the pagemanager and stuff that can not be GC'd because I can't figure out how else it could be done.
Beer28 wrote:int *pX = 0x00000000A;
pX = 0;
Doesn't that defeat the purpose of managed code?
<quote>If it has data structures that need memory reserved, the programmer simply decides where to put them and hey presto it's done - that memory is allocated.</quote>
you mean like the bss and data segments of elf or pe. Yeah, I've been writing that over and over again in this whole thread. You do the same on embedded mcu's.
But as long as you're doing this and doing stack only programming and allocating the memory yourself in raw ram, wouldn't it be better to do it in C where there isn't OO and huge data structures in classes to push on the stack and reserved memory?
Wouldn't you rather put task structs into a linked list than a managed array?
I mean, think about how often a scheduler goes through a linked task struct list. Would you really want the compiler to generate all the extra type safe instructions for critical operations that execute hundreds of times a second?
Beer28 wrote:
Wouldn't you rather put task structs into a linked list than a managed array?
I mean, think about how often a scheduler goes through a linked task struct list. Would you really want the compiler to generate all the extra type safe instructions for critical operations that execute hundreds of times a second?
Sorry to jump in so late, but I was traveling and only got around to reading these messages.
This thread has a lot of different issues intertwined. Let me try to clear up some of the most common confusion. Galen and I would be happy to answer questions about Singularity.
1. When are source/binary/more papers going to be available?
We'll put papers on our website (https://research.microsoft.com/os/singularity/) as we finish the final versions of them. It's a quaint academic tradition to ship no paper before its time.
Code and/or a running system is further in the future. We thought about it, but there are a lot of reasons why we're not ready (not the least of which is that the system is still very barebones, not useful to anyone but us, and in a state of rapid flux). Releasing
code entails a lot of work on our part and at least a commitment to answer questions, so it isn't something we'll do until we are good and ready.
2. What compiler do you use?
As several people noted, we use the Bartok compiler and runtime from the ACT group in MSR (https://research.microsoft.com/act/). It is a highly optimizing compiler that compilers MSIL down to x86 code.
It comes with a runtime system written entirely in C#--though parts of it, most notably the garbage collector (GC) are unsafe C#. (It is an open research challenge to write a real GC in a type-safe language.)
Bartok is a very high quality compiler that produces good code, but it is a research prototype. It doesn't handle exactly the same language as MS's product compilers (e.g. no reflection) and isn't ready for widespread use. Don't ask when it will be shipped,
since it isn't going to be. If you wonder why, say "research prototype" 10 times fast and you'll have the reason.
3. How do you do xxx in C#?
A couple things to note. Everything in Singularity is written in safe managed code (C#), except the kernel. This includes device drives, system components, applications, etc. The kernel, since it implements the memory system, scheduler, and
manages devices is pretty low-level code and is primarily written in safe C#, though there are parts written in unsafe C# and a HAL written in C++.
Also note that we own the compiler and can control the code that it generates. Using an off-the-shelf compiler would introduce a lot of difficulties in predicting exactly what code would be generated in different situations. This is not fundamental, but rather
a big convenience.
And yes, you too can write a good part of your run-time system in safe code. Look at a library sometime. Most of it is pretty simple data manipulation that can be written in any language. There are a few tricky parts where the unsafe subset of C#, or its equivalent,
is essential. The key is to factor your system so these parts live in the kernel, with a safe interface, or are inserted by your compiler.
4. Didn't JavaOS do this?
Not really. JavaOS is just a simple run-time system between the JVM and bare hardware, which provided a bare minimum of services on the hardware to run the JVM.
Singularity is much closer to the JX project in many respects. You might want to take a look at their paper to understand some of what we are doing:
Golm, M., Felser, M., Wawersich, C. and Kleinoeder, J. The JX Operating System. in Proceedings of the USENIX 2002 Annual Conference, Monterey, CA, 2002, 45-58.
Galen and I would be happy to answer questions on Singularity. It would be a lot easier to reply if there was a thread for each topic of discussion, rather than messages containing a collection of unrelated questions.
Thanks a lot for your interesting in Singularity!
larus wrote:Sorry to jump in so late, but I was traveling and only got around to reading these messages.
This thread has a lot of different issues intertwined. Let me try to clear up some of the most common confusion. Galen and I would be happy to answer questions about Singularity.
...
Galen and I would be happy to answer questions on Singularity. It would be a lot easier to reply if there was a thread for each topic of discussion, rather than messages containing a collection of unrelated questions.
Thanks a lot for your interesting in Singularity!
If you'd like a thread featured way to communicate there is the ms forum betanews.microsoft.com or you could open one on google groups.
Great project, I hope you produce something that we can install and play with (perhaps shared source initiative). Also can you envision a device-driver oriented language?
AndyC wrote:C9 isn't about PR and marketroid stuff.
Beer28 wrote:
larus wrote: an unsound assumption that a program isn't violating language semantics with dirty tricks, such as converting an integer to a pointer.
I never thought of using pointers as integers a dirty trick....
in x86 they're the same width so it doesn't really matter in the compiled code nor asm which is which type. I guess the same is true on 64 with long.
Do you have any plans for embedded systems, 8051/2, arm?
what about just arm? I don't forsee this going to 8 or 16 bit in short retrospect.
Beer28 wrote:
Buzza wrote:
I class it as a dirty trick as a pointer is just that - it points to data - but it is not the actual data - the data is what is pointed at that memory location. An integer is data - not a pointer. Keep the 2 seperate.
in assembly it doesn't matter, C is shorthand for assembly
it's 32 bits either way. All those types in windef.h are all typedefs, they are not compiler standard types. WORD, DWORD, WPARAM, all that stuff.
A pointer to a type and an int are compiler types, they happen to be the same size on x86, long and * are the same size on x86_64.
That's not cheating, that's programming. If you ever do small MCU's with limited addresses, you will quickly see, that you need addressability.
Managed code like java is a whole other story. You can't say C or C++ code is dirty because it doesn't behave like java. It's not supposed to.
Beer28 wrote:
Also you can write out a function from the code like JIT, and call that new entry point, say on win, you virtualalloc pages with +rwx and write out some opcodes, you can then cast that address to the function pointer with the params and return type, then funcptr(arg1, arg2);
So in otherwords you can do dynamic code and call it after marking t he heap pages executable.
Beer28 wrote:
There are just tons of great uses for addressable memory. C# and Java are pretty limited in my opinion. It sounds like the Microsoft guys may be using some dirty tricks with that "bartok" compiler though
an unsound assumption that a program isn't violating language semantics with dirty tricks, such as converting an integer to a pointer.
I never thought of using pointers as integers a dirty trick....
in x86 they're the same width so it doesn't really matter in the compiled code nor asm which is which type. I guess the same is true on 64 with long.
Do you have any plans for embedded systems, 8051/2, arm?
what about just arm? I don't forsee this going to 8 or 16 bit in short retrospect.
These are good questions, and the answers probably aren't clear from the video.
1. We don't have DLLs, in the sense of dynamically loaded libraries, but we do have libraries of code that can be reused in various applications.
2. I'm not sure what you mean by "app/drive statically linked against the code"? Code in two processes is only related by the channels between them, which have contracts expression the data that is transfered and the legal message patterns. Beyond those, there
is not linking and the code in each process is entirely independent.
3. We don't have a JIT. We precompile everything before executing it. If you don't have dynamic code loading, you don't need a JIT.
/Jim
How do you extend the libraries of code to allow reuse by various applications? For example, adding a SOAP library.
What I meant by the static linking question is: How do you a) build the app, and b) load it into the system such that it can call the kernel library functions? This also builds on the previous question of extending the kernel library...
I'm confused how you get by without a JIT, and how this precompiling process works. Especially how you create a SIP, and put code into it. Is that not dynamic code loading?
Any Singularity alpha that we can download just to play around? Any .PPT explaining the architecture?
Have you guys ever considered making an ARM port of the OS to use for robotics control and such? It'd be cool to have a .NET-based OS that can run without having to go through WinCE/Linux and Compact Framework/Mono.
There's DotNetCPU, but... is it gone?
chris31 wrote:They mentioned that the current NT kernel can't really be changed that much and I'm wondering why? Can't you just change the nt kernel and just make sure the interface between kernel and userland is the same? As long as the kernel interface is the same the app wouldn't care how the kernel did something. I would think that the app wouldn't even have to know it changed?
Beer28 wrote:[...]
I'm guessing it's linked to undisclosed low level C or ASM libraries since it's not libc. You can't write low level code with C#, it's impossible because it won't let you break free into the instruction set you need to reach the bios and or service interupts, such as faults, device interupts, or system interupts of any kind.
[...]
In the second video, I saw that you are doing message passing over channels and run everything in its own SIP.
I did a lot of programming for µnOS which is an OO OS written in C++ including a GC.
It also uses channels for message passing (without message contracts) just like QNX does.
So if every driver runs in it's own SIP, can I write a SIP by myself sending messages to the NIC SIP to tell it to send raw ethernet packets to the net?
Or has the TCP/IP SIP exclusive rights to communicate to the NIC SIP?
In µnOS we dicided to compile device drivers to DLLs.
So the network service, which is a seperate process in user mode creates an instance of the device driver object(s)
in its own process space and uses a well-defined interface to access these driver object.
Only the network service has a channel over which the other applications can create and use network connections.
How about modularity and the extension of a certain system service?
Let's say your TCP/IP SIP currently supports IP, ICMP, ARP, TCP and UDP.
How would I extend it to support SCTP as well?
Can I write an extension module (e. g. in form of a C# class) that is loaded by the TCP/IP SIP at startup to support SCTP?
Or do I have to write the extension in form of another SIP?
I also saw that you are doing process creation, channel management and security inside the microkernel.
Did you thaught of implementing a SIP in user mode for doing stuff like that?
E. g. the QNX process manager does some of these things in user mode.
At all, you did a great job with Singularity!
I hope Microsoft stays tuned with it!