Scripting and Code Analysis using Roslyn – And a first look at Roslyn CTP


ANOOP MADHUSUDANAN

Vote on HN

I Love C#Microsoft recently released Roslyn CTP, which previews the upcoming features of C# and VB.NET. Roslyn CTP is a pretty exciting release, and it opens up lot of possibilities for C# and VB.NET programmers. Roslyn provides language services and APIs on top of .NET’s compiler services, and this will enable .NET developers to do a lot more things – including using C# and VB.NET as scripting languages, use compiler as a service in your own applications for code related tasks, develop better language and IDE extensions etc.

Also, Roslyn provides a lot of possibilities for developers to write code analysis and manipulation tools around Visual Studio IDE, and provides APIs that’ll enable you to develop Visual Studio enhancements pretty quickly. The Roslyn APIs will have full fidelity with C# and VB for syntax, semantic binding, code emission, etc – and you’ll see soon a lot of language bending around C# and VB.NET, and possibly new DSLs and lot of meta programming ideas.

Roslyn has mainly four API Layers.

  • Scripting APIs
    • Provides a run time execution context for C# and VB.NET, Now you can use C#/VB.NET in your own applications as a scripting language
  • Compiler APIs
    • For accessing the Syntax and Semantic model of your code.
  • Workspace APIs
    • Provides an object model to aggregate the code model across projects in a solution. Mainly for code analysis and refactoring around IDEs like Visual Studio, though the APIs are not dependent on Visual Studio.
  • Services APIs
    • Provides a layer on top of Visual Studio SDK (VSSDK) for features like Intellisense, code formatting etc.

In this post, we’ll have a sneak peak towards the Scripting APIs and Compiler APIs.

Scripting APIs

Some time back, I posted about how to use Mono C# Compiler as a Service in your .NET applications to leverage C# as a scripting language. Thanks to Roslyn, now we’ll be having a first class Scripting API soon to .NET.

The Roslyn.Scripting.* namespace provides types for implementing your own scripting sessions, using C# and VB.NET. You can create a new Scripting session using Session.Create(..) method. The Session.Create method can also accept a Host object, and the methods of the Host object will be directly available to the runtime context.

To execute some code, you can create an instance of a ScriptEngine by providing the required assembly references and name space import information, and invoke the Execute method of the Scripting Engine. To demonstrate how this works, let us create a simple ScriptingHost which wraps a scripting session. For demonstration, we’ll also create a ScriptedDog class, and the purpose of this example is that we can Create dogs and train them through the scripting environment. Once you install Roslyn, create a new Console application, and here is the code that demonstrates a simple scripting environment.

Update: This below code examples are based on Roslyn CTP2. After this, Roslyn CTP3 September 2012 now has got released, and there are some breaking changes. I suggest you read this post about the new September 2012 CTP

using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
using System.Reflection.Emit;
using System.Text;
using Roslyn.Compilers;
using Roslyn.Compilers.CSharp;
using Roslyn.Scripting;
using Roslyn.Scripting.CSharp;

namespace ScriptingRoslyn
{
    //Our Dog class so that we can train dogs later
    public class OurDog
    {
        private string _name = string.Empty;
        public OurDog(string name)
        {
            _name = name;
        }

        public string Name
        {
            get { return _name; }
        }

        public void Byte(OurDog other)
        {
            Console.WriteLine("{0} is byting the tail of {1}", _name, other.Name);
        }

        public void Walk()
        {
            Console.WriteLine("{0} is Walking", _name);
        }

        public void Eat()
        {
            Console.WriteLine("{0} is Eating", _name);
        }
    }


    //Let us create a Host object, where we'll wrap our session and engine
    //The methods in the host class are available directly to the environment
    public class ScriptingHost
    {
        private ScriptEngine engine;
        private Session session;

        //Methods in the Host object can be called directly from the 
        //environment
        public OurDog CreateDog(string name)
        {
            return new OurDog(name);
        }

        public ScriptingHost()
        {
            //Create a session
            session = Session.Create(this);

            //Create the engine, just pass the assemblies and namespaces
            engine = new ScriptEngine(new Assembly[]
                                {
                                    typeof(Console).Assembly,
                                    typeof(ScriptingHost).Assembly,
                                    typeof(IEnumerable<>).Assembly,
                                    typeof(IQueryable).Assembly
                                },
                                new string[] 
                                { 
                                    "System", "System.Linq", 
                                    "System.Collections",
                                    "System.Collections.Generic"
                                }
                            );

        }

        //Pass the code to the engine, nothing much here
        public object Execute(string code)
        {
            return engine.Execute(code, session);
        }

        public T Execute<T>(string code)
        {
            return engine.Execute<T>(code, session);
        }

    }

    //Main driver

    class Program
    {
        static void Main(string[] args)
        {
            var host = new ScriptingHost();
            Console.WriteLine("Hello Dog Trainer!! Type your code.\n\n");


            string codeLine = string.Empty;
            Console.Write(">");
            while ((codeLine = Console.ReadLine()) != "Exit();")
            {
                try
                {
                    //Execute the code
                    var res = host.Execute(codeLine);

                    //Write the result back to console
                    if (res != null)
                        Console.WriteLine(" = " + res.ToString());
                }
                catch (Exception e)
                {
                    Console.WriteLine(" !! " + e.Message);
                }

                Console.Write(">");
            }
        }
    }
}

I assume the code is self explanatory, we are creating a Host class where we wrap the session and execute the code using the ScriptEngine in that session. A very simple REPL example - So, now let us go and train the dogs.

image

You can find that we are invoking the CreateDog method with in our Host class to create the dogs, and then we are training them – Though I’m not sure if we really need to train them to bite each other. Anyway, I hope the idea is clear.

Compiler APIs

Compiler APIs has object models for accessing the syntax and semantic models of your code. Using compiler APIs, you can obtain and manipulate Syntax trees. The common Syntax APIs are found in the Roslyn.Compilers and the Roslyn.Compilers.Common namespace, while the language specific Syntax APIs are found in Roslyn.Compilers.CSharp and Roslyn.Compilers.VisualBasic.

‘Syntax’ is the grammatical structure whereas ‘Semantics’ refers to the meaning of the vocabulary symbols arranged with that structure. If you consider English, "Dogs Are Cats" is grammatically correct, but semantically it is nonsense.

Building Syntax Trees and Accessing Semantic Model

Roslyn provides APIs for building syntax trees, and also for semantic analysis. So let us write some code that'll parse a method and creates syntax tree. It also shows how to get the method symbol from the semantic model of our syntax tree

            //Parse some code
            SyntaxTree tree = SyntaxTree.ParseCompilationUnit
                    (@"class Bar { 
                        void Foo() { Console.WriteLine(""foo""); }
                          }");

            //Find the first method declaration inside the first class declaration
            MethodDeclarationSyntax methodDecl = tree.Root
                .DescendentNodes()
                .OfType<ClassDeclarationSyntax>()
                .First().ChildNodes().OfType<MethodDeclarationSyntax>().First();

            //Create a compilation unit
            Compilation compilation = Compilation.Create("SimpleMethod").AddSyntaxTrees(tree);


            //Get the associated semantic model of our syntax tree
            var model = compilation.GetSemanticModel(tree);

            //Find the symbol of our Foo method
            Symbol methodSymbol = model.GetDeclaredSymbol(methodDecl);


            //Get the name of the method symbol
            Console.WriteLine(methodSymbol.Name);  

In the above example, you can easily figure out how we are parsing the code to a SyntaxTree, Getting the semantic model associated with that syntax tree, and then looking up for information in the Semantic model. In this case, we are getting the method symbol corresponds to the first method declaration with in the first class declaration in the syntax tree.  I.e, DescendentNodes() .OfType<ClassDeclarationSyntax>() .First() of the root node gives us the first class declaration node - and ChildNodes().OfType<MethodDeclarationSyntax>().First() gives us the first method declaration node with in that class declaration. To keep the point short, traversing the syntax tree using the object model is pretty easy.

The symbols we obtain from the semantic model can be used for a wide range of scenarios, including code analysis.

A bit more about Syntax Trees

Instead of parsing the entire code to a syntax tree, you can also parse code and for child Syntax Nodes nodes – For example, you can parse a statement to a StatementSyntax.

StatementSyntax statement=Syntax.ParseStatement("for (int i = 0; i < 10; i++) { }");

 

Syntax trees hold the entire source information in full fidelity - which means it contains every peice of the source information. Also, they can be round tripped to form the actual source code back from the syntax tree or the node. Syntax trees and nodes are immutable and thread safe. 

How ever, it is possible to replace a node entirely in the current syntax tree by using the ReplaceNode function to create a tree with the old node replaced with the new node. For now, that is it. Enjoy coding, and follow me in twitter

© 2012. All Rights Reserved. Amazedsaint.com