Bending Your Code Like Anders With C# Roslyn APIs - More About Syntax Trees

By Anoop Madhusudanan

Vote on HN

imageIn my previous post Introduction to C# Roslyn CTP, I gave a quick introduction about C# Roslyn CTP, and briefed how you could use C# as a Scripting Language using Roslyn. I also gave a quick introduction regarding parsing code using Roslyn APIs. Recommend you to read the same before starting this.

On a side note, you could also read few of my related C# vNext posts here.

In this post, we’ll re-look at few Roslyn features, based on the June 2012 CTP release – so that you can start leveraging Roslyn to write your own build tasks and pre processors. Also, note that you can compile the Syntax Debugger Visualizer sample that comes with the Roslyn CTP – Find it in the Shared folder in the CTP installation. Open the SyntaxDebuggerVisualizer project and compile the libraries, and place it in your Documents\<VisualStudioFolder>\Visualizers so that you can visualize syntax trees during debugging.

Update: Roslyn CTP September 2012 now has got released, and there are some breaking changes. I suggest you read this post about the new September 2012 CTP. 

Parsing Syntax Trees

So, let us start by parsing some code. Create a new Roslyn CTP Console project in Visual Studio, and try parsing this code. Note that in the new June 2012 CTP, Roslyn has added support for Query expressions, anonymous types, iterators, Indexers, switch statements etc. A Full list of features added since Oct 2011 CTP  is available here.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
using System.Reflection.Emit;
using System.Text;
using AForge.Genetic;
using Roslyn.Compilers;
using Roslyn.Compilers.CSharp;
using Roslyn.Scripting;
using Roslyn.Scripting.CSharp;

namespace Roslyn.Console
{   
    class Program
    {
        static void Main(string[] args)
        {

            string code=@"class SimpleClass { 
                                public void SimpleMethod()
                                    {
                                        var list = new List<string>();
                                        list.Add(""first"");
                                        list.Add(""second"");
                                        var result = from item in list
                                                     where item == ""first""
                                                     select item;
                                    }
                        }";


            var tree = SyntaxTree.ParseCompilationUnit(code);
        }

Now, put a break point near the closing bracket of our Main method, and try bringing up the Syntax Visualizer, assuming have the Visualizer libraries copied as above.

image

And check out the Syntax Tree formed by Roslyn. The most interesting part is how the LINQ Expressions are parsed. So, this should provide a useful way to load Query Expressions dynamically (for implementing dynamic filters etc), once full version is available.

 

image

 

Walking The Syntax Tree

Now, let us quickly explore how you can walk the syntax tree. You can simply inherit your own SyntaxWalker in the Roslyn API, that internally implements the Visitor pattern for visiting all nodes in the tree. Let us write a quick ConsoleDumpWalker that can dump the above syntax tree to a console, and a quick extension method so that we can use the walker to dump any syntax tree to console.

public static class SyntaxTreeExtensions
    {
        public static void Dump(this SyntaxTree tree)
        {
            var writer = new ConsoleDumpWalker();
            writer.Visit(tree.GetRoot());
        }

        class ConsoleDumpWalker : SyntaxWalker
        {
            public override void Visit(SyntaxNode node)
            {
                int padding = node.Ancestors().Count();
                //To identify leaf nodes vs nodes with children
                string prepend = node.ChildNodes().Count() > 0 ? "[-]" : "[.]";
                //Get the type of the node
                string line = new String(' ', padding) + prepend +
                                        " " + node.GetType().ToString();
                //Write the line
                System.Console.WriteLine(line);
                base.Visit(node);
            }

        }
    }
And now, you can try dumping your syntax tree with the above Dump extension method. For example, dumping the above code we parsed in step 1 should show the syntax tree we found earlier in the Debug syntax visualizer.
  var tree = SyntaxTree.ParseCompilationUnit(code);
  tree.Dump();

image

 

Modifying the Syntax Tree

Modifying Syntax Trees are equally easy. You can implement your own syntax re-writers, by inheriting from the SyntaxRewriter class.  Here is a pretty na├»ve SyntaxRewriter, that appends an ‘I’ to the beginning of all interfaces that doesn’t start with an I in their name. In an actual scenario, you should also change the related implementations, but this is just for an example. So, here is our InterfaceRenameRewriter

//Our rewriter
public class InterfaceRenameRewriter : SyntaxRewriter
    {

        public override SyntaxToken VisitToken(SyntaxToken token)
        {
            //If the token is an identifier name, and it's 
           //parent is an interface declaration

            if (token.Kind == SyntaxKind.IdentifierToken && 
                   token.Parent.Kind==SyntaxKind.InterfaceDeclaration)
            {
              //If the name doesn't start with I - bluntly fix it. 
               if (!token.GetText().StartsWith("I"))
                    {
                        return Syntax.Identifier("I" + token.GetText());
                    }
            }

            return base.VisitToken(token);
        }

    }
And you can test our InterfaceRenameRewriter on some sample code. 
            string code = @"interface SomeInterface { 
                                //Some Simple Method
                                public void SomeMethod();
                        }";


            var root = SyntaxTree.ParseCompilationUnit(code).GetRoot();
            var commentRewriter = new InterfaceRenameRewriter();
            var newRoot = commentRewriter.Visit(root);

 

And if you inspect newRoot, you’ll find that the interface name got renamed as intended. Here we go.

image

You can experiment with more overrides available in the SyntaxRewriter, as you understood how to modify syntax trees. Another way to modify syntax trees is directly replacing an old node in the root with a new node, but that is quite simple.

So, in this post we explored a bit about Roslyn Syntax Trees, and we saw how to start bending your code to write your own build tasks and pre-processors- Oh yes, all of us will take a little bit of time and practice to really bend it like Anders and Eric – but you just kicked the ball. Happy coding. Checkout my other C# posts as well Winking smile

© 2012. All Rights Reserved. Amazedsaint.com