collapse Blogs I Read
collapse Table of Contents

Jonathan Pryor's web log

HackWeek V

Last week was HackWeek V, during which I had small goals, yet had most of the time eaten by unexpected "roadblocks."

The week started with my mis-remembering OptionSet behavior. I had thought that there was a bug with passing options containing DOS paths, as I thought the path would be overly split:

string path = null;
var o = new OptionSet () {
	{ "path=", v => path = v },
};
o.Parse (new[]{"-path=C:\path"});

Fortunately, my memory was wrong: this works as expected. Yay.

What fails is if the option supports multiple values:

string key = null, value = null;
var o = new OptionSet () {
	{ "D=", (k, v) => {key = k; value = v;} },
};
o.Parse (new[]{"-DFOO=C:\path"});

The above fails with a OptionException, because the DOS path is split, so OptionSet attempts to send 3 arguments to an option expecting 2 arguments. This isn't allowed.

The patch to fix the above is trivial (most of that patch is for tests). However, the fix didn't work at first.

Enter roadblock #1: String.Split() can return too many substrings. Oops.

So I fixed it. That only killed a day...

Next up, I had been sent an email showing that OptionSet had some bugs when removing by index. I couldn't let that happen...and being in a TDD mood, I first wrote some unit tests to describe what the IList<T> semantics should be. Being in an over-engineering mood, I wrote a set of "contract" tests for IList<T> in Cadenza, fixed some Cadenza bugs so that Cadenza would pass the new ListContract, then merged ListContract with the OptionSet tests.

Then I hit roadblock #2 when KeyedCollection<TKey, TItem> wouldn't pass my ListContract tests, as it wasn't exception safe. Not willing to give up on ListContract, I fixed KeyedCollection so it would now pass my ListContract tests, improving compatibility with .NET in the process, which allowed me to finally fix the OptionSet bugs.

I was then able to fix a mdoc export-html bug in which index files wouldn't always be updated, before starting to investigate mdoc assemble wanting gobs of memory.

While pondering how to figure out why mdoc assemble wanted 400MB of memory, I asked the folks on ##csharp on freenode if there were any Mono bugs preventing their SpikeLite bot from working under Mono. They kindly directed me toward a bug in which AppDomain.ProcessExit was being fired at the wrong time. This proved easier than I feared (I feared it would be beyond me).

Which left me with pondering a memory "leak." It obviously couldn't be a leak with a GC and no unmanaged memory to speak of, but what was causing so much memory to be used? Thus proceeded lots of Console.WriteLine(GC.GetTotalMemory(false)) calls and reading the output to see where the memory use was jumping (as, alas I found Mono's memory profiler to be less than useful for me, and mono's profiler was far slower than a normal run). This eventually directed me to the problem:

I needed, at most, two XmlNode values from an XmlDocument. An XmlDocument loaded from a file that could be very small or large-ish (0.5MB). Thousands of such files. At once.

That's when it dawned on me that storing XmlNodes in a Dictionary loaded from thousands of XmlDocuments might not be such a good idea, as each XmlNode retains a reference to the XmlDocument it came from, so I was basically copying the entire documentation set into memory, when I only needed a fraction of it. Doh!

The fix was straightforward: keep a temporary XmlDocument around and call XmlDocument.ImportNode to preserve just the data I needed.

Memory use plummeted to less than one tenth what was previously required.

Along the way I ran across and reported an xbuild bug (since fixed), and filed a regression in gmcs which prevented Cadenza from building.

Overall, a productive week, but not at all what I had originally intended.

Posted on 15 Jun 2010 | Path: /development/mono/ | Permalink

Defending XML-based Build Systems

Justin Etheredge recently suggested that we Say Goodbye to NAnt and MSBuild for .NET Builds With IronRuby. Why? because they're based on XML.

He goes on to mention several problems with XML-based build systems, principally:

His solution: use Ruby to describe your build process.

My reaction? No, no, for the love of $deity NO!

Why? Three reasons: GNU Autotools, Paul E. McKenney's excellent parallel programming series, and SQL.

Wait, what? What do those have to do with build systems? Everything, and nothing.

The truly fundamental problem is this: "To a man with a hammer, everything looks like a nail" (reportedly a quote from Mark Twain, but that's neither here nor there). In this case, the "hammer" is "writing code." But it's more than that: it's writing imperative code, specifically Ruby code (though the particular language isn't the problem I have, rather the imperative aspect).

Which is, to me, the fundamental problem: it's a terrible base for any form of higher-level functionality. Suppose you want to build your software in parallel (which is where Paul McKenney's series comes in). Well, you can't, because your entire build system is based on imperative code, and unless all the libraries you're using were written with that in mind...well, you're screwed. The imperative code needs to run, and potentially generate any side effects, and without a "higher-level" description of what those side effects entail it can't sanely work.

Want to add a new file to your build (a fairly common thing to do in an IDE, along with renaming files?) Your IDE needs to be able to understand the imperative code. If it doesn't, it just broke your build script. Fun!

OK, what about packaging? Well, in order to know what the generated files are (and where they're located), you'll have to run the entire script and (somehow) track what files were created.

Want to write an external tool that does something hitherto unknown? (As a terrible example, parse all C# code for #if HAVE_XXX blocks so that a set of feature tests can be automatically extracted.) Well, tough -- you have to embed an IronRuby interpreter, and figure out how to query the interpreter for the information you want (e.g. all the source files).

etc., etc.

My problem with imperative languages is that they're not high-level enough. McKenney asks what the good multicore programming languages are; the answer is SQL because it's dedicated ~solely to letting you describe the question but leaves the implementation of the answer to the question up to the SQL database. It's not imperative, it's declarative (at least until you hit esoteric features such as cursors, but in principal you can generally stick to a declarative subset).

OK, so I want a higher-level language to describe targets and dependencies, and supports faster builds. To a large degree, make(1) supports all that, and it's the basis of Autotools. Surely I like that, right?

The problem with autotools is that it's a mixture of declarative and imperative code, with Unix shell scripts forming the backbone of the imperative code (aka the target rules), and these are inherently Unix specific. (Possibly Linux specific, much to my consternation.) Plus, the format is virtually unreadable by anything other than make(1), what with all the language extensions...

So why XML?

Because it's not code, it's data, which (somewhat) lowers the barrier of entry for writing external tools which can parse the format and Do New Things without needing to support some language which might not even run on the platform you're using.

Because it's easily parseable AND verifiable, it's (somewhat) safer for external automated tools to manipulate the file without screwing you over "accidentally" -- e.g. adding and removing files from the build via an IDE.

Because custom rules are limited, there is a smaller "grammar" for external tools to understand, making it simpler to write and maintain them. It also encourages moving "non-target targets" out of the build system, simplifying file contents (and facilitating interaction with e.g. IDEs).

Am I arguing that XML-based build systems are perfect? Far from it. I'm instead arguing that small, purpose-specific languages can (and often are) Good Things™, particularly if they permit interoperability between a variety of tools and people. XML allows this, if imperfectly. An IronRuby-based build system does not.

Posted on 26 Apr 2010 | Path: /development/ | Permalink

mdoc Repository Format History

Time to wrap up this overly long series on mdoc. We covered:

To close out this series, where did the mdoc repository format come from? It mostly came from Microsoft, actually.

Taking a step back, "in the beginning," as it were, the Mono project saw the need for documentation in January 2002. I wasn't involved then, but perusing the archives we can see that csc /doc output was discarded early because it wouldn't support translation into multiple languages. NDoc was similarly discarded because it relied on csc /doc documentation. I'm sure a related problem at the time was that Mono's C# compiler didn't support the /doc compiler option (and wouldn't begin to support /doc until April 2004), so there would be no mechanism to extract any inline documentation anyway.

By April 2003 ECMA standardization of the Common Language Infrastructure was apparently in full force, and the standardization effort included actual class library documentation. The ECMA documentation is available within ECMA-335.zip. The ECMA-335 documentation also included a DTD for the documentation contained therein, and it was a superset of the normal C# XML documentation. The additional XML elements provided what XML documentation lacked: information available from the assembly, such as actual parameter types, return types, base class types, etc. There was one problem with ECMA-335 XML, though: it was gigantic, throwing everything into a single 7MB+ XML file.

To make this format more version-control friendly (can you imagine maintaining and viewing diffs on a 7+MB XML file?), Mono "extended" the ECMA-335 documentation format by splitting it into one file per type. This forms the fundamental basis of the mdoc repository format (and is why I say that the repository format came from Microsoft, as Microsoft provided the documentation XML and DTD to ECMA). This is also why tools such as mdoc assemble refer to the format as ecma. The remainder of the Mono extensions were added in order to fix various documentation bugs (e.g. to distinguish between ref vs. out parameters, to better support generics), etc.

In closing this series, I would like to thank everyone who has ever worked on Monodoc and the surrounding tools and infrastructure. It wouldn't be anywhere near as useful without them.

Posted on 20 Jan 2010 | Path: /development/mono/mdoc/ | Permalink

Assembly Versioning with mdoc

Previously, we mentioned as an aside that the Type.xml files within an mdoc repository contained //AssemblyVersion elements. Today we will discuss what they're for.

The //AssemblyVersion element records exactly one thing: which assembly versions a type and member was found in. (The assembly version is specified via the AssemblyVersionAttribute attribute.) With a normal assembly versioning policy, this allows monodoc to show two things: which version added the type/member, and (by inference) which version(s) removed the member.

For example, consider the NetworkStream.Close method. This method was present in .NET 1.0 which overrode Stream.Close. However, in .NET 2.0 the override was removed entirely.

The //AssemblyVersion attribute allows the mdoc repository to track such versioning changes; for example, consider the mdoc-generated NetworkStream.xml file. The //Member[@MemberName='Close']/AssemblyInfo/AssemblyVersion elements contain only an entry for 1.0.5000.0 (corresponding to .NET 1.1) on line 536. Compare to the //Member[@MemberName='CanWrite']/AssemblyInfo/AssemblyVersion elements (for the NetworkStream.CanWrite property) which has //AssemblyVersion elements for 1.0.5000.0 and 2.0.0.0. From this, we can deduce that NetworkStream.Close was present in .NET 1.1, but was removed in .NET 2.0.

When viewing type and member documentation, monodoc and the ASP.NET front end will show the assembly versions that have the member:

NetworkStream.Close -- notice only 1.0.5000.0 is a listed assembly version.

There are two limitations with the version tracking:

  1. It only tracks types and members. For example, attributes, base classes, and interfaces may be added or removed across versions; these are not currently tracked.
  2. It uses the assembly version to fill the <AssemblyVersion> element.

The second point may sound like a feature (isn't it the point?), but it has one downfall: auto-generated assembly versions. You can specify an auto-generated assembly version by using the * for some components in the AssemblyVersionAttribute constructor:

[assembly: AssemblyVersion("1.0.*.*")]

If you do this, every time you rebuild the assembly the compiler will dutifully generate a different assembly number. For example, the first time you might get a compiler version of 1.0.3666.19295, while the second recompilation the compiler will generate 1.0.3666.19375. Since mdoc assigns no meaning to the version numbers, it will create //AssemblyVersion elements for each distinct version found.

The "advantage" is that you can know on which build a member was added. (If you actually care...)

The disadvantage is a major bloating of the mdoc repository, as you add at least 52*(1+M) bytes to each file in the mdoc repository for each unique assembly version (where M is the number of members within the file, as each member is separately tracked). It will also make viewing the documentation distracting; imagine seeing 10 different version numbers for a member, which all differ in the build number. That much noise would make the feature ~useless.

As such, if you're going to use mdoc, I highly suggest not using auto-generated assembly version numbers.

Next time, we'll wrap up this series with a history of the mdoc repository format.

Posted on 19 Jan 2010 | Path: /development/mono/mdoc/ | Permalink

Caching mdoc's ASP.NET-generated HTML

Last time we discussed configuring the ASP.NET front-end to display monodoc documentation. The display of extension methods within monodoc and the ASP.NET front-end is fully dynamic. This has it's pros and cons.

On the pro side, if/when you install additional assembled documentatation sources, those sources will be searched for extension methods and they will be shown on all matching types. This is very cool.

On the con side, searching for the extension methods and converting them into HTML takes time -- there is a noticable delay when viewing all members of a type if there are lots of extension methods. On heavily loaded servers, this may be detrimental to overall performance.

If you're running the ASP.NET front-end, you're not regularly adding documentation, and you have Mono 2.6, you can use the mdoc export-html-webdoc command to pre-render the HTML files and cache the results. This will speed up future rendering.

For example, consider the url http://localhost:8080/index.aspx?link=T:System.Collections.Generic.List`1/* (which shows all of the List<T> members). This is a frameset, and the important frame here is http://localhost:8080/monodoc.ashx?link=T:System.Collections.Generic.List`1/* which contains the member listing (which includes extension methods). On my machine, it takes ~2.0s to download this page:

$ time curl -s \
	'http://localhost:8080/monodoc.ashx?link=T:System.Collections.Generic.List`1/*' \
	> /dev/null

real	0m2.021s
user	0m0.003s
sys	0m0.002s

In a world where links need to take less than 0.1 seconds to be responsive, this is...pretty bad.

After running mdoc export-html-webdoc netdocs.zip (which contains the List<T> docs):

$ time curl -s \
	'http://localhost:8080/monodoc.ashx?link=T:System.Collections.Generic.List`1/*' \
	> /dev/null

real	0m0.051s
user	0m0.003s
sys	0m0.006s

That's nearly 40x faster, and within the 0.1s guideline.

Cache Generation: to generate the cache files, run mdoc export-html-web ASSEMBLED-FILES. ASSEMBLED-FILES consists of the .tree or .zip files which are generated by mdoc assemble and have been installed into $prefix/lib/monodoc/sources:

$ mdoc export-html-webdoc $prefix/lib/monodoc/sources/Demo.zip

(Where $prefix is your Mono installation prefix, e.g. /usr/lib/monodoc/sources/Demo.zip.)

This will create a directory tree within $prefix/lib/monodoc/sources/cache/Demo. Restarting the ASP.NET front-end will allow it to use the cache.

If you don't want to generate the cache in another directory, use the -o=PREFIX option. This is useful if you're updating an existing cache on a live server and you don't want to overwrite/replace the existing cache (it's a live server!) -- generate the cache elsewhere, then move the files when the server is offline.

If you have lots of time on your hands, you could process all assembled documentation with:

$ mdoc export-html-webdoc $prefix/lib/monodoc/sources/*.zip

Limitations: It should be noted that this is full of limitations, so you should only use it if performance is really important. Limitations include:

Next time, we'll cover mdoc's support for assembly versioning.

Posted on 15 Jan 2010 | Path: /development/mono/mdoc/ | Permalink

Configuring the ASP.NET front-end for mdoc

Last time, we assembled our documentation and installed it for use with monodoc. This is a prerequisite for ASP.NET support (as they both use the same system-wide documentation directory).

Once the documentation is installed (assuming a Linux distro or OSX with the relevant command-line tools installed), you can trivially host a web server which will display the documentation:

$ svn co http://anonsvn.mono-project.com/source/branches/mono-2-4/mono-tools/webdoc/
# output omitted...
$ cd webdoc
$ xsp2

You will need to change the svn co command to use the same version of Mono that is present on your system. For example, if you have Mono 2.6 installed, change the mono-2-4 to mono-2-6.

Once xsp2 is running, you can point your web browser to http://localhost:8080 to view documentation. This will show the same documentation as monodoc did last time:

System.Array extension methods -- notice With() is listed

For "real" use, setting up using Apache with mod_mono may be preferred (or any of the other options listed at Mono's ASP.NET support page). Configuring mod_mono or anything other than xsp2 is beyond my meager abilities.

Next time, we'll discuss improving the ASP.NET front-end's page rendering performance.

Posted on 14 Jan 2010 | Path: /development/mono/mdoc/ | Permalink

Assembling Documentation with mdoc

We previously discussed exporting the mdoc repository into static HTML files using mdoc export-html and into a Microsoft XML Documentation file with mdoc export-msxdoc. Today, we'll discuss exporting documentation with mdoc assemble.

mdoc assemble is used to assemble documentation for use with the monodoc Documentation browser and the ASP.NET front-end. This involves the following steps:

  1. Running mdoc assemble.
  2. Writing a .source file.
  3. Installing the files.

Unfortunately we're taking a diversion from the Windows world, as the monodoc browser and the ASP.NET front-end won't run under Windows (due to limitations in the monodoc infrastructure). I will attempt to fix these limitations in the future.

Running mdoc assemble: mdoc assemble has three arguments of interest:

For our current documentation, we would run:

$ mdoc assemble -o Demo Documentation/en.docs

This will create the files Demo.tree and Demo.zip in the current working directory.

The .source file is used to tell the documentation browser where in the tree the documentation should be inserted. It's an XML file that contains two things: a (set of) /monodoc///node elements describing where in the tree the documentation should be inserted, and /monodoc/source elements which specify the files to use. For example:

<?xml version="1.0"?>
<monodoc>
  <node label="Demo Library" name="Demo-lib" parent="libraries" />
  <source provider="ecma" basefile="Demo" path="Demo-lib"/>
</monodoc>

The /monodoc/node element describes where in the monodoc tree the documentation should be placed. It has three attributes, two of which are required:

The /monodoc/source element describes what file basename to use when looking for the .tree and .zip files. (By convention the .source, .tree, and .zip files share the same basename, but this is not required. The .tree and .zip files must share the same basename, but the .source basename may differ, and will differ if e.g. one .source file pulls in several .tree/.zip pairs.) It has three attributes, all of which are required:

Installing the files. Files need to be installed into $prefix/lib/monodoc/sources. You can obtain this directory with pkg-config(1):

$ cp Demo.source Demo.tree Demo.zip \
    `pkg-config monodoc --variable=sourcesdir`

Now when we run monodoc, we can navigate to the documentation that was just installed:

ObjectCoda.With() documentation inside monodoc.

Additionally, those paying attention on January 10 will have noticed that the With() method we documented is an extension method. Monodoc supports displaying extension methods on the relevant type documentation. In this case, With() is an extension on TSource, which is, for all intents and purposes, System.Object. Thus, if we view the System.Object docs within our local monodoc browser, we will see the With() extension method:

System.Object extension methods -- notice With() is listed.

In fact, we will see With() listed as an extension method on all types (which is arguably a bug, as static types can't have instance methods...).

Furthermore, mdoc export-html will also list extension methods. However, mdoc export-html is far more limited: it will only look for extension methods within the mdoc repositories being processing, and it will only list those methods as extension methods on types within the mdoc repository. Consequently, mdoc export-html will not list e.g. IEnumerable<T> extension methods on types that implement IEnumerable<T>. (It simply lacks the information to do so.)

Examples of mdoc export-html listings of extension methods can be found in the mdoc unit tests and the Cadenza.Collections.CachedSequence<T> docs (which lists a million extension methods because Cadenza.Collections.EnumerableCoda contains a million extension methods on IEnumerable<T>).

Next time, we'll discuss setting up the ASP.NET front end under Linux.

Posted on 13 Jan 2010 | Path: /development/mono/mdoc/ | Permalink

Exporting mdoc Repositories to Microsoft XML Documentation

Previously, we discussed how to write documentation and get it into the documentation repository. We also discussed exporting the documentation into static HTML files using mdoc export-html. Today, we'll discuss mdoc export-msxdoc.

mdoc export-msxdoc is used to export the documentation within the mdoc repository into a .xml file that conforms to the same schema as csc /doc. This allows you, if you so choose, to go entirely to externally managed documentation (instead of inline XML) and still be able to produce your Assembly.xml file so that Visual Studio/etc. can provide code completion against your assembly.

There are two ways to invoke it:

$ mdoc export-msxdoc Documentation/en
$ mdoc export-msxdoc -o Demo.xml Documentation/en

The primary difference between these is what files are generated. Within each Type.xml file of the mdoc repository (e.g. ObjectCoda.xml) is a /Type/AssemblyInfo/AssemblyName element.

The first command (lacking -o Demo.xml) will generate a set of .xml files, where the filenames are based on the values of the /Type/AssemblyInfo/AssemblyName element values, in this case Demo.xml. Additionally, a NamespaceSummaries.xml file is generated, containing documentation for any namespaces that were documented (which come from the ns-*.xml files, e.g. ns-Cadenza.xml).

The second command (which specifies -o Demo.xml) will only generate the specified file (in this case Demo.xml).

For this mdoc repository, there is no actual difference between the commands (as only one assembly was documented within the repository), except for the generation of the NamespaceSummaries.xml file. However, if you place documentation from multiple assemblies into the same mdoc repository, the first command will properly generate .xml files for each assembly, while the latter will generate only a single .xml file containing the documentation from all assemblies.

Next time, we'll cover mdoc assemble.

Posted on 12 Jan 2010 | Path: /development/mono/mdoc/ | Permalink

Customizing mdoc's Static HTML Output

Last time, we wrote documentation for our Demo.dll assembly. What if we want to improve the looks of those docs, e.g. to change the colors or add additional navigation links for site consistency purposes?

mdoc export-html uses three mechanisms to control output:

The XSLT needs to consume an XML document that has the following structure:

<Page>
    <CollectionTitle>Collection Title</CollectionTitle>
    <PageTitle>Page Title</PageTitle>
    <Summary>Page Summary</Summary>
    <Signature>Type Declaration</Signature>
    <Remarks>Type Remarks</Remarks>
    <Members>Type Members</Members>
    <Copyright>Documentation Copyright</Copyright>
</Page>

The contents of each of the //Page/* elements contains HTML or plain text nodes. Specifically:

/Page/CollectionTitle
Contains the Assembly and Namespace name links.
/Page/PageTitle
Contains the type name/description.
/Page/Summary
Contains the type <summary/> documentation.
/Page/Signature
Contains the type signature, e.g. whether it's a struct or class, implemented interfaces, etc.
/Page/Remarks
Contains type-level <remarks/>.
/Page/Members
Contains the documentation for all of the members of the type, including a table for all of the members.
/Page/Copyright
Contains copyright information taken from the mdoc repository, specifically from index.xml's /Overview/Copyright element.

By providing a custom --template XSLT and/or by providing an additional CSS file, you have some degree of control over the resulting documentation.

I'll be the first to admit that this isn't a whole lot of flexibility; there is no control over what CSS class names are used, nor is there any control over what is generated within the /Page//* elements. What this model does allow is for controlling the basic page layout, e.g. to add a site-wide menu system, allowing documentation to be consistent with the rest of the site.

For example, my site uses custom templates to provide a uniform look-and-feel with the rest of their respective sites for the Mono.Fuse and NDesk.Options documentation.

Next time, we'll cover mdoc export-msxdoc.

Posted on 11 Jan 2010 | Path: /development/mono/mdoc/ | Permalink

mdoc XML Schema

Previously, I mentioned that you could manually edit the XML files within the mdoc repository.

What I neglected to mention is that there are only parts of the XML files that you should edit, and that there is an XML Schema file available for all docs.

The mdoc(5) man page lays out which files within the repository (and which parts of those files) are editable. In summary, all ns-*.xml files and the //Docs nodes of all other .xml files are editable, and they should contain ye normal XML documentation elements (which are also documented within the mdoc(5) man page).

The XML Schema can be found in Mono's SVN, at http://anonsvn.mono-project.com/source/trunk/mcs/tools/mdoc/Resources/monodoc-ecma.xsd.

Posted on 10 Jan 2010 | Path: /development/mono/mdoc/ | Permalink

Writing Documentation for mdoc

Last time, we create an assembly and used mdoc to generate a documentation repository containing stubs. Stubs have some utility -- you can view the types, members, and parameter types that are currently present -- but they're far from ideal. We want actual documentation.

Unfortunately, mdoc isn't an AI, and can't write documentation for you. It manages documentation; it doesn't create it.

How do we get actual documentation into the respository? There are three ways:

  1. Manually edit the XML files within the repository directory (if following from last time, this would be all .xml files within the Documentation/en directory.
  2. Use monodoc --edit Documentation/en.
  3. We can continue writing XML documentation within our source code.

Manually editing the files should be self-explanatory; it's not exactly ideal, but it works, and is how I write most of my documentation.

When using monodoc --edit Documentation/en, the contents of Documentation/en will be shown sorted in the tree view by it's assembly name, e.g. in the Mono Documentation → Demo node. When viewing documentation, there are [Edit] links that, when clicked, will allow editing the node (which directly edits the files within Documentation/en.

However, I can't recommend monodoc as an actual editor. It's usability is terrible, and has one major usability flaw: when editing method overloads, most of the documentation will be the same (or similar enough that you'll want to copy everything anyway), e.g. <summary/>, <param/>, etc. The monodoc editor doesn't allow copying all of this at once, but only each element individually. It makes for a very slow experience.

Which brings us to inline XML documentation. mdoc update supports importing XML documentation as produced by csc /doc. So let's edit our source code to add inline documentation:

using System;

namespace Cadenza {
    /// <summary>
    ///  Extension methods on <see cref="T:System.Object" />.
    /// </summary>
    public static class ObjectCoda {
        /// <typeparam name="TSource">The type to operate on.</typeparam>
        /// <typeparam name="TResult">The type to return.</typeparam>
        /// <param name="self">
        ///   A <typeparamref name="TSource" /> containing the value to manipulate.
        ///   This value may be <see langword="null" /> (unlike most other
        ///   extension methods).
        /// </param>
        /// <param name="selector">
        ///   A <see cref="T:System.Func{TSource,TResult}" /> which will be
        ///   invoked with <paramref name="self" /> as a parameter.
        /// </param>
        /// <summary>
        ///   Supports chaining otherwise temporary values.
        /// </summary>
        /// <returns>
        ///   The value of type <typeparamref name="TResult" /> returned by
        ///   <paramref name="selector" />.
        /// </returns>
        /// <remarks>
        ///   <para>
        ///     <c>With</c> is useful for easily using an intermediate value within
        ///     an expression "chain" without requiring an explicit variable
        ///     declaration (which is useful for reducing in-scope variables, as no
        ///     variable is explicitly declared).
        ///   </para>
        ///   <code lang="C#" src="../../example.cs#With" />
        /// </remarks>
        /// <exception cref="T:System.ArgumentNullException">
        ///   <paramref name="selector" /> is <see langword="null" />.
        /// </exception>
        public static TResult With<TSource, TResult>(
                this TSource self, 
                Func<TSource, TResult> selector)
        {
            if (selector == null)
                throw new ArgumentNullException ("selector");
            return selector (self);
        }
    }
}

(As an aside, notice that our file ballooned from 14 lines to 45 lines because of all the documentation. This is why I prefer to keep my documentation external to the source code, as it really bloats the source. Certainly, the IDE can hide comments, but I find that this defeats the purpose of having comments in the first place.)

Compile it into an assembly (use csc if running on Windows), specifying the /doc parameter to extract XML documentation comments:

$ gmcs /t:library /out:Demo.dll /doc:Demo.xml demo.cs

Update our documentation repository, but import Demo.xml:

$ mdoc update -o Documentation/en -i Demo.xml Demo.dll --exceptions=added
Updating: Cadenza.ObjectCoda
Members Added: 0, Members Deleted: 0

(No members were added or deleted as we're only changing the documentation, and didn't add any types or members to the assembly.)

Now when we view ObjectCoda.xml, we can see the documentation that was present in the source code.

However, notice one other change. In the documentation we wrote, we had:

        ///   <code lang="C#" src="../../example.cs#With" />

Yet, within ObjectCoda.xml, we have:

          <code lang="C#" src="../../example.cs#With">Console.WriteLine(
    args.OrderBy(v => v)
    .With(c => c.ElementAt (c.Count()/2)));
</code>

What's going on here? What's going on is that mdoc will search for all <code/> elements. If they contain a //code/@src attribute, the specified file is read in and inserted as the //code element's value. The filename specified in the //code/@src attribute is relative to the documentation repository root. A further extension is that, for C# code, if the filename has an "anchor", a #region block of the same name is searched for within the source code.

The ../../example.cs file referenced in the //code/@src value has the contents:

using System;
using System.Linq;
using Cadenza;

class Demo {
    public static void Main (string[] args)
    {
        #region With
        Console.WriteLine(
            args.OrderBy(v => v)
            .With(c => c.ElementAt (c.Count()/2)));
        #endregion
    }
}

This makes keeping documentation examples actually compiling trivial to support. For example, I'll have documentation refer to my unit tests, e.g.

<code lang="C#" src="../../Test/Cadenza/ObjectTest.cs#With" />

One final point worth mentioning: you can import documentation as often as you want. The imported documentation will always overwrite whatever is already present within the documentation repository. Consequently, if you want to use mdoc for display purposes but want to continue using inline XML documentation, always import the compiler-generated .xml file.

Now, we can update our HTML documentation:

$ mdoc export-html -o html Documentation/en
Cadenza.ObjectCoda

The current Demo.dll documentation.

Next time, we'll cover customizing the static HTML output.

Posted on 10 Jan 2010 | Path: /development/mono/mdoc/ | Permalink

Using mdoc

As mentioned last time, mdoc is an assembly-based documentation management system. Thus, before you can use mdoc you need an assembly to document. Let's write some C# source:

using System;

namespace Cadenza {
    public static class ObjectCoda {
        public static TResult With<TSource, TResult>(
                this TSource self, 
                Func<TSource, TResult> selector)
        {
            if (selector == null)
                throw new ArgumentNullException ("selector");
            return selector (self);
        }
    }
}

Compile it into an assembly (use csc if running on Windows):

$ gmcs /t:library /out:Demo.dll demo.cs

Now that we have an assembly, we can create the mdoc repository for the Demo.dll assembly, which will contain documentation stubs for all publically visible types and members in the assembly:

$ mdoc update -o Documentation/en Demo.dll --exceptions=added
New Type: Cadenza.ObjectCoda
Member Added: public static TResult With<TSource,TResult> (this TSource self, Func<TSource,TResult> selector);
Namespace Directory Created: Cadenza
New Namespace File: Cadenza
Members Added: 1, Members Deleted: 0

mdoc update is the command for for synchronizing the documentation repository with the assembly; it can be run multiple times. The -o option specifies where to write the documentation repository. Demo.dll is the assembly to process; any number of assemblies can be specified. The --exceptions argument analyzes the IL to statically determine which exception types can be generated from a member. (It is not without some limitations; see the "--exceptions" documentation section.) The added argument to --exceptions tells mdoc to add <exception/> elements only for types and members that have been added to the repository, not to all types and members in the assembly. This is useful for when you've removed <exception/> documentation and don't want mdoc to re-add them.

We choose Documentation/en as the documentation repository location so that we can easily support localizing the documentation into multiple languages: each directory underneath Documentation would be named after an ISO 639-1 code, e.g. en is for English. This is only a convention, and is not required; any directory name can be used.

Notice that, since mdoc is processing assemblies, it will be able to work with any language that can generate assemblies, such as Visual Basic.NET and F#. It does not require specialized support for each language.

Now we have a documentation repository containing XML files; a particularly relevant file is ObjectCoda.xml, which contains the documentation stubs for our added type. I won't show the output here, but if you view it there are three important things to note:

  1. The XML is full of type information, e.g. the /Type/Members/Member/Parameters/Parameter/@Type attribute value.
  2. The XML contains additional non-documentation information, such as the //AssemblyVersion elements. This will be discussed in a future blog posting.
  3. The //Docs elements are a container for the usual C# XML documentation elements.

Of course, a documentation repository isn't very useful on it's own. We want to view it! mdoc provides three ways to view documentation:

  1. mdoc export-html: This command generates a set of static HTML files for all types and members found within the documentation repository.
  2. mdoc assemble: This command "assembles" the documentation repository into a .zip and .tree file for use with the monodoc Documentation browser and the ASP.NET front-end (which powers http://www.go-mono.com/docs).
  3. mdoc export-msxdoc: This generates the "traditional" XML file which contains only member documentation. This is for use with IDEs like Visual Studio, so that the IDE can show summary documentation while editing.

We will cover mdoc assemble and mdoc export-msxdoc in future installments. For now, to generate static HTML:

$ mdoc export-html -o html Documentation/en
Cadenza.ObjectCoda

The current Demo.dll documentation.

Next time we will cover how to write actual documentation instead of just documentation stubs.

Posted on 09 Jan 2010 | Path: /development/mono/mdoc/ | Permalink

TekPub's Mastering LINQ Challenge

Justin Etheredge has posted TekPub's Mastering LINQ Challenge, in which he lays out a "little LINQ challenge." The rules:

  1. You have to blog about a single LINQ query which starts with Enumerable.Range(1,n) and produces a list of prime numbers from the range. Thus, this blog posting. (Otherwise I'd rely on my twitter response.)
  2. You can't cheat. This is determined by me, and includes hardcoding values in the results. You'll know if you cheated. Part of me wonders if just being me qualifies as cheating, but that might imply that my computer self has too large an ego </ob-##csharp-meme>.
  3. Uses no custom LINQ methods. Here I ponder what constitutes a "custom LINQ method." Is any extension method a custom LINQ method? Any utility code?
  4. Will return all of the prime numbers of the sequence. It doesn't have to be super optimal, but it has to be correct. Boy is it not super optimal (it's a one liner!), but some improvements could make it better (e.g. Memoization, hence the prior question about whether extension methods constitute a "custom LINQ method").
  5. Be one of the first 5 people to blog a correct answer and then tweet this "I just solved the @tekpub LINQ challenge: <link to post>" will get any single TekPub screencast. The time of your solution will be based on your tweet! So be prompt!

    As far as timliness, I'm writing this blog entry over four hours after my tweet, so, uh, so much for timliness.

  6. You must link to both TekPub's website and this post in your blog post.

    Done, and done.

So, the quick and dirty, not at all efficent answer (with longer identifiers as I certainly have more than 140 characters to play with:

Enumerable.Range(1, n).Where(value => 
    value <= 3
        ? true
        : Enumerable.Range(2, value - 2)
          .All(divisor => value % divisor != 0))

In English, we take all integers between 1 and n. Given a value from that sequence, if the value is less than 3, it's prime. If it's greater than three, take all numbers from 2 until value-1 and see if any of them divides value with no remainder. If none of them divide with no remainder, value is prime.

We need to use value-2 in the nested Enumerable.Range call so that we skip the value itself (since we're starting at 2).

Now, we can improve upon this in a fairly straightforward fashion if we can use additional code. For example, if we use Bart de Smet's Memoize extension method on System.Func<T, TResult>, we can skip the repeated nested Enumerable.Range call on every value, as prime numbers don't change (and thus are prime candidates for caching ;-):

Func<int, bool> isPrime = value => 
    value <= 3
        ? true
        : Enumerable.Range(2, value - 2)
          .All(divisor => value % divisor != 0))
isPrime = isPrime.Memoize();
Enumerable.Range(1, n).Where(value => isPrime(value));

Whether this latter answer matches the rules depends upon the definition of "single LINQ query" (does the definition of isPrime need to be part of the LINQ query, or just its use?) and whether Bart's Memoize extension method qualifies as a "custom LINQ method" (I don't think it is...). The downside to the memoization is that it's basically a memory leak in disguise, so I still wouldn't call it "optimal," just that it likely has better performance characteristics than my original query...

Posted on 08 Jan 2010 | Path: /development/ | Permalink

What is mdoc?

mdoc is an assembly-based documentation management system, which recently added support for .NET .

I say "assembly based" because an alternative is source-based, which is what "normal" C# XML documentation, JavaDoc, and perlpod provide. Unlike these source-based systems, in mdoc documentation for public types and members are not present within source code. Instead, documentation is stored externally (to the source), in a directory of XML files (hereafter refered to as the mdoc repository).

Furthermore, mdoc provides commands to:

Why the mdoc repository?

Why have a directory of XML files as the mdoc repository? The mdoc repository comes from the need to satisfy two goals:

  1. The compiler-generated /doc XML contains no type information.
  2. Having types is very useful for HTML output/etc., so the type information must come from somewhere.

Said "somewhere" could be the actual assemblies being documented, but this has other downsides (e.g. it would complicate supporting different versions of the same assembly). mdoc uses the repository to contain both documentation and full type information, so that the source assemblies are only needed to update the repository (and nothing else).

Why use mdoc?

Which provides enough background to get to the point: why use mdoc?

You would primarily want to use mdoc if you want to view your documentation outside of an IDE, e.g. within a web browser or stand-alone documentation browser. Most mdoc functionality is geared toward making documentation viewable (e.g. mdoc export-html and mdoc assemble), and making the documentation that is viewed more useful (such as the full type information provided by mdoc update and the generation of <exception/> elements for documentation provided by mdoc update --exceptions).

Next time, we'll discuss how to use mdoc.

Posted on 08 Jan 2010 | Path: /development/mono/mdoc/ | Permalink

Re-Introducing mdoc

Many moons ago, Jon Skeet announced Noda Time. In it he asked:

How should documentation be created and distributed?

Thus I pondered, "how well does mdoc support Windows users?"

The answer: not very well, particularly in an interop scenario.

So, lots of bugfixing and a false-start later, and I'd like to announce mdoc for .NET. All the power of mdoc, cross-platform.

Note that these changes did not make it into Mono 2.6, and won't be part of a formal Mono release until Mono 2.8. Consequently, if you want to run things under .NET, you should use the above ZIP archive. (You can, of course, install Mono on Windows and then use mdoc via Mono, you just won't be able to run mdoc under .NET.)

The changes made since Mono 2.6 include:

Next time, we'll cover what mdoc is, and why you'd want to use it.

Posted on 07 Jan 2010 | Path: /development/mono/mdoc/ | Permalink