Saturday, January 29, 2011

Automatically publishing files provisioned with Sandboxed Solutions

I couple of days ago I saw this posting on Waldek Mastykarz’s blog (a great blog by the way) and now you are reading my posting with the very same name, which treats also the very same issue, but provides a different solution to it. So, what exactly is the issue with file provisioning with sandbox solutions? In short – the problem occurs when you use “Module” elements in a feature for provisioning files to SharePoint document libraries. And not to any document library but to a library which is configured with one of the following:

  1. Check out for files is required
  2. Content approval is enabled
  3. Minor versions are enabled

The files will be of course provisioned to the target library but they will be either checked out and/or in a draft or pending state depending on the enabled configurations and the actual combination of the three that is enabled on the library. And this will mean that the files will be present but they won’t be accessible for the regular users of the site. This behavior is obviously intended and by design – I suppose that the main reason for it is the ability for the site and site collection administrators to review and approve or reject the content provisioned by the sandbox solution.

Having said that, I should mention that I knew about this issue for quite some time myself and even had tackled one isolated case of it in a previous posting of mine – it is about provisioning of publishing pages with sandbox solutions. The problem with publishing pages was even more serious – not only the pages end up in a draft state and checked out, but the web parts defined in the “Module” manifest file never get provisioned (check the posting itself for more details). The solution for publishing pages was pretty neat and simple  (it doesn’t need a feature receiver and code with it) but unfortunately it is applicable only for publishing pages.

As for the solution itself – it is known that the SPFile.CheckIn, SPFile.Publish and SPFile.Approve methods are available in sandbox solutions too, so with a feature receiver and several lines of code the solution would be pretty trivial. But obviously in this case the best solution would be if you have a reusable feature receiver that you write just once and then can use everywhere without changing or adjusting its code. The universal feature receiver will save you a lot of time and efforts since the usage of “Module” features is pretty frequent in SharePoint. The main technical challenge for this universal feature receiver is that the code in the receiver should be “aware” of which files exactly its feature provisions to the target library (libraries). This type of “awareness” is provided OOB by the built in SPFeatureDefinition.GetElementDefinitions method but unfortunately this method is not available in the sandbox subset of the object model. So we need a different solution and I will start with briefly explaining the solution from Waldek’s blog – it is pretty simple and requires very little additional effort – you just add an extra “Property” element to every “File” element in your “Module” manifest file like that:

<File Path="Images\someimage.png" Url="someimage.png" Type="GhostableInLibrary">

  <Property Name="FeatureId" Value="$SharePoint.Feature.Id$" Type="string"/>

</File>

The idea is as follows – since you won’t have a column name with the name “FeatureId” in your target library, the value of this property element will be saved in the property bag of the underlying SPFile instance of the provisioned file. And the value of the property element is none other but the Guid of the parent feature (note the smart usage of the Visual Studio token syntax). So basically by adding this extra “Property” element to all “File” elements in the manifest file you literally “mark” all files that your feature provisions. The logic then of the feature receiver will be to iterate all files in the target library and after finding all marked files to check them in, approve or publish them where appropriately. Additionally you will have to provide somehow the name/names of the target library/libraries to the feature receiver, which is possible by using feature properties in the feature.xml file of your feature.

And, now to the reasons as to why I started this article and thought of a different solution – the solution above is basically pretty neat and straight-forward and involves very little extra implementation effort, but what I didn’t like about it was the iteration of all files in the target library(libraries), which may happen to contain thousands of files (which in fact is very improbable but this thought always gives me the creeps when I think of some aspects of SharePoint performance). Of course there’re several ways to optimize the iteration – for instance, to iterate the SPList.Items collection instead of the SPFolder/SPFile hierarchy (the former is generally faster) and use the contents of the internal system MetaInfo list column which contains the serialized property bag of the underlying SPFile instance (unfortunately you cannot use CAML filtering on this column). Further, a very simple but effective optimization would be to use CAML filtering on the “Created”, “_Level” and “_ModerationStatus” fields so that you can fetch only items just recently created (say you can put a margin of 5 or 10 minutes in the past; note that even if the file exists already in the library, when the feature provisions it again, its “Created” date will also get updated) and are either checked out or in a draft or pending state.

Thus far so good, but still I wanted to find a solution that is as close to the “self awareness” approach with the SPFeatureDefinition.GetElementDefinitions method, so that the feature receiver knows exactly which files its feature provisions without it being necessary to traverse the target library or libraries and search for them. And what occurred to me was that if we need the contents of the manifest file in the feature receiver why can’t we simply provision it to the target site (it is available in the feature definition as an element file but there is nothing wrong if it is also referenced in a “File” element in the manifest file), then read its contents from the feature receiver, and finally when the files are published the feature receiver can safely delete it. And let me show you a sample “Module” manifest file so that you can get a better idea of the trick that I just explained:

<Elements xmlns="http://schemas.microsoft.com/sharepoint/">

  <Module Name="TestModule" Url="Style Library">

    <File Path="TestModule\Sample.txt" Url="Sample.txt" Type="GhostableInLibrary" />

    <File Path="TestModule\Sample2.txt" Url="Sample2.txt" Type="GhostableInLibrary" />

    <File Path="TestModule\Sample3.txt" Url="test/Sample3.txt" />

  </Module>

  <Module Name="TestModule2">

    <File Path="TestModule\Elements.xml" Url="Elements_$SharePoint.Feature.Id$.xml" Type="Ghostable" />

  </Module>

</Elements>

You can see that the manifest file (whose name is the commonplace “elements.xml”) contains two “Module” elements – the first one provisions several files to the standard “Style Library” library (a pretty recurring task in SharePoint development) and check carefully the second one – this is the only extra bit in the manifest file that you need (the other bit is the reusable feature receiver) – this is a “Module” element that provisions the manifest file itself. Note two things – the manifest file gets provisioned to the root folder of the target site – we don’t want to put the file in a document library because this way it will be directly visible to the site users. And secondly – check the URL of the file – it contains the Guid of the feature and the feature receiver will use exactly this so that it can locate the file when it gets to be executed. The pattern of the URL should be like this: [any number of characters that are unique among the “Module” manifest files in the parent feature]_$SharePoint.Feature.Id$.xml (the part from the underscore character on should be always the same) – basically we want to get a unique target name for every “Module” manifest file in this feature. And here is the code of the feature receiver itself:

    public class TestFeatureEventReceiver : SPFeatureReceiver

    {

        // The SharePoint elements file namespace

        private static readonly XNamespace WS = "http://schemas.microsoft.com/sharepoint/";

 

        public override void FeatureActivated(SPFeatureReceiverProperties properties)

        {

            // make it work for both 'Site' and 'Web' scoped features

            SPWeb web = properties.Feature.Parent as SPWeb;

            if (web == null && properties.Feature.Parent is SPSite) web = ((SPSite)properties.Feature.Parent).RootWeb;

            if (web != null) CheckinFiles(web, properties.Feature.DefinitionId);

        }

 

        private void CheckinFiles(SPWeb web, Guid featureID)

        {

            // create a regular expression pattern for the manifest files

            string pattern = string.Format(@"^.+_{0}.xml$", featureID);

            Regex fileNameRE = new Regex(pattern, RegexOptions.Compiled | RegexOptions.IgnoreCase);

 

            // get the manifest files from the root folder of the site

            SPFile[] manifestFiles = web.RootFolder.Files.Cast<SPFile>().Where(f => fileNameRE.IsMatch(f.Name)).ToArray();

            try

            {

                // iterate the manifest files

                foreach (SPFile manifestFile in manifestFiles)

                {

                    // load the contents of the manifest file in an XDocument

                    MemoryStream mStream = new MemoryStream(manifestFile.OpenBinary());

                    StreamReader reader = new StreamReader(mStream, true);

                    XDocument manifestDoc = XDocument.Load(reader, LoadOptions.None);

 

                    // iterate over the 'Module' and 'File' elements in the XDocument, concatenating their Url attributes in a smart way so that we grab the site relative file Url-s

                    string[] fileUrls = manifestDoc.Root.Elements(WS + "Module")

                        .SelectMany(me => me.Elements(WS + "File"), (me, fe) => string.Join("/", new XAttribute[] { me.Attribute("Url"), fe.Attribute("Url") }.Select(attr => attr != null ? attr.Value : null).Where(val => !string.IsNullOrEmpty(val)).ToArray()))

                        .ToArray();

 

                    // iterate the file url-s

                    foreach (string fileUrl in fileUrls)

                    {

                        // get the file

                        SPFile file = web.GetFile(fileUrl);

                        // depending on the settings of the parent document library we may need to check in and/or (publish or approve) the file

                        if (file.Level == SPFileLevel.Checkout) file.CheckIn("", SPCheckinType.MajorCheckIn);

                        if (file.Level == SPFileLevel.Draft)

                        {

                            if (file.DocumentLibrary.EnableModeration) file.Approve("");

                            else file.Publish("");

                        }

                    }

                }

            }

            finally

            {

                // finally delete the manifest files from the site root folder

                foreach (SPFile manifestFile in manifestFiles) manifestFile.Delete();

            }

        }

    }

As you can see, the receiver’s code is quite small and straight-forward (check also the comments inside the code). In short what it does is as follows: it first iterates the files in the site root folder, matching their names against the above mentioned name pattern, then it loads the manifest files one by one in an XDocument object and extracts the URL-s of the files in every manifest file. After that the provisioned files are being checked in, published or approved if necessary (depending on their state and the settings of the containing document library). The final step is to delete the manifest file or files in the site root folder, because they are no longer needed, of course.