Saturday, January 16, 2016

Playing Fetch with XML

Motivation

Importing data into Microsoft Dynamics CRM is pretty convinient, however the data we have to import is usually not perfect, therefore we often end up importing and deleting the same data many times before we are satisfied. Both of these processes are quite slow, and does require a bit of manual work. This method also means that we can't start using the the data until it is imported correctly because relations to it will be broken. Even in the situations where the data is perfect, like when we move data from a development or testing system to a production system, we first have to export it, manually change the format, and manually import it again.

Our next project is to make data import easier. We cannot cover everything within this area, so this is a running project, which we will develope and expand over time.

In this post, we make the foundation for a neat and minimal library for querying the CRM system. The most common way for users to query the system is using the Advanced Search functionality. Advanced Search provides an interface where we can choose which attributes to display and set up conditions on which records to retrive. Behind the scenes this is represented in as XML more specifically FetchXML. FetchXML is the way we are going to query the system. XML is not the nicest interface for humans, thus we are going to abstract away the syntax and start introducing types.

Remember: this is a technical post, a usage/tutorial post will follow next week.

Prerequicits

In order to build an F# library to abstract away XML we need to be familiar with the following concepts:

Utilities

For this library, we don't need alot of new utilities. Actually, we only need one new function:

module StringUtils =
  let capitalize_first (str : string) = 
    string (System.Char.ToUpper str.[0]) + str.Substring(1)

Unsafe FetchXml

In the first version of the library we focus only on the basic functionality. Afterwards we consider how to improve usability by introducing types to help catch bugs. The library consists of five functions for: initialization, setting how many records to retrieve, setting up conditions on the lookup, choosing which attributes to retrieve, and finally generating the finished FetchXML.

As mentioned earlier on this blog: I like chain-calling. So like last time we have a value – the 'needle' – that is 'threaded' through all the calls. This needle is the last argument to every function, and every function returns it, or a variation of it.

Initialization

For now, we are only interested in basic functionality we only need to store:

  • which entity we want to retrieve from
  • how many records to retrieve
  • which attributes to retrieve
  • and the conditions on the records.

The final value is a string representing XML, therefore many of these can be represented as strings. However, for the conditions there are some advantages to using a custom datatype. First, it is difficult for users to remember what conditions are possible, and how to write them – their format and keyword. If the conditions are represented by a custom datatype these problems are solved by code completion and custom code, respectively. Second, if we ever want to extend the library, if the conditions were strings we might have to parse them, which we would rather not.

Note: I have only chosen a few central types of condition, but it should be easily extended with more types.

module FetchXml =
  type condition =
    | Equals of string
    | In of string list
    | ContainsData
  type fetchxml =
    { entity : string;
      count : int option;
      attributes : string list;
      conditions : (string * condition) list }
  let init ent =
    { entity = ent;
      count = None;
      attributes = [];
      conditions = []; }

By default, we retrieve all attributes and all records.

Setting how many Records to Retrive

Limiting how many records to retrieve is as easy as updating count:

  let set_count count fxml =
    { fxml with count = Some count }

Chosing which Attributes to Retrive

Similarly, if we want to limit which attributes to retrieve we simply add them, one at a time:

  let add_attribute attr fxml =
    { fxml with attributes = attr :: fxml.attributes }

Setting up Conditions on the Lookup

Finally, to limit which records to retrieve we simply add attribute-conditions pairs, one at a time:

  let add_condition attr cond fxml =
    { fxml with conditions = (attr, cond) :: fxml.conditions }

Generating the Finished FetchXML

Having a fetchxml-record it is straightforward to generate the XML. One thing to note is that we cannot use amporsant in values in FetchXML.

  let generate fxml =
    let e = fxml.entity in
    "<fetch mapping=\"logical\"" + 
    (match fxml.count with
      | None -> ""
      | Some c -> " count=\"" + string c + "\"") +
    " version=\"1.0\">" + 
    "<entity name=\"" + e + "\">" +
    List.foldBack (fun a acc -> "<attribute name=\"" + a + "\" />" + acc) fxml.attributes "" +
    "<filter type=\"and\">" +
    List.foldBack (fun (a, c) acc -> 
      match c with
        | Equals s ->
          "<condition attribute=\"" + a + "\" operator=\"eq\" value=\"" + s.Replace("&", "&") + "\" />" + acc
        | In vs ->
          "<condition attribute=\"" + a + "\" operator=\"in\">" +
          List.foldBack (fun v acc -> "<value>" + v + "</value>" + acc) vs "" +
          "</condition>" + acc
        | ContainsData ->
          "<condition attribute=\"" + a + "\" operator=\"not-null\" />" + acc) fxml.conditions "" +
    "</filter>" +
    "</entity>" +
    "</fetch>"

This concludes the core library. This is fine if we want to use it with other code, but people are not great with strings. People make spelling mistakes, forget which entities have which attributes, or even more subtly, forget to correct code when it changes.

Invent datatypes

One way to improve these problems is to use types. We need a way to connect entities and attributes with types, but where all entities still have 'similar' types. For this, we use polymorphism. Here is a toy example of how this would look:

module Attribute =
  module systemuser =
    type attribute = Name | Fullname
    let string_of_attribute = function
      | Name -> "name"
      | Fullname -> "fullname"
module Entity =
  type 'a entity = private { logical_name : string; string_of : 'a -> string }
  let string_of_entity e = e.logical_name
  let string_of_attribute e a = e.string_of a
  let systemuser = 
    { logical_name = "systemuser"; 
      string_of = Attribute.systemuser.string_of_attribute }

Notice that because the entity record is private we cannot accidentally make an invalid entity.

Generate datatypes

Writing the datatypes for all attributes and entities is unmanageable for people. However, even if we could, it would be static so we would need to update it every time something changed in the CRM system. We need a way to generate it:

let meta = 
  let er = RetrieveAllEntitiesRequest () in
  er.EntityFilters <- EntityFilters.Attributes;
  xrm.OrganizationService.Execute(er) :?> RetrieveAllEntitiesResponse in
meta.EntityMetadata
|> Array.fold (fun (attrs, ents) i -> 
  let attributes = i.Attributes |> Array.fold (fun acc a -> acc + " | " + StringUtils.capitalize_first (a.LogicalName)) "" in
  let string_of = i.Attributes |> Array.fold (fun acc a -> acc + "      | " + StringUtils.capitalize_first (a.LogicalName) + "-> \"" + a.LogicalName + "\"\n") "" in
  (attrs + 
   "  module " + i.LogicalName + " = \n" +
   "    type attribute = " + attributes + "\n" +
   "    let string_of_attribute = function \n" + string_of,
   ents +
   "  let " + i.LogicalName + "= \n" +
   "    { logical_name = \"" + i.LogicalName + "\"; \n" +
   "      string_of = Attributes." + i.LogicalName + ".string_of_attribute } \n")
   ) ("module Attributes = \n", 
      "module Entities = \n" +
      "  type 'a entity = private { logical_name : string; string_of : 'a -> string }\n" +
      "  let string_of_entity e = e.logical_name\n" +
      "  let string_of_attribute e a = e.string_of a\n")
|> fun (attrs, ents) -> 
  File.WriteAllText (cfg.rootFolder + "/Crm.fsx", attrs + ents);

Safe FetchXml

With all these datatypes, we can modify the fetchxml-record to use these instead of strings:

type 'a fetchxml =
  { entity : 'a Crm.Entities.entity;
    count : int option;
    attributes : 'a list;
    conditions : ('a * condition) list }

Of course we also need to change generate accordingly, but that is straightforward.

As mentioned earlier, string actually work better with other code, thus if we still want this option we need to add this "entity-generator" to the Entities-module.

  let unsafe lname = { logical_name = lname; string_of = id }

Quality Control

As this library does not modify any data, we don't need to be as critical of it.

A Note on Performance

My focus is usually more on correctness and aesthetics; for once, we will consider the performance. The problem is that we are generating an 80 kB file. This file needs to be read, parsed, and type checked, which turns out to be very slow.

One solution to this problem is to split up the entities into separate files, but then we have to forgo the advantages of the private record. Namely that we cannot obtain an illegal entity, which is aesthetically less pleasing, but in practice may be the right choice.

No comments:

Post a Comment