Showing posts with label Data Manipulation. Show all posts

Monday, February 29, 2016

Import-ant Data

Previously on Dr. Lambda's blog:

Importing data into Microsoft Dynamics CRM is convenient; however, the data we have to import is usually not perfect, therefore, we often end up importing and deleting the same data many times before we are satisfied. Both of these processes are quite slow, and does require a bit of manual work. Another caveat of this method is that we cannot start using the data until it is imported correctly, because relations to it will be broken.

import : string -> unit

Now, the continuation...

Off the Shelf Usage

Our tool can import CSV-files into CRM. In order to do this however we need to follow strict rules.

The first rule is the name our CSV-file should be the logical name of the entity we want the records to have. Thus if we want to import contacts the file should be called contact.csv.

Further, the first line in the file should be headers, also following strict rules.

Field types

Normal Values

For string (or two option) values the header should be the logical name of the attribute.

Examples:

name
donotemail
drl_custom_attribute

Local Option Sets

For values which are labels of a local option set we add a : after the logical name of the attribute.

Examples:

paymenttermscode:
drl_customerpricegroup:

Global Option Sets

For values which are labels of a global option set we further add the logical name of the option set after the :.

Examples:

budgetstatus:budgetstatus
drl_attribute:drl_option_set

Lookup Fields

For values which refer to other entities we add a :, then the logical name of the target entity, then a ., then the logical name of the field to lookup based on.

Examples:

transactioncurrencyid:transactioncurrency.isocurrencycode
primarycontactid:contact.emailaddress1

Additional

Primary Keys

We can define a number of fields as primary. By doing this the tool will instead try to update records before it creates new ones. The way the tool determines what to do is if a record exist which is equal on all the primary fields to the one we would create, then we update this one instead.

We annotate a field as primary by adding a * immediately after its logical name.

Examples:

name*
primarycontactid*:contact.emailaddress1

Ignore

We can also have fields in our CSV files which contain data we don’t want in CRM. If we prefix a header with # the column will be ignored.

Examples:

# secret data
# password

Putting it together

Here is a toy example of a complete CSV file, ready for import.

contact.csv:

firstname ; lastname ; emailaddress1* ; parentcustomerid:account.name ; preferredcontactmethodcode: ; # age
Jacob ; Goldberg ; first@bing.com ; Microsoft ; Phone ; 43
John ; Doe ; untitled@gmail.com ; Google ; Email ; 28
Michael ; Lincoln ; michael82@hotmail.com ; Starbucks ; Mail ; 23
Christian ; Gosling ; real_gosling@live.com ; Blizzard ; Any ; 35

Monday, February 15, 2016

Import-ant Tool

Motivation

In this post, we implement a small F# script to import data into Microsoft Dynamics CRM. However, if the data is already in the CRM system the script corrects the existing record with the new data, thereby preserving relations already in place. This means, we can start using our data immediately, without fear of loss.

Remember: this is a technical post; a usage/tutorial post will follow next week.

Prerequicits

For once we are not working with XML, thus we only need:

F#, TypeProviders (in particular the XrmDataProvider)
Microsoft Dynamics CRM
And our own FetchXML library.

Utilities

Apart from the FetchXml library we made last week, we only need this simple function:

let rec assoc key map = 
  match map with
    | [] -> None
    | (k, v) :: map' ->
      if k = key
      then Some v
      else assoc key map'

The Library

Our goal is to load data from a CSV-file (comma-separated values) into CRM.

It is straightforward to load in strings. However, it is a different story for option sets or lookups. Option sets are easier if we have them as the values, as opposed to the labels. Although, this makes it much less readable, and harder to work with.

Another goal is to keep the data as human readable as possible. Readable data is easier to work with, and less error prone. It also allows less technical users to work with the data, e.g. in Excel.

Because we also want the library to be able to update records, which are already in CRM, therefore we need some way to specify which records to update.

My favorite solution to these problems is to use sophisticated headers. Our headers should specify both what type of data it is, and supply additional information necessary to load the data. The headers should also specify which attributes uniquely define a record; this could be an account number, or an email address. We call these fields primary.

Initialization

There are many types of data in CRM. In this prototype, we only address a few very common types; strings, booleans, local option sets, global option sets, and lookups.

For simplicity, I have decided that you can only set lookups based on one attribute of the target entity. We also represent both local and global option sets simply as a mapping from labels to values. We also allow columns to be ignored.

type mapping =
  | Simple
  | Ignore
  | OptionSet of (string * int) list
  | Lookup of string * string
type field =
  { name : string;
    typ : mapping;
    primary : bool; }

In order to initialize the mapping we need to define some syntax:

If a field starts with # the column will be ignored
Otherwise the header should be the logical name of the attribute in CRM
If the name is followed by a * this column is primary
If the name (and possibly *) is followed by only a :, then the field is the label of a local option set. (Example (Account): paymenttermscode:)
If the name (and possibly *) is followed by only a : and the logical name of a global option set, then the field is the label of that global option set. (Example (Opportunity): budgetstatus:budgetstatus)
If the name (and possibly *) is followed by only a : and the logical name of an entity, and a . and the logical name of an attribute of the entity, then the field should look up a record of that entity based that attribute. (Example (Account): primarycontactid:contact.emailaddress1)

let init_mapping ent (s : string) =
  s.Split [|';'|]
  |> Array.map (fun f ->
    let f = f.Trim () in
    let f = f.Split(':') in
    let primary = f.[0].EndsWith("*") in
    let name = 
      if primary
      then f.[0].Substring(0, f.[0].Length - 1)
      else f.[0] in
    if name.StartsWith("#")
    then { name = name; typ = Ignore; primary = primary }
    else if f.Length = 1
    then { name = name; typ = Simple; primary = primary }
    else 
      let f = f.[1].Split('.') in
      if f.Length = 1
      then 
        let options =
          if f.[0].Length = 0
          then 
            let req = RetrieveAttributeRequest () in
            req.EntityLogicalName <- ent;
            req.LogicalName <- name;
            let resp = xrm.OrganizationService.Execute req :?> RetrieveAttributeResponse in
            let pList = resp.AttributeMetadata :?> PicklistAttributeMetadata in
            pList.OptionSet.Options
            |> Seq.map (fun o -> (o.Label.UserLocalizedLabel.Label, o.Value.Value))
            |> Seq.toList 
          else
            let req = RetrieveOptionSetRequest () in
            req.Name <- f.[0];
            let resp = xrm.OrganizationService.Execute req :?> RetrieveOptionSetResponse in
            let oSetMeta = resp.OptionSetMetadata :?> OptionSetMetadata in
            oSetMeta.Options
            |> Seq.map (fun o -> (o.Label.UserLocalizedLabel.Label, o.Value.Value))
            |> Seq.toList in
        { name = name; typ = OptionSet options; primary = primary }
      else
        let entity = f.[0] in
        let field = f.[1] in
        { name = name; typ = Lookup (entity, field); primary = primary }
    )

Setting Data

Now setting a value based on the mapping and a string is straightforward.

Notice: lookup is from last week’s post.

let set_value (e : Entity) m (v : string) =
  if v.Length = 0
  then ()
  else
    match m.typ with
      | Ignore -> ()
      | Simple -> 
        if v = "true" then
          e.Attributes.Add(m.name, true)
        else if v = "false" then
          e.Attributes.Add(m.name, false)
        else
          e.Attributes.Add(m.name, v)
      | OptionSet map -> 
        match assoc v map with
          | None -> printfn "Warning: %s not found in optionSet" v; ()
          | Some i -> e.Attributes.Add(m.name, OptionSetValue i)
      | Lookup (ent, f) ->
        e.Attributes.Add(m.name, EntityReference (ent, lookup ent f v))

Getting Data

As mentioned above we sometimes want to update existing records instead of creating new ones. The way we determine this is if a record exist which is equal on all the primary fields to the one we would create, then we update this one instead. To test this we again use our FetchXml library.

Notice: we require that the record be unique. If we did not do this, we could not guarantee that we consistently update the same record.

let get_entity ent ms vals =
  let fxml = 
    (ms, vals)
    ||> Array.fold2 (fun st m v -> 
      if not m.primary
      then st
      else 
        match m.typ with
          | Ignore -> st
          | Simple -> FetchXml.add_condition m.name (FetchXml.Equals v) st
          | OptionSet os -> FetchXml.add_condition m.name (FetchXml.Equals (string (assoc v os).Value)) st
          | Lookup (ent, f) -> FetchXml.add_condition m.name (FetchXml.Equals ((lookup ent f v).ToString("B"))) st
      ) (FetchXml.init (Crm.Entities.unsafe ent) |> FetchXml.set_count 2)
    |> FetchXml.generate in
  let ec = xrm.OrganizationService.RetrieveMultiple (FetchExpression fxml) in
  let e = Entity () in
  e.LogicalName <- ent;
  if ec.Entities.Count = 0 then
    e
  else if ec.Entities.Count = 1 then
    e.Id <- (ec.Item 0).Id;
    e
  else
    printfn "Fxml: %s" fxml;
    failwithf "Lookup failed: multiple %s records found." ent

Importing/Repairing Data

With these helpers, the main function simply loads the file, line-by-line, and calls the helpers. Currently the library assumes the default export format for CSV files from Excel, which means un-quoted strings separated by semicolons.

let action act (e : Entity) =
  if e.Attributes.Contains("name")
  then printfn "%s: %s" act (string (e.Attributes.["name"]))
  else printfn "%s entity" act

let import_data entity =
  let lines = File.ReadAllLines (cfg.rootFolder + "/" + entity + ".csv") in
  let mapping = init_mapping entity lines.[0] in
  lines
  |> Array.skip 1
  |> Array.iter (fun valS ->
    try
      let vals = valS.Split [|';'|] |> Array.map (fun v -> v.Trim ()) in
      if vals.Length <> mapping.Length
      then failwithf "Not enough values"
      else
        let e = get_entity entity mapping vals in
        (mapping, vals)
        ||> Array.iter2 (set_value e);
        if e.Id = System.Guid.Empty
        then action "Created" e; xrm.OrganizationService.Create e |> ignore
        else action "Updated" e; xrm.OrganizationService.Update e
    with
      e -> printfn "Error: %s; %s" (e.Message) valS
    )

Quality Control

As I am devoted to high quality software, we should take a step back and examine how solid this library is.

First, this library is only as good as the data it gets. We can feed it destructive data, either because it overwrites good data, or contains invalid references, or something else. In practice though, bad data is most commonly rejected by CRM, in which case the tool skips the line, prints the error, and the faulty data. Similar to the built in import tool in CRM.

Another potential problem is if the data contains semicolons, in which case the input line is split incorrectly. However, there is no way to protect against this with the format we chose. If this happens, the line will be skipped, and printed.

Because of the primary fields, the tool never creates duplicates. This also means that we can run it multiple times without any additional risks. Further, because any error is printed, along with the triggering data, we get an easy overview over what has been skipped, and usually why. This means that we can easily find the flaw, correct it, and run the import tool again.

Monday, February 1, 2016

Using the FetchXML Library

Previously on Dr. Lambda's blog:

The most common way for users to query a Microsoft Dynamics CRM system is using the Advanced Search functionality. Advanced Search provides an interface where we can choose which attributes to display and set up conditions on which records to retrive. Behind the scenes this is represented in as XML more specifically FetchXML. FetchXML is the way we are going to query the system. XML is not the nicest interface for humans, thus we abstract away the syntax and start introducing types.

module FetchXml =
  type condition
  type 'a fetchxml
  let init : 'a Crm.Entities.entity

  let set_count : int -> 'a fetchxml -> 'a fetchxml

  let add_condition : 'a -> condition -> 'a fetchxml -> 'a fetchxml

  let add_attribute : 'a -> 'a fetchxml -> 'a fetchxml

  let generate : 'a fetchxml -> string

Now, the continuation...

Off the Shelf Usage

This library is itself mostly a utility for other tool we want to implement. However, let's look at some examples.

Example 1

In our first example we want to retrieve a specific users guid. This is not possible using the online interface.

FetchXml.init (Crm.Entities.systemuser)
|> FetchXml.set_count 1
|> FetchXml.add_condition Crm.Attributes.systemuser.Fullname (FetchXml.Equals "CRM System")
|> FetchXml.generate
|> fun f -> xrm.OrganizationService.RetrieveMultiple (FetchExpression f)
|> fun ec -> (ec.Entities.Item 0).Id

Returns:

1594eb93-de4c-4f11-bcf8-c27446d96378

Example 2

Let's do a slightly more complicated example, a generalized version of the built-in "My Open Opportunities". It is generalized because we can supply it with any users guid, and it retrieves all records, not limited by the usual maximum of 250.

FetchXml.init (Crm.Entities.opportunity)
|> FetchXml.add_condition Crm.Attributes.opportunity.Ownerid (FetchXml.Equals "{1594eb93-de4c-4f11-bcf8-c27446d96378}")
|> FetchXml.add_condition Crm.Attributes.opportunity.Statecode (FetchXml.Equals "0")
|> FetchXml.generate
|> fun f -> xrm.OrganizationService.RetrieveMultiple (FetchExpression f)
|> fun ec -> ec.Entities

Going Beyond

In the examples above, we notice that the two final steps are quite similar. Further, the steps exhibit the |> fun x ->-pattern which I don't find very attractive. Thus, we can abstract these steps into a function.

let fetch fxml = 
  xrm.OrganizationService.RetrieveMultiple (FetchExpression fxml)
  |> fun ec -> ec.Entities

Another useful addition to the library is a lookup function, which finds a unique element, based on one fields value.

let lookup ent f v =
  let es =
    FetchXml.init ent
    |> FetchXml.set_count 2
    |> FetchXml.add_condition f (FetchXml.Equals v)
    |> FetchXml.generate
    |> fetch in
  if es.Count = 0 then
    failwithf "Lookup failed: no %A records found with %A = %s" ent f v
  else if es.Count > 1 then
    failwithf "Lookup failed: multiple %A records found with %A = %s" ent f v
  else
    (es.Item 0).Id

Example: Audit History

I was recently in a situation where I needed to retrieve a lot of audit data. Currently, there is no convinient way to do this in CRM. Thus, it is a perfect way to demonstrate the usefulness of our new library. We are going to find the owner – before last update – of all open opportunities currently owned by a specific user.

FetchXml.init (Crm.Entities.opportunity)
|> FetchXml.add_condition Crm.Attributes.opportunity.Ownerid (FetchXml.Equals "{1594eb93-de4c-4f11-bcf8-c27446d96378}")
|> FetchXml.add_condition Crm.Attributes.opportunity.Statecode (FetchXml.Equals "0")
|> FetchXml.generate
|> fetch
|> Seq.map (fun o ->
  let req = RetrieveRecordChangeHistoryRequest () in
  req.Target <- EntityReference("opportunity", o.Id);
  let resp = xrm.OrganizationService.Execute(req) :?> RetrieveRecordChangeHistoryResponse in
  let oldVal = (resp.AuditDetailCollection.Item 0 :?> AttributeAuditDetail).OldValue in
  if oldVal <> null && oldVal.Attributes.ContainsKey("ownerid")
  then
    string (o.Attributes.["name"]) + "; " +
    (oldVal.Attributes.["ownerid"] :?> EntityReference).Name
  else
    "")

Saturday, January 16, 2016

Playing Fetch with XML

Motivation

Importing data into Microsoft Dynamics CRM is pretty convinient, however the data we have to import is usually not perfect, therefore we often end up importing and deleting the same data many times before we are satisfied. Both of these processes are quite slow, and does require a bit of manual work. This method also means that we can't start using the the data until it is imported correctly because relations to it will be broken. Even in the situations where the data is perfect, like when we move data from a development or testing system to a production system, we first have to export it, manually change the format, and manually import it again.

Our next project is to make data import easier. We cannot cover everything within this area, so this is a running project, which we will develope and expand over time.

In this post, we make the foundation for a neat and minimal library for querying the CRM system. The most common way for users to query the system is using the Advanced Search functionality. Advanced Search provides an interface where we can choose which attributes to display and set up conditions on which records to retrive. Behind the scenes this is represented in as XML more specifically FetchXML. FetchXML is the way we are going to query the system. XML is not the nicest interface for humans, thus we are going to abstract away the syntax and start introducing types.

Remember: this is a technical post, a usage/tutorial post will follow next week.

Prerequicits

In order to build an F# library to abstract away XML we need to be familiar with the following concepts:

XML (as a reference CRM's FetchXML)
F#, TypeProviders (in particular the XrmDataProvider)
and of course Microsoft Dynamics CRM.

Utilities

For this library, we don't need alot of new utilities. Actually, we only need one new function:

module StringUtils =
  let capitalize_first (str : string) = 
    string (System.Char.ToUpper str.[0]) + str.Substring(1)

Unsafe FetchXml

In the first version of the library we focus only on the basic functionality. Afterwards we consider how to improve usability by introducing types to help catch bugs. The library consists of five functions for: initialization, setting how many records to retrieve, setting up conditions on the lookup, choosing which attributes to retrieve, and finally generating the finished FetchXML.

As mentioned earlier on this blog: I like chain-calling. So like last time we have a value – the 'needle' – that is 'threaded' through all the calls. This needle is the last argument to every function, and every function returns it, or a variation of it.

Initialization

For now, we are only interested in basic functionality we only need to store:

which entity we want to retrieve from
how many records to retrieve
which attributes to retrieve
and the conditions on the records.

The final value is a string representing XML, therefore many of these can be represented as strings. However, for the conditions there are some advantages to using a custom datatype. First, it is difficult for users to remember what conditions are possible, and how to write them – their format and keyword. If the conditions are represented by a custom datatype these problems are solved by code completion and custom code, respectively. Second, if we ever want to extend the library, if the conditions were strings we might have to parse them, which we would rather not.

Note: I have only chosen a few central types of condition, but it should be easily extended with more types.

module FetchXml =
  type condition =
    | Equals of string
    | In of string list
    | ContainsData

  type fetchxml =
    { entity : string;
      count : int option;
      attributes : string list;
      conditions : (string * condition) list }

  let init ent =
    { entity = ent;
      count = None;
      attributes = [];
      conditions = []; }

By default, we retrieve all attributes and all records.

Setting how many Records to Retrive

Limiting how many records to retrieve is as easy as updating count:

  let set_count count fxml =
    { fxml with count = Some count }

Chosing which Attributes to Retrive

Similarly, if we want to limit which attributes to retrieve we simply add them, one at a time:

  let add_attribute attr fxml =
    { fxml with attributes = attr :: fxml.attributes }

Setting up Conditions on the Lookup

Finally, to limit which records to retrieve we simply add attribute-conditions pairs, one at a time:

  let add_condition attr cond fxml =
    { fxml with conditions = (attr, cond) :: fxml.conditions }

Generating the Finished FetchXML

Having a fetchxml-record it is straightforward to generate the XML. One thing to note is that we cannot use amporsant in values in FetchXML.

  let generate fxml =
    let e = fxml.entity in
    "<fetch mapping=\"logical\"" + 
    (match fxml.count with
      | None -> ""
      | Some c -> " count=\"" + string c + "\"") +
    " version=\"1.0\">" + 
    "<entity name=\"" + e + "\">" +
    List.foldBack (fun a acc -> "<attribute name=\"" + a + "\" />" + acc) fxml.attributes "" +
    "<filter type=\"and\">" +
    List.foldBack (fun (a, c) acc -> 
      match c with
        | Equals s ->
          "<condition attribute=\"" + a + "\" operator=\"eq\" value=\"" + s.Replace("&", "&") + "\" />" + acc
        | In vs ->
          "<condition attribute=\"" + a + "\" operator=\"in\">" +
          List.foldBack (fun v acc -> "<value>" + v + "</value>" + acc) vs "" +
          "</condition>" + acc
        | ContainsData ->
          "<condition attribute=\"" + a + "\" operator=\"not-null\" />" + acc) fxml.conditions "" +
    "</filter>" +
    "</entity>" +
    "</fetch>"

This concludes the core library. This is fine if we want to use it with other code, but people are not great with strings. People make spelling mistakes, forget which entities have which attributes, or even more subtly, forget to correct code when it changes.

Invent datatypes

One way to improve these problems is to use types. We need a way to connect entities and attributes with types, but where all entities still have 'similar' types. For this, we use polymorphism. Here is a toy example of how this would look:

module Attribute =
  module systemuser =
    type attribute = Name | Fullname
    let string_of_attribute = function
      | Name -> "name"
      | Fullname -> "fullname"

module Entity =
  type 'a entity = private { logical_name : string; string_of : 'a -> string }
  let string_of_entity e = e.logical_name
  let string_of_attribute e a = e.string_of a

  let systemuser = 
    { logical_name = "systemuser"; 
      string_of = Attribute.systemuser.string_of_attribute }

Notice that because the entity record is private we cannot accidentally make an invalid entity.

Generate datatypes

Writing the datatypes for all attributes and entities is unmanageable for people. However, even if we could, it would be static so we would need to update it every time something changed in the CRM system. We need a way to generate it:

let meta = 
  let er = RetrieveAllEntitiesRequest () in
  er.EntityFilters <- EntityFilters.Attributes;
  xrm.OrganizationService.Execute(er) :?> RetrieveAllEntitiesResponse in
meta.EntityMetadata
|> Array.fold (fun (attrs, ents) i -> 
  let attributes = i.Attributes |> Array.fold (fun acc a -> acc + " | " + StringUtils.capitalize_first (a.LogicalName)) "" in
  let string_of = i.Attributes |> Array.fold (fun acc a -> acc + "      | " + StringUtils.capitalize_first (a.LogicalName) + "-> \"" + a.LogicalName + "\"\n") "" in
  (attrs + 
   "  module " + i.LogicalName + " = \n" +
   "    type attribute = " + attributes + "\n" +
   "    let string_of_attribute = function \n" + string_of,
   ents +
   "  let " + i.LogicalName + "= \n" +
   "    { logical_name = \"" + i.LogicalName + "\"; \n" +
   "      string_of = Attributes." + i.LogicalName + ".string_of_attribute } \n")
   ) ("module Attributes = \n", 
      "module Entities = \n" +
      "  type 'a entity = private { logical_name : string; string_of : 'a -> string }\n" +
      "  let string_of_entity e = e.logical_name\n" +
      "  let string_of_attribute e a = e.string_of a\n")
|> fun (attrs, ents) -> 
  File.WriteAllText (cfg.rootFolder + "/Crm.fsx", attrs + ents);

Safe FetchXml

With all these datatypes, we can modify the fetchxml-record to use these instead of strings:

type 'a fetchxml =
  { entity : 'a Crm.Entities.entity;
    count : int option;
    attributes : 'a list;
    conditions : ('a * condition) list }

Of course we also need to change generate accordingly, but that is straightforward.

As mentioned earlier, string actually work better with other code, thus if we still want this option we need to add this "entity-generator" to the Entities-module.

  let unsafe lname = { logical_name = lname; string_of = id }

Quality Control

As this library does not modify any data, we don't need to be as critical of it.

A Note on Performance

My focus is usually more on correctness and aesthetics; for once, we will consider the performance. The problem is that we are generating an 80 kB file. This file needs to be read, parsed, and type checked, which turns out to be very slow.

One solution to this problem is to split up the entities into separate files, but then we have to forgo the advantages of the private record. Namely that we cannot obtain an illegal entity, which is aesthetically less pleasing, but in practice may be the right choice.