Customers who sign-up prior to 30/06/2024 get unlimited access to free features, newer features (with some restrictions), but for free for at least 1 year.Sign up now! https://webveta.alightservices.com/
Categories
.Net C#

C# Code snippets for generating XML and sitemaps

I have explained sitemaps in an earlier blog post – Sitemaps an intro! In this blog post we would look at some code snippets for generating XML sitemaps. Two approaches are mentioned. This blog post is about sample code snippets. Read the comments in the code.

I might consider creating a small library for generating XML sitemaps and open sourcing the library.

The general structure of a valid XML document contains a XML declaration i.e the <?xml….>. Then there would be one and exactly one root node and the root node would contain other child nodes.

With XML sitemap, we would have the XML declaration, the <urlset> root node. The <urlset> root node would have one or more <url> child nodes upto a maximum of 50,000. The <url> node would have one and only one mandatory <loc> node. The <lastmod>, <changefreq>, <priority> are optional. If the optional <lastmod>, <changefreq>, <priority> nodes are present, there can be a maximum of one child node per <url> node.

Business logic validations would be:

  1. The <url> nodes can be between 1 and 50,000.
  2. Unique URL’s
  3. <loc> is mandatory and the value must be a valid http or https URL.
  4. If present, <lastmod> must be in valid w3c format, there can be 0 or 1 <lastmod> per <url> node.
  5. <changefreq> can be 0 or 1 per <url> node and must be one of the following:
    • always
    • hourly
    • daily
    • weekly
    • monthly
    • yearly
    • never
  6. <priority> can be 0 or 1 per <url> node and must be a valid number between 0 and 1 such as 0.1 or 0.2 etc…

Approach – 1:

In the following code snippets we would not be dealing with the business logic. This is for purely creating.

XmlDocument xmlDoc = new XmlDocument();
var declaration = xmlDoc.CreateXmlDeclaration("1.0", "UTF-8", "yes");
xmlDoc.AppendChild(declaration);

XmlNode urlSet = xmlDoc.CreateElement("urlset", "http://www.sitemaps.org/schemas/sitemap/0.9");

var url = xmlDoc.CreateElement("url");
var loc = xmlDoc.CreateElement("loc");
loc.InnerText = "https://www.google.com";

url.AppendChild(loc);

var lastMod = xmlDoc.CreateElement("lastmod");
lastMod.InnerText = $"{DateTime.UtcNow.ToString("s")}Z";
url.AppendChild(lastMod);

urlSet.AppendChild(url);

xmlDoc.AppendChild(urlSet);


// Append more <url> nodes.

// Move the logic of generating url nodes into a separate method, call the method repetitively, apply business logic etc...




xmlDoc.Save(@"C:\temp\sitemap.xml");

Approach – 2:

In this also we won’t be applying business logic, this is a sample code snippet.

UrlSet.cs

[XmlRoot(ElementName = "urlset", Namespace = "http://www.sitemaps.org/schemas/sitemap/0.9")]
    public class Urlset
    {
        [XmlElement(ElementName = "url", Namespace = "http://www.sitemaps.org/schemas/sitemap/0.9")]
        public List<Url> Url { get; set; }
        [XmlAttribute(AttributeName = "xmlns")]
        public string Xmlns { get; set; }

        public Urlset()
        {
            Xmlns= "http://www.sitemaps.org/schemas/sitemap/0.9";
            Url = new List<Url>();
        }
    }
Url.cs

public class Url
    {
        [XmlElement(ElementName = "loc", Namespace = "http://www.sitemaps.org/schemas/sitemap/0.9")]
        public string Loc { get; set; }
        [XmlElement(ElementName = "lastmod", Namespace = "http://www.sitemaps.org/schemas/sitemap/0.9")]
        public string Lastmod { get; set; }
        [XmlElement(ElementName = "changefreq", Namespace = "http://www.sitemaps.org/schemas/sitemap/0.9")]
        public string Changefreq { get; set; }
        [XmlElement(ElementName = "priority", Namespace = "http://www.sitemaps.org/schemas/sitemap/0.9")]
        public double Priority { get; set; }

        public Url()
        {
            Lastmod = DateTime.Now.ToLongDateString();
            Priority = 0.5;
            Changefreq = "Never";
        }
    }
// Generation

XmlSerializerNamespaces ns = new XmlSerializerNamespaces();

//Adding an empty namespace
ns.Add("", "");

var urlSet = new Urlset();
urlSet.Url.Add(new Url { Loc = "https://www.sample.com" });
urlSet.Url.Add(new Url { Loc = "https://www.sample.com/page1" });
XmlSerializer serializer = new XmlSerializer(typeof(Urlset));
string utf8;
using (StringWriter writer = new Utf8StringWriter())
{
    serializer.Serialize(writer, urlSet, ns);
    utf8 = writer.ToString();
    Console.WriteLine(utf8);
}
// DeSerialization - useful for modifications and deletions
XmlSerializer serializer = new XmlSerializer(typeof(Urlset));

var deserializedUrlSet = serialization.Deserialize(new StringReader(utf8));

// Here utf8 is a string with the xml, refer to the overloads of Deserialize for other ways of passing XML

I don’t have any fake aliases, nor any virtual aliases like some of the the psycho spy R&AW traitors of India. NOT associated – “ass”, eass, female “es”, “eka”, “ok”, “okay”, “is”, “erra”, yerra, karan, kamalakar, diwakar, kareem, karan, sowmya, zinnabathuni, bojja srinivas (was a friend and batchmate 1998 – 2002), mukesh golla (was a friend and classmate 1998 – 2002), thota veera, uttam’s, bandhavi’s, bhattaru’s, thota’s, bojja’s, bhattaru’s or Arumilli srinivas or Arumilli uttam (may be they are part of a different Arumilli family – not my family).

Mr. Kanti Kalyan Arumilli

Arumilli Kanti Kalyan, Founder & CEO
Arumilli Kanti Kalyan, Founder & CEO

B.Tech, M.B.A

Facebook

LinkedIn

Threads

Instagram

Youtube

Founder & CEO, Lead Full-Stack .Net developer

ALight Technology And Services Limited

ALight Technologies USA Inc

Youtube

Facebook

LinkedIn

Phone / SMS / WhatsApp on the following 3 numbers:

+91-789-362-6688, +1-480-347-6849, +44-07718-273-964

+44-33-3303-1284 (Preferred number if calling from U.K, No WhatsApp)

kantikalyan@gmail.com, kantikalyan@outlook.com, admin@alightservices.com, kantikalyan.arumilli@alightservices.com, KArumilli2020@student.hult.edu, KantiKArumilli@outlook.com and 3 more rarely used email addresses – hardly once or twice a year.