Company logo
  • Jobs
  • Bootcamp
  • About Us
  • For professionals
    • Home
    • Jobs
    • Courses
    • Questions
    • Teachers
    • Bootcamp
  • For business
    • Home
    • Our process
    • Plans
    • Assessments
    • Payroll
    • Blog
    • Calculator

0

159
Views
Is there a simpler way to re-format this date as a short date in c#?

Given this code:

// Decode the text string
string test = "Version 21.1.0 - 2021 Edition (22nd March 2021)";
string[] textitems = test.Split(' ');

// The text should split down like this:

// [0] Version
// [1] 21.1.0
// [2] -
// [3] 2021
// [4] Edition
// [5] (22nd
// [6] March
// [7] 2021)

I have created a enum to use:

enum UpdateInfo
{
    Version = 1,
    Edition = 3,
    Day = 5,
    Month = 6,
    Year = 7
}

The information I am interested in is:

  • Version Number: 21.1.0
  • Edition: 2021
  • Date: (22nd March 2021)

Version and Edition are straightforward:

writer.WriteAttributeString("Version", textitems[(int)UpdateInfo.Version]);
writer.WriteAttributeString("Edition", textitems[(int)UpdateInfo.Edition]);

But the Date is not. I found out that I can't parse (eg.):

(22nd March 2021)

I want the short date so I have come up with the following code after doing research:

// Rebuild date as short date

// Day - strip off "(" and "st", "nd", "rd" or "th"
string day = string.Empty;
for (int i = 0; i < textitems[(int)UpdateInfo.Day].Length; i++)
{
    if (Char.IsDigit(textitems[(int)UpdateInfo.Day][i]))
        day += textitems[(int)UpdateInfo.Day][i];
}

// Rebuilt long date
string datetest = day + " " + textitems[(int)UpdateInfo.Month] + " " + textitems[(int)UpdateInfo.Year];

// Remove trailing ")"
datetest = datetest.Trim(')');

// Now we can parse the long date string
DateTime date = DateTime.ParseExact(datetest, "d MMMM yyyy", CultureInfo.InstalledUICulture, DateTimeStyles.None);

if (date != null)
    writer.WriteAttributeString("Date", date.ToShortDateString());

Is there a simpler way to achieve the same result without bloating the code?


Note:

  • The dates will always be in English.
  • The source data is related to this question. Eg:
<p class="rvps2">
    <img alt="New Version Icon" 
         style="vertical-align: middle; padding : 1px; margin : 0px 5px;"
         src="lib/IMG_NewVersion.png">
    <span class="rvts16">Version 21.1.0 - 2021 Edition</span>
    <span class="rvts15"> (22nd March 2021)</span>
</p>

So I actually have a HtmlNode (the p element`).

9 months ago · Santiago Trujillo
3 answers
Answer question

0

I would not split by spaces, there are too many. I would split by "-" and then use regex to extract the date part. Then it's easy with TryParseExact and dd'nd' MMMM yyyy:

string[] textitems = test.Split('-');
string version = textitems[0].Trim();
string edition = textitems[1].Substring(0, textitems[1].IndexOf("(")).Trim();
string dateStr = Regex.Match(textitems[1], @"\(([^)]*)\)").Groups[1].Value;

string[] formats = { "d'st' MMMM yyyy", "d'nd' MMMM yyyy" };
bool validDate = DateTime.TryParseExact(dateStr, formats, CultureInfo.InvariantCulture, DateTimeStyles.None, out DateTime date );

I have added also d'st' MMMM yyyy since i can imagine that this would be your next issue. Another option was to include the brackets in the format: "'('d'nd' MMMM yyyy')'".

You might want to add some code to validate the input first, i have omitted that.

9 months ago · Santiago Trujillo Report

0

For this I wouldn't even bother with splitting the text, you can do this with a regular expression and named matches.

string test = "Version 21.1.0 - 2021 Edition (22nd August 2021)";
var regex = new Regex(@"Version (?'version'[\d.]+) - (?'edition'\d+) Edition \((?'date'[^)]+)", RegexOptions.None);
var matches = regex.Matches(test);

var version = matches[0].Groups["version"].Value;
var edition = matches[0].Groups["edition"].Value;
var dateString = matches[0].Groups["date"].Value;

// remove date ordinal before parsing
dateString = Regex.Replace(dateString, @"^(\d+)(st|nd|rd|th)", "$1");
var date = DateTime.ParseExact(dateString, "dd MMMM yyyy", CultureInfo.CurrentCulture);

date.ToShortDateString().Dump();

Normally I would use TryParseExact and handle any parse exceptions properly.

You can get an explanation of the main regular expression here: https://regex101.com/r/Nzpa5h/1

9 months ago · Santiago Trujillo Report

0

I have come up with a solution that combines both approaches. Since the original data is actually a HtmlNode (as indicated at the bottom of the question) and is already split into two span elements, I decided to do it this way:

// The paragraph element should only have two "span" elements
var listSpan = itemParagraph.Descendants("span");
if(listSpan != null)
{
    if(listSpan.Count() == 2)
    {
        // The first "span" element should contain: Version 21.1.0 - 2021 Edition
        var regex = new Regex(@"Version (?'version'[\d.]+) - (?'edition'\d+) Edition", RegexOptions.None);
        var matches = regex.Matches(listSpan.ElementAt(0).InnerText.Trim());

        writer.WriteStartElement("Update");
        writer.WriteAttributeString("Version", matches[0].Groups["version"].Value);
        writer.WriteAttributeString("Edition", matches[0].Groups["edition"].Value);

        // The second "span" element should contain: eg. (22nd March 2021)
        string dateString = listSpan.ElementAt(1).InnerText.Trim(' ', '(', ')');

        string[] formats =
        {
            "d'st' MMMM yyyy",
            "d'nd' MMMM yyyy",
            "d'rd' MMMM yyyy",
            "d'th' MMMM yyyy"
        };

        if (DateTime.TryParseExact(dateString,
            formats, CultureInfo.CurrentUICulture, DateTimeStyles.None, out DateTime dateRevision))
        {
            writer.WriteAttributeString("Date", dateRevision.ToShortDateString());
        }
    }
}

I admit that I do not quite follow how this bit of code actually works:

var regex = new Regex(@"Version (?'version'[\d.]+) - (?'edition'\d+) Edition", RegexOptions.None);
var matches = regex.Matches(listSpan.ElementAt(0).InnerText.Trim());

The above code is modified from one of the supplied answers. But it works. :)

I decided to construct the date using the accepted answer approach as I understand what it is doing, as opposed to the regex suggestion.

@phuzi maybe you could add some explanations or pointers to flesh out your answer concerning the regex syntax?

9 months ago · Santiago Trujillo Report
Answer question
Find remote jobs