Tech Off Thread

5 posts

Forum Read Only

This forum has been made read only by the site admins. No new threads or comments can be added.

Split and Join

Back to Forum: Tech Off
  • User profile image

    Maybe I've always just missed the obvious, but is it just me who finds it extremely annoying that string.Split() and string.Join() expect different datatypes for the separator character? string.Join() expects a string value as the separator, and string.Join() expect either a char (via params), an array of chars or an array of strings. I'd have guessed that these two methods often operate on each other's output, and so defining a const separator value would be useful, but apparently not. The simplest way I can think of is defining the separator character as a const char, and then calling .ToString() on that char for the Join() method, but that's still a pain.

    What am I missing?

  • User profile image

    For Split(), I often resort to:

    ",".ToCharArray()   // just comma delimiter

    " ,.;".ToCharArray()   // space, comma, period, semi-colon delimiters

    but this helps you naught...

  • User profile image

    @Bas: I'd imagine it's because Join is concatenating strings and Split is going through character by character looking for the separator. Still, it probably would have made sense to put overloads in there.

  • User profile image

    If I remember correctly (this decision was a loooong time ago), the reason for this was that either .NET 1.0 or a pre-release of .NET had

    string[] String::Split(char ch);
    string[] String::Split(params char[] chs);
    string[] String::Split(IEnumerable<char> chs);

    the third overload was removed from .NET because string is IEnumerable<char> and this led to confusion:

    foreach(var x in "Hello World".Split("el")) Console.WriteLine(x);

    would yield


    Rather than


    as most people would expect.

    Therefore the decision was that the overload string[] String::split(IEnumerable<char> chs) should be removed.

    Unfortunately you can't then add String::Split(string chs), since this means that you've just changed what "Foo".Split("Bar") means (it used to mean split by 'B', 'a' and 'r', since string is a collection of chars and matches IEnumerable<char> and now it means split by "Bar" since you have an overload of string).

    So long story short is that a lot of this nastiness is there for frankly pretty old reasons. Adding overloads in future probably isn't a bad idea, but since most people have been coping (string.Split(x, new string[] { y }) does what most people expect), I think this has been pretty low down the list of priorities for .NET. You need a good reason to change the base library once it's used by millions of customers, and I'm not sure this is a good enough reason to change it.

  • User profile image

    That makes sense. I knew there must've been a reason for it, I just couldn't figure it out. I hadn't thought about the string/character array thing.

    I wasn't really expecting a change, it just struck me as some annoying holdover and I wanted to know why it was there. Now I know.

    What strikes me though: why couldn't they have simply added an overload to Join that takes a char? At least that way I can simply use the same values for both methods. It feels.. cleaner. Ah well.

    Thanks for enlightening me!

Conversation locked

This conversation has been locked by the site admins. No new comments can be made.