Tech Off Thread

17 posts

Forum Read Only

This forum has been made read only by the site admins. No new threads or comments can be added.

[C#] Parsing string to char

Back to Forum: Tech Off
  • User profile image
    TadejK

    Hey!

    I've got a problem with parsing strings to chars .. to be more specific - with \t and similar.
    I'm using the char.Parse(string) method, and it works .. if I give it something like char.Parse("a");
    But, if I try char.Parse("\t"), it fails because it's more than one char in length.
    So .. any ideas on how to parse "\t" to '\t' ? Tongue Out

    Regards,
    Tadej

  • User profile image
    odujosh

    TadejK wrote:
    Hey!

    I've got a problem with parsing strings to chars .. to be more specific - with \t and similar.
    I'm using the char.Parse(string) method, and it works .. if I give it something like char.Parse("a");
    But, if I try char.Parse("\t"), it fails because it's more than one char in length.
    So .. any ideas on how to parse "\t" to '\t' ? Tongue Out

    Regards,
    Tadej


    '\t' Will fail because its a string written syntactically like a char. "\t" does work. I tested it below both in a longer string as shown and calling it as you suggest. Both work as expected. Enjoy.

    using System;

    namespace ConsoleApplication3

    {

    class Program

    {

    static void Main(string[] args)

    {

    string mytest = "my\n\t\r\ntest\"";

    foreach (char c in mytest)

    {

    char fromString = char.Parse(c.ToString());

    Console.WriteLine("({0})", fromString);

    }

    char rawTest = char.Parse("\t");

    Console.WriteLine("\n\nThis disproves you thesis ({0})", rawTest);

    }

    }

    }

  • User profile image
    Maddus Mattus

    You have to take a look at encoding of your string.

    It seems like you currently have no encoding. 

  • User profile image
    TadejK

    @odujosh:
    This is very weird ..

    char rawTest = char.Parse("\t");
    works ..
    but
    char rawTest = char.Parse(args[0]);
    doesn't work .. where args[0] is "\t" .. that is actually my problem - I need to read \t as an argument ..

    Any ideas as to why it woudn't work this way?

    @Maddus Mattus: would a different encoding really matter?

  • User profile image
    Sven Groot

    Is the contents of args[0] actually a tab character, or is it a backslash and a t? If args[0].Length == 2, of course it won't work.

  • User profile image
    TadejK

    Sven Groot wrote:
    Is the contents of args[0] actually a tab character, or is it a backslash and a t? If args[0].Length == 2, of course it won't work.


    Of course .. you're right, I just assumed somehow that \t would be converted to a tab .. hmm.
    Well, that solves why this doesn't work .. but I've still got the same problem - how to read a tab from the command line using arguments? Or a new line character, for that matter..

    Any ideas? Wink

    Regards,
    Tadej

  • User profile image
    odujosh

    Umm you would have to convert the string made up of "\" and "t" to a tab charecter or whatever you want to use:

    using System;

     

    namespace ConsoleApplication1

    {

    class Program

    {

    static void Main(string[] args)

    {

    char c = (char)9;

    string replaceTest = "My Test(\\t)";

    Console.WriteLine("Before: ({0})", replaceTest);

    Console.WriteLine("After: ({0})", replaceTest.Replace("\\t", c.ToString()));

    }

    }

    }

  • User profile image
    TadejK

    odujosh wrote:
    

    Umm you would have to convert the string made up of "\" and "t" to a tab charecter or whatever you want to use:

    using System;

     

    namespace ConsoleApplication1

    {

    class Program

    {

    static void Main(string[] args)

    {

    char c = (char)9;

    string replaceTest = "My Test(\\t)";

    Console.WriteLine("Before: ({0})", replaceTest);

    Console.WriteLine("After: ({0})", replaceTest.Replace("\\t", c.ToString()));

    }

    }

    }



    Yeah, that's one way .. but I was looking for something more .. erm .. elegant? Wink
    Well, thanks anyway, I guess it'll have to do Smiley

    Regards,
    Tadej

  • User profile image
    odujosh

    The big technical issue is whitespace and detecting it. I am interested in seeing what you are doing that requires this. Maybe this would help me to understand how to direct you to a more elegant solution. Maybe we are addressing the wrong issue here.

  • User profile image
    TadejK

    odujosh wrote:
    

    The big technical issue is whitespace and detecting it. I am interested in seeing what you are doing that requires this. Maybe this would help me to understand how to direct you to a more elegant solution. Maybe we are addressing the wrong issue here.



    What I'm doing is very simple - a command line utility which reads a csv file (using the supplied filename as an argument), splits each line using a delimiter (again, it's supplied as an argument .. it can be \t, ;, or similar), then processes the data, and saves it to a file (using another filename, supplied as an argument). So, in it's basic form, it required 3 arguments to be supplied.
    In reality, it's a bit more complex, but that doesn't really matter. The point is, that the delimiter has to be supplied as an argument, and that works fine, if the delimiter is ; .. but if it's \t .. well, you see my problem Wink

    Of course, there are possible workarounds (like using a config file, instead of supplying everything in the form of arguments) .. but, that's not the point .. Wink

    Regards,
    Tadej

  • User profile image
    odujosh

    Are you reading the values from the CSV into a database? If so whats the platform.

  • User profile image
    TadejK

    odujosh wrote:
    Are you reading the values from the CSV into a database? If so whats the platform.


    Nope, I'm simply processing the data, then saving it back.
    A more detailed response would be .. It involves lots of sample data, and the aplication works in such a way, that it looks at a column, figures out what the unique values are, then selects the same amount of data from each unique value (aka class).
    The application works perfectly, and it's complete .. the only thing missing is the problem I'm having with parsing \t .. Wink

    Don't get me wrong - I know enough to write a workaround or two.. but I'd rather not do that - partly because I believe that there has to be a more elegant (and correct?) way to do this (although my belief is fading quickly Sad).

    Regards,
    Tadej

  • User profile image
    Matthew van Eerde

    TadejK wrote:
    In reality, it's a bit more complex, but that doesn't really matter. The point is, that the delimiter has to be supplied as an argument, and that works fine, if the delimiter is ; .. but if it's \t .. well, you see my problem Wink


    So what you want to do is take a supplied argument of \t... that is, in C-ish, "\\t"... and turn it into a literal tab... that is, in C-ish, a "\t"?

    EDIT:

    If so, I think your best bet is to clearly define which C escapes you want to support, and implement them individually...

    delimeter = delimiter
       .Replace("\\n", "\n")
       .Replace("\\\\", "\\")
       .Replace("\\t", "\t")
       ...
    ;

  • User profile image
    odujosh

    Here is a CSV parser I made.

    "Header1",Header2,Header3
    "Row1V1" ,"Row1V2","Row3V3"
    "Row2V2","Row2V2","Row2V2"


    is the contents of a valid CSV.Notice the Commas. It is highley suggested you use the double quotes too makes detecting cell even easier and protects you against values like '1,000'. If you keep the headers standard than you can get away with something like this. Of course you I am sure you can come up with improvements this is my V1 Tongue Out Bet on bugs. Fun little afternoon project.

    using System;

    using System.IO;

    using System.Data;

    using System.Collections.Generic;

    using System.Text;

    namespace ConsoleApplication1

    {

    class CSVTableHelper

    {

    public CSVTableHelper(string csvFile)

    {

    m_CSVFile = new StringReader(csvFile);

    string line = m_CSVFile.ReadLine() ;

    while (line != null)

    {

    ParseLine(line);

    line = m_CSVFile.ReadLine();

    }

    }

    private void ParseLine(string line)

    {

    if (m_CSVData == null)//if first row

    {

    m_CSVData = new DataTable();

    GetHeaders(line);

    }

    else

    {

    GetValues(line);

    }

    }

    private void GetHeaders(string line)

    {

    //Assumes Headers cells are always formatted "h1","h2","h4"

    string [] headers = line.Replace("\"", string.Empty).Split(',');

    foreach (string header in headers)

    {

    DataColumn dc = new DataColumn(header);

    m_CSVData.Columns.Add(dc);

    }

    }

    private void GetValues(string line)

    {

    if (!string.IsNullOrEmpty(line))

    {

    line = line.Trim();

    Insert(AdvancedParse(line));

    }

    }

    private string[] AdvancedParse(string line)

    {

    List<string> cells = new List<string>();

    StringBuilder cell = new StringBuilder();

    bool firstchar = true;

    bool incell = false;

    foreach (char c in line)

    {

    if (!incell)

    {

    if (firstchar)

    {

    incell = true;

    cell.Remove(0, cell.Length);

    if (c == '\"')

    {

    firstchar = false;

    continue;

    }

    else

    {

    cell.Append(c.ToString());

    }

    }

    }

    else

    {

    if (c == ',')

    {

    incell = false;

    firstchar = true;

    cells.Add(cell.ToString());

    continue;

    }

    else if (c == '\"')

    {

    continue;

    }

    else

    {

    cell.Append(c);

    }

    }

    }

    cells.Add(cell.ToString());

    return cells.ToArray();

    }

    private void Insert(string[] cells)

    {

    if (cells.Length != m_CSVData.Columns.Count)

    {

    throw new Exception("Line has invalid amount of cells");

    }

    else

    {

    DataRow dr = m_CSVData.NewRow();

    int cellcount = 0;

    foreach (string cell in cells)

    {

    TrimQuotes(cell);

    dr[cellcount] = cell.Trim();

    cellcount++;

    }

    CSVData.Rows.Add(dr);

    }

    }

    private string TrimQuotes(string line)

    {

    bool starting = false;

    if (line[0] == '\"')

    {

    line.Remove(0, 1);

    starting = true;

    }

    if (line[line.Length - 1] == '\"')

    {

    if (!starting)

    {

    throw new Exception("Invalid Cell Format in CSV: Ending but no starting Quote");

    }

    line.Remove(line.Length - 1);

    }

    else if (starting)

    {

    throw new Exception("Invalid Cell Format in CSV: Starting but no ending Quote");

    }

    return line;

    }

    private int CountChar(char charecter, string line)

    {

    int count = 0;

    foreach (char c in line)

    {

    if (c == charecter)

    {

    count++;

    }

    }

    return count;

    }

    private StringReader m_CSVFile;

    private DataTable m_CSVData;

    public DataTable CSVData

    {

    //Read-only: no set property

    get

    {

    return m_CSVData;

    }

    }

    }

    }

  • User profile image
    odujosh

    And the Test Project:(Works good with helper class)

    using System;

    using System.Data;

    namespace ConsoleApplication1

    {

    class Program

    {

    static void Main(string[] args)

    {

    string teststring = "\"H1\",\"H2\",\"H3\"\n\"cell0\",cell1,cell2\n\"cell01\",\"cell11\",cell21\n\"cell02\",cell12,cell22\n\"cell012\",\"cell113\",cell213";

    CSVTableHelper helper = new CSVTableHelper(teststring);

    DataTable dt = helper.CSVData;

    foreach (DataRow dr in dt.Rows)

    {

    foreach (object o in dr.ItemArray)

    {

    Console.Write(o.ToString());

    Console.Write("\t");

    }

    Console.Write("\n");

    }

    }

    }

    }

  • User profile image
    TadejK

    odujosh wrote:
     .. code ..


    Sorry, been away the last few days ..
    Thanks for code, but I'm not sure if it'll come in handy regarding my current problem .. perhaps in the future Wink
    It certainly looks interesting, though.

    Well, it seems there's only one real solution to my problem .. I'm gonna have to manually translate the input string to the char .. "a" becomes 'a', "\t" becomes '\t' .. nothing simpler Tongue Out

    Thanks for all the help, guys Wink

    Regards,
    Tadej

  • User profile image
    Skinned​Knuckles

    TadejK said:
    odujosh wrote:
     .. code ..


    Sorry, been away the last few days ..
    Thanks for code, but I'm not sure if it'll come in handy regarding my current problem .. perhaps in the future Wink
    It certainly looks interesting, though.

    Well, it seems there's only one real solution to my problem .. I'm gonna have to manually translate the input string to the char .. "a" becomes 'a', "\t" becomes '\t' .. nothing simpler Tongue Out

    Thanks for all the help, guys Wink

    Regards,
    Tadej

    could you use the verbatim literal marker @"\t"   ?

Conversation locked

This conversation has been locked by the site admins. No new comments can be made.