Hey!
I've got a problem with parsing strings to chars .. to be more specific - with \t and similar.
I'm using the char.Parse(string) method, and it works .. if I give it something like char.Parse("a");
But, if I try char.Parse("\t"), it fails because it's more than one char in length.
So .. any ideas on how to parse "\t" to '\t' ? ![]()
Regards,
Tadej
-
-
TadejK wrote:Hey!
I've got a problem with parsing strings to chars .. to be more specific - with \t and similar.
I'm using the char.Parse(string) method, and it works .. if I give it something like char.Parse("a");
But, if I try char.Parse("\t"), it fails because it's more than one char in length.
So .. any ideas on how to parse "\t" to '\t' ?
Regards,
Tadej
'\t' Will fail because its a string written syntactically like a char. "\t" does work. I tested it below both in a longer string as shown and calling it as you suggest. Both work as expected. Enjoy.
using System;
namespace ConsoleApplication3{
class Program
{
static void Main(string[] args)
{
string mytest = "my\n\t\r\ntest\"";
foreach (char c in mytest)
{
char fromString = char.Parse(c.ToString());
Console.WriteLine("({0})", fromString);
}
char rawTest = char.Parse("\t");
Console.WriteLine("\n\nThis disproves you thesis ({0})", rawTest);
}
}
}
-
You have to take a look at encoding of your string.
It seems like you currently have no encoding. -
@odujosh:
This is very weird ..
char rawTest = char.Parse("\t");
works ..
but
char rawTest = char.Parse(args[0]);
doesn't work .. where args[0] is "\t" .. that is actually my problem - I need to read \t as an argument ..
Any ideas as to why it woudn't work this way?
@Maddus Mattus: would a different encoding really matter?
-
Is the contents of args[0] actually a tab character, or is it a backslash and a t? If args[0].Length == 2, of course it won't work.
-
Sven Groot wrote:Is the contents of args[0] actually a tab character, or is it a backslash and a t? If args[0].Length == 2, of course it won't work.
Of course .. you're right, I just assumed somehow that \t would be converted to a tab .. hmm.
Well, that solves why this doesn't work .. but I've still got the same problem - how to read a tab from the command line using arguments? Or a new line character, for that matter..
Any ideas?
Regards,
Tadej -
Umm you would have to convert the string made up of "\" and "t" to a tab charecter or whatever you want to use:
using System;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
char c = (char)9;
string replaceTest = "My Test(\\t)";
Console.WriteLine("Before: ({0})", replaceTest);
Console.WriteLine("After: ({0})", replaceTest.Replace("\\t", c.ToString()));
}
}
}
-
odujosh wrote:
Umm you would have to convert the string made up of "\" and "t" to a tab charecter or whatever you want to use:
using System;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
char c = (char)9;
string replaceTest = "My Test(\\t)";
Console.WriteLine("Before: ({0})", replaceTest);
Console.WriteLine("After: ({0})", replaceTest.Replace("\\t", c.ToString()));
}
}
}
Yeah, that's one way .. but I was looking for something more .. erm .. elegant?
Well, thanks anyway, I guess it'll have to do
Regards,
Tadej
-
The big technical issue is whitespace and detecting it. I am interested in seeing what you are doing that requires this. Maybe this would help me to understand how to direct you to a more elegant solution. Maybe we are addressing the wrong issue here.
-
odujosh wrote:
The big technical issue is whitespace and detecting it. I am interested in seeing what you are doing that requires this. Maybe this would help me to understand how to direct you to a more elegant solution. Maybe we are addressing the wrong issue here.
What I'm doing is very simple - a command line utility which reads a csv file (using the supplied filename as an argument), splits each line using a delimiter (again, it's supplied as an argument .. it can be \t, ;, or similar), then processes the data, and saves it to a file (using another filename, supplied as an argument). So, in it's basic form, it required 3 arguments to be supplied.
In reality, it's a bit more complex, but that doesn't really matter. The point is, that the delimiter has to be supplied as an argument, and that works fine, if the delimiter is ; .. but if it's \t .. well, you see my problem
Of course, there are possible workarounds (like using a config file, instead of supplying everything in the form of arguments) .. but, that's not the point ..
Regards,
Tadej
-
Are you reading the values from the CSV into a database? If so whats the platform.
-
odujosh wrote:Are you reading the values from the CSV into a database? If so whats the platform.
Nope, I'm simply processing the data, then saving it back.
A more detailed response would be .. It involves lots of sample data, and the aplication works in such a way, that it looks at a column, figures out what the unique values are, then selects the same amount of data from each unique value (aka class).
The application works perfectly, and it's complete .. the only thing missing is the problem I'm having with parsing \t ..
Don't get me wrong - I know enough to write a workaround or two.. but I'd rather not do that - partly because I believe that there has to be a more elegant (and correct?) way to do this (although my belief is fading quickly
).
Regards,
Tadej
-
TadejK wrote:In reality, it's a bit more complex, but that doesn't really matter. The point is, that the delimiter has to be supplied as an argument, and that works fine, if the delimiter is ; .. but if it's \t .. well, you see my problem

So what you want to do is take a supplied argument of \t... that is, in C-ish, "\\t"... and turn it into a literal tab... that is, in C-ish, a "\t"?
EDIT:
If so, I think your best bet is to clearly define which C escapes you want to support, and implement them individually...
delimeter = delimiter
.Replace("\\n", "\n")
.Replace("\\\\", "\\")
.Replace("\\t", "\t")
...
;
-
Here is a CSV parser I made.
"Header1",Header2,Header3
"Row1V1" ,"Row1V2","Row3V3"
"Row2V2","Row2V2","Row2V2"
is the contents of a valid CSV.Notice the Commas. It is highley suggested you use the double quotes too makes detecting cell even easier and protects you against values like '1,000'. If you keep the headers standard than you can get away with something like this. Of course you I am sure you can come up with improvements this is my V1
Bet on bugs. Fun little afternoon project.
using System;
using System.IO;
using System.Data;
using System.Collections.Generic;
using System.Text;
namespace ConsoleApplication1
{
class CSVTableHelper
{
public CSVTableHelper(string csvFile)
{
m_CSVFile = new StringReader(csvFile);
string line = m_CSVFile.ReadLine() ;
while (line != null)
{
ParseLine(line);
line = m_CSVFile.ReadLine();
}
}
private void ParseLine(string line)
{
if (m_CSVData == null)//if first row
{
m_CSVData = new DataTable();
GetHeaders(line);
}
else
{
GetValues(line);
}
}
private void GetHeaders(string line)
{
//Assumes Headers cells are always formatted "h1","h2","h4"
string [] headers = line.Replace("\"", string.Empty).Split(',');
foreach (string header in headers)
{
DataColumn dc = new DataColumn(header);
m_CSVData.Columns.Add(dc);
}
}
private void GetValues(string line)
{
if (!string.IsNullOrEmpty(line))
{
line = line.Trim();
Insert(AdvancedParse(line));
}
}
private string[] AdvancedParse(string line)
{
List<string> cells = new List<string>();
StringBuilder cell = new StringBuilder();
bool firstchar = true;
bool incell = false;
foreach (char c in line)
{
if (!incell)
{
if (firstchar)
{
incell = true;
cell.Remove(0, cell.Length);
if (c == '\"')
{
firstchar = false;
continue;
}
else
{
cell.Append(c.ToString());
}
}
}
else
{
if (c == ',')
{
incell = false;
firstchar = true;
cells.Add(cell.ToString());
continue;
}
else if (c == '\"')
{
continue;
}
else
{
cell.Append(c);
}
}
}
cells.Add(cell.ToString());
return cells.ToArray();
}
private void Insert(string[] cells)
{
if (cells.Length != m_CSVData.Columns.Count)
{
throw new Exception("Line has invalid amount of cells");
}
else
{
DataRow dr = m_CSVData.NewRow();
int cellcount = 0;
foreach (string cell in cells)
{
TrimQuotes(cell);
dr[cellcount] = cell.Trim();
cellcount++;
}
CSVData.Rows.Add(dr);
}
}
private string TrimQuotes(string line)
{
bool starting = false;
if (line[0] == '\"')
{
line.Remove(0, 1);
starting = true;
}
if (line[line.Length - 1] == '\"')
{
if (!starting)
{
throw new Exception("Invalid Cell Format in CSV: Ending but no starting Quote");
}
line.Remove(line.Length - 1);
}
else if (starting)
{
throw new Exception("Invalid Cell Format in CSV: Starting but no ending Quote");
}
return line;
}
private int CountChar(char charecter, string line)
{
int count = 0;
foreach (char c in line)
{
if (c == charecter)
{
count++;
}
}
return count;
}
private StringReader m_CSVFile;
private DataTable m_CSVData;
public DataTable CSVData
{
//Read-only: no set property
get
{
return m_CSVData;
}
}
}
}
-
And the Test Project:(Works good with helper class)
using System;
using System.Data;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string teststring = "\"H1\",\"H2\",\"H3\"\n\"cell0\",cell1,cell2\n\"cell01\",\"cell11\",cell21\n\"cell02\",cell12,cell22\n\"cell012\",\"cell113\",cell213";
CSVTableHelper helper = new CSVTableHelper(teststring);
DataTable dt = helper.CSVData;
foreach (DataRow dr in dt.Rows)
{
foreach (object o in dr.ItemArray)
{
Console.Write(o.ToString());
Console.Write("\t");
}
Console.Write("\n");
}
}
}
}
-
odujosh wrote: .. code ..
Sorry, been away the last few days ..
Thanks for code, but I'm not sure if it'll come in handy regarding my current problem .. perhaps in the future
It certainly looks interesting, though.
Well, it seems there's only one real solution to my problem .. I'm gonna have to manually translate the input string to the char .. "a" becomes 'a', "\t" becomes '\t' .. nothing simpler
Thanks for all the help, guys
Regards,
Tadej
-
TadejK said:odujosh wrote: .. code ..
Sorry, been away the last few days ..
Thanks for code, but I'm not sure if it'll come in handy regarding my current problem .. perhaps in the future
It certainly looks interesting, though.
Well, it seems there's only one real solution to my problem .. I'm gonna have to manually translate the input string to the char .. "a" becomes 'a', "\t" becomes '\t' .. nothing simpler
Thanks for all the help, guys
Regards,
Tadej
could you use the verbatim literal marker @"\t" ?
Thread Closed
This thread is kinda stale and has been closed but if you'd like to continue the conversation, please create a new thread in our Forums,
or Contact Us and let us know.