page 1 of 1
Comments: 10 | Views: 1978
webmonkey
webmonkey
How am i supposed to code with theeeeeese ?

My app doesn't need to accept HTML as HTML but HTML as plain text and something that looks like markup may be entered.

Therefore unless I'm very mistaken I need to turn off page validation and to stop HTML from getting rendered I need to HTMLEncode somewhere.

Some data may be exported as CSV for use in none HTML rendering applications.

My plan was to HTMLEncode all input text either before being passed to the BLL or in the BLL itself, so HTMLEncoded strings are stored in the database.

I would then need to decode only when the data is being externally exported.

Is this the correct way? Or should I not encode any data until it is being pulled out of the database?

stevo_
stevo_
Casablanca != Manchester
Either way.. the argument for storing pure text in the database is because its 'right'.. given that the database may be read by another system that doesn't need html encoding..

The good thing about storing html in a database is that you only need encode at the time you enter the data.. and you don't need to worry about decoding at view time..

The choice is up to you to weight up if its even worth doing one over the other and not to go over the top when thinking about future implications..

Developers have a tendacy to go to the extremes of their knowledge and this may make them attempt to future proof an app to death..

While it's important to think ahead a little.. it should only be to ensure you don't go down wrong routes.. as soon as you start having to do a ton of extra work in an attempt to future proof yourself.. your really not helping yourself and in sets feature creep yadda yadda..

Most website databases are solely for the website, so personally I don't see any problem with pre-encoding data..
W3bbo
W3bbo
The Master of Baiters
stevo_ wrote:
The good thing about storing html in a database is that you only need encode at the time you enter the data.. and you don't need to worry about decoding at view time..


You do both.

Your table should have PreText and PostText columns, PreText contains the raw stuff, PostText contains the HTML-encoded version.

In a CMS I wrote (which supported Markdown) PreText stores the Markdown source and PostText contains the markdown-rendered output.

Whilst it isn't a normalized database, markdown rendering is expensive so it's better to do it once then cache it. The only real expense is now the database consumes roughly twice the amount it needs to, but it's just text and can be compressed easily.
Matthew van Eerde
Matthew van Eerde
AKA Maurits
It depends on your application.  For some applications it is entirely appropriate to turn off validation (suppressing the XSS heuristics) and to store the raw input in the database... and then HtmlEncode on write.
W3bbo
W3bbo
The Master of Baiters
Matthew van Eerde wrote:
entirely appropriate to turn off validation (suppressing the XSS heuristics)


Calling it "heuristics" is a stretch Wink Validation practically amounts to:

String request = Request.QueryString || Request.GetStream().ToString();
if( String.IndexOf('<') != -1 ) throw new Exception();

It's like calling a system where the highest earners get fired an "algorithm" Smiley

BTW Matt, been in a C9 video yet?
Matthew van Eerde
Matthew van Eerde
AKA Maurits
W3bbo wrote:
BTW Matt, been in a C9 video yet?


Nope, but you can hear my voice on the Audio Fidelity Test videos here:
http://www.microsoft.com/whdc/winlogo/wlk/default.mspx
stevo_
stevo_
Casablanca != Manchester

Storing both is an interesting idea, but I'm not sure what it solves, it may actually become more complicated..

When I sum up things like this.. I have to think about the app itself.. how many times does it write data vs read the data?

I have to think about if its critically this database has to interop with another clients.. such as a desktop management app needs a html decoded view.. vs how many times this app is used vs the web admin..

Given this is a single 'client' database.. and the client is always html in basis, I would lean to encoding at write time to save having to decode on every view..

In which case, covering both basis with encoded and decoded wouldn't gain me anything..

Given that this would be suited more when I'm targeting clients that are html and 'other'.. the contract imposed by this structure may cause more problems than its worth for that insignificant perf increase..

I need to ensure that all the management clients ensure they write to both columns.. and they provide the encoded and non-encoded.. and that none of them screws up..

The good thing about encoding on output is that you are ensuring you don't ever allow unencoded outputs.. the worst you could do is output a double encoded value..

But like everything.. theres no golden bullet.. and its all about factoring in whats best for this solution..

W3bbo
W3bbo
The Master of Baiters
Matthew van Eerde wrote:

W3bbo wrote:BTW Matt, been in a C9 video yet?


Nope, but you can hear my voice on the Audio Fidelity Test videos here:
http://www.microsoft.com/whdc/winlogo/wlk/default.mspx


That's a great claim-to-fame you've got there Smiley
webmonkey wrote:
Also, I noticed you can set individual GridView boundfields to HTMLEncode, can you do the same with textbox's within a formview/detailsview/DataRepeater?

You can with a FormView and DetailsView if you use the Fields collection without custom templated columns.  Custom templated columns and the Repeater (which only uses templates) don't support it out of the box, but you can encode HTML either in code, with Literal WebControls, or both.