The Sandbox Thread

34 posts

Forum Read Only

This forum has been made read only by the site admins. No new threads or comments can be added.

Content management and tag expansion: Reinventing the wheel?

Back to Forum: The Sandbox
  • User profile image
    kenfine

    I'll share my current work in progress, which may represent an impressive re-invention of the wheel.

    I've been thinking really, really hard about content management issues, and the best ways to allow web pages to be programmically rendered while still allowing for flexible layouts.

    My solution involves programmatic expansion of custom tags in several stages...tags describe high-level  page structure, and they also can be embedded by non-technical staff using a web-based inline text editor.
    I'm coding this in ASP/VBScript even though I understand that ASP.NET richly supports the development of user tags.

    I'm not formally trained in CS or web development -- my background is in visual art, and I'm sort of a collegue-less one-man shop in my job. One reason I'm posting this here is because I'm hoping that others with experience in advanced content management may have perspectives on how my design can be improved.

    A Visio diagram that elaborates the sequence of how my pages will be built appears below.

    I'll follow up this message with a message I wrote earlier that describes some of the rationales for how this thing will work, as well as a challenge for how to make my seperation of content and design more "pure".

  • User profile image
    MasterPi

    Interesting. Can you provide examples of what each file would look like..xml...TempXML, etc..? I'm curious to see this.


    I'm in the process of building a content management system as well.  Unfortunately, I'm very much limited (classic ASP Sad). If you are interested, I can post up my diagram (Keep in mind, I'm only 15 so my diagraming skills are a bit awkward Big Smile)

  • User profile image
    kenfine

    I can post some code later on, probably tomorrow.

    What I described expands on some ideas and code originally written up in a (IMHO) classic and insightful article on diado.com. The author describes a method of using regex parsing to extract custom tags and attributes from a content stream and expanding them programmatically. This allows users of an inline CMS to embed custom tags that call more advanced programming. I implemented his idea on one of my websites: we have a smart person working for us who doesn't know HTML but who wants to embed "for more information" link boxes with each article. Using custom tags, she can supply a title and a URL and the machine takes care of generating a nicely formatted set of links in a tabular layout.

    My plan is a little more ambitious in that it calls for this tag expansion to happen in several iterations: tags can define content areas, and tags can also describe the entire layout of the page. My sense is that all of this parsing is not efficient server-side, but if I'm caching pages programmatically, who really cares.

    ASP.NET supports custom tags and user controls, but I don't think it supports the sort of two-level tag expansion I've described. Since ASP/VBSCRIPT is what I know well, I think it's what I'm going to use until ASP.NET 2.0's "master pages" are out the door.

    I sometimes share your inferiority complex over using boring old ASP, but the truth is that you can use it in fairly sophisticated ways: it supports a minimal implementation of classes and as soon as you start playing around with the class stuff, the world will suddenly seem quite a bit bigger.

    There really aren't very many great books or articles on advanced content management. Just keep plugging away at it and try to solve the hard problems. I like to think it'll all snap into focus sooner or later.

    -KF

  • User profile image
    MasterPi

    Well, for my system, I was trying to imitate the MS cms (I have absoluely no clue on how they made their CMS and no one else seems to know either).

    Here's my diagram.  The CMS basically uses templating and I have to actually go out and make the templates on my own. It's not very useful right now since I haven't implemented an editing system (I can't actually have my users learn xml coding..need something more graphical). Your idea looks far more sophisticated than mine as you probably already notice. Wink

     

    Oh btw...ignore the little wigglies. Smiley They are indeed hilarious...but should not take away from the seriousness of the project. Wink

  • User profile image
    kenfine

    Cool stuff, mvpStar. Wish I had been fooling around with XML at your age (of course, if I had been, I'd have been the inventor Smiley  )

    Some considerations that will inform your CMS design: 

    + whether you're designing for a single site or many sites;
    + How fancy-pants your layouts are, and how many controls your users might want to embed in their layouts;
    + the sophistication or lack thereof of the CMS's users;
    + Your understanding and implementations of relational database concepts, which is key, and the sophistication of your data display.

    Many of the solutions that have been devised assume a very flat data model, few controls or customizations made by users,  and management of only a few sites.

    My work involves many sites and dozens of templates that more or less use the same modes of data display. Many of the renderings on these pages are complicated. The act of developing, organizing, and maintaining even the templates becomes a chore. The solution I'm leaning toward seeks to abstract the process of making templates. If I didn't have so much to manage and there was someone else to help with this work, I'd probably lean toward a less complicated model.

    One thing I notice is for all the hoo-ha about seperating content from presentation, many/most of the solutions out there don't really strike at the heart of the problem in the sense of making it easy to repackage and repurpose content in alternate forms. That's what I'm trying to do.

    Enough soliloquy. Let me ask you a question: why are you inclined toward keeping your data store in XML versus in a database? Databases solve many problems. SQL Server is good. Smiley 

    Do you have a good understanding of the ASP FSO?

    Google for "ASP caching class". Someone has written a spectacular class for Classic ASP that will dynamically generate an in-memory and/or file-based cache on the server. You drop it on your page and forget about it. You can have the benefits of a database's dynamism with the performance of an auto-generated cache. Sort of a poor-man's ASP.NET caching object.

    If you want your users to have the ability to easily edit the textual aspects of their page (what I call the "contentarea"), you can embed an inline text editor on an administrative page. yusasp.com makes a cheap one that will do the job; telerik.com makes a ASP.NET-based editor that will dazzle and amaze you. Any of these editors will give your users great power to edit their own files, but remember that it's only part of the content management problem. (Why MSFT doesn't build a world-class inline editing environment and ship it with ASP.NET 2.1 is beyond me. Go buy telerik.com and you'll rule the world, MSFT.)

    Thanks for the conversation, mvpStar. This is fun: there ain't many/any folks at my workplace that can relate to the issues I'm trying to work through, and it's nice to be able to talk it through with other folks.

    -KF

  • User profile image
    MasterPi

    kenfine wrote:
    Enough soliloquy. Let me ask you a question: why are you inclined toward keeping your data store in XML versus in a database? Databases solve many problems. SQL Server is good.  


    Well basically, the xml content is in a database. A cached version of the content is the xml file. Basically when a user requests the page, he requests the cached version or the newly generated cached version.

    I could have it come straight from the database but I also want to expose the actual raw xml.  It seems a bit hard to grab that xml from a database than a raw file, although I could be completely mistaken.

    Basically exposing the xml exposes that other alternative to grabbing data which I think is the big plus for xml. So in essence, I now have two methods of grabbing the data: going directly to the .asp page and viewing the html content or grabbing the data directly from the xml.

    When I was talking alot about this system to friends who work in this kind of business, they also questioned me about having data "stored" in an xml file.  They reasoned: "XML is used to describe data, not to store it. That's what a database is for".  I understand this completely, however, a question that's always been lurking in my mind is..what exactly do you put in the database if it's supposed to be used for storing data?

    Do you keep the html form of the data in the database and convert it to xml format which then converts to html again usin xslt?  Seems rather pointless.

    A friend who works at a company called TorchBox told me of their CMS.  Basically in the databases, the raw xml is stored. Then they use abstraction to take the xml bits and plug them into templates. Lastly the final page is cached.

    So, I decided to adopt their method of placing the raw xml in the database.

    Regarding SQL Server, I don't actually have access to anything like SQL Server.  I end up using MS Access.  This CMS I'm building is basically for my school's library. 

    Now when I think about it, what I was trying to accomplish, to make it easier for the librarians to manage the site, perhaps leaving everything in html and plugging in an editor and an administrative area would have probably been much easier than the horror I've created. Wink

    Then again, I lose the benefits of xml, making content universal.

    All that I really need to do now is make an editor that will take user input and describe it.

    My xml content is basically something like:


    <body>
    <standardModule>
    <section>
    <title>Highlights</title>
    <separator />
    <reference xlinkHref="reviewssubmissions">
    <label>Submit a Book Review!</label>
    </reference>
    <para>Find out more about how you can get your voice heard...</para>
    </section>
    </standardModule>
    </body>


    Which then becomes:


    <div id="standardM">
    <br/>
    <h4>Highlights</h4>
    <hr/>
    <a href="reviewssubmissions">Submit a Book Review!</a>
    <p>
    Find out more about how you can get your voice heard...
    </p>
    </div>



    My concept uses xslt transformations which occur at the server.


    Bah, I was gonna write some more but I gotta go to lunch. Tongue Out I'm starved.

    Catch ya later!

    mVPstar

  • User profile image
    kenfine

    MVPstar,

    Write Microsoft, tell them you're a 15 year old genius, and to comp you a copy of SQL Server. Barring that, you can probably get an academic discounted version of the standard edition of SQL Server for ~300. Make some noise and someone may/should help you out.

    Among a billion other worthwhile things that SQL Server can do, it can render relational database data as XML, either in flat form or according to whatever schema you desire. After you score the free copy of SQL Server, read Wrox Press's "SQL Server 2000 DTS" and "SQL Server 2000 Professional."

    So the answer to your question about what you store in the DB: you store data, and you store relationships about that data, and if you need to render it as XML or in any other structured format, SQL Server and ADO/ADO.NET make it super-easy to translate it into a kludgier form.

    The following Classic ASP code will return XML directly from a SQL Server database, the operative bits being the FOR XML RAW directive:

    ConnString=Whatever_String
    set objStream = Server.CreateObject("ADODB.Stream")
    Set objConn= Server.CreateObject("ADODB.Connection")
    set objComm = Server.CreateObject("ADODB.Command")
    objConn.open (ConnString)

    objComm.ActiveConnection = objConn
    objComm.CommandType = 1 'adCmdText
    objComm.CommandText = "SELECT blah blah blah ... FOR XML RAW"
    objComm.Properties("Output Stream") = objStream
    objStream.Open
    objComm.Execute , , 1024 'adExecuteStream
    Response.Write objStream.ReadText


    I'm not a developer genius, but my own experience is that XML is a kludgy, clumsy mechanism for data storage in many circumstances. In my opinion, it's an excellent medium for data exchange, but a database engine can be applied a lot more flexibly. 

    I'm also proposing to store XML in a database, but what the XML really is in my case is a kind of encoding or shorthand for a more expansive template. 

    Maybe this helps. Your library project sounds great. Stick with it.


  • User profile image
    kenfine

    Attached, a vastly simplified map of everything I discussed in my design earlier.

  • User profile image
    MasterPi

    kenfine wrote:
    Write Microsoft, tell them you're a 15 year old genius, and to comp you a copy of SQL Server. Barring that, you can probably get an academic discounted version of the standard edition of SQL Server for ~300. Make some noise and someone may/should help you out.


    *Ahem* Any one of you MS guys out there reading this...Charles... Wink



    I think I got what you are doing. Very cool. Tongue Out Just curious, one of my problems is mixing server side coding (like page specific scripts, etc..) with the final rendered page, how did you work around this?  Like, sometimes it would be nice to pull stuff from a db here and there, yet rendering from an xml file makes only static information (unless you call an update everytime) possible.

    Is it perhaps one of the custom tags or something?


    Now as I write this, I think I figured out how to better use the DB to do what I wanted to do.

    Thanks so much! As I talk to you, I'm learning so much more about CMS than I could dream of!

    Thanks again!

    mVPstar

  • User profile image
    kenfine


    If Microsoft doesn't give you a copy of SQL Server, as it should, remember that MSFT generously allows you to install a complete and fully functional version for three or four months on trial. Four years ago when I started developing my first clumsy web applications, I learned on a trial version of SQL Server, and that gave me enough time to convince my bosses that it was worth the expenditure. Generally one good (or even bad) app is all it takes.

    The deal with XML or a database or any other structured data form is that instead of dumping the whole contents of an XML file, you can pick and choose fields or nodes and intersperse the dynamic stuff with static, boilerplate textual elements.

    If you've never used DB with ASP, go buy FriendofEd's "Foundation Dreamweaver MX" and work through their database examples. It's pretty clear.

    If you're asking how to piece pieces together: as you work with this stuff more and more, you try to figure out ways to modularize the code into distinct, self-contained content "units" which can be pieced together lego-style. Custom tags are a mechanism to call these units.

    Does that answer your Q? After you read the friendly manual a few million times, it will all start making sense, and you'll have confidence in your chosen solutions.

  • User profile image
    MasterPi

    Don't get me wrong, I use databases all the time.  Well, MS Access at least, but I use them for simple stuff like storing items in a shopping list (well not really storing items in a shopping list but you get the idea). It's a little hard for me to visualize in this project.

    The thing is deciding what part of the content goes where in the database.

    Right now I have a DB that has one table that sort of acts like a TOC of the website and tells where the cached xml is located, when were the files published, etc.  The other table contains all the header/content info for each page, the content which is in the same xml form as the cached xml file. The header isn't in xml form and is in fields (which is then assembled with the content into the cached xml file)


    Switching over to classes and OOP actually helped me alot in making the viewing part of the system. I could simply do like:

    If objXMLcached.Cached = true Then
    xmlCached = objXMLcached.CachedPage
    End If

    mVPstar

  • User profile image
    kenfine

    My tendency for a project like yours would be to structure the changable page data in a database table:

    Contentitems
    -----------
    ContentitemID (PK)
    ConCategory (if you're doing this right, FK to a join table)
    ConTitle
    ConDescription
    ConPubDate
    ConKillDate
    ConStatus
    ConBody

    Any inline structuring of the text -- say, HTML links -- should happen in ConBody, the body or contentarea of the page. 

    Items like the header and the footer, which aren't subject to change, could be rendered as static include files and/or stored in XML. 

    If you have media (say, pictures), the beginner tendency is to include a field in the table "ConPhoto", or to include inline HTML markup that points to a photo file. These are quite limiting approaches from a CMS perspective. Better is to have a relational link in your DB from the Contentitem to one or more Photos via a "join" table.

    For smaller projects, changable data can be stored in database, static stuff can go on the filesystem.

    Try to granularize your data into the smallest useful units, and programmatically join them if necessary. That allows you to, say, easily summarize your data, or get a list of only the last names of people, and so on.

    Just a few ideas. You may know all this already.

  • User profile image
    MasterPi

    Yeah that's pretty much how I have it right now (except the image):

    cID
    cTitle
    cPageTitle
    cAuthor
    cPageHref
    cDatePublished
    cDateUpdated
    cDescription
    cContent
    cCachedPage
    cPageTemplate

    I've never implemented a Join table so I'm not sure what that is exactly.

  • User profile image
    kenfine

    "Join table" is a sloppy way of describing a table that permits so-called "one-to-many" and "many to many" relationships between items in different tables in the database. For instance, one article can be related to many photos.

    Beginning web devs/90% of the web apps out there tend to implement "flat" spreadsheet-style data models that only allow, say, one photo or two photos to be associated with an article (e.g. "cPhoto1" and "cPhoto2" as fields).

    In your design above, you're limited to one cPageTemplate, one Author, and one cached page. For many webapps or data storage applications, that's going to be quite limiting (what do you do if there are several authors?) 

    A so-called "join table" allows you to reference an object on one table -- say, an article -- to a bunch of things on another -- say, photos. This is a fundamental tenet of relational database design, and it's one thing that makes the relational model powerful. 

    Have a look at this excellent article on classes: 
    http://www.asp101.com/articles/richard/ooasp/default.asp 

    Quinn is talking about class implementations in Classic ASP, but he also demonstrates a good clean database design.

    Database design is the most valuable thing I've learned in my work. You will be able to do great things if you learn to use DBs well.

  • User profile image
    kenfine

    The text below elaborates on some of what's discussed in the message above and in the Visio diagram.

    The message was written in an earlier stage of my thinking, so some of what's described may be out of sorts with the Visio piece.

    ----------------------------------------------

    This message describes an advanced method of page templating and layout management. It allows administrators to easily and significantly modify page layouts by modifying simple tag-based descriptions. The system also allows novice users to extensively modify layouts within editable subsections of pages and (unlike many page templating strategies) to allow users to call advanced presentational functionalities via custom tags that they can write into their content as plaintext.

    Behind the scenes, the system works by "expanding" XML tag information in a succession of stages. At the highest and most abstract level, a page layout is described as a simple combination of custom XML tags embedded in a tabular HTML layout.

    The contentarea +within+ a layout can be modified by a user in a given page instance. The user can opt to include custom tags that call advanced functionalities within this contentarea, using a web-based administration tool.

    So, in Stage 1, the administrator can describe pages as a set of abstract tags: header, contentarea, footer, and so on. In Stage 2, the Stage 1 tags are "expanded" into a more complete layout along with feeding in database content for the contentarea of a given page instance. The contentarea corresponding to a page instance may include custom tags written in by a user via a web-based editor. In Stage 3, the user's custom tags are expanded into an HTML stream and combined with the output of Stage 1 to render a finished webpage. Succinctly:

    1) XML layout description with tabular layout -> 2) Template expanded into page instance with contents of contentarea from DB -> 3) custom controls/parameters embedded in the content area expanded

    This model is powerful because the conceptual view of a page is separated from the details of its rendering. The high-level description of the page remains a "pure" and uncomplicated XML description. Lower levels can contain the details of rendering, can "branch" versions to accommodate alternate browser types, etc.

    In this model, both page layouts and content layouts can be managed via web-based CMS, and parameters can be fed to the tags to modify the basic conditions of the layout. If the controls that drive the page parts are sufficiently encapsulated, the layout descriptions are extremely flexible.

    Unresolved issues: I may be suffering from a case of "creeping elegance" (the evil twin of "creeping featurism"), but it would be nice if you could somehow have a total separation of the XML tags that describe the template elements, and the mechanism of describing where a given element should be placed. In the model I describe above, HTML table structures are used to position the custom tags. You could conceivably use CSS to describe placement, but it still intermingles some specific presentational details with abstract layouts.

    So right now, you're looking at...

    <table>

    <tr> <td colspan='3'> <layout:header / > </td> <tr> <td> <layout:navbar / > </td> <td> <layout:contentarea / > </td> <td> < layout:sidebar /> </td> </tr> <tr><td colspan='3'> <layout:footer /> </table>

    ...for a given page layout type, which is looking pretty good in terms of simplicity. But really, it would be nice if you could somehow express this in a way that's independent of HTML or CSS or any presentational technology, but could somehow notate the spatial relationships of the layout so that it could be rebuilt independent of any presentational tech.

    Ideas? Is the "right" solution here just to call it a day and make templates as necessary to suit the presentational tech de jour? Or is there a better way?

  • User profile image
    MasterPi

    Thanks for the link! I'll check it out when I find some time.

    I think I might go for a good revision of my system. 

    Just one more question, at what level does your client editor edit btw? Does it edit at the custom tag level? 

    mVPstar

  • User profile image
    kenfine

    mVPstar wrote:

    Just one more question, at what level does your client editor edit btw? Does it edit at the custom tag level? 



    Most of the inline content editors are similar to the one on C9 when you enter a message: they have a "Design" view, and an HTML/code view.

    Custom tags can be written in in the HTML view of any of these editors. Really good ones like telerik's provide facilities for a minimal representation of the custom tag in the HTML view -- it may be a gray labeled blob, but at least you know it's there.

    You will find that 99% of the inline editors out there have serious issues with mangling/rewriting tags, and custom tags aren't immune to this problem.

    HTH.

    -KF

  • User profile image
    MasterPi

    EDIT: nvm..what I was going to ask was redundant. 

    Still waiting for that SQL Server. Smiley

Conversation locked

This conversation has been locked by the site admins. No new comments can be made.