Tech Off Post

Single Post Permalink

View Thread: Database design/performance question
  • LostIn​Tangent

    I have a group of about 6 datatables that collectively CAN contain millions of records (it is based on the user). Users have the ability to create a personal version of the data contained in the 6 tables so that they can modify/add/delete/whatever they want to the data to perform "what-if" scenarios and not affect the live data. The design originally placed a column in all 6 tables called VersionID that was part of the respective primary keys, allowing each record in all 6 tables to be replicated with a modified version identifier. So when a user created a new version, it would copy all data in each table, apply a new VersionID value to the data and re-insert it into each table.

    This solution seems to be pretty crappy in terms of performance/design.

    1) The tables get cluttered with tons of data that is user created.
    2) The tables record counts get blown out of whack if excessive versions are made, which in turns kills performance.
    3) In order to use these tables with any sort of effeciency, an index needed to be put on the VersionID column to allow for selecting of records by it, but adding that index affected the other non-version related operations on each table.
    4) Other issues...

    My question is: what would be a better approach to this solution? Would it be better to create a copy of each of the tables for each version? So that the data would be seperated and data operations would perform better since each version would be isolated. Plus when the version is deleted later, I could easily perform DROP TABLEs on each copy which would be basically instant and wouldn't jack the index of the live tables with deleting millions of records.

    Any input is appreciated.