Alpha Software Mobile Development Tools:   Alpha Anywhere    |   Alpha TransForm subscribe to our YouTube Channel  Follow Us on LinkedIn  Follow Us on Twitter  Follow Us on Facebook

Announcement

Collapse

The Alpha Software Forum Participation Guidelines

The Alpha Software Forum is a free forum created for Alpha Software Developer Community to ask for help, exchange ideas, and share solutions. Alpha Software strives to create an environment where all members of the community can feel safe to participate. In order to ensure the Alpha Software Forum is a place where all feel welcome, forum participants are expected to behave as follows:
  • Be professional in your conduct
  • Be kind to others
  • Be constructive when giving feedback
  • Be open to new ideas and suggestions
  • Stay on topic


Be sure all comments and threads you post are respectful. Posts that contain any of the following content will be considered a violation of your agreement as a member of the Alpha Software Forum Community and will be moderated:
  • Spam.
  • Vulgar language.
  • Quotes from private conversations without permission, including pricing and other sales related discussions.
  • Personal attacks, insults, or subtle put-downs.
  • Harassment, bullying, threatening, mocking, shaming, or deriding anyone.
  • Sexist, racist, homophobic, transphobic, ableist, or otherwise discriminatory jokes and language.
  • Sexually explicit or violent material, links, or language.
  • Pirated, hacked, or copyright-infringing material.
  • Encouraging of others to engage in the above behaviors.


If a thread or post is found to contain any of the content outlined above, a moderator may choose to take one of the following actions:
  • Remove the Post or Thread - the content is removed from the forum.
  • Place the User in Moderation - all posts and new threads must be approved by a moderator before they are posted.
  • Temporarily Ban the User - user is banned from forum for a period of time.
  • Permanently Ban the User - user is permanently banned from the forum.


Moderators may also rename posts and threads if they are too generic or do not property reflect the content.

Moderators may move threads if they have been posted in the incorrect forum.

Threads/Posts questioning specific moderator decisions or actions (such as "why was a user banned?") are not allowed and will be removed.

The owners of Alpha Software Corporation (Forum Owner) reserve the right to remove, edit, move, or close any thread for any reason; or ban any forum member without notice, reason, or explanation.

Community members are encouraged to click the "Report Post" icon in the lower left of a given post if they feel the post is in violation of the rules. This will alert the Moderators to take a look.

Alpha Software Corporation may amend the guidelines from time to time and may also vary the procedures it sets out where appropriate in a particular case. Your agreement to comply with the guidelines will be deemed agreement to any changes to it.



Bonus TIPS for Successful Posting

Try a Search First
It is highly recommended that a Search be done on your topic before posting, as many questions have been answered in prior posts. As with any search engine, the shorter the search term, the more "hits" will be returned, but the more specific the search term is, the greater the relevance of those "hits". Searching for "table" might well return every message on the board while "tablesum" would greatly restrict the number of messages returned.

When you do post
First, make sure you are posting your question in the correct forum. For example, if you post an issue regarding Desktop applications on the Mobile & Browser Applications board , not only will your question not be seen by the appropriate audience, it may also be removed or relocated.

The more detail you provide about your problem or question, the more likely someone is to understand your request and be able to help. A sample database with a minimum of records (and its support files, zipped together) will make it much easier to diagnose issues with your application. Screen shots of error messages are especially helpful.

When explaining how to reproduce your problem, please be as detailed as possible. Describe every step, click-by-click and keypress-by-keypress. Otherwise when others try to duplicate your problem, they may do something slightly different and end up with different results.

A note about attachments
You may only attach one file to each message. Attachment file size is limited to 2MB. If you need to include several files, you may do so by zipping them into a single archive.

If you forgot to attach your files to your post, please do NOT create a new thread. Instead, reply to your original message and attach the file there.

When attaching screen shots, it is best to attach an image file (.BMP, .JPG, .GIF, .PNG, etc.) or a zip file of several images, as opposed to a Word document containing the screen shots. Because Word documents are prone to viruses, many message board users will not open your Word file, therefore limiting their ability to help you.

Similarly, if you are uploading a zipped archive, you should simply create a .ZIP file and not a self-extracting .EXE as many users will not run your EXE file.
See more
See less

Normalization Limitations

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Normalization Limitations

    OK, we've been wrestling with a normalization issue, with me being a sort of rabid normalizer and my boss being the voice of reason, so I thought I would put the question up here, broadly:

    What are the limitations of normalization in A5?

    The specific situation that arises can be seen by this example: We have a table that contains information about "projects". Each project is associated with a "contact". The contact's name is vital to the project, and I managed to make the name appear by using lookups, even though I'm storing only the contact code.

    So, if "Sam Jones" has an ID of "0001", projects associated with Mr. Jones have only the "0001" code. Through the magic of lookups, this appears as "Sam Jones" in all the forms and browses based on the project table (even though, again, only the "0001" is stored in the project table).

    This is wonderful. Normalized. What I was working to achieve. You can even do a right-click-sort on "Sam Jones" and the column will come out in alphabetic order (sorted by "Sam Jones" rather than "0001"). But there are problems.

    All or most of the underlying DB functions (except the right-click-sort) mentioned above seem to function on the "0001" instead of the "Sam Jones". In particular, the built-in "Find" function operates inconsistently (from the user's standpoint).

    We've thought of a few ways around some of these problems. I suggested, for example, that we build the index for the field based on a function that does a lookup--so that "0001" really does become "Sam Jones"--but initial attempts at this seemed to indicate that using a function in an index was a bad idea. ("Bad" as in "unreliable" and "prone to breakage".)

    We could also roll our own Find routine and do some sleight-of-hand, though this becomes a lot of work, and presents us with considerable work to duplicate the built-in feature.

    Or we could just scrap normalization and store "Sam Jones" alongside the "0001". Then, whenever the contact database is changed (say, "Sam Jones" becomes "Sam J. Jones"), we go through and update the projects table as well as the contacts table.

    This last makes me shudder. Normalization--3rd normal form--is a Good Thing. But if it can't be done reliably, it can't be done. So, how do y'all handle this?

    Thanks in advance for any feedback.

    #2
    RE: Normalization Limitations

    Blake

    I think there are some cases where you might save some data like "Sam Jones" alongside the "0001". It does violate some rules, but I think rules are arbitrary. If you have a need for an index on name, it isn't going to work very well if all you have saved is the id number. Yes, you can build a query on a set that might do what you need, but that may have speed issues and doesn't work well in all cases. It does make keeping related data updated a pain however. In one case where I was almost forced to save such "extra" data for indexing reasons, I have a routine that checks that field value in field rules for the table. If the value changes, it searches and updates all tables that link to that table with id and also save that field value. A great incentive to avoid the practice at all costs, or at least limit duplication to a couple tables.

    Another approach to find issues is to "roll your own" find routine. For example, if you save a customer name in the customer table and link it to invoices with a id, you can't find the name directly in the set. Instead I do a "find" on the child table, which will give me the id and then use that id to find related records in the set.

    I think sometimes we get caried away with normalization and rules and don't always use it appropriately. In the end, the question is does the solution we use do what we want? If it does, does it really matter if it meets all "rules"? Sometimes duplicate information just can't be avoided.

    Jerry

    Comment


      #3
      RE: Normalization Limitations

      Hello Blake,

      Someone posted a saying not long ago something like:

      "Normalize till it hurts, denormalize until it works"

      There's quite a bit of truth to this. First off, like Jerry, it wouldn't bother me to put the name in the table. However, I would probably look at a simple one-to-one link to provide the name without storing it in the table. I don't know what context you are using the lookup function, but in general I would stay away from this approach. If you are using a layout calculated field, each record has to query the table to find the appropriate name. On a browse, this could cause a fair amount of overhead. If however, you are doing this in a physical table field, defined as a calculated field, Then the lookup function shouldn't continue to fire unless the actual record is changed.

      Good luck,
      Jim

      Comment


        #4
        RE: Normalization Limitations

        Jerry,

        Philosophical points first:

        I think sometimes we get caried away with normalization and rules and don't always use it appropriately. In the end, the question is does the solution we use do what we want? If it does, does it really matter if it meets all "rules"? Sometimes duplicate information just can't be avoided.

        Relational theory and the principles normalization are, in my opinion, pretty solid responses to the difficulties that arise in storing data and creating DBMSes. They're elegant and simple solutions to complex problems. (Contrast this with any programming paradigm, like object-orientation, which tend to be very useful but also wanting in many ways.)

        They can be misapplied, of course. Sometimes one needs a hierarchical DBMS and R is not appropriate. Normalization can be misapplied by mistaking structure for significance. (For example, a lot of DBMSes don't support arrays because they think it violates 1st Normal Form, but an array can be a convenient way to store similar data that has different significances.)

        Anyway, the problem here is not "getting carried away by normalization" but a failure of the DBMS to provide a way to normalize properly. This isn't really a shot at A5: I don't believe the late Dr. Codd ever met a DBMS that satisfied all his rules in the eighteen years since he wrote them.

        What might be a legitimate shot is that A5 seems to allow normalization here--through lookups and indexes based on functions--but those features are unreliable. As a result, I have a lot more work to do.

        I think there are some cases where you might save some data like "Sam Jones" alongside the "0001". It does violate some rules, but I think rules are arbitrary.

        Some rules are arbitrary; normalization rules are not. Storing the "Sam Jones" in more than one place compromises data integrity. It makes what should be a simple change vastly more complex and prone to failure.

        If you have a need for an index on name, it isn't going to work very well if all you have saved is the id number.

        If I understand the way indices normally work in dBase, I actually don't see why it should be a problem. (As I understand it, indices are usually kept with their key values, making updating a much faster process. That might have been a cs. 1990 Foxpro advance, but I think it was widely adopted.)

        Yes, you can build a query on a set that might do what you need, but that may have speed issues and doesn't work well in all cases.

        I was thinking more in terms of a function that returned the value of one field--well, really, just a call to "lookup" would do it. "lookupc" would even use an index, right?

        It does make keeping related data updated a pain however. In one case where I was almost forced to save such "extra" data for indexing reasons, I have a routine that checks that field value in field rules for the table. If the value changes, it searches and updates all tables that link to that table with id and also save that field value. A great incentive to avoid the practice at all costs, or at least limit duplication to a couple tables.

        So, really, I think we agree as far as the "rules" of normalization. Duplication is bad. Breaking normalization should be done minimally, if at all.

        Another approach to find issues is to "roll your own" find routine. For example, if you save a customer name in the customer table and link it to invoices with a id, you can't find the name directly in the set. Instead I do a "find" on the child table, which will give me the id and then use that id to find related records in the set.

        Yes, this was one of the possibilities we've been looking at. Problem is, it makes the UI inconsistent. We either end up doing it for everything, or end up with two different find features.

        Appreciate the feedback, Jerry. It seems that we understand our options at this point. It's just a matter of making that tough decision.

        Comment


          #5
          RE: Normalization Limitations

          Hey, Jim,

          "Normalize till it hurts, denormalize until it works"

          Indeed, and many years ago when R was first finding popularity, I got into more than one heated debate about "how much normalization" is good. I had a C/S RDBMS at the time and discovered that if I fully normalized my data, it ballooned to about 200 times its non-normalized size, took minutes to perform a moderately simple request, and made some things downright impossible.

          In the subsequent decade, though, tools got better and certainly my understanding got better. I admit I've gotten spoiled: I know how much work non-normalization is, and what a nightmare it is if data integrity is compromised. I hate to break it.

          When I came to this project, my first goal was to normalize everything, and I certainly misapplied normalization in a few instances. (For example, in some cases, we need an actual copy of the data because we don't want the data to be changeable.) But this a good, necessary application, and the problem only arises because the linked-to-field-data is of primary importance to the user.

          There's quite a bit of truth to this. First off, like Jerry, it wouldn't bother me to put the name in the table.

          Gah. Actually, I think it does bother Jerry. Heh. He's just come to accept the realities of what can and can't be done. (I generally put off reality till the last second.)

          However, I would probably look at a simple one-to-one link to provide the name without storing it in the table.

          That's fine, except we need to sort on the name. We need the user to be able to "Find By" that name. I've considered the possibility of using some sleight-of-hand to flip the active table over to the contacts, but I think even if that were possible, I'd end up presenting the user with a jarring inconsistency in the UI.

          I don't know what context you are using the lookup function, but in general I would stay away from this approach. If you are using a layout calculated field, each record has to query the table to find the appropriate name. On a browse, this could cause a fair amount of overhead. If however, you are doing this in a physical table field, defined as a calculated field, Then the lookup function shouldn't continue to fire unless the actual record is changed.

          A5 lets you create a field that uses a lookup table for its values. You can store "X" in the field, but A5 will display "Y" from a completely different table, both in dropdown selection lists and in browses and on forms. It's a normalization dream come true.

          It's just not "complete", in the sense that this doesn't follow through to all of A5's features. You can sort on the abovementioned "Y" (which is great!) but the built-in Find feature reverts to the underlying "X" (which is bad).

          I don't know about the performance issues yet, but it's a cinch that we'll need calculated layout fields or subforms or something of that sort. We may "have" to put in the name, but there's no way we're duplicating all the contact info.

          Thanks for the feedback!

          Comment


            #6
            RE: Normalization Limitations

            In this case, since the incidence would be so slight, (after all, how often do you change a company name,) why not just query and update tables - its very fast.

            If you need to display the company name, why not link it 1 to 1, thus only the id is stored and "normalization" is preserved.
            Cole Custom Programming - Terrell, Texas
            972 524 8714
            [email protected]

            ____________________
            "A young man who is not liberal has no heart, but an old man who is not conservative has no mind." GB Shaw

            Comment


              #7
              RE: Normalization Limitations

              In this case, since the incidence would be so slight, (after all, how often do you change a company name,) why not just query and update tables - its very fast.

              'cause it's wrong? :-)

              Now, if we have to, we have to. The quesiton is "Do we have to?" and to a lesser extent "Why do we have to?"

              If you need to display the company name, why not link it 1 to 1, thus only the id is stored and "normalization" is preserved.

              That's the way it is now. The problem arises in using the built-in "Find" function and trying to index on this referenced field. I actually have a simple example of the situation, but I figured 'most everyone must've run into this at some point so I didn't post it.

              Comment


                #8
                RE: Normalization Limitations

                Blake,
                It might not be a bad idea to post your simple example and see if anyone wants to take a shot at trying to do finds and sorts while the data is normalized.
                Certainly an important question in your particular case is the potential size of the table - miilions? thousands? or what.
                John

                Comment


                  #9
                  RE: Normalization Limitations

                  There are those who would say this discussion is esoteric and overly technical. Well, they are probably right. Many users of Alpha Five have never heard of normalization or have no real clue what it means. I am not a programmer by trade and have not studied programming concepts and conventions a great deal. I suspect many users here have a similar level of knowledge and experience. When I took computer programming many many years ago, the big concern was how to use as few lines as possible since creating punch cards was a pain.

                  However, this is a very valuable discussion since many users don't consider the consequences of their methods. It seems some developers want to build forms, write code, and write reports and spend little time building the underlaying structures. Some of the table and database structures I have seen suggest that there is minimal thought about how to design the structures in any consistent or logical pattern. The lack of any normalization is just one of the issues.

                  My main concern is maintaining data integrity and storing the same information in more than one place just seems wrong and an invitation to disaster. I have spent some time trying to understand normalization rules and what I know suggests they are very good rules. But rules are arbitrary contructs. They just don't work in every situation.

                  For example, they work extremely well for such things as invoicing since you can link everything with customer ids, part numbers, etc. Each identifier has a lot of information attached. This is where normalization works very well. Save the "fixed" identifier and you can link a large amount of changable data.

                  On the other hand, it is easy to get carried away. I have a situation where a lookup table has a single field that can has a value that can change. I could assign a "fixed" identifier and link each instance the value is needed on the identifier. But now I am still saving the same amount of data, just a different field. If I need the actual value, I either have to link the table in a set or use a lookup function. Both add complication, and lookup functions have a speed hit. Or,I can just forget normalization issues and save the actual value. If the value does not change often, updates across the various tables are infrequent and not difficult to control.

                  Every decision should consider the tradeoffs. Rules should exist to help us make decisions, not force us to make bad decisions. I try very hard to normalize and don't like to compromise. At the same time, I am not afaid to break a "rule" if the result is just as secure and actually reduces complication. A rule is there to help. Don't ignore it, but if it doesn't work, chuck it.

                  Jerry

                  Comment


                    #10
                    RE: Normalization Limitations

                    Hello Blake,

                    >>That's fine, except we need to sort on the name. We need the user to be able to "Find By" that name. I've considered the possibility of using some sleight-of-hand

                    Comment


                      #11
                      RE: Normalization Limitations

                      Jim,
                      Good demo.The only problem with something like that is that it negates the possibility of optimization.Therefore as the table gets larger and larger,the performance will suffer.
                      Actually the normalizing thing can apply to keeping summary information(for instance mtd sales - ytd sales - lyr sales on a customer table ), even though the info exists in the invoice details.Very easy for an end user to do a query by form and find all customers whose current sales are zero and last year's sales more than $1,000.
                      Redundancy has its place.
                      John

                      Comment


                        #12
                        RE: Normalization Limitations

                        Hello John,

                        >>The only problem with something like that is that it negates the possibility of optimization.Therefore as the table gets larger and larger,the performance will suffer.

                        Comment


                          #13
                          RE: Normalization Limitations

                          The slowdown would be that Alpha will build a query every time you open the form.On a table with 400,000 records or such that will take a bit.
                          If you had a form with a filter accompanying your order expression that said just give me the names starting with "A", then I also believe there is no optimization.It is my understanding that Alpha will have to read thru every record to get the "A".

                          Comment


                            #14
                            RE: Normalization Limitations

                            Hello John,

                            >>The slowdown would be that Alpha will build a query every time you open the form.

                            Comment


                              #15
                              RE: Normalization Limitations

                              "True, but would not Alpha be able to take advantage of LQO and the index on the one-to-one child table (in my example the index on lastname in the Customer table) ?"

                              Jim,
                              No.At least that is what I always thought and it was backed up by testing.
                              Try it on some large file.Test the time difference between ordering on an indexed field in the parent record and an indexed field in the child record.
                              By putting the order in the way you did, Alpha in essence builds a temporary index (query) and once that is built you can do the "find by key".It works, but is questionable with huge tables.
                              John

                              Comment

                              Working...
                              X