Alpha Software Mobile Development Tools:   Alpha Anywhere    |   Alpha TransForm subscribe to our YouTube Channel  Follow Us on LinkedIn  Follow Us on Twitter  Follow Us on Facebook

Announcement

Collapse

The Alpha Software Forum Participation Guidelines

The Alpha Software Forum is a free forum created for Alpha Software Developer Community to ask for help, exchange ideas, and share solutions. Alpha Software strives to create an environment where all members of the community can feel safe to participate. In order to ensure the Alpha Software Forum is a place where all feel welcome, forum participants are expected to behave as follows:
  • Be professional in your conduct
  • Be kind to others
  • Be constructive when giving feedback
  • Be open to new ideas and suggestions
  • Stay on topic


Be sure all comments and threads you post are respectful. Posts that contain any of the following content will be considered a violation of your agreement as a member of the Alpha Software Forum Community and will be moderated:
  • Spam.
  • Vulgar language.
  • Quotes from private conversations without permission, including pricing and other sales related discussions.
  • Personal attacks, insults, or subtle put-downs.
  • Harassment, bullying, threatening, mocking, shaming, or deriding anyone.
  • Sexist, racist, homophobic, transphobic, ableist, or otherwise discriminatory jokes and language.
  • Sexually explicit or violent material, links, or language.
  • Pirated, hacked, or copyright-infringing material.
  • Encouraging of others to engage in the above behaviors.


If a thread or post is found to contain any of the content outlined above, a moderator may choose to take one of the following actions:
  • Remove the Post or Thread - the content is removed from the forum.
  • Place the User in Moderation - all posts and new threads must be approved by a moderator before they are posted.
  • Temporarily Ban the User - user is banned from forum for a period of time.
  • Permanently Ban the User - user is permanently banned from the forum.


Moderators may also rename posts and threads if they are too generic or do not property reflect the content.

Moderators may move threads if they have been posted in the incorrect forum.

Threads/Posts questioning specific moderator decisions or actions (such as "why was a user banned?") are not allowed and will be removed.

The owners of Alpha Software Corporation (Forum Owner) reserve the right to remove, edit, move, or close any thread for any reason; or ban any forum member without notice, reason, or explanation.

Community members are encouraged to click the "Report Post" icon in the lower left of a given post if they feel the post is in violation of the rules. This will alert the Moderators to take a look.

Alpha Software Corporation may amend the guidelines from time to time and may also vary the procedures it sets out where appropriate in a particular case. Your agreement to comply with the guidelines will be deemed agreement to any changes to it.



Bonus TIPS for Successful Posting

Try a Search First
It is highly recommended that a Search be done on your topic before posting, as many questions have been answered in prior posts. As with any search engine, the shorter the search term, the more "hits" will be returned, but the more specific the search term is, the greater the relevance of those "hits". Searching for "table" might well return every message on the board while "tablesum" would greatly restrict the number of messages returned.

When you do post
First, make sure you are posting your question in the correct forum. For example, if you post an issue regarding Desktop applications on the Mobile & Browser Applications board , not only will your question not be seen by the appropriate audience, it may also be removed or relocated.

The more detail you provide about your problem or question, the more likely someone is to understand your request and be able to help. A sample database with a minimum of records (and its support files, zipped together) will make it much easier to diagnose issues with your application. Screen shots of error messages are especially helpful.

When explaining how to reproduce your problem, please be as detailed as possible. Describe every step, click-by-click and keypress-by-keypress. Otherwise when others try to duplicate your problem, they may do something slightly different and end up with different results.

A note about attachments
You may only attach one file to each message. Attachment file size is limited to 2MB. If you need to include several files, you may do so by zipping them into a single archive.

If you forgot to attach your files to your post, please do NOT create a new thread. Instead, reply to your original message and attach the file there.

When attaching screen shots, it is best to attach an image file (.BMP, .JPG, .GIF, .PNG, etc.) or a zip file of several images, as opposed to a Word document containing the screen shots. Because Word documents are prone to viruses, many message board users will not open your Word file, therefore limiting their ability to help you.

Similarly, if you are uploading a zipped archive, you should simply create a .ZIP file and not a self-extracting .EXE as many users will not run your EXE file.
See more
See less

Can this routine be made quicker!

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Can this routine be made quicker!

    Hi,

    I have been looking at an old A5 app that we have been using for quite a few years. Its used to compare a match file (match.dbf, with partial zip/postcodes and other address info) against a sample file (process file.dbf) and then flag any records in the sample file that match.

    The address data is concatenated into a single field called address (one long string).

    We can have tens of thousands of records in the sample file and hundreds of records in the match file.

    This takes quite a while to chew through, is the routine below efficient! Is there a better way to do this and get a faster result?
    Code:
    t1=toseconds(time())
    
    m_tbl=table.open("match",FILE_RW_EXCLUSIVE)
    total_records=m_tbl.records_get()
    update.fields = 1
    update.field1 = "match_cnt"
    update.expr1  = "0"
    m_tbl.update()
    
    p_tbl=table.open("process file",FILE_RW_EXCLUSIVE)
    
    m_tbl.fetch_first()
    p_tbl.fetch_first()
    
    p_tbl.batch_begin()
    
    While .not. m_tbl.fetch_eof()
     While .not. p_tbl.fetch_eof()
      If rtrim(upper(match->match1))$upper(process file->address).and.rtrim(upper(match->match2))$upper(process file->address)
       p_tbl.change_begin()
       p_tbl.area      =m_tbl.area
       p_tbl.area_name =m_tbl.area_name
       p_tbl.std       =m_tbl.std
       p_tbl.change_end(.t.) 
       m_tbl.change_begin()
       m_tbl.match_cnt =m_tbl.match_cnt+1
       m_tbl.change_end(.t.) 
      End If
     p_tbl.fetch_next()
     End While
     m_tbl.fetch_next()
     p_tbl.fetch_first()
     statusbar.percent(m_tbl.recno(),total_records)
    End While
    p_tbl.batch_end()
    
    p_tbl.close()
    m_tbl.close()
    
    t2=toseconds(time())
    XMsg=ui_msg_Box("","Process Finished"+chr(10)+ltrim(str(t2-t1))+" seconds",ui_ok)
    Last edited by AaronBBrown; 11-22-2006, 06:05 PM. Reason: code tag
    Chris Tanti
    Technical Support

    Nuance & Fathom Ltd - The data-driven marketing agency

    #2
    Re: Can this routine be made quicker!

    two quick comments:

    1) this code would be much much easier to read if it were formatted. The message board removes all formatting unless the block of code is delimited with CODE delimiters, available in the advanced message reply editor

    2) ideas we come up with could be tested and validated before replying if specimen tables were supplied, with their indexes.

    -- tom

    Comment


      #3
      Re: Can this routine be made quicker!

      Chris,

      Yes it can be made faster. Currently, you are doing the match table # of records times the process table # of record fetches. Instead, you need to create an index in one of the tables (the larger one probably is best), then use Key_Exist() and LQO queries to quickly return the records to process in the inner WHILE.

      There also might be ways to use the Summarize, Intersect and/or Crosstab operations to effectively create a table indicating those that match, and that would probably be even quicker.
      Regards,

      Ira J. Perlow
      Computer Systems Design


      CSDA A5 Products
      New - Free CSDA DiagInfo - v1.39, 30 Apr 2013
      CSDA Barcode Functions

      CSDA Code Utility
      CSDA Screen Capture


      Comment


        #4
        Re: Can this routine be made quicker!

        There's also the Subtract Records operation.

        Lenny Forziati
        Vice President, Internet Products and Technical Services
        Alpha Software Corporation

        Comment


          #5
          Re: Can this routine be made quicker!

          Chris,

          is there any possibility that more than one record in the process table will match the same match file record? If not, your script could exit the inner loop whenever a match is found. There'd be no reason to check the rest of the process table records if you see what I mean.

          -- tom

          Comment


            #6
            Re: Can this routine be made quicker!

            I dunno if Ira's suggestion will work. I started to write the same response, but realized that the $ function that you need to make this work isn't compatible with LQO and indexing methods. Lenny and Tom make suggestions that seem worth trying.
            -Steve
            sigpic

            Comment


              #7
              Re: Can this routine be made quicker!

              Hi everyone,

              Not sure if an index will help, the routine looks for a string in the long address line field. The match could appear anywhere within this field, and its usaually based on a partial string match.
              eg. the process file could have TOWN 1,CF11 1TH and the Match file will have TOWN 1 , CF11 1 and consider that a close enough match.
              These two string may or may not appear together.

              And the match file could refer to many (all or any) of the records in the sample process file, its even possible that the match record could match several times to the same record.

              Just had a look at subtract and I think it works with matching keys, so once again I dont think it comes in to play when hunting for partial strings within a field

              It takes a heck of a long time to crunch through records, and I remember we had an old DOS DBASE5 app that did a similar job way quicker (but not terribly flexible or friendly to use)

              Have attached a sample DB.
              Chris Tanti
              Technical Support

              Nuance & Fathom Ltd - The data-driven marketing agency

              Comment


                #8
                Re: Can this routine be made quicker!

                Chris,

                If your match record could match several process table records then your counting system is going to be off. The same match table record would be changed each time a qualifying process table record is found. Is this significant to your process, or do you really only need to know if AT LEAST one match is found?

                Later:

                I took a look. Cannot find your script. Please describe the results you need to achieve as specifically as you can. (a) I've been assuming records "match" when they match exactly. It now appears you have something else in mind. (b) of what significance is the running count of matched records (c) how should multiple "matches" be handled?

                -- tom
                Last edited by Tom Cone Jr; 11-22-2006, 04:52 PM.

                Comment


                  #9
                  Re: Can this routine be made quicker!

                  Hi Tom,

                  Yep, just an example, its basically down to the person running the process to create a good match file with no overlaps (if thats what they want), there are a few occasions when we may want to have one match take priority over another and so they may key them in order of priority in those cases (if theres the posibility of a multi match).
                  Chris Tanti
                  Technical Support

                  Nuance & Fathom Ltd - The data-driven marketing agency

                  Comment


                    #10
                    Re: Can this routine be made quicker!

                    Sorry, about the zip, you need to add a table called tbl_menu, the code is on a form called menu!! Don't know why the table just got dropped (it should be in the original zip)!

                    Have re-attached a hopefully corect version of the zip file with the form visible
                    Chris Tanti
                    Technical Support

                    Nuance & Fathom Ltd - The data-driven marketing agency

                    Comment


                      #11
                      Re: Can this routine be made quicker!

                      If a record is found that matched, can it be excluded from further consideration?

                      If so, add a flag field to the table-- let's assume it's a logical field named "Found".

                      Set up an index, but filtered for "found = .f."

                      Do your searching through all records in the index, which will initially be all records in the table.

                      For each match that is found, change Found to .t.

                      Continue subsequent searches in the filtered index, which will mean fewer and fewer records searched as the process continues.

                      There's a bit of overhead involved to change the value of Found when a match is located, but I think it would pay off quite well.
                      -Steve
                      sigpic

                      Comment


                        #12
                        Re: Can this routine be made quicker!

                        Chris,

                        Whether or not an LQO is possible, nevertheless, rather than testing each record in the inner loop, evaluate the outerloop's table values for comparison, then do a query of the field's in the inner loop. The query will go much faster than record by record fetching. After the query, process the record's found (with a global update which is faster than individual changes of records (at least for query count's probably bigger than 3), then move to the next record in the outer loop. It will still take time, but should be faster by, I'll guess a factor of 100 or more.
                        Regards,

                        Ira J. Perlow
                        Computer Systems Design


                        CSDA A5 Products
                        New - Free CSDA DiagInfo - v1.39, 30 Apr 2013
                        CSDA Barcode Functions

                        CSDA Code Utility
                        CSDA Screen Capture


                        Comment


                          #13
                          Re: Can this routine be made quicker!

                          Chris,

                          If one assumes the format of the data will remain unchanged it's possible to custom craft a very fast routine. Using your sample define a new index called "byTownPostal" using this expression:
                          Code:
                          REMSPECIAL(WORD(ADDRESS,4," ",3))
                          . Build the index.

                          Then run this code to find the matches:
                          Code:
                          t1=toseconds(time())
                          
                          m_tbl=table.open("match",FILE_RW_EXCLUSIVE)
                          total_records=m_tbl.records_get()
                          update.fields = 1
                          update.field1 = "match_cnt"
                          update.expr1 = "0"
                          m_tbl.update()
                          
                          'xbasic_wait_for_idle()
                          p_tbl=table.open("process_file",FILE_RW_EXCLUSIVE)
                          p_tbl.index_primary_put("ByTownPostal")
                          
                          m_tbl.fetch_first()
                          p_tbl.fetch_first()
                          
                          p_tbl.batch_begin()
                          
                          While .not. m_tbl.fetch_eof()
                          	'load keys once
                          	'vc_m1 = trim(Match->match1)
                          	'vc_m2 = trim(Match->match2)	
                          	srchKey = remspecial(alltrim(Match->match1))+alltrim(match->match2)
                          	'trace.writeln(srchKey)
                          	result = P_tbl.fetch_find(srchKey)
                          	if result > 0 then 'find successful
                          		matchcounter = 0
                          		While (srchKey = remspecial(word(p_tbl.address,4," ",3))) .and. .not. p_tbl.fetch_eof()
                          			p_tbl.change_begin()
                          			p_tbl.area =m_tbl.area
                          			p_tbl.area_name =m_tbl.area_name
                          			p_tbl.std =m_tbl.std
                          			p_tbl.change_end(.t.)
                          			matchcounter = matchcounter + 1
                          			p_tbl.fetch_next()
                          		End While
                          		m_tbl.change_begin()
                          		m_tbl.match_cnt = matchcounter   'write last counter value, instead of each counter value
                          		m_tbl.change_end(.t.)		
                          	end if	
                          	m_tbl.fetch_next()
                          	p_tbl.fetch_first()
                          	statusbar.percent(m_tbl.recno(),total_records)
                          End While
                          p_tbl.batch_end()
                          
                          p_tbl.close()
                          m_tbl.close()
                          
                          t2=toseconds(time())
                          XMsg=ui_msg_Box("","Process Finished"+chr(10)+ltrim(str(t2-t1))+" seconds",ui_ok)

                          Comment


                            #14
                            Re: Can this routine be made quicker!

                            Steve / Ira / Tom,

                            Thanks a lot guys, I am going to try all your suggestions, I think it would be a good idea to filter out the matched records as suggested, occurences of match files with possible overlaps is extremely rare.

                            I have tried Tom version of the code, and its incredibly quick (I initially thought it didnt work!) Only prob is that they key values may not always be in that sequence, they could literally be matching on any part of the string (there could be sub areas, or even partial street names). If I can find a situation where the match is based on a fixed format then I think this is an incredibly efficient way of handling the matching. As it stands I think I will first try filtering out the matched records to reduce the sample size.
                            Chris Tanti
                            Technical Support

                            Nuance & Fathom Ltd - The data-driven marketing agency

                            Comment


                              #15
                              Re: Can this routine be made quicker!

                              they could literally be matching on any part of the string (there could be sub areas, or even partial street names)
                              This wasn't clear to me from the top. The approach I built will work only if the search key and data format remain fixed.

                              What you're seeing is the same difference which existed back in DOS dBase between a FIND (using an index) and a LOCATE. The indexed FIND approach was always much faster, but required rigid structure. The LOCATE approach requires no structure but is much slower.

                              Perhaps if you tell us more about the application, and describe the permitted ways users can populate the match table, maybe we can suggest other ideas for you. It would help to know if the structure and content of the Process_file table is fixed or not, too.

                              -- tom

                              Comment

                              Working...
                              X