Framework IX adds three new parameters to the @dbasefilter()
FRED function. Those parameters add to Framework two new methods
for instant access to very large disk based databases.
@dbasefilter("c:\database directory\my_database.dbf", ; 1st. the disk-based database file path name
@and(DATE >= {2012/08/26}, DATE <= {2012/09/29}), ; 2nd. a FRED logical expression (filter) (DATE is a field name)
1, ; 3rd. starting record number to evaluate and load
1, ; 4rt. ending record number to evaluate and load
40, ; 5th. max number of records to load at a time for browsing in table view
#FALSE, ; 6th. show record number in table view in read only mode in first col.
#FALSE, ; 7th. show delete flag in table view as 2nd col.
, ; 8th. Optional MDX/NDX index expression. Not used with the new parameters
1, ; 9th. File access mode. 1 write, 0 - read only.
, ; 10th. Optional big data storage expression. Not used with the 12th parameter
, ; 11h. Optional big data expression. Not used with the 12th parameter
"AND,FIRST_NAME,'JOHN',LAST_NAME,'SMITH'" ; 12th. On the fly scan index for instant access pre-filter (case insensitive)
; FIRST_NAME and LAST_NAME are field names
) ; end of @dbasefilter
The first method controlled by the first two new parameters,
the 10th and 11th, is used for handling "big data" instant
interactive processing and complex encoding such as DNA analisys
with specialized hardware, custom software and consulting and is
not the subject of this article.
The second method which works with existing .DBF files is
one of the subjects this article covers. It uses the new 12th
parameter to provide virtually instant access to very large
disk-based databases without the need to maintain indexing,
alleviating the need for MDX and NDX index files and an 8th
parameter index expression. It does it by creating "on-the-fly"
custom memory-based indexes for fields specified in the 12th
parameter logical expression so a FRED expression in the 2nd
parameter can evaluate only those records that conform to the
12th parameter logical expression.
@dbasefilter 12th parameter consists of a single string
expression that provide instant access to dBase files (.DBF) by
preloading specified database fields to memory and scanning them
for specified data in order to create a subset of the records
which the logical FRED expression in the 2nd parameters will then
evaluate to determine which of the records will be loaded to the
database frame and which records will be discarded. It saves time
since FRED evaluate only the data of interest. It completely
eliminate the need for NDX or MDX indexes so the 8th parameter
index expression can be left empty. Preloading the specified
fields to memory is just as fast or faster than indexing and does
not required the updating of index file when the database is
written to.
"AND,FIRST_NAME,'JOHN',LAST_NAME,'SMITH'"
How to use this new capability? The 12th parameter is a
string enclosed in quotes that start with one of two possible
keywords: OR or AND. It continues with an unlimited number of
pairs, each consist of a field name and the data to look for in
it. The referenced fields are loaded to memory and scanned for
the specified data, taking into account the OR or AND keyword.
Only records that contain the specified data and satisfy the OR
or AND condition are made available to 2nd parameter FRED
expression for evaluation. Since the default returned value of
the 2nd parameter FRED expression is TRUE leaving the 2nd
parameter empty will load all the records that satisfy the 12th
parameter conditions.
In many cases a 12th parameter can replace the 2nd parameter
FRED expression but when a more complex filter require additional
evaluations such as date or time periods, or the referencing of
data outside the database the 2nd parameter FRED expression can
be used to evaluate those records that satisfied the 12th
parameter conditions. The 2nd parameter FRED expression may of
course reference all the database fields including fields that
are not specified in the 12th parameter as well as frames,
spreadsheets and other databases, even other disk based
databases. It can perform all FRED operations regardless if the
evaluated record will end up loaded and appended to the database
frame or discarded. The 2nd parameter FRED expression is a
program that can be as long as any FRED program and can call
other FRED programs. Its final returned value (its result) must
be a logical value which is used to determine if the evaluated
record will be loaded or discarded. A returned value of #TRUE
keeps the record in the database after it was evaluated and
#FALSE discards it.
Suppose we want to load a single record that contain the name John Smith in a disk based database (a .DBF file) that have the fields FIRST_NAME and LAST_NAME in it. A 12th parameter will look like that:
"AND, FIRST_NAME,'JOHN', LAST_NAME,'SMITH'"
A synthetic example of it will look like:
"CONDITION, FIELD-NAME,'DATA'[, FIELD-NAME, 'DATA' [, '', , '', ...]]"
\ fields / \ 1st / \ optional 2nd / \ optional /
condition \ field-data / \ field-data / \ field-data /
(conjunction) \ pair / \ pair / \ pairs /
(OR/AND)
The fields condition keyword and first field-data pair are required.
The fields condition apply to all fields. Any number of additional
field-data pairs are allowed.
Field names and data are case insensitive.
Note the use of single quotes around data but not around
field names. Also note that the data is case insensitive. The
expression consists of a single keyword followed by two pairs of
field name and data.
Supposed we want to load the same set of records as above
but only if they have a date field named DATE containing a date
between August 26, 2012 and September 15, 2012. We keep the same
above expression in the 12th parameter and put this in the 2nd
parameter FRED expression:
@and(DATE >= {2012/08/26}, DATE <= {2012/09/29})
The 2nd parameter FRED expression will evaluate only the records
that satisfies the 12th parameter conditions. If it will return FALSE the
evaluated record will be discarded. If TRUE the record will be appended
to the database frame. Since the 2nd parameter default returned
value is TRUE, with no 2nd parameter all the records that satisfies the
12th parameter conditions will be appended to the database frame.
Note that if the number of records that should be loaded is larger
than the specified database frame height (the 5th parameter) Framework
make it possible to browse through those records by pressing (or preforming
using a FRED function or a macro) PgDn on the bottom record or PgUp on top
record in the database frame.
If you suspect that the number of record that the 2nd
parameter FRED expression will be evaluating is quite high, say
for example that a few hundred records belong to John Smith, you
can optimize the filter by adding more condition to the 12th
parameter. e.g.
"AND,FIRST_NAME,'JOHN',LAST_NAME,'SMITH',DATE,'2012'"
so only to 2012 records will be evaluated, or even better:
"AND,FIRST_NAME,'JOHN',LAST_NAME,'SMITH',DATE,'201208',DATE,'201209'"
This will make sure that FRED will only evaluate the records
with August 2012 or September 2012 in the DATE field. Depending
on a case by case sometimes it is possible make the 2nd parameter
FRED expression unnecessary.
The 2nd parameter FRED expression may include references to
any fields in the database, including of course fields that are
not referenced in the 12th parameter:
@and(DATE >= {2012/08/26}, DATE <= {2012/09/29},PRODUCT,"FWSDK", SERIAL_NO,"48731")
Note that the 12th parameter does not "evaluate" the fields by
value, instead it scans the field content ASCII characters which
is faster. If we were to use the 12th parameter to look for
product and serial_no the syntax we use would be:
"AND,FIRST_NAME,'JOHN',LAST_NAME,'SMITH',DATE,'201208',DATE,'201209'
PRODUCT,'FWSDK4', SERIAL_NO = '48731;"
The single quotes are a new addition to the FRED language syntax which depending on
specific functions may signify literal strings that will not be evaluated.
In regular strings single quotes are considered a normal text characters but in specific
database functions such as SQL interface functions in FW10 single quotes may be used to prevent
malicious code injection etc.
And now to the main course: Database normalization. As you
are about to see it can be used to provide a considerable beneficial
capability, normalization is achieved when database fields and tables
that contain data of say, a company, a project or say, an accounting
system are structured in related tables linked by unique keys,
and do not contain redundant information. Why that is good?
Because it prevents functional errors. No, it does not help to
prevent errors, it is actually eliminate the possibility of the
most common errors in data management such as double billing,
duplicate payments, multiple shippings and the loss of records.
How is it possible that just following a particular database
structure with no additional code can do that? -- read on to find
out.
Consider a database system that keeps records of Customers,
invoices, and shipping bills. To achieve normalization we store
the data in three corresponding tables with no data
redundancies. It means that the customer's names are not stored
in the invoice and the shipping bill tables. Instead a unique
customer ID is given to each customer when a customer record is
created. The unique ID number is recorded in the customer table
(a separate .DBF database file in this case). It is also entered
to the invoices tables to indicate to which customer that invoice
record was issued. When we display an invoice we in fact build
the invoice form from fields from two separate files, customer
DBF file and the invoices DBF file. We can have as many invoices
as we need per a single customer or none at all but a customer
name is only entered once regardless of how many invoices were
created for it.
The same type of relation: "a one to many relation" also
describe the relation between the invoice table (the INVOICES.DBF
file) and the Shipping Bills tables (the SHIPPING.DBF file). We
can have many shipping bills per single invoice. The invoice
information is not recorded in the shipping bill database.
Instead a unique invoice number that assigned and recorded when a
new invoice is created is used to associated a single invoice
with possibly many shipping bills.
The unique customer and invoice number are referred to in
database terminology as "Keys" but a more appropriate name for
them would have been simply "unique id's". As a rule they should
appear in any printed or electronic reference to the data as a
"customer number", an "invoice number", and a "shipping bill
number" to allow the unmistakable identification of stored data
and allow or disallow related actions. Code that performs
specific actions such as payments and deliveries should check the
tables to see if they should be allowed or refused. How well
modern software is following that rule is quite questionable.
As I am writing these lines I am looking a check issued by a
major university as a payment for one of our products. The check
amount and our invoice number, printed on its attached page, are
the same as a those associates with a wire transfer which arrived
at our bank a few weeks ago. It is a duplicate payment. It was
probably issued automatically after the product arrival was
reported and the arrival date was entered into the system, even
though a wire transfer for the same invoice was already sent.
Normalization can effectively be used to prevent the possibility
of such operational errors.
Now to the actual implementation of a form that present data
from, and allow the updating multiple DBF files. The function
@execute can be used to insert and execute formulas in frame as
well as in linked databases. We will need to prepare a frame
containing the three linked databases with FRED program, possibly
in the container or in some frames inside it, that may call a
few @inputline-based functions such @inputdate or @inputnumber to
gather information from a user, verify it, and build the linked
database formulas and execute them.
The customer ID, an invoice number, or a shipping bill
number for example can be used to build the appropriate formulas
and @execute then linked the databases. Additional code should
then be used to verify the related data and allow or refuse
related operations. If needed while interacting with a customer,
the 12th parameter makes it easy to find a customer number by
entering the whole or a part of the customer name. There are no
practical limits on the number of linked databases nor on how
many linked databases can linked to a single DBF file or on the
fields that can be in a linked database frames. The first field
name in a linked frame that matched an existing field in the
linked DBF file is used as the data source. To save time however
ask us for sample programs and do not hesitate to ask for
suggestions and for solutions to your specific needs.