Downloading and Compiling Tips
gen lit|tax|seq [options] gen lit|tax|seq -help For more detailed list of optionslit: large (frequent) itemsets without taxonomies
Downloading and Compiling Tips
Attribute Value ~~~~~~~~~ ~~~~~ Salary uniformly distributed from 20000 to 150000 Commission if Salary >= 75000, Commission = 0 else uniformly distributed from 10000 to 75000 Age uniformly distributed from 20 to 80 Education uniformly chosen from 0 to 4 Car make of the car, uniformly chosen from 1 to 20 ZipCode uniformly chosen from 9 available zipcodes HouseValue uniformly distributed from 0.5*k*100000 to 1.5*k*100000, where 0 <= k <= 9 and depends on the ZipCode YearsOwned uniformly distributed from 1 to 30 Loan uniformly distributed from 0 to 500000Attributes educationLevel, car, and zipCode are categorical, and the rest are numeric. The attribute values are randomly generated. There is a derived attribute also, called Equity, defined as follows:
if YearsOwned < 20 Equity = 0 else Equity = 0.1 * ( YearsOwned - 20 )We developed a series of classification functions of increasing complexity that used the above attributes to classify people into different groups. Tuples in the training set were assigned the group label by first generating the tuple and then applying the classification function on the tuple to determine the group to which the tuple belongs.
It is rarely the case that the boundaries between the groups are very sharp. To model fuzzy boundaries, the data generation program takes a perturbation factor $p$ as an additional argument. After determining the values of different attributes of a tuple and assigning it a group label, the values for non-categorical attributes are perturbed. If the value of an attribute A for a tuple t v and the range of values of A is a, then the value of A for t after perturbation becomes v + r*p*a, where r is a uniform random variable between -0.5 and +0.5.
pred [options] pred -help For more detailed list of options
You are visitor number
to the Quest Synthetic Data Page since October 12, 1996.
[
QUEST Home |
Technologies |
Publications |
Demos & Goodies |
Seminars |
Links & Info |
People
]
[
IBM home page |
Order |
Search |
Contact IBM |
Help |
(C) |
(TM)
]