# Dynamic arithmetic expressions

A couple of days ago I showed how you can use logical operators inside arithmetic expressions. The point of that post was that if you had three variables, called say _var1, _var2 and _var3, then you could build another variable, call it _var4, that counted row-wise their non-zero instances, like so:

gen _var4=(_var1!=0)+(_var2!=0)+(_var3!=0)

This was actually spun off a larger project. I had several data sets that could each have all or any subset of a list of measurements saved as _var1 through _var17. I had to count the non-zero instances row-wise with a formula that worked across all of the data sets, regardless of which of these _var measurements each of them included. I also had to allow for the possibility that some data sets would have no such variables at all, and that some data sets added later would bring in additional measurements, labeled _var18 and onward. Clearly, a static formula like the one above won't do. Here's an alternative:

// 1. build a mock-up file to try this thing out

clear

set obs 4

gen id=_n

gen _var1=0

gen _var2=0

replace _var1=1.2 in 1

replace _var1=1.1 in 2

replace _var2=1.2 in 2

replace _var2=.5 in 3

// 2. build the formula

// 2.1. generate a list of the _var variables

// you have in this particular file, if any.

unab allvars: _all // this collects all the variable names

foreach varname in `allvars' {

if regexm("`varname'","_var") {

local myvars `myvars' `varname' // this collects those that start with "_var"

}

}

local myvars_ct: list sizeof myvars // this counts them

// check it out:

di "`myvars'"

di `myvars_ct'

// 2.2. build the actual formula

if `myvars_ct'>0 { // if you have any _var variables, then proceed

forvalues i=1/`myvars_ct' {

local this_var: word `i' of `myvars'

local vars_sum "`vars_sum'(`this_var'!=0)+"

}

// check it out:

di "`vars_sum'" // ok, got an extra + sign.

// 2.3 clean up the formula

local vars_sum=substr("`vars_sum'",1,strlen("`vars_sum'")-1)

// looks good now:

di "`vars_sum'"

// 3. now put the formula to use

gen vars_sum=`vars_sum'

}

else {

di "No _var found in this file."

}

This will build your formula for any combo of the _var measurements. One code fits all and it's future proof too. That's always nice.