The easiest way to get your .csv (comma-separated values) text files into Stata is this:

insheet using myfile.csv, comma names

This assumes that your .csv file is well-behaved: names are on the first row, and any commas really are field separators.

This is sometimes not the case. Recently I got a set of .csv files that were likely exported from Microsoft Excel and had four main problems. The first was that the first row was empty except for the first cell, which had some gibberish in it; the variable names were on the second row. Second, some of these names were things like "$100 & UP". Third, some of the files had an empty las variable, as if MS Excel had tacked an extra spreadsheet column onto the .csv file. Finally, all the variables entered Stata as string, because Stata, seeing alphanumeric characters all over the second row, reasonably assumed that there were no numeric variables in this file.

So I wrote a program that would fix these four problems in any set of .csv files that shared any of them, regardless of the actual number of variables, which were string and which were numeric, and what crazy characters were in the original variable names.

It went like this:

capture prog drop fixVarlist
prog def fixVarlist

// get rid of last var if empty
unab oldvars: _all
quietly describe
local varnum=r(k)
local last: word `varnum' of `oldvars'
capture confirm byte var `last'
if _rc==0 {
  count if `last'!=.
  if r(N)==0 {
    drop `last'
    unab oldvars: _all
    quietly describe
    local varnum=r(k)
  }
}
// collect names of vars from first row and fix them as needed
forvalues i=1/`varnum' {
  local oldvar: word `i' of `oldvars'
  // capture value from first row
  local newvar=`oldvar' in 1
  // turn $ signs to USD
  local newvar=regexr("`newvar'","^\\$","USD")
  // clean up blank spaces
  while regexm("`newvar'"," ") {
    local newvar=regexr("`newvar'"," ","")
  }
  // some more cleanup
  // -- these changes apply to only a subset of the vars
  if regexm("`newvar'","&UP") {
    local newvar=regexr("`newvar'","&UP","andUp")
  }
  // -- these apply to all vars
  local newvar=regexr("`newvar'","w/o","Without")
  local newvar=regexr("`newvar'","w/","With")
  // collect clean var names into a list
  local newvars `newvars' `newvar'
}
di "there are `varnum' valid variables in this file"
di "they are: `newvars'"
// now rename vars
drop in 1
local newvarnum: list sizeof newvars
forvalues i=1/`newvarnum' {
  local oldvar: word `i' of `oldvars'
  local newvar: word `i' of `newvars'
  rename `oldvar' `newvar'
}
// now de-string any numeric vars
unab newvars: _all
local newvarnum: list sizeof newvars
quietly count
local check=r(N)
forvalues i=1/`newvarnum' {
  local newvar: word `i' of `newvars'
  gen x=real(`newvar')
  quietly count if x==.
  if r(N)==`check' {
    di "`newvar' is not numeric"
  }
  else {
    destring `newvar', replace force
  }
  drop x
}
end

You need to fix blank spaces inside that while loop because Stata's regexr() command would operate only on the first blank space it found, and leave the rest alone. You check whether there are any blank spaces left to fix with regexm(), which returns 1 if there are, and 0 otherwise.

Also, you need to escape the dollar sign, twice as it turns out, because in regular expressions $ by itself is the end-of-string marker. If you didn't escape it with a backslash (OK, two in this case) then regexr() would tack the string "USD" as a suffix onto all your variable names. Oh, and there's the ^ sign: in regular expressions it's the beginning-of-string character. Essentially, ^\\$ means "look for dollar signs at the beginning of this string".

Other than that, there's nothing exotic about this program. Its nice feature is that it keeps track of how many variables you have (by using the unab command to read in the unabbreviated variable list and by filling recursively the local `newvars', another Stata feature not commonly used) and it does its own counting. Features like these aren't mandatory, but they make your code more reusable.