sysuse auto, clear
browse *浏览数据
list make price mpg in 1/20 *列出数据
describe *描述数据
describe, detail
d make price mpg
summarize *基本统计量
summarize,detail
sum make price mpg
codebook *列出变量信息
inspect *描述变量的属性
Closest to
Storage 0 without
type Minimum Maximum being 0 bytes
----------------------------------------------------------------------
byte -127 100 +/-1 1
int -32,767 32,740 +/-1 2
long -2,147,483,647 2,147,483,620 +/-1 4
float -1.70141173319*10^38 1.70141173319*10^38 +/-10^-38 4
double -8.9884656743*10^307 8.9884656743*10^307 +/-10^-323 8
----------------------------------------------------------------------
Precision for float is 3.795x10^-8.
Precision for double is 1.414x10^-16.
String
storage Maximum
type length Bytes
-----------------------------------------
str1 1 1
str2 2 2
... . .
... . .
... . .
str2045 2045 2045
strL 2000000000 2000000000
-----------------------------------------
Date type Examples of HRFs
--------------------------------------------
datetime 20jan2010 09:15:22.120
date 20jan2010, 20/01/2010, ...
weekly date 2010w3
monthly date 2010m1
quarterly date 2010q1
half-yearly date 2010h1
yearly date 2010
--------------------------------------------
SIF type Examples in SIF Units
-----------------------------------------------------------------
datetime/c 1,579,598,122,120 milliseconds since
01jan1960 00:00:00.000,
assuming 86,400 s/day
datetime/C 1,579,598,146,120 milliseconds since
01jan1960 00:00:00.000,
adjusted for leap seconds*
date 18,282 days since 01jan1960
(01jan1960 = 0)
weekly date 2,601 weeks since 1960w1
monthly date 600 months since 1960m1
quarterly date 200 quarters since 1960q1
half-yearly date 100 half-years since 1960h1
yearly date 2010 years since 0000
-----------------------------------------------------------------
SIF datetime/C is equivalent to coordinated universal time (UTC).
In UTC, leap seconds are periodically inserted because the length of the mean solar day is slowly increasing.
Function to convert
SIF type HRF to SIF Note
--------------------------------------------------------------------
datetime/c tc = clock(HRFstr, mask) tc must be double
datetime/C tC = Clock(HRFstr, mask) tC must be double
date td = date(HRFstr, mask) td may be float or
long
weekly date tw = weekly(HRFstr, mask) tw may be float or int
monthly date tm = monthly(HRFstr, mask) tm may be float or int
quarterly date tq = quarterly(HRFstr, mask) tq may be float or int
half-year date th = halfyearly(HRFstr, mask) th may be float or int
yearly date ty = yearly(HRFstr, mask) ty may be float or int
--------------------------------------------------------------------
Warning: To prevent loss of precision, datetime SIFs must be stored as doubles.
gen double eventtime = clock(mystr, "YMDhm")gen double eventtime = clock(mystr, "YMDhms")gen double eventtime = clock(mystr, "MD20Yhm")#忽略其中的多余字符gen double eventtime = clock(mystr, "hm#MDY")gen eventdate = date(mystr, "DMY")gen double eventtime = clock(mystr, "DMY")2010.07.12 14:32,SIF的HRF表达化,则显示为蓝色字符2010.07.12 14:32,类似于添加了label value。 Display format to
SIF type present SIF in HRF
-----------------------------------
datetime/c %tc
datetime/C %tC
date %td
weekly date %tw
monthly date %tm
quarterly date %tq
half-yearly date %th
yearly date %ty
-----------------------------------
gen double eventtime = clock(mystr, "YMDhm")
format eventtime %tc
gen eventdate = date(mystr, "DMY")
format eventdate %td
format eventdate %tdCY.N.D
format eventdate %tdCY/N/D
format eventdate %tdCY-M-D
format eventdate %tdCY_M_D //"_"表示空格
| To:
From: | datetime/c datetime/C date
------------+------------------------------------------
datetime/c | tC=Cofc(tc) td=dofc(tc)
datetime/C | tc=cofC(tC) td=dofC(tC)
date | tc=cofd(td) tC=Cofd(td)
weekly | td=dofw(tw)
monthly | td=dofm(tm)
quarterly | td=dofq(tq)
half-yearly | td=dofh(th)
yearly | td=dofy(ty)
-------------------------------------------------------
| To:
From: | weekly monthly quarterly
------------+------------------------------------------
date | tw=wofd(td) tm=mofd(td) tq=qofd(td)
-------------------------------------------------------
| To:
From: | half-yearly yearly
------------+------------------------------------------
date | th=hofd(td) ty=yofd(td)
-------------------------------------------------------
从年月日字符串分别提取数值型年月日
generate double timestamp = date(varname, "DMY")
gen year=year(timestamp)
gen month=month(timestamp)
gen day=day(timestamp)
不会改变数据,却能缩小数据占用的空间
通过转换数据类型来压缩数据
| 数据类型 | 转换后的类型 |
|---|---|
| double | long , int 或 byte |
| float | int 或 byte |
| long | int 或 byte |
| int | byte |
| string | 短 string |
sysuse auto,clear
list gear_ratio in 1/5
d gear_ratio
recast int gear_ratio, force
d gear_ratio
list gear_ratio in 1/5
%[ - | ~ ][ 0 ] w . d [ e | f | g ][ c ]
% 表明开始设定格式"-" 表示数据靠左列式"0" 表示输入的数字 则会保留在开头"." 为小数点.dis %7.2f 1000.12345
1000.12
.dis %9.2e 1000.12345
1.00e+03
.dis %8.2fc 1000.12345
1,000.12
. sysuse auto.dta, clear
. list price gear in 1/3
+------------------+
| price gear_r~o |
|------------------|
1. | 4,099 3.58 |
2. | 4,749 2.53 |
3. | 3,799 3.08 |
+------------------+
. format price %7.1fc / /保留一位小数,并保留价格中的逗号
. format gear %6.3f / /保留三位小数
. list price gear in 1/3
+--------------------+
| price gear_r~o |
|--------------------|
1. | 4,099.0 3.580 |
2. | 4,749.0 2.530 |
3. | 3,799.0 3.080 |
+--------------------+
. tabstat price mpg rep78 length turn foreign, ///
> stats(n mean sd min max) c(s) f(%9.2f)
variable | N mean sd min max
-------------+--------------------------------------------------
price | 74.00 6165.26 2949.50 3291.00 15906.00
mpg | 74.00 21.30 5.79 12.00 41.00
rep78 | 69.00 3.41 0.99 1.00 5.00
length | 74.00 187.93 22.27 142.00 233.00
turn | 74.00 39.65 4.40 31.00 51.00
foreign | 74.00 0.30 0.46 0.00 1.00
----------------------------------------------------------------
. qui reg price mpg
. est store reg1
. qui reg price mpg rep78
. est store reg2
* 输出估计结果
. esttab reg1 reg2, b(%9.2f) se(%7.2f) sfmt(%7.3f) / / /
> star(* 0.1 ** 0.05 *** 0.01) mtitle(reg1 reg2) / / /
> scalar(r2 r2_a N F) compress nogap
------------------------------------
(1) (2)
reg1 reg2
------------------------------------
mpg -238.89*** -271.64***
(53.08) (57.77)
rep78 666.96*
(342.36)
_cons 11253.06*** 9657.75***
(1170.81) (1346.54)
------------------------------------
N 74 69
r2 0.220 0.251
r2_a 0.209 0.228
F 20.258 11.057
------------------------------------
Standard errors in parentheses
* p<0.1, ** p<0.05, *** p<0.01
#delimit ;
twoway (scatter price mpg) (lfit price mpg),
ylabel(, format(%9.1f) angle(0) nogrid)
xlabel(, format(%3.1f))
;
#delimit cr
*Syntax
generate [type] newvar[:lblname] =exp [if] [in] [, before(varname) | after(varname)]
replace oldvar =exp [if] [in] [, nopromote]
*Examples
webuse genxmpl3,clear
generate age2 = age^2
generate int age2 = age^2
webuse genxmpl1, clear
replace age2 = age^2
rename old_varname new_varnamesysuse auto,clear
d
rename rep78 repair78
d
*Syntax
label data ["label"]
*Examples
sysuse auto,clear
d
label data "1978年汽车价格资料数据"
d //注意观察数据标签
*Syntax
label variable varname ["label"]
*Examples
sysuse auto,clear
label var price 汽车价格
label var foreign "汽车产地(1 国外; 2 国内)"
*Syntax
* Define value label
label define lblname # "label" [# "label" ...] [, add modify replace nofix]
* Assign value label to variables
label values varlist lblname [, nofix]
* Remove value labels
label values varlist [.]
* List names of value labels
label dir
* List names and contents of value labels
label list [lblname [lblname ...]]
* Copy value label
label copy lblname lblname [, replace]
* Drop value labels
label drop {lblname [lblname ...] | _all}
* Save value labels in do-file
label save [lblname [lblname...]] using filename [, replace]
*Examples
sysuse auto,clear
* label define 标签名
* label values 变量名 标签名 /*将变量值和标签联系起来*/
label define repair 1 "好" 2 "较好" 3 "中" 4 "较差" 5 "差"
label values rep78 repair
*显示值标签
label dir
label list
label list repair
*添加和修改值标签
label def repair 5 "差", add
label def repair 3 "一般", modify
*删除值标签
label drop repair
label list
本节命令:
browse , list , describe , summarize , codebook , inspect
compress , recast , format , display , recast
gen , replace , rename , label