Dict.CN 在线词典, 英语学习, 在线翻译

sunnywang

smile forever.

 

三大统计软件:SAS、Stata与SPSS比较(转载)

三大统计软件:SAS、Stata与SPSS比较(转载)
2006-08-07 11:22

Strategically using General Purpose Statistics Packages:
A Look at Stata, SAS and SPSS

中文版(自英文版本翻译):
很多人曾问及SAS,Stata 和SPSS之间的不同,它们之中哪个是最好的。可以想到,每个软件都有自己独特的风格,有自己的优缺点。本文对此做了概述,但并不是一个综合的比较。人们时常会对自己所使用的统计软件有特别的偏好,希望大多数人都能认同这是对这些软件真实而公允的一个对比分析。

  SAS
  一般用法。SAS由于其功能强大而且可以编程,很受高级用户的欢迎。也正是基于此,它是最难掌握的软件之一。使用SAS时,你需要编写SAS程序来处理数据,进行分析。如果在一个程序中出现一个错误,找到并改正这个错误将是困难的。
  数据管理。在数据管理方面,SAS是非常强大的,能让你用任何可能的方式来处理你的数据。它包含SQL(结构化查询语言)过程,可以在SAS数据集中使用SQL查询。但是要学习并掌握SAS软件的数据管理需要很长的时间,在Stata或SPSS中,完成许多复杂数据管理工作所使用的命令要简单的多。然而,SAS可以同时处理多个数据文件,使这项工作变得容易。它可以处理的变量能够达到32,768个,以及你的硬盘空间所允许的最大数量的记录条数。
  统计分析。SAS能够进行大多数统计分析(回归分析,logistic回归,生存分析,方差分析,因子分析,多变量分析)。SAS的最优之处可能在于它的方差分析,混合模型分析和多变量分析,而它的劣势主要是有序和多元logistic回归(因为这些命令很难),以及稳健方法(它难以完成稳健回归和其他稳健方法)。尽管支持调查数据的分析,但与Stata比较仍然是相当有限的。
  绘图功能。在所有的统计软件中,SAS有最强大的绘图工具,由SAS/Graph模块提供。然而,SAS/Graph模块的学习也是非常专业而复杂,图形的制作主要使用程序语言。SAS 8虽然可以通过点击鼠标来交互式的绘图,但不象SPSS那样简单。
  总结。SAS适合高级用户使用。它的学习过程是艰苦的,最初的阶段会使人灰心丧气。然而它还是以强大的数据管理和同时处理大批数据文件的功能,得到高级用户的青睐。

  Stata
  一般用法。Stata以其简单易懂和功能强大受到初学者和高级用户的普遍欢迎。使用时可以每次只输入一个命令(适合初学者),也可以通过一个Stata程序一次输入多个命令(适合高级用户)。这样的话,即使发生错误,也较容易找出并加以修改。
  数据管理。尽管Stata的数据管理能力没有SAS那么强大,它仍然有很多功能较强且简单的数据管理命令,能够让复杂的操作变得容易。Stata主要用于每次对一个数据文件进行操作,难以同时处理多个文件。随着Stata/SE的推出,现在一个Stata数据文件中的变量可以达到32,768,但是当一个数据文件超越计算机内存所允许的范围时,你可能无法分析它。
  统计分析。Stata也能够进行大多数统计分析(回归分析,logistic回归,生存分析,方差分析,因子分析,以及一些多变量分析)。Stata最大的优势可能在于回归分析(它包含易于使用的回归分析特征工具),logistic回归(附加有解释logistic回归结果的程序,易用于有序和多元logistic回归)。Stata也有一系列很好的稳健方法,包括稳健回归,稳健标准误的回归,以及其他包含稳健标准误估计的命令。此外,在调查数据分析领域,Stata有着明显优势,能提供回归分析,logistic回归,泊松回归,概率回归等的调查数据分析。它的不足之处在于方差分析和传统的多变量方法(多变量方差分析,判别分析等)。
  绘图功能。正如SPSS,Stata能提供一些命令或鼠标点击的交互界面来绘图。与SPSS不同的是它没有图形编辑器。在三种软件中,它的绘图命令的句法是最简单的,功能却最强大。图形质量也很好,可以达到出版的要求。另外,这些图形很好的发挥了补充统计分析的功能,例如,许多命令可以简化回归判别过程中散点图的制作。
  总结。Stata较好地实现了使用简便和功能强大两者的结合。尽管其简单易学,它在数据管理和许多前沿统计方法中的功能还是非常强大的。用户可以很容易的下载到别人已有的程序,也可以自己去编写,并使之与Stata紧密结合。

  SPSS
  一般用法。SPSS非常容易使用,故最为初学者所接受。它有一个可以点击的交互界面,能够使用下拉菜单来选择所需要执行的命令。它也有一个通过拷贝和粘贴的方法来学习其“句法”语言,但是这些句法通常非常复杂而且不是很直观。
  数据管理。SPSS有一个类似于Excel的界面友好的数据编辑器,可以用来输入和定义数据(缺失值,数值标签等等)。它不是功能很强的数据管理工具(尽管SPS 11版增加了一些增大数据文件的命令,其效果有限)。SPSS也主要用于对一个文件进行操作,难以胜任同时处理多个文件。它的数据文件有4096个变量,记录的数量则是由你的磁盘空间来限定。
  统计分析。SPSS也能够进行大多数统计分析(回归分析,logistic回归,生存分析,方差分析,因子分析,多变量分析)。它的优势在于方差分析(SPSS能完成多种特殊效应的检验)和多变量分析(多元方差分析,因子分析,判别分析等),SPSS11.5版还新增了混合模型分析的功能。其缺点是没有稳健方法(无法完成稳健回归或得到稳健标准误),缺乏调查数据分析(SPSS12版增加了完成部分过程的模块)。
  绘图功能。SPSS绘图的交互界面非常简单,一旦你绘出图形,你可以根据需要通过点击来修改。这种图形质量极佳,还能粘贴到其他文件中(Word 文档或Powerpoint等)。SPSS也有用于绘图的编程语句,但是无法产生交互界面作图的一些效果。这种语句比Stata语句难,但比SAS语句简单(功能稍逊)。
  总结。SPSS致力于简便易行(其口号是“真正统计,确实简单”),并且取得了成功。但是如果你是高级用户,随着时间推移你会对它丧失兴趣。SPSS是制图方面的强手,由于缺少稳健和调查的方法,处理前沿的统计过程是其弱项。

  总体评价
  每个软件都有其独到之处,也难免有其软肋所在。总的来说,SAS,Stata和SPSS是能够用于多种统计分析的一组工具。通过Stat/Transfer可以在数秒或数分钟内实现不同数据文件的转换。因此,可以根据你所处理问题的性质来选择不同的软件。举例来说,如果你想通过混合模型来进行分析,你可以选择SAS;进行logistic回归则选择Stata;若是要进行方差分析,最佳的选择当然是SPSS。假如你经常从事统计分析,强烈建议您把上述软件收集到你的工具包以便于数据处理。


English Version:
SAS

General use. SAS is a package that many "power users" like because of its power and programmability. Because SAS is such a powerful package, it is also one of the most difficult to learn. To use SAS, you write SAS programs that manipulate your data and perform your data analyses. If you make a mistake in a SAS program, it can be hard to see where the error occurred or how to correct it.
Data Management. SAS is very powerful in the area of data management, allowing you to manipulate your data in just about any way possible. SAS includes proc sql that allows you to perform sql queries on your SAS data files. However, it can take a long time to learn and understand data management in SAS and many complex data management tasks can be done using simpler commands in Stata or SPSS. However, SAS can work with many data files at once easing tasks that involve working with multiple files at once. SAS can handle enormous data files up to 32,768 variables and the number of records is generally limited to the size of your hard disk.
Statistical Analysis. SAS performs most general statistical analyses (regression, logistic regression, survival analysis, analysis of variance, factor analysis, multivariate analysis). The greatest strengths of SAS are probably in its ANOVA, mixed model analysis and multivariate analysis, while it is probably weakest in ordinal and multinomial logistic regression (because these commands are especially difficult), robust methods (it is difficult to perform robust regression, or other kinds of robust methods). While there is some support for the analysis of survey data, it is quite limited as compared to Stata.
Graphics. SAS may have the most powerful graphic tools among all of the packages via SAS/Graph. However, SAS/Graph is also very technical and tricky to learn. The graphs are created largely using syntax language; however, SAS 8 does have a point and click interface for creating graphs but it is not as easy to use as SPSS.
Summary. SAS is a package geared towards power users. It has a steep learning curve and can be frustrating at first. However, power users enjoy the its powerful data management and ability to work with numerous data files at once.


Stata

General Use. Stata is a package that many beginners and power users like because it is both easy to learn and yet very powerful. Stata uses one line commands which can be entered one command at a time (a mode favored by beginners) or can be entered many at a time in a Stata program (a mode favored by power users). Even if you make a mistake in a Stata command, it is often easy to diagnose and correct the error.
Data Management. While the data management capabilities of Stata may not be quite as extensive as those of SAS, Stata has numerous powerful yet very simple data management commands that allows you to perform complex manipulations of your data with ease. However, Stata primarily works with one data file at a time so tasks that involve working with multiple files at once can be cumbersome. With the release of Stata/SE, you can now have up to 32,768 variables in a Stata data file but probably would not want to analyze a data file that exceeds the size of your computers memory.
Statistical Analysis . Stata performs most general statistical analyses (regression, logistic regression, survival analysis, analysis of variance, factor analysis, and some multivariate analysis). The greatest strengths of Stata are probably in regression (it has very easy to use regression diagnostic tools), logistic regression, (add on programs are available that greatly simplify the interpretation of logistic regression results, and ordinal logistic and multinomial logistic regressions are very easy to perform). Stata also has a very nice array of robust methods that are very easy to use, including robust regression, regression with robust standard errors, and many other estimation commands include robust standard errors as well. Stata also excels in the area of survey data analysis offering the ability to analyze survey data for regression, logistic regression, poisson regression, probit regression, etc...). The greatest weaknesses in this area would probably be in the area of analysis of variance and traditional mutivariate methods (e.g. manova, discriminant analysis, etc.).
Graphics. Like SPSS, Stata graphics can be created using Stata commands or using a point and click interface. Unlike SPSS, the graphs cannot be edited using a graph editor. The syntax of the graph commands is the easiest of the three packages and is also the most powerful. Stata graphs are high quality, publication quality graphs. In addition, Stata graphics are very functional for supplementing statistical analysis, for example there are numerous commands that simplify the creation of plots for regression diagnostics.
Summary. Stata offers a good combination of ease of use and power. While Stata is easy to learn, it also has very powerful tools for data management, many cutting edge statistical procedures, the ability to easily download programs developed by other users and the ability to create your own Stata programs that seamlessly become part of Stata.

SPSS

General use. SPSS is a package that many beginners enjoy because it is very easy to use. SPSS has a "point and click" interface that allows you to use pulldown menus to select commands that you wish to perform. SPSS does have a "syntax" language which you can learn by "pasting" the syntax from the point and click menus, but the syntax that is pasted is generally overly complicated and often unintuitive.
Data Management. SPSS has a friendly data editor that resembles Excel that allows you to enter your data and attributes of your data (missing values, value labels, etc.) However, SPSS does not have very strong data management tools (although SPSS version 11 added commands for reshaping data files from "wide" format to "long" format, and vice versa). SPSS primarily edits one data file at a time and is not very strong for tasks that involve working with multiple data files at once. SPSS data files can have 4096 variables and the number of records is limited only by your disk space.
Statistical Analysis. SPSS performs most general statistical analyses (regression, logistic regression, survival analysis, analysis of variance, factor analysis, and multivariate analysis). The greatest strengths of SPSS are in the area of analysis of variance (SPSS allows you to perform many kinds of tests of specific effects) and multivariate analysis (e.g. manova, factor analysis, discriminant analysis) and SPSS 11 has added some capabilities for analyzing mixed models. The greatest weakness of SPSS are probably in the absence of robust methods (we know of no abilities to perform robust regression or to obtain robust standard errors), the absence of survey data analysis (we know of no tools in this area).
Graphics. SPSS has a very simple point and click interface for creating graphs and once you create graphs they can be extensively customized via its point and click interface. The graphs are very high quality and can be pasted into other documents (e.g. word documents or powerpoint). SPSS does have a syntax language for creating graphs but many of the features in the point and click interface are not available via the syntax language. The syntax language is more complicated than the language provided by Stata, but probably simpler (but less powerful) than the SAS language.
Summary. SPSS focuses on ease of use (their motto is "real stats, real easy", and it succeeds in this area. But if you intend to use SPSS as a power user, you may outgrow it over time. SPSS is strong in the area of graphics, but weak in more cutting edge statistical procedures lacking in robust methods and survey methods.

Overall Summary

Each package offers its own unique strengths and weaknesses. As a whole, SAS, Stata and SPSS form a set of tools that can be used for a wide variety of statistical analyses. With Stat/Transfer it is very easy to convert data files from one package to another in just a matter of seconds or minutes. Therefore, there can be quite an advantage to switching from one analysis package to another depending on the nature of your problem. For example, if you were performing analyses using mixed models you might choose SAS, but if you were doing logistic regression you might choose Stata, and if you were doing analysis of variance you might choose SPSS. If you are frequently performing statistical analyses, we would strongly urge you to consider making each one of these packages part of your toolkit for data analysis.

http://www.biostat.cn/forum/viewthread.php?tid=92&extra=page%3D1

posted on 2012-06-21 11:22 sunnywang 阅读(713) 评论(0)  编辑 收藏 引用 所属分类: accumulation of knowledge

只有注册用户登录后才能发表评论。

导航

统计

公告

silence is gold.

常用链接

留言簿(3)

随笔分类

相册

收藏夹

BI

english

friends' blog

knowledge

PM

tools

universities

搜索

最新评论