One of the first tasks after importing data is usually to look at summary statistics of numerical fields. titanic=pd.read_csv('titanic_train.csv') 2. A DataFrame is Pandas equivalent of a SAS dataset. Here, I am creating a Pandas DataFrame object by using read_csv() to import titanic_train file. Pandas provides the read_csv() function to do this. proc import datafile = "&path.titanic_train.csv" out = titanic dbms = csv run The code below shows how to import the titanic_train.csv comma delimited file into a SAS dataset called ‘titanic’. SAS provides a powerful procedure - PROC IMPORT to do this. Delimited flat files are often used for moving the data around. Importing data is usually the most common first step in any data analytics project. Let’s get started with the 10 common data analytics tasks and how you do them in SAS and Python. Notice, I like to use an alias - ‘pd’ while working with Pandas. You could do that by running the code below. Once you have successfully installed pandas, you need to import it into your Python working session. That could be done using ‘pip’ and running the below command. Once you have access to SAS and Python, the last thing you would need is to install pandas for Python. SAS is proprietary software but provides a free University Edition for academic and non-commercial use. Python is open source and therefore freely available to install. You will need both SAS and Python to follow along. You can download and find details about this data here. I will be using the famous Titanic training dataset from Kaggle. However, my objective here is not to go deep into a particular procedure or function but to give a flavor of how the same tasks could be done in both languages. The procedures and functions that I will be using have a lot more capability then what I will be presenting. We will look at 10 of the most common data analytics tasks and how you do them in SAS and in Python. My objective in writing this article is to develop such a guide. I have extensively worked with both the languages and often times get asked if there is easy guidance for people who know SAS to be able to easily transition to Python. While SAS is still widely respected and used across corporations because of its efficiency and support availability, Python is increasingly becoming the language of choice owing to its open source nature and rapid development of machine learning and artificial intelligence libraries. That was 14 years before Python first appeared as a general purpose programming language in 1990 and 32 years before Pandas was first released in 2008 and transformed Python into an open source data analytics power house. SAS is a specialized data analytics programming language that has been around since 1976.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |