As part of my new years’ goals, I decided to share more of my work on data analytic and programming.
In my former role, I used more SQL, R and Python and not much SAS. When you work on projects in teams, It hard to use a program not too many of your team members are familiar with if you know what I mean. I recently took the SAS programming course from coursera to brush up on my knowledge and just to learn. I completed the two case studies as part of the course. The complete course has 3 modules
1. Getting Started with SAS Programming. A 6 weeks Module
2. Doing More with SAS programming. A 7 week Module
3. Practical SAS programming and Certificate Review. A 6 week Module
Upon completion of all 3 modules, you will be prepared to take the SAS Base Certification exams. It takes 6 to 12 months committing 3 hours each day to complete all 3 that is if you are a complete beginner but I bet if you commit yourself it is possible in 7 days.
I knocked the first one and third one out in a single weekend. This was because I did’t want to pay the $50/mo after my 7 days trial ended and also I do have a data analytics background.
The course was surprising properly structured. Stacy and her team of instructors were great. It was comprehensive in delivery and the programming activities and quiz were pretty straight forward and instructions were easy to follow. I will totally recommend this course to anybody.
The Case Studies
The case studies were great. They provided real life working experience with data. In the 3rd week of “Practical SAS Programming and Certification Review SAS” was the
- The TSA Travel Case Study
The TSA Claims 2002 to 2017 CSV file has 14 columns and over 220,000 rows. The Claim_Number column has a number for each claim. Some claims can have duplicate claim numbers, but different information for each claim. Those claims are considered valid for this case study. Incident_Date and Date_Received columns have the date the incident occurred and the date the claim was filed. Claim_Type has a type of the claim. There are 14 valid claim types. The Claim_Site column has where the claim occurred. There are 8 valid values for claims site. The Disposition column has a final settlement of the claim. The Close_Amount column has dollar amount of the settlement. The Item_Category column has a type of items in the claim. The values in this column vary by year. Airport_Code and Airport_Name columns have the code and the name where the incident occurred. The County, City, State, and Statename columns have the location of the airport. The State column has a two letter state code and Statename has the full state name.
- The World Toursim Case Study:
The task was to create 3 tables;
1. The cleaned_tourism table
2. the final_tourism table
3. the nocountryfound table
In it’s original form, the tourism table consists of 23 columns and over 2,400 rows. The A column contains a numeric ID when a country name appears in the country column. The country column contains a variety of information such as country names, tourism type, inbound or outbound, and tourism categories such as the number of arrivals or departures from a country or expenditure in the country or other countries in US dollars.
The series column contains the data collection method used by the country. Columns _1995 through _2014, contains scaled numeric data stored as text. The country column contains the information you need to properly convert this data to a numeric value. Values are US dollar amounts in millions for rows containing expenditure data and passenger count values for arrivals and departures are in thousands. For example, a scaled value of 21,719 for arrivals in thousands will be calculated by multiplying the number times 1,000 for a value of 21,719,000. The country_info table contains two columns and 250 rows. The continent column contains IDs for each continent. For example, one is North America, two is South America, and so on. Your document will list each value with the corresponding continent. The country column contains the name of each country.
See the SAS Codes for both case studies.