Scenario:
We have flight data that has Flight Time upto Minute level. We also have weather data that has hourly weather parameters.
Requirement:
We need weather data for flight time with most nearer slots from weather data. For eg; if the flight time is between "1.30 - 2.30" it should come under 2'o clk slot.
Sample Data
Sample data (Flight data and weather data) is attached herewith.
Weather data
Flight data
Learning from the exercise
Approach 1
Step:1 To Achieve this, we need to separate the hours from the date field from both the datasets.
Considering the Weather dataset;
1. To Copy the Date column,
Right click on the Date column and Copy -> Column. Copy of new Date column will be generated.
2. Right click on the new Date column -> Transform -> More
3. Choose hour(date) from transform operation, Select the date field in the Argument and click OK.
This will create the hour values in the new column.
Transform Date to extract hours
Step:2 In Flight dataset, let us consider Dep_Time to separate the hours. By default, it has hours and minutes combined data. So we need to split them to get the departure hour.
1. Right click on the DepTime and Split
2. Split by length as 2 from Right and click OK
Hours and Mins columns will be created.
Split the Hours and Minutes from Flight Dataset
Step:3 After creating Hours column (DEP_TIME_hours) in the Flight dataset, JOIN with Weather dataset
1.Click Add Dataset and choose Weather dataset
2.Click JOIN (Equi JOIN) and choose columns as FL_DATE = Date, ORIGIN = SiteId, DEP_TIME_hours = Date_1 and click OK.
The weather data and flight data will be displayed in the final dataset.
Joining Condition to combine Weather and Flight Data
Step:4 Create a new column in the final dataset for creating the Departure Hours slot as below.
1.Right Click -> Add column -> Custom
2.Give the expression as
”ifCase( Weather_Hours == 23 && Flight_Dep_Hours == 23 && Flight_Dep_Minutes >= 30, Flight_Dep_Hours,ifCase( Flight_Dep_Minutes >= 30, Flight_Dep_Hours + 1,Flight_Dep_Hours))“ and click OK
The hour slot will be created as below.
This will be created in such a way that if the minutes is greater than 30, it will be considered in the next hour slot. If not in the same hour slot.
Custom column expression
Approach 2
Step:1 We need to separate the hours from the date field from both the datasets.
Considering the Weather dataset;
1. To Copy the Date column,
Right click on the Date column and Copy -> Column. Copy of new Date column will be generated.
2. Right click on the new Date column -> Transform -> More
3. Choose hour(date) from transform operation -> Select the date field in the Argument and click OK.
This will create the hour values in the new column.
Get hour from transform operation
Step:2 Since flight dataset has zero in the time part, we need to make the time part zero in the weather dataset too for joining.
1.Right click on the Date -> Transform -> Make timepart zero
Date will be generated without timestamp.
Make timepart zero
Step:3 In flight dataset, let us consider Dep_Time to extract the hours. By default, it has hours and minutes combined data. So we need to split them to get the departure hour.
1. Right click on the DepTime and Split
2. Split by length as 2 from Right and click OK
3. After this, Change the 24 hour to 0
Right click -> Unique values -> Edit 24 to 0 and click OK.
Split the Hours and Minutes from Flight Dataset
Change 24 hour to 0 from unique value edit
Step:4 Here (Flight dataset) we need to convert DEP_TIME_MINS to hours to append with DEP_TIME_HOUR.
1.Right click on DEP_TIME_MIN -> Transform.
2.Choose divide() as Operation and divide by value 60 -> click OK. This will give the mins data in hours.
3.Right click on the DEP_TIME_MIN again -> Transform
4.Choose the plus() Operation and add with DEP_TIME_HH. This is to add the mins (in hours) to the actual hours column.
5.Right click on DEP_TIME_MIN - > Transform -> Datatype (This is to convert the datatype to integer)
6.Choose INT as datatype and click OK
7.Right click on DEP_TIME_HH -> Transform to round the DEP_TIME_HH column.
8.Choose round() as Operation and click OK
Convert minutes to hour
Add minute column to hours column
Round hour value
Step:5 In Flight Dataset, need to edit the column in such a way that if the hour is 24, the date should be changed to the next day and change 24 to 0 as weather data has 0 value for the next day morning data.
1.Right click on the DEP_TIME_HH and Add column (FinalDate)
2.Give expression as “ifCase(DEP_TIME_HH == “24”, dateAdd(“d”, 1, FL_DATE), FL_DATE)”
3.Right click on the DEP_TIME_HH -> Unique values
4.Replace 24 to 0 and click OK
Change to next date if hour is 24
Change 24 hour to 0
Step:6 JOIN with Weather dataset
1.Click Add Dataset and choose Weather dataset
2.Click JOIN (Equi JOIN) and choose columns as FinalDate = Date, DEP_TIME_HH = HH, DEST = SiteId and click OK.
The date will be displayed as per the hour slots.
Join operation
Approach 3
Step:1 We will need Hour column in Weather dataset as well as Flight dataset.
In Weather dataset it can be done as given below;
1.Right click on Date column and go to Add column > Hour
Get hour from Add column operation
Step:2 We have to make the timepart 0 of Date column in Weather dataset as flight dataset Date column has timepart 0. This can be done as given below:
1) Right click on Date column and then go to Transform > Make Timepart zero.
Make timepart 0 in Date column in Weather dataset
Step:3 We will get Hour column in Flight dataset as given below:
1. Right click on DEP_TIME Column and go to Split > Split Column.
2. Select Length in Split By and also give length as 2.
3. Select Right in Split From option and hence you will get Hours and Minutes column as shown in the screenshot below.
Getting Hours using Split option
Step:4 In flight dataset, if minutes are greater than 30 then we can round off it to the next hour which can be done as given below:
1.Right click on Hours column and go to Add column and add a new column named Hours_to_be_added whose expression is given as ifCase( Minutes >=30, 1, 0)
2. Add the Hours_to_be added column to the Hours column by Right clicking on Hours column -> Transform -> More and select plus operation which will round off Hours having minutes > 30.
Hour to be added column expression
Adding Hours_to_be_added and Hours column
Step:5 If Hours is equal to 24 then the date should be changed to next date
1) This can be done by generating a new Date column whose expression is ifCase( Hours == 24, dateAdd( "d", 1, FL_DATE ), FL_DATE ).
NewDate Column expression
Step:6 Convert Hour 24 into Hour 0 as 12:00 AM cannot be represented as Hour 24 and Hour 0.
1. Right click on Hours Column and go to Transform > More.
2. Select mod operation and select Static in argument 2 and write 24. This will convert the 24th Hour to 0th Hour.
Converting Hour 24 into Hour 0
Step:7 Join the Flight Data and Weather Data
1. Open the Flight data and add the dataset Weather Data using Add Datasets option.
2. Perform Equi Join operation on Hours, Origin and NewDate columns of Flight dataset with Date_Hour, SiteId and Date columns of Weather dataset.
Join operation
Note: This article is based on Smarten Version 5.x. This may or may not be relevant to the Smarten version you may be using.
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article