107k views
1 vote
Ord no purch_ant ord_date customer_id salesman_id

0 70001 150.50 2012-10-05 3005 5002 1 70009 270.65 2012-09-10 3001 5005 2 70002 65.26 2012-10-05 3002 5001 3 70004 110.50 2012-08-17 3009 5003 470007 948.50 2012-09-10 3005 5002 5 70005 2400.60 2012-07-27 3007 5001 6 70008 5760.00 2012-09-10 3002 5001 7 70010 1983.43 2012-10-10 3004 5006 8 70003 2480.40 2012-10-10 3009 5003 9 70012 250.45 2012-06-27 3008 5002 10 70011 75.29 2012-08-17 3003 5007 11 70013 3045.60 2012-04-25 3002 5001

Create a Pandas dataframe to store above dataset. The ord_date should be used as row index and stored as timeseries (not stored as character string!)

User Pokaboom
by
7.0k points

1 Answer

7 votes

Final answer:

To create a Pandas DataFrame from the given dataset, import the pandas library and use the DataFrame function. Convert the ord_date column to a datetime type and set it as the index.

Step-by-step explanation:

To create a Pandas dataframe to store the given dataset, you can first import the pandas library and then use the DataFrame function to create the dataframe. The ord_date column should be converted to a datetime type using the pd.to_datetime function, and then set as the index using the set_index function. Here's the code:

import pandas as pd

data = {
'ord_no': [70001, 70009, 70002, 70004, 70007, 70005, 70008, 70010, 70003, 70012, 70011, 70013],
'purch_amt': [150.50, 270.65, 65.26, 110.50, 948.50, 2400.60, 5760.00, 1983.43, 2480.40, 250.45, 75.29, 3045.60],
'ord_date': ['2012-10-05', '2012-09-10', '2012-10-05', '2012-08-17', '2012-09-10', '2012-07-27', '2012-09-10', '2012-10-10', '2012-10-10', '2012-06-27', '2012-08-17', '2012-04-25'],
'customer_id': [3005, 3001, 3002, 3009, 3005, 3007, 3002, 3004, 3009, 3008, 3003, 3002],
'salesman_id': [5002, 5005, 5001, 5003, 5002, 5001, 5001, 5006, 5003, 5002, 5007, 5001]
}

df = pd.DataFrame(data)
df['ord_date'] = pd.to_datetime(df['ord_date'])
df = df.set_index('ord_date')

User Sysyphus
by
8.1k points