Project IMUS

Dataset description:

What is an IMU?

An inertial measurement unit (IMU) measures and reports raw or filtered angular rate and specific force/acceleration experienced by the object it is attached to. Data outputs for an IMU are typically body-frame accelerations, angular rates, and (optionally) magnetic field measurements.

The columns of the dataset are:

Category: The category to which this IMU belongs.
Manufacturer: Name of the manufacturer.
part number: Discrete part number
sku: stock keeping unit
alt_sku: alternate stock keeping unit
Description: description of the product
Status: Whether it is in stock or not
Quantity: Quantity available
Price: Price of the product
sku_create_date: when was the product procured

Importing Related Resources

import pandas as pd
import numpy as np
pd.plotting.register_matplotlib_converters()
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns

file ='imus.xlsx'

df = pd.concat(pd.read_excel("imus.xlsx",sheet_name= None),ignore_index = True) # dedining df variable to file , merging of all sheet with concat method.

Pre-Processing

Intial Insepection

df.head()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	category	manufacturer	part_number	sku	alt_sku	description	status	price	sku_create_date
0	1539	MS-OPEN VALUE SUBSCRIPTION	021-09108	BA0011	NaN	1YR OFFICESTD ALNG LICSAPK OLV, D AP	No Stock	143.58	2020-08-17
1	1539	MS-OPEN VALUE SUBSCRIPTION	059-07014	BA0025	NaN	1YR WORD ALNG LICSAPK OLV D AP,	No Stock	56.09	2020-08-17
2	1539	MS-OPEN VALUE SUBSCRIPTION	065-06444	BA0039	NaN	1YR EXCEL ALNG LICSAPK OLV D AP,	No Stock	56.09	2020-08-17
3	1539	MS-OPEN VALUE SUBSCRIPTION	076-04355	BA0067	NaN	1YR PRJCT ALNG LICSAPK OLV D AP,	No Stock	217.60	2020-08-17
4	1582	MS-OPEN VALUE SUBSCRIPTION	077-05318	BA0080	NaN	1YR ACCESS ALNG LICSAPK OLV D, AP	No Stock	56.09	2020-08-17

print('Number of Rows and Columns in dataset',df.shape,'.')

Number of Rows and Columns in dataset (2627926, 10) .

print('Data Type of each column:')
df.dtypes

Data Type of each column:





category                    int64
manufacturer               object
part_number                object
sku                        object
alt_sku                    object
description                object
status                     object
qty                         int64
price                     float64
sku_create_date    datetime64[ns]
dtype: object

print('Four type of columns in dataset.')
df.dtypes.value_counts()

Four type of columns in dataset.





object            6
int64             2
float64           1
datetime64[ns]    1
dtype: int64

Concise Summary of a DataFrame

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2627926 entries, 0 to 2627925
Data columns (total 10 columns):
 #   Column           Dtype         
---  ------           -----         
 0   category         int64         
 1   manufacturer     object        
 2   part_number      object        
 3   sku              object        
 4   alt_sku          object        
 5   description      object        
 6   status           object        
 7   qty              int64         
 8   price            float64       
 9   sku_create_date  datetime64[ns]
dtypes: datetime64[ns](1), float64(1), int64(2), object(6)
memory usage: 200.5+ MB

Cleaning of dataset for further analysis

df = df.astype({"category":'category'}) # changing type of category.

Checking null type of data in dataframe

df.isnull().sum()

category                 0
manufacturer             0
part_number              0
sku                      0
alt_sku            2613442
description              0
status                   0
qty                      0
price                    0
sku_create_date          0
dtype: int64

lot of null value in alt_sku so better to drop it for further analysis.

Droping extra columns

df.drop(['sku','alt_sku','description'], axis=1, inplace=True)

df.head()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	category	manufacturer	part_number	status	price	sku_create_date
0	1539	MS-OPEN VALUE SUBSCRIPTION	021-09108	No Stock	143.58	2020-08-17
1	1539	MS-OPEN VALUE SUBSCRIPTION	059-07014	No Stock	56.09	2020-08-17
2	1539	MS-OPEN VALUE SUBSCRIPTION	065-06444	No Stock	56.09	2020-08-17
3	1539	MS-OPEN VALUE SUBSCRIPTION	076-04355	No Stock	217.60	2020-08-17
4	1582	MS-OPEN VALUE SUBSCRIPTION	077-05318	No Stock	56.09	2020-08-17

Renaming columns for convience

df = df.rename(columns={'sku_create_date': 'date', 'part_number': 'product'})

df.head()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	category	manufacturer	product	status	price	date
0	1539	MS-OPEN VALUE SUBSCRIPTION	021-09108	No Stock	143.58	2020-08-17
1	1539	MS-OPEN VALUE SUBSCRIPTION	059-07014	No Stock	56.09	2020-08-17
2	1539	MS-OPEN VALUE SUBSCRIPTION	065-06444	No Stock	56.09	2020-08-17
3	1539	MS-OPEN VALUE SUBSCRIPTION	076-04355	No Stock	217.60	2020-08-17
4	1582	MS-OPEN VALUE SUBSCRIPTION	077-05318	No Stock	56.09	2020-08-17

Dividing Dataset on basis of availablity of stock.

print('Stock availablity:')
df['status'].value_counts()

Stock availablity:





No Stock    1914138
In Stock     713788
Name: status, dtype: int64

df_st= df[df.status == 'In Stock']

df_nos= df[df.status == 'No Stock']

For data In Stock:

df_st.head()

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	category	manufacturer	product	status	qty	price	date
253	9920	CASE LOGIC-PERSONAL & PORTABLE	3200943	In Stock	7	99.00	2020-08-17
271	531	OMNITRON SYSTEMS	7206-0	In Stock	2	59.09	2020-08-17
320	9070	ZEBRA ENTERPRISE MCD-A1	SAC5070-800CR	In Stock	3	500.06	2020-08-17
324	9080	ZEBRA ENTERPRISE MCD-A1	KT-CLMPT-RS507-01R	In Stock	115	86.34	2020-08-17
326	9080	ZEBRA ENTERPRISE MCD-A1	KT-PAD-RS507-10R	In Stock	42	51.80	2020-08-17

df_st.drop(["status"], axis=1) # no need of stock column now so dropping

df_st

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	category	manufacturer	product	qty	price	date
253	9920	CASE LOGIC-PERSONAL & PORTABLE	3200943	7	99.00	2020-08-17
271	531	OMNITRON SYSTEMS	7206-0	2	59.09	2020-08-17
320	9070	ZEBRA ENTERPRISE MCD-A1	SAC5070-800CR	3	500.06	2020-08-17
324	9080	ZEBRA ENTERPRISE MCD-A1	KT-CLMPT-RS507-01R	115	86.34	2020-08-17
326	9080	ZEBRA ENTERPRISE MCD-A1	KT-PAD-RS507-10R	42	51.80	2020-08-17
...	...	...	...	...	...	...
2627921	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28
2627922	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28
2627923	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28
2627924	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28
2627925	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28

713788 rows × 6 columns

1. Which are the top Manufacturers with price, quantity?

mfg_grp = df_st.groupby('manufacturer').sum()

mfg_sort = mfg_grp.sort_values(by=['qty', 'price'],ascending= False)[:10]

mfg_sort_in = mfg_sort.reset_index()

mfg_sort_in

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	manufacturer	qty	price
0	STARTECH.COM	16069681	7266737.29
1	DELL CSG SERVICE WARRANTIES	13578833	150917.91
2	TRIPP LITE CONNECTIVITY	8283765	768146.62
3	LOGITECH - COMPUTER ACCESSORIES	8251159	1005042.67
4	JABRA BUSINESS	4829490	392963.51
5	PLANTRONICS INC	3825984	703634.71
6	CONFIG 6	3825896	2140.78
7	BELKIN - CABLES	3508335	262931.76
8	DELL ESG WARRANTIES	3149689	173018.32
9	EPOS	3127008	1127406.70

sns.set(rc={'figure.figsize':(11.7,8.27)})
top_mfg_qty_price = sns.barplot(x = 'qty',y = 'price',hue="manufacturer",  data= mfg_sort_in).set(title='Top Manufacturers with Price, Quantity')

2. Which are the most profitable products?

df_st['Total_Price'] =df_st['qty']*df_st['price']

C:\Users\Mohit\AppData\Local\Temp/ipykernel_7272/1120783067.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_st['Total_Price'] =df_st['qty']*df_st['price']

df_st

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	category	manufacturer	product	qty	price	date	Total_Price
253	9920	CASE LOGIC-PERSONAL & PORTABLE	3200943	7	99.00	2020-08-17	693.00
271	531	OMNITRON SYSTEMS	7206-0	2	59.09	2020-08-17	118.18
320	9070	ZEBRA ENTERPRISE MCD-A1	SAC5070-800CR	3	500.06	2020-08-17	1500.18
324	9080	ZEBRA ENTERPRISE MCD-A1	KT-CLMPT-RS507-01R	115	86.34	2020-08-17	9929.10
326	9080	ZEBRA ENTERPRISE MCD-A1	KT-PAD-RS507-10R	42	51.80	2020-08-17	2175.60
...	...	...	...	...	...	...	...
2627921	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28	606.36
2627922	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28	606.36
2627923	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28	606.36
2627924	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28	606.36
2627925	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28	606.36

713788 rows × 7 columns

pr_tot_grp = df_st.groupby(['product','Total_Price'])

top_sort = pr_tot_grp.sum().sort_values(by='Total_Price',ascending=[False])[:10]

most_profitable_products = top_sort.reset_index()

profit_product = sns.barplot(x = 'product',y = 'Total_Price',data = most_profitable_products).set(title='most_profitables_products')

3. Which is the product where you have seen a decrease in the price?

df_st

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	category	manufacturer	product	qty	price	date	Total_Price
253	9920	CASE LOGIC-PERSONAL & PORTABLE	3200943	7	99.00	2020-08-17	693.00
271	531	OMNITRON SYSTEMS	7206-0	2	59.09	2020-08-17	118.18
320	9070	ZEBRA ENTERPRISE MCD-A1	SAC5070-800CR	3	500.06	2020-08-17	1500.18
324	9080	ZEBRA ENTERPRISE MCD-A1	KT-CLMPT-RS507-01R	115	86.34	2020-08-17	9929.10
326	9080	ZEBRA ENTERPRISE MCD-A1	KT-PAD-RS507-10R	42	51.80	2020-08-17	2175.60
...	...	...	...	...	...	...	...
2627921	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28	606.36
2627922	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28	606.36
2627923	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28	606.36
2627924	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28	606.36
2627925	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28	606.36

713788 rows × 7 columns

pric_01 = df_st.drop(['manufacturer', 'category','qty',"Total_Price"], axis=1)

pric_01.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 713788 entries, 253 to 2627925
Data columns (total 3 columns):
 #   Column   Non-Null Count   Dtype         
---  ------   --------------   -----         
 0   product  713788 non-null  object        
 1   price    713788 non-null  float64       
 2   date     713788 non-null  datetime64[ns]
dtypes: datetime64[ns](1), float64(1), object(1)
memory usage: 21.8+ MB

p_max = pric_01.loc[pric_01.groupby('product').date.idxmax(),:]

p_min = pric_01.loc[pric_01.groupby('product').date.idxmin(),:]

p_max['price_max'] =p_max['price']

p_min['price_min'] =p_min['price']

p_max

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	product	price	date	price_max
1728303	0-00496	59.62	2020-08-24	59.62
463475	0-00498	96.65	2020-08-17	96.65
2098347	003-000884-01-OE	398.68	2020-08-24	398.68
1438732	00385	1.80	2020-08-24	1.80
1438760	00388	2.48	2020-08-24	2.48
...	...	...	...	...
1731234	ZT62062-T01A100Z	4001.17	2020-08-24	4001.17
2356040	ZT62063-T010100Z	3797.41	2020-08-25	3797.41
1731241	ZT62063-T01A100Z	4140.69	2020-08-24	4140.69
1731246	ZT62063-T110100Z	4344.44	2020-08-24	4344.44
1741603	ZUCUBJCK	16.74	2020-08-24	16.74

30672 rows × 4 columns

p_min

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	product	price	date	price_min
463476	0-00496	59.62	2020-08-17	59.62
463475	0-00498	96.65	2020-08-17	96.65
834073	003-000884-01-OE	398.68	2020-08-17	398.68
173447	00385	1.80	2020-08-17	1.80
173475	00388	2.48	2020-08-17	2.48
...	...	...	...	...
466408	ZT62062-T01A100Z	4001.17	2020-08-17	4001.17
466414	ZT62063-T010100Z	3797.41	2020-08-17	3797.41
466415	ZT62063-T01A100Z	4140.69	2020-08-17	4140.69
466420	ZT62063-T110100Z	4344.44	2020-08-17	4344.44
476823	ZUCUBJCK	16.74	2020-08-17	16.74

30672 rows × 4 columns

p_max = p_max.drop(['price','date'], axis=1)

p_max

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	product	price_max
1728303	0-00496	59.62
463475	0-00498	96.65
2098347	003-000884-01-OE	398.68
1438732	00385	1.80
1438760	00388	2.48
...	...	...
1731234	ZT62062-T01A100Z	4001.17
2356040	ZT62063-T010100Z	3797.41
1731241	ZT62063-T01A100Z	4140.69
1731246	ZT62063-T110100Z	4344.44
1741603	ZUCUBJCK	16.74

30672 rows × 2 columns

p_min = p_min.drop(['price','date'], axis=1)

p_min

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	product	price_min
463476	0-00496	59.62
463475	0-00498	96.65
834073	003-000884-01-OE	398.68
173447	00385	1.80
173475	00388	2.48
...	...	...
466408	ZT62062-T01A100Z	4001.17
466414	ZT62063-T010100Z	3797.41
466415	ZT62063-T01A100Z	4140.69
466420	ZT62063-T110100Z	4344.44
476823	ZUCUBJCK	16.74

30672 rows × 2 columns

final = pd.merge(p_max, p_min , on='product', how='outer')

final['final_price']= final['price_max']-final['price_min']

pric_Dec = final.sort_values(by=['final_price'], ascending=True ,ignore_index=True)[0:10]

sns.barplot(y = 'product',
            x = 'final_price',
            data = pric_Dec).set(title='Product whose Price Decreased')

[Text(0.5, 1.0, 'Product whose Price Decreased')]

4. Which is the product where you have seen an increase in the price?

pric_Inc =final.sort_values(by=['final_price'], ascending=False ,ignore_index=True)[0:10]

bar = sns.barplot(y = 'product',
            x = 'final_price',
            data = pric_Inc).set(title='Product whose Price Increased')

5. Which is the top category by price and quantity?

cat_grp = df_st.groupby('category').sum()

cat_grp.drop(['Total_Price'], axis=1, inplace=True)

cat_grp_srt_q_p = cat_grp.sort_values(by=['qty', 'price'],ascending= False)[:10]

cat_grp_srt_q_p_ind = cat_grp_srt_q_p.reset_index()

cat_grp_srt_q_p_ind

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	category	qty	price
0	1221	17378795	346527.07
1	2220	12851216	1741955.11
2	9970	8618185	1389155.66
3	2240	8264370	1520556.20
4	3350	7068656	580035.13
5	9995	5015339	970241.61
6	303	4902118	4544824.27
7	2201	4715114	558368.49
8	1251	4347126	13541.88
9	101	3877634	11079362.36

6. Which products are aging?

df_st

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	category	manufacturer	product	qty	price	date	Total_Price
253	9920	CASE LOGIC-PERSONAL & PORTABLE	3200943	7	99.00	2020-08-17	693.00
271	531	OMNITRON SYSTEMS	7206-0	2	59.09	2020-08-17	118.18
320	9070	ZEBRA ENTERPRISE MCD-A1	SAC5070-800CR	3	500.06	2020-08-17	1500.18
324	9080	ZEBRA ENTERPRISE MCD-A1	KT-CLMPT-RS507-01R	115	86.34	2020-08-17	9929.10
326	9080	ZEBRA ENTERPRISE MCD-A1	KT-PAD-RS507-10R	42	51.80	2020-08-17	2175.60
...	...	...	...	...	...	...	...
2627921	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28	606.36
2627922	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28	606.36
2627923	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28	606.36
2627924	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28	606.36
2627925	9945	VERBATIM CORPORATION	93975	12	50.53	2020-08-28	606.36

713788 rows × 7 columns

qty_01 = df_st.drop(['manufacturer', 'category','price','Total_Price'], axis=1)

q_max = qty_01.loc[qty_01.groupby('qty').date.idxmax(),:]

qty_Inc =q_max.sort_values(by=['qty'], ascending=False ,ignore_index=True)[0:10]

sns.barplot(x = 'product',
            y = 'qty',
            data =qty_Inc ).set(title='Aged Product')

[Text(0.5, 1.0, 'Aged Product')]

7. How to Decrease Product aging?

For Data Not In Stock:

df_nos

df_nos.drop(["status"], axis=1) # no need of stock column now so dropping

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	category	manufacturer	product	qty	price	date
0	1539	MS-OPEN VALUE SUBSCRIPTION	021-09108	0	143.58	2020-08-17
1	1539	MS-OPEN VALUE SUBSCRIPTION	059-07014	0	56.09	2020-08-17
2	1539	MS-OPEN VALUE SUBSCRIPTION	065-06444	0	56.09	2020-08-17
3	1539	MS-OPEN VALUE SUBSCRIPTION	076-04355	0	217.60	2020-08-17
4	1582	MS-OPEN VALUE SUBSCRIPTION	077-05318	0	56.09	2020-08-17
...	...	...	...	...	...	...
2627418	3360	TARGUS	AMB580TT	0	19.53	2020-08-28
2627775	505	ENGENIUS	EAP1300	0	85.24	2020-08-28
2627822	9910	KINGSTON	HX-MPFS-SM	0	13.75	2020-08-28
2627853	291	KINGSTON TECHNOLOGY FLASH	SD10VG2/16GBCP	0	16.51	2020-08-28
2627857	291	KINGSTON	IKD300S/4GB	0	52.83	2020-08-28

1914138 rows × 6 columns

1. Which are the top Manufacturers with price, quantity?

mfg_grp_n = df_nos.groupby('manufacturer').sum()

mfg_sort = mfg_grp_n.sort_values(by=['qty', 'price'],ascending= False)[:10]

mfg_sort_in_n = mfg_sort.reset_index()

mfg_sort_in_n

sns.set(rc={'figure.figsize':(11.7,8.27)})
top_mfg_qty_price_n = sns.barplot(x = 'qty',y = 'price',hue="manufacturer",  data= mfg_sort_in_n).set(title='Top Manufacturers with Price, Quantity')

2. Which are the most profitable products?

pr_tot_grp_n = df_nos.groupby(['product','price'])

top_sort_n = pr_tot_grp_n.sum().sort_values(by='price',ascending=[False])[:10]

most_profitable_products_n = top_sort_n.reset_index()

profit_product_n = sns.barplot(x = 'product',y = 'price',data = most_profitable_products_n).set(title='most_profitables_products')

df_nos

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	category	manufacturer	product	status	qty	price	date
0	1539	MS-OPEN VALUE SUBSCRIPTION	021-09108	No Stock	0	143.58	2020-08-17
1	1539	MS-OPEN VALUE SUBSCRIPTION	059-07014	No Stock	0	56.09	2020-08-17
2	1539	MS-OPEN VALUE SUBSCRIPTION	065-06444	No Stock	0	56.09	2020-08-17
3	1539	MS-OPEN VALUE SUBSCRIPTION	076-04355	No Stock	0	217.60	2020-08-17
4	1582	MS-OPEN VALUE SUBSCRIPTION	077-05318	No Stock	0	56.09	2020-08-17
...	...	...	...	...	...	...	...
2627418	3360	TARGUS	AMB580TT	No Stock	0	19.53	2020-08-28
2627775	505	ENGENIUS	EAP1300	No Stock	0	85.24	2020-08-28
2627822	9910	KINGSTON	HX-MPFS-SM	No Stock	0	13.75	2020-08-28
2627853	291	KINGSTON TECHNOLOGY FLASH	SD10VG2/16GBCP	No Stock	0	16.51	2020-08-28
2627857	291	KINGSTON	IKD300S/4GB	No Stock	0	52.83	2020-08-28

1914138 rows × 7 columns

3. Which is the product where you have seen a decrease in the price?

df_nos

pric_02 = df_nos.drop(['manufacturer', 'category','qty'], axis=1)

p_max_n = pric_02.loc[pric_02.groupby('product').date.idxmax(),:]

p_min_n = pric_02.loc[pric_02.groupby('product').date.idxmin(),:]

p_max_n['price_max'] =p_max_n['price']

p_min_n['price_min'] =p_min_n['price'] 


p_max_n = p_max_n.drop(['price','date'], axis=1)


p_min_n = p_min_n.drop(['price','date'], axis=1)



final_n = pd.merge(p_max_n, p_min_n , on='product', how='outer')

final_n['final_price']= final_n['price_max']-final_n['price_min']

pric_Dec_n = final_n.sort_values(by=['final_price'], ascending=True ,ignore_index=True)[0:10]

sns.barplot(y = 'product',
            x = 'final_price',
            data = pric_Dec_n).set(title='Product whose Price Decreased')

[Text(0.5, 1.0, 'Product whose Price Decreased')]

4. Which is the product where you have seen an increase in the price?

pric_Inc_n =final_n.sort_values(by=['final_price'], ascending=False ,ignore_index=True)[0:10]

bar = sns.barplot(y = 'product',
            x = 'final_price',
            data = pric_Inc_n).set(title='Product whose Price Increased')

5. Which is the top category by price and quantity?

cat_grp_n = df_nos.groupby('category').sum()


cat_grp_srt_q_p_n = cat_grp_n.sort_values(by=['qty', 'price'],ascending= False)[:10]

cat_grp_srt_q_p_ind_n = cat_grp_srt_q_p_n.reset_index()

cat_grp_srt_q_p_ind_n

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

</style>

	category	price
0	1221	1.539977e+09
1	1241	6.036206e+08
2	1231	4.445039e+08
3	1569	3.213863e+08
4	525	2.871297e+08
5	1577	2.064920e+08
6	531	1.861489e+08
7	661	1.643297e+08
8	101	1.282264e+08
9	1582	1.120179e+08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IMUS_Project.md

IMUS_Project.md

Project IMUS

Dataset description:

What is an IMU?

The columns of the dataset are:

Importing Related Resources

Pre-Processing

Intial Insepection

Concise Summary of a DataFrame

Cleaning of dataset for further analysis

Checking null type of data in dataframe

Droping extra columns

Renaming columns for convience

Dividing Dataset on basis of availablity of stock.

For data In Stock:

1. Which are the top Manufacturers with price, quantity?

2. Which are the most profitable products?

3. Which is the product where you have seen a decrease in the price?

4. Which is the product where you have seen an increase in the price?

5. Which is the top category by price and quantity?

6. Which products are aging?

7. How to Decrease Product aging?

For Data Not In Stock:

1. Which are the top Manufacturers with price, quantity?

2. Which are the most profitable products?

3. Which is the product where you have seen a decrease in the price?

4. Which is the product where you have seen an increase in the price?

5. Which is the top category by price and quantity?

Files

IMUS_Project.md

Latest commit

History

IMUS_Project.md

File metadata and controls

Project IMUS

Dataset description:

What is an IMU?

The columns of the dataset are:

Importing Related Resources

Pre-Processing

Intial Insepection

Concise Summary of a DataFrame

Cleaning of dataset for further analysis

Checking null type of data in dataframe

Droping extra columns

Renaming columns for convience

Dividing Dataset on basis of availablity of stock.

For data In Stock:

1. Which are the top Manufacturers with price, quantity?

2. Which are the most profitable products?

3. Which is the product where you have seen a decrease in the price?

4. Which is the product where you have seen an increase in the price?

5. Which is the top category by price and quantity?

6. Which products are aging?

7. How to Decrease Product aging?

For Data Not In Stock:

1. Which are the top Manufacturers with price, quantity?

2. Which are the most profitable products?

3. Which is the product where you have seen a decrease in the price?

4. Which is the product where you have seen an increase in the price?

5. Which is the top category by price and quantity?