Jupyter Notebook (python) assignments

All must be delievered in Jupyter Notebook format using python (3 Jupyter Notebooks in total):

Task 1:

using file attached: clothing_store.csv

You are given a dataset from a customer – a clothing store chain in New England. Your task is to develop a model that will maximize profits for direct-mail marketing (id est. a model that would identify customers who will respond to a direct-mail marketing promotion, based on information collected about the customers). However, it will not be sufficient just to derive the model with the most accurate predictions. You should also consider how the problem fits into the business goal of maximizing profits.

Be creative. Think of this task as a bigger project (might involve several directions, complex path towards the final goal). Provide in-depth solution. Simply applying any pre-built model from the library is not enough.

Support your decisions and intermediate/final solutions with performance metrics, graphs etc. Provide comments and conclusions, include Python code extracts and output into your final report in markdown PDF format.

Task 2:

using file attached: churn.txt

You have to do both R and Python coding. Upload your report including code, outputs, related tables and graphs with captions, discussion and conclusions in Jupyter notebook markdown format.

Use the churn data set. Filter out all variables except the following: VMail Plan, Intl Plan, CustServ Calls, and Churn. Set CustServ Calls to be ordinal. Allow the three predictors to be in either antecedent or consequent, but do not allow Churn to be in the antecedent.

Set the minimum antecedent support to 1%, the minimum rule confidence to 5%. Use rule confidence as your evaluation measure.

Find the association rule with the greatest lift.
Report the following for the rule in (a): Number of instances, Support %, Confidence %, Lift.

Task3:

using file attached: Loans_training.csv

Perform BIRCH clustering for the Loans data set, experiment with different number of clusters. Make a graph of clusters, compute silhouette (in addition, you can make a silhouette graph), or/and other metrics for measuring cluster goodness in both R and Python, and make conclusions. Make a final report with code, outputs, graphs, captions, and basic descriptions / conclusions in a jupyter notebook.

Leave a Reply

Your email address will not be published. Required fields are marked *