数据仓库代写|BI代写

莫那什大学

FIT3003 Business intelligence and data warehousing

General Information and Submission

o This is an individual assignment.

o Submission method: Submission is online through Moodle.

o Penalty for late submission: 5% deduction for each day.

o Assignment FAQ: There is an Assignment Frequently Asked Questions page set up for the Assignment 2 on EdStem Forum.


Problem Description

M-Stay is a residential service that offers homestay and rental services to Monash students and staff around Melbourne. The company has an existing operational database that maintains and stores all of the business transactions information (e.g. properties, hosts, listings, booking, etc.) required for the management's daily operation. As the business grows, M-Stay has decided to build a Data Warehouse to improve their analysis and work efficiency. However, since the staff at M-Stay have limited Business Intelligence and Data Warehouse knowledge, they have decided to hire you to design, develop and quickly generate BI reports from a Data Warehouse.


The operational database tables can be found at the MStay account. You can, for example, execute the following query:

select * from MStay.<table_name>;


A. Transformation Stage

The first stage of this assignment is divided into TWO main tasks: 

1. Design a data warehouse for the above M-Stay database.

You are required to create a data warehouse for the M-Stay database. The management is especially interested in the following indicators:

● Number of reviews

● Number of listings

● Average booking cost

● Listing type

● Listing time [Month, Year]

● Listing season

● Listing maximum stay duration [short-term: less than 14 nights, medium-term: 14 to 30 nights, long-term: more than 30 nights]

 (find appropriate fact measures that can calculate the average booking cost)

 

The following is a list of dimension attributes that you should include in your data warehouse:

● Listing price range [low: less than $100, medium: $100 to $200, high: more than $200]

● Channels

● Booking duration [short-term: less than 30 nights, medium-term: 30 to 90 nights, long-term: more than 90 nights]

● Review time [Month, Year]

● Booking cost range [low: less than $5000, medium: $5000 to $10000, high: more than $10000]


For the attribute, ensure that it meets the requirements of the range or group specified in your submission, if required in the specification.


Preparation stage.

Before you start designing the data warehouse, you have to ensure that you have explored the operational database and have done sufficient data cleaning. Once you have done the data cleaning process, you are required to explain what strategies you have taken to explore and clean the data.


The outputs of this task for Report are:

a) If you have done the data cleaning process, explain the strategies you used in this process (you need to show the SQL to explore the operational database and SQL of the data cleaning, as well as the screenshot of data before and after data cleaning).

- Designing the data warehouse by drawing star/snowflake schema.

Design task A:

The star schema for this data warehouse may contains multi-facts. You need to identify the fact measures, dimensions, and attributes of the star/snowflake schema. The following queries might help you to determine the fact measures and dimensions:

● How many long-term stay duration listings are listed on Facebook?

● How many listings are listed in June 2015?

● How many listings are there in summer for an “Entire home/apt” in a medium price range?

● How much is the average booking cost in March 2013?

● How many bookings were there for “Private rooms” with a short-term stay duration in 2015?

● How many high-cost bookings were made in April 2014?

● How many reviews were given in February 2016?

Note: the star schema you created in Design Task A as the highest level of aggregation

咨询 Alpha 小助手,获取更多课业帮助