SQL代写|数据库代写

IEOR 4720 SQL and Databases

Part 1 (10 points)

This first part is about riders (clients) signup.

Given the below subset of Uber’s schema (the 3 tables below), write executable SQL queries to answer the questions. Please answer in a single query script for each question and you can use at most 1 temporary table (a CTE with a WITH statement or a nested query).

A. (5 pts) For each of the cities 'Qarth' and 'Meereen', calculate the average difference between Actual and Predicted ETA (estimated time of arrival) for all completed trips within the last 90 days.

B. (5 pts) A rider’s signup is defined as an event labeled ‘sign_up_success’ within the events table. Restrict the attention to signups in the first 7 days of 2024. For each city (‘Qarth’ and ‘Meereen’) and for each day in which there was a signup, determine the percentage of signups that resulted in a completed trip within 168 hours of the signup.


Data is not provided for this exercise, but you can generate fake data to test your queries.

Submission instructions: You will copy/paste your queries into Courseworks.

Table name: trips

Column name:

Datatype:

id

integer

client_id

Integer (Foreign keyed to events.rider_id)

driver_id

integer

city_id

Integer (Foreign keyed to cities.city_id)

client_rating

integer

driver_rating

integer

request_at

Timestamp

predicted_eta

Integer (in minutes)

actual_eta

Integer (in minutes)

status

String (can be ‘completed’, canceled’)



Table name:
cities

Column name:

Datatype:

city_id

integer

city_name

string

 

 

Table name: events

Column name:

Datatype:

device_id

integer

rider_id

integer

city_id

Integer

event_name

String (can be: ‘sign_up_success’, ‘attempted_sign_up’, ‘sign_up_failure’)

_ts

Timestamp of the event

 

 

Part 2 (20 points)

This second part is not related to the first part. In this second part we will study drivers signup.
Uber’s goal is to make sure that the drivers that start the signup process end up taking their first trip on the platform.

 

A snapshot of data from the  driver_signup table is available in the driver_signup.csv file (on Courseworks).

 

An Uber product manager needs help answering a few (intentionally ambiguous) questions:

1. Are there any issues with the data? Feel free to make assumptions to perform your analysis.

2. What fraction of the drivers that sign up also take a first trip? How long does it take them to take a first trip?

3. What makes a driver more likely to start driving?

4. How would you define the most important performance metric that our cross-functional team should track using this table? What are one or two additional performance metrics that we should track?

5. Do you have any ideas for our app to improve the performance metrics that you picked?

 

Submission instructions:

Please report the results of your analysis in a Google Slides deck of at most 12 pages (title page included). Optional: you can add additional pages or code in an appendix section of your deck (they won’t be graded). When done working on your slides deck, you will click on Share (top right) and change the General Access setting to LionMail (or Public or columbia.edu). You will then copy the url link to your deck and paste it in the Courseworks submission field.

 

Your analysis will be evaluated based on its content and the criteria for building an effective presentation, discussed throughout the course. The analysis can be a simple exploratory analysis (fancy statistical models are not necessary).

 


Table name: driver_signup

Column name:

Datatype:

id

integer

city_id

Integer (where did the driver sign up)

signup_os

String (can be: “android”, “ios”, “website”, …)

signup_channel

string (Can be: “offline”, “paid”, “organic”, “referral”)

signup_timestamp

Timestamp (timestamp of account creation)

bgc_date

Date (of background consent)

vehicle_added_date

Date when driver uploaded vehicle info

first_trip_date

Date of first trip as a driver

vehicle_make

String (make of vehicle uploaded)

vehicle_model

String (model of vehicle uploaded)

vehicle_year

Year that the car was made